myrkraverk.......sourceforge.... wrote: > In a plain text environment, there is often a need to encode more than > just the plain character. ... > Since I'm using 64 bits, I call it Excessive Memory Usage Encoding, or > EMUE. ... > I thought of dividing the 64 bit code space into 32 variably wide > plains, one for control characters, one for latin characters, one for > han characters, and so on;
This all seems to me like something of a pointless excercise. Or maybe you're not making clear what is your intented audience of users and problems that you're trying to solve. Decent libraries exist that already do nice things with strings having attributes. And that, in my opinion, is a better model than bit-hacking in a 64-bit space with vague implementation-defined attributes that change depending on the "script" of a character. Such "attributed strings" are easy to work with and provide a much higher-level model than this. You might want to check out Apple's Cocoa environment, particularly the definitions of the attributed string classes. For example... http://developer.apple.com/documentation/Cocoa/Reference/Foundation/Java/Classes/NSAttributedString.html or even the intro: http://developer.apple.com/documentation/Cocoa/Conceptual/AttributedStrings/index.html I'm sure there are libraries with similar capabilities for storing characters + attributes in Java and other languages, I'm just not familiar with them. Maybe some of the developers can chime in with their favorite attributed string libraries. Even if you don't use one, you might find the attributed string model educational. (All of the above of course reflects only my personal opinion.) Rick

