> -----Original Message-----
> From: Philippe Verdy [mailto:[EMAIL PROTECTED]]
>
> The same
> optimization can be done in Java by subclassing the String
> class to add a "form" field and related form conversion (getters)
> and tests methods.

Only slightly confused about this. The Java String class is declared final in the API, and therefore cannot be subclassed. One would have to write an alternative String class (not rocket science of course, but still a tad more involved than subclassing).

> In fact, to further optimize and reduce the
> memory footprint of Java strings, in fact I choosed to store
> the String in a array of bytes

Okay. That explains that then.


> It is possible, with a custom class loader to overide the default
> String class used in the Java core libraries

Ouch. Never taken Java that far myself. I like the idea though. Is it difficult?


> Looking at the Java VM machine specification, there does not
> seem to be something implying that a Java "char" is necessarily a
> 16-bit entity. So I think that there will be sometime a conforming
> Java VM that will return UTF-32 codepoints in a single char, or
> some derived representation using 24-bit storage units.

I've wondered about that ever since Unicode went to 21 bits. Actually of course, it's C (and C++), not Java,  which has the real problem. C is (supposed to be) portable, but fast on all architectures, so all of the built-in types have platform-dependent widths. (So far so good). The annoying thing is that, BY DEFINITION, the sizeof() operator returns the size of an object measured in chars. Therefore, it is a violation of the rules of C to have an addressable object smaller than a char. One can have 32-bit chars, but only if you disallow bytes and 16-bit words. sizeof() is not allowed to return a fraction. Sigh! If only C had seen fit to measure addressable locations in bits, or even architecture-specific-atoms (which would have been 8-bits wide on most systems), then we could have had sizeof(char) returning 4 or something. Ah well.


 
> This leads to many discussions about what is a "character"

I think we just had that discussion. If it happens again I'm probably not going to join in (though it was quite amusing).

Jill



Reply via email to