David P Grove wrote:

I've been tracking down a bug using classpath to run JSPs on top of Jikes RVM and I think the root of the problem is that EncoderUTF8.java is strictly following the UTF8 encoding scheme instead of the "pseudo-UTF8" that JVMs actually need. In particular, the character \u0000 is being encoded as the one byte 0 instead of the 2 byte sequence that Java uses.

I'm happy to contribute a bug fix for this. My question is should I change EncoderUTF8 to implement the Java treatment of \u0000,

That would be wrong. EncoderUTF8 is used to convert 16-bit Unicode to the *external* UTF8 encoding used for files etc. Not the Java pseudo-UTF8.

I can only think of one reason why you'd want to create the Java
pseudo-UTF8 format:  when writing a Java class file.  Implement
that however you wish, but don't change the behavior of EncoderUTF8.
You could add a flag to EncoderUTF8 file to enable "Java-style UTF8",
but it can't be the default.
--
        --Per Bothner
[EMAIL PROTECTED]   http://per.bothner.com/




_______________________________________________ Classpath mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/classpath

Reply via email to