David P Grove wrote:
I've been tracking down a bug using classpath to run JSPs on top of Jikes
RVM and I think the root of the problem is that EncoderUTF8.java is
strictly following the UTF8 encoding scheme instead of the "pseudo-UTF8"
that JVMs actually need. In particular, the character \u0000 is being
encoded as the one byte 0 instead of the 2 byte sequence that Java uses.
I'm happy to contribute a bug fix for this. My question is should I
change EncoderUTF8 to implement the Java treatment of \u0000,
That would be wrong. EncoderUTF8 is used to convert 16-bit Unicode
to the *external* UTF8 encoding used for files etc. Not the Java
pseudo-UTF8.
I can only think of one reason why you'd want to create the Java
pseudo-UTF8 format: when writing a Java class file. Implement
that however you wish, but don't change the behavior of EncoderUTF8.
You could add a flag to EncoderUTF8 file to enable "Java-style UTF8",
but it can't be the default.
--
--Per Bothner
[EMAIL PROTECTED] http://per.bothner.com/
_______________________________________________
Classpath mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/classpath