Theodore H. Smith scripsit:

> I'm just curious about the \0 thing. What problems would having a \0 in 
> UTF-8 present, that are not presented by having \0 in ASCII? I can't 
> see any advantage there.

AFAICT it was a hack so that arbitrary Java strings could be encoded
as C strings; that is, with no 0x00 bytes in them, even when the
string contained a U+0000.  This is the format used in Java class
files for string constants as well.

The important thing is to note that the readUTF and writeUTF methods are
*binary* I/O; they are the standard way of serializing strings,
just as the standard way of serializing ints is to write them out
as a 4-byte big-endian sequence.

They simply have nothing to do with character encoding at all.

-- 
He made the Legislature meet at one-horse       John Cowan
tank-towns out in the alfalfa belt, so that     [EMAIL PROTECTED]
hardly nobody could get there and most of       http://www.reutershealth.com
the leaders would stay home and let him go      http://www.ccil.org/~cowan
to work and do things as he pleased.    --Mencken, Declaration of Independence

Reply via email to