In a message dated 2001-02-05 5:19:59 Pacific Standard Time,
[EMAIL PROTECTED] writes:
I have heard a rumour (i.e. my source is not involved in the reported
activity) that:
quote
SAP, PeopleSoft, Siebel, Oracle and others are actually
in the process of proposing a new format
Within a String, the encoding of char values is practically irrelevant. It is a
hidden encoding that is never exposed to the user...or developer. When you access
String char values, you use an index to 16-bit Unicode values. To my knowledge,
Sun does not claim that its internal encoding of String
John O'Conner wrote:
Within a String, the encoding of char values is practically irrelevant. It is a
hidden encoding that is never exposed to the user...or developer. When you access
String char values, you use an index to 16-bit Unicode values. To my knowledge,
Sun does not claim that its
John,
It does impact developers.
The API for DataInputStream defines FSS_UTF, which includes the funky
null behavior.
http://java.sun.com/products/jdk/1.2/docs/api/java/io/DataInputStream.html
Since this API and other use this UTF, it gets into file formats and
applications
end up supporting
Perhaps the methods readUTF and writeUTF should be deprecated in favor of
read/writeString. I will submit an RFE (request for enhancement) for this.
I noticed that although the Data{Input,Output} interface clearly says that the
write/readUTF handles a "Java modified UTF-8". The actual javadoc in
John,
I am not clear from your comments which is the bug, since the doc
goes both ways. Are the doc bugs that they say
it is UTF-8, or that they say it is modified UTF-8?
It would be great to learn that the functions are actually unmodified
UTF-8, as I know of some interfaces that are writing
Tex Texin wrote:
I am not clear from your comments which is the bug, since the doc
goes both ways. Are the doc bugs that they say
it is UTF-8, or that they say it is modified UTF-8?
It uses modified UTF-8, modified in three ways:
1) U+ is encoded in two bytes as 0xc0 0x80;
2) values
Here's what I see about the Java API docs:
1. The Data{Input, Output}Stream methods {read, write}UTF could be named better. More
appropriate names are {read, write}String. Strictly speaking, this is not a bug, but
it could
be better. That's why I call it an RFE (request for enhancement).
2.
8 matches
Mail list logo