Section 9.2.2 of the current Web Apps 1.0 draft states:
Bytes or sequences of bytes in the original byte stream that could not be converted to Unicode characters must be converted to U+FFFD REPLACEMENT CHARACTER code points.
I'm concerned about the "or". For example, suppose there are six upper halves of a Unicode surrogate pair in a row and no lower halves. Does that turn into six replacement characters or one? Both interpretations seem possible.
I suppose I prefer six rather than one, but I don't care a great deal as long as this is locked down one way or the other.
-- Elliotte Rusty Harold [EMAIL PROTECTED] Java I/O 2nd Edition Just Published! http://www.cafeaulait.org/books/javaio2/ http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
