Henry Spencer wrote on 2002-12-20 23:38 UTC: > It might be worth mention, because Java's not the only thing using it. > It's actually quite convenient to be able to make applications > NUL-transparent without having to recode all the string operations.
Is there a proper full specification of this encoding somewhere online? Merely replacing 0x00 with its overlong UTF-8 equivalent 0xc0 0x80 can't be the full story, because what you are interested in the end must surely be binary transparency, not merely NUL-transparency. I don't see what NUL-transparency alone would be good for, as NUL is usually only a problem in arbitrary binary strings. So you also have to specify how to represent any byte sequence including overlong UTF-8 sequences such as 0xc0 0x80. Until someone shows me the full spec behind this frequently quoted but unnamed Java derivative of UTF-8, I am not yet convinced that it is useful for anything in practice. Markus -- Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/> -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
