Markus Kuhn <[EMAIL PROTECTED]>: > Is there a proper full specification of this encoding somewhere > online? Merely replacing 0x00 with its overlong UTF-8 equivalent > 0xc0 0x80 can't be the full story, because what you are interested > in the end must surely be binary transparency, not merely > NUL-transparency. I don't see what NUL-transparency alone would > be good for, as NUL is usually only a problem in arbitrary binary > strings.
True, but pedantically correct handling of e-mail messages is an exception. According to RFC 822 all 7-bit characters, including '\0', are valid in a Subject line, for example. You are even allowed to have a bare '\r' or a bare '\n'; only "\r\n" is special: it must be followed by ' ' or '\t'. Of course, nobody really implements this. Edmund -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
