Markus Kuhn <[EMAIL PROTECTED]>:

> Is there a proper full specification of this encoding somewhere
> online? Merely replacing 0x00 with its overlong UTF-8 equivalent
> 0xc0 0x80 can't be the full story, because what you are interested
> in the end must surely be binary transparency, not merely
> NUL-transparency. I don't see what NUL-transparency alone would
> be good for, as NUL is usually only a problem in arbitrary binary
> strings.

True, but pedantically correct handling of e-mail messages is an
exception. According to RFC 822 all 7-bit characters, including '\0',
are valid in a Subject line, for example. You are even allowed to have
a bare '\r' or a bare '\n'; only "\r\n" is special: it must be
followed by ' ' or '\t'. Of course, nobody really implements this.

Edmund
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to