On Sat, 11 Mar 2006, Henri Sivonen wrote:
> The encoding labels with LE or BE in them mean BOMless variants where
> the encoding label on the transfer protocol level gives the endianness.
> See http://www.ietf.org/rfc/rfc2781.txt When the spec refers to UTF-16
> with BOM in a particular endianness, I think the spec should use
> "big-endian UTF-16" and "little-endian UTF-16".
>
> Since declaring endianness on the transfer protocol level has no benefit
> over using the BOM when the label is right and there's a chance to get
> the label wrong, the encoding labels with explicit endianness are
> harmful for interchange. In my opinion, the spec should avoid giving
> authors any bad ideas by reinforcing these labels by repetition.

FWIW, after reading the labeling part of the RFC again and adding your
suggestion, I came up with this:

big-endian UTF-16 = The big-endian encoding of UTF-16 with the BOM FEFF
little-endian UTF-16 = The little-endian encoding of UTF-16 with the BOM FFFE
UTF-16BE = The big-endian encoding of UTF-16 without the BOM
UTF-16LE = The little-endian encoding of UTF-16 without the BOM
UTF-16 = big-endian UTF-16 or little-endian UTF-16 or fallback to UTF-16BE

--
Michael

Reply via email to