Edmund GRIMLEY EVANS writes:

> So, UTF-16 gives you bigendian with BOM

No. That was the not-really-standardized official reading before RFC
2781 came out, and it was not followed by some well-known proprietary
operating system on little-endian CPUs.

Now the RFC says:
  - When programs produce UTF-16, they must provide a BOM, and the
    endianness is not necessarily bigendian.
  - When programs interpret UTF-16 text, they must look if there is a
    BOM. Only if there is a BOM they may interpret it as bigendian.
    (But I wouldn't bet on this either.)

> UTF-16BE gives you big-endian without BOM and
> UTF-16LE gives you little-endian without BOM.

Yes.

> How do I ask for the machine's native ordering with or without BOM?

If you do want a BOM, use "UTF-16". If not, use UTF-16BE or UTF-16LE,
depending on autoconf's AC_C_BIGENDIAN result.

Or better, use UTF-8 instead.

Bruno
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/

Reply via email to