Followup to: <[EMAIL PROTECTED]>
By author: Bruno Haible <[EMAIL PROTECTED]>
In newsgroup: linux.utf8
>
> H. Peter Anvin writes:
>
> > The point is that I don't think iconv should emit BOMs unless you
> > explicitly ask for them.
>
> The only existing standard for UTF-16 is RFC 2781, and it recommends
> this behaviour:
>
> Any labelling application that uses UTF-16 character encoding, and
> puts an explicit charset label on the text, and does not know the
> serialization order of the characters in text, MUST label the text as
> "UTF-16", and SHOULD make sure the text starts with 0xFEFF.
>
> You could argue that putting a BOM is the application's duty, not
> iconv's business, but that would be painful for all applications which
> try to use iconv. And unlabelled data (e.g. files on a filesystem)
> shouldn't use UTF-16 or its variants in the first place, that what
> UTF-8 is for.
>
Well, the issue is that iconv() is also used for, say, text strings
embedded in data. However, it sounds like the solution is simply to
request UTF-16BE instead.
-hpa
--
<[EMAIL PROTECTED]> at work, <[EMAIL PROTECTED]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/