At 07:14 AM 7/21/00 -0800, [EMAIL PROTECTED] wrote:
>Why does it say there are three varieties when a 16-bit datum can only be
>serialised in two orders? If the scheme UTF-16 doesn't have a BOM, isn't it
>just one of the other two? When it does have a BOM, it can still be
>serialised in two ways, so aren't there four schemes - 2 serialisations x
>ħBOM? I barely manage to make sense of forms and schemes and then they
>confuse me with this stuff!

The problem is that the labels where invented to tag data streams, not to 
'label' the result of autodetection. As you point out there are 4 results 
of auto-detection:

UTF-16, no BOM
UTF-16, no BOM, but arriving in reverse byte order (for my processor)
UTF-16 with BOM
UTF-16 with BOM, arriving in reverse byte order (for my processor)

When I send a data stream, I have these conditions

1) don't know byte order
a) send it out bare
b) send it out with BOM

2) do know byte order
a) send it out with BOM, but don't tell recipient the byte order
b) don't use bom, and tell recipient the byte order in an external label

labels UTF-16BE and UTF-16LE are to be used for case 2b *only*.
label UTF-16 is required for 1a and b and 2a.

The hypothetical case of telling the recipient the byte order *and* using 
the BOM at the same time is not supported.

A./

Reply via email to