Stephan Stiller wrote:
With that in mind, there is value in documenting, however briefly,
that reading FF FE 00 00 is by itself technically ambiguous.
I have seen this documented many times, though I can't say for sure that
it was in official Unicode literature.
Even though you can never
I have seen this documented many times, though I can't say for sure
that it was in official Unicode literature.
Excellent, so let's have someone state whether it's in the official
Unicode literature.
And independent of whether it is or not, I know that some mention of the
content of this
As an aside to the BOM discussion - something I've always been meaning
to ask.
So there is a BOM-ambiguity when a file starts with
FF FE
and then a couple of U+ characters, yes? Because this could be
either UTF-16 or UTF-32 under little-endianness. Has this been pointed
out
-text by itself.
2012/7/13 Stephan Stiller stephan.stil...@gmail.com:
As an aside to the BOM discussion - something I've always been meaning to
ask.
So there is a BOM-ambiguity when a file starts with
FF FE
and then a couple of U+ characters, yes? Because this could be either
UTF-16
Null characters are almost always avoided in interchanged plain texts.
This is not a practicle problem. The use of nulls as significant
characters is extremely exceptional
Yes, but still I think that the BOM ambiguity needs to be documented. If
it already is, the documentation isn't visible
is in a
file that is not plain-text by itself.
2012/7/13 Stephan Stiller stephan.stil...@gmail.com:
As an aside to the BOM discussion - something I've always been meaning to
ask.
So there is a BOM-ambiguity when a file starts with
FF FE
and then a couple of U+ characters, yes? Because
2012/7/13 Asmus Freytag asm...@ix.netcom.com:
A) treating NUL as ignorable is really deep legacy. Totally no longer
appropriate for modern data.
I did not say that. But modern data heavily uses bytes as fillers for
padding, or as terminators in various enveloppe formats. There are
some more
On 7/13/2012 1:54 PM, Stephan Stiller wrote:
So there is a BOM-ambiguity when a file starts with
FF FE
and then a couple of U+ characters, yes? Because this could be
either UTF-16 or UTF-32 under little-endianness. Has this been pointed
out and discussed beforehand
: the ambiguity
persists in arbitrary plain text files, but not from HTML and XML
documents)
2012/7/14 Ken Whistler k...@sybase.com:
On 7/13/2012 1:54 PM, Stephan Stiller wrote:
So there is a BOM-ambiguity when a file starts with
FF FE
and then a couple of U+ characters, yes? Because this could
On Jul 13, 2012, at 4:54 PM, Stephan Stiller wrote:
As an aside to the BOM discussion - something I've always been meaning to ask.
So there is a BOM-ambiguity when a file starts with
FF FE
and then a couple of U+ characters, yes? Because this could be either
UTF-16 or UTF-32 under
So there is a BOM-ambiguity when a file starts with
FF FE
and then a couple of U+ characters, yes? Because this could be
either UTF-16 or UTF-32 under little-endianness. Has this been
pointed out and discussed beforehand?
No, there is not a BOM-ambiguity. Rather, there is an English
PS: I mean, what you (Ken W) are writing is an argument for documenting
the format outside of the file proper, and that's good, but then one
wouldn't/shouldn't use a BOM in the first place.
So if one uses the BOM as a format indicator (not a perfect situation, I
understand), that often
12 matches
Mail list logo