Re: [Python-3000] BOM handling

Antoine Pitrou Wed, 13 Sep 2006 23:19:06 -0700

Hi,

Le mercredi 13 septembre 2006 à 16:14 -0700, Josiah Carlson a écrit :
> In any case, I believe that the above behavior is correct for the
> context.  Why?  Because utf-8 has no endianness, its 'generic' decoding
> spelling of 'utf-8' is analagous to all three 'utf-16', 'utf-16-be', and
> 'utf-16-le' decoding spellings; two of which don't strip.


Your opinion is probably valid in a theoretical point of view. You are
more knowledgeable than me.

My point was different : most programmers are not at your level (or
Paul's level, etc.) when it comes to Unicode knowledge. Py3k's str type
is supposed to be an abstracted textual type to make it easy to write
unicode-friendly applications (isn't it?).
Therefore it should hide the messy issue of superfluous BOMs, unwanted
BOMs, etc. Telling the programmer to use a specific UTF-8 variant
specialized in BOM-stripping will make eyes roll... "why doesn't the
standard UTF-8 do it for me?"

Regards

Antoine.


_______________________________________________
Python-3000 mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Re: [Python-3000] BOM handling

Reply via email to