Hi, Le mercredi 13 septembre 2006 à 16:14 -0700, Josiah Carlson a écrit : > In any case, I believe that the above behavior is correct for the > context. Why? Because utf-8 has no endianness, its 'generic' decoding > spelling of 'utf-8' is analagous to all three 'utf-16', 'utf-16-be', and > 'utf-16-le' decoding spellings; two of which don't strip.
Your opinion is probably valid in a theoretical point of view. You are more knowledgeable than me. My point was different : most programmers are not at your level (or Paul's level, etc.) when it comes to Unicode knowledge. Py3k's str type is supposed to be an abstracted textual type to make it easy to write unicode-friendly applications (isn't it?). Therefore it should hide the messy issue of superfluous BOMs, unwanted BOMs, etc. Telling the programmer to use a specific UTF-8 variant specialized in BOM-stripping will make eyes roll... "why doesn't the standard UTF-8 do it for me?" Regards Antoine. _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
