Le samedi 09 janvier 2010 01:47:38, vous avez écrit : > One concern I have with this implementation encoding="BOM" is that if > there is no BOM it assumes UTF-8.
If no BOM is found, it fallback to the current heuristic: os.device_encoding() or system local. > (...) Hence, it might be that someone would expect a UTF-16LE (or any of > the formats that don't require a BOM, rather than UTF-8), but be willing > to accept any BOM-discriminated format. > (...) declare that they will accept > any BOM-discriminated format, but want to default, in the absence of a > BOM, to the original national language locale that they historically > accepted You mean "if there is a BOM, use it, otherwise fallback to a specific charset"? How could it be declared? Maybe: open("file.txt", check_bom=True, encoding="UTF16-LE") open("file.txt", check_bom=True, encoding="latin1") About falling back to UTF-8, it would be written: open("file.txt", check_bom=True, encoding="UTF-8") As explained before, check_bom=True is only accepted for read only file mode. Well, why not. This is a third choice for my point (1) :-) It's between Guido and Antoine choice, and I like it because we can fallback to UTF-8 instead of the dummy system locale: Windows users will be happy to be able to use UTF-8 :-) I prefer to fallback to a fixed encoding then depending on the system locale. -- Victor Stinner http://www.haypocalc.com/ _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com