On Sun, Jan 24, 2021 at 10:17 AM Guido van Rossum <gu...@python.org> wrote:
>
> I have definitely seen BOMs written by Notepad on Windows 10.
>
> Why can’t the future be that open() in text mode guesses the encoding?

I don't like guessing. As a Japanese, I have seen many mojibake caused
by the wrong guess.
I don't think guessing encoding is not a good part of reliable software.

On the other hand, if we add `open_utf8()`, it's easy to ignore BOM:

* When reading, use "utf-8-sig". (it can read UTF-8 without bom)
* When writing, use "utf-8".

Although UTF-8 with BOM is not recommended, and Notepad uses UTF-8
without BOM as default encoding from 1903, UTF-8 with BOM is still
used in some cases.
For example, Excel reads CSV file with UTF-8 with BOM or legacy
encoding. So some CSV files is written with BOM.

Regards,
-- 
Inada Naoki  <songofaca...@gmail.com>
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BJC6LCYNO2HHRLHF4TFHWTG53M4YL6LL/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to