On Fri, Feb 12, 2021 at 5:18 AM Jim J. Jewett <jimjjew...@gmail.com> wrote: > > Inada Naoki wrote: > > > Default encoding is used for: > > > a. Really need to use locale specific encoding > > b. UTF-8 (bug. not work on Windows) > > c. ASCII (not a bug, but slow on Windows) > > > I assume most usages are (b) and (c). This PEP can reduce them soon. > > Is this just an assumption, based on those times being visible to someone who > installs a lot of packages, or has the use of any locale other than UTF-8 and > ASCII really gone down a lot? Have browsers stopped using charset sniffing? >
Using "most" is my fault. I am not good at Englsh. I should use "many" here. You can see many bugs caused by not specifying `encoding="utf-8"` in Q&A sites. I wrote some number about this common bugs in the PEP. UTF-8 is used for 96.3% of web sites [1], although browser still use charset sniffing. But how is it relating to this PEP? [1] https://w3techs.com/technologies/details/en-utf8 > > Additionally, encoding="locale" will be backward/forward compatible > > What would be the problem with changing the default from None to locale? It doesn't work on Python ~3.9. So using `encoding="locale"` is not recommended anytime soon until user drops Python 3.9 support. > (I think you mentioned that they are the same 99% of the time; is that other > 1% likely to be cases where locale is wrong but None is right? Would there > be a better way to represent that 1%?) > `encoding="locale"` and `encoding=None` has same behavior except `encoding="locale"` doesn't emit EncodingWarning even when it is opt-in. There is little difference between `encoding=None` and `encoding=locale.getpreferredencoding(False)`. The difference is: * When Python is using Windows, and * When when the file is console, and * (for open()) When PYTHONLEGACYWINDOWSSTDIO is set * (for TextIOWrapper()) When the file is not _WindowsConsoleIO encoding=None uses console codepage but encoding=locale.getpreferredencoding(False) uses Otherwise, encoding=None and encoding=locale.getpreferredencoding(False) are same. So `encoding=locale.getpreferredencoding(False)` can be used to specify locale-specific encoding explicitly. But this PEP doesn't recommend it. This PEP recommend to use EncodingWarning for just finding missing `encoding="utf-8"` (or any other specific encoding). -- Inada Naoki <songofaca...@gmail.com> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PD4BTBAQHFUYOCF5QKIBDIMHATPVEFPW/ Code of Conduct: http://python.org/psf/codeofconduct/