On Tue, Feb 2, 2021 at 5:40 AM Inada Naoki <songofaca...@gmail.com> wrote: > > In Python 3.10, I added _locale._get_locale_encoding() function which > > is exactly what the encoding used by open() when no encoding is > > specified (encoding=None) and when os.device_encoding(fd) returns > > None. See _Py_GetLocaleEncoding() for the C implementation > > (Python/fileutils.c). > > > > Maybe we should add a public locale.get_locale_encoding() function? On > > Unix, this function uses nl_langinfo(CODESET) *without* setting > > LC_CTYPE locale to the user preferred locale. > > > > I can not imagine any use case. Isn't it just confusing?
It's the same than locale.getpreferredencoding(False) but with a more explicit name, no argument and a *sane default behavior* (don't change the LC_CTYPE locale temporarily). The use case is to pass text to the OS (or get text from the OS) when you cannot pass text directly, but must encode it (or decode it) manually. Not all use cases involve files ;-) Example of locale.getpreferredencoding() usage: * XML ElementTree uses locale.getpreferredencoding() when encoding="unicode" is used * Deprecate gettext functions use it to encode to bytes * the cgi module uses it to encode the URL query string for the CGI stdin (GET and HEAD methods) I dislike getpreferredencoding() because by default it changes temporarily the LC_CTYPE locale which affects all threads, and this is bad. Well, it doesn't have to be part of the PEP ;-) > > I understand that encoding=locale.get_locale_encoding() would be > > different from encoding="locale": > > encoding=locale.get_locale_encoding() doesn't call > > os.device_encoding(), right? > > > > Yes. Would it be useful to add a io.get_locale_encoding(fd)->str (maybe "get_default_encoding"?) function which gives the chosen encoding from a file descriptor, similar to open(fd, encoding="locale").encoding? The os.device_encoding() call is not obvious. > > Maybe the PEP should also explain (in a "How to teach this" section?) > > when encoding="locale" is better than a specific encoding, like > > encoding="utf-8" or encoding="cp1252". In my experience, it's mostly > > for the inter-operability which other applications which also use the > > current locale encoding. > > This option is for experts who are publishing cross-platform > libraries, frameworks, etc. > > For students, I am suggesting another idea that make UTF-8 mode more > accessible. Maybe just say that in "How to teach this" section in the PEP? In case of doubt, pass encoding="utf-8". Only use encoding="locale" if you understand that the encoding changes depending on the platform and the user locale. The common issue with encoding="locale" is that files should not be exchanged between two computers. encoding="locale" is good for files which remain local. It's also good for interoperability with other applications which use the locale encoding and with the terminal. > > > Opt-in warning > > > --------------- > > > > > > Although ``DeprecationWarning`` is suppressed by default, emitting > > > ``DeprecationWarning`` always when ``encoding`` option is omitted > > > would be too noisy. > > > > The PEP is not very clear. Does "-X warn_encoding" only emits the > > warning, or does it also display it by default? Does it add a warning > > filter for EncodingWarning? > > > > This section is not the spec. This section is the rationale for adding > EncodingWarning instead of using DeprecationWarning. > > As spec saying, EncodingWarning is a subclass of Warning. So it is > displayed by default. But it is not emitted by default. > > When -X encoding_warning (or -X warn_default_encoding) is used, the > warning is emitted and shown unless the user suppresses warnings. I understand that EncodingWarning is always displayed by default (default warning filters don't ignore it, whereas DeprecationWarning are ignored by default), but no warning is emitted by default. Ok, that makes sense. Maybe try to say it explicitly in the PEP. > This PEP doesn't have "backward compatibility" section because the PEP > doesn't break any backward compatibility. IMO it's a good thing to always have the section, just to say that you took time to think about backward compatibility ;-) The section can be empty, like just say "there is no incompatible change" ;-) > And if developers want to support Python ~3.9 and use -X > warn_default_encoding on 3.10, they need to write > `encoding=getattr(io, "LOCALE_ENCODING", None)`, as written in the > spec. Maybe repeat it in the Backward Compatibility section. It's important to provide a way to prevent the warning without losing the support for old Python versions. > > The main question is if it's possible to use encoding="locale" on > > Python 3.6-3.9 (maybe using some ugly hacks). > > No. Hum. To write code compatible with Python 3.9, I understand that encoding=None is the closest to encoding="locale". And I understand that encoding=getattr(io, "LOCALE_ENCODING", None) is backward and forward compatible ;-) Well, encoding=None will hopefully remain accepted with your PEP anyway for lazy developers ;-) > Oh, I'm sorry. I want to make it in 3.10. Since it doesn't change anything by default, the warning is only displayed when you opt-in for it, IMO Python 3.10 target is reasonable. Victor -- Night gathers, and now my watch begins. It shall not end until my death. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/XDB6YASB37HJYKYYYNQ43IL2GESNWSFC/ Code of Conduct: http://python.org/psf/codeofconduct/