On Tue, Feb 2, 2021 at 5:40 AM Inada Naoki <songofaca...@gmail.com> wrote:
> > In Python 3.10, I added _locale._get_locale_encoding() function which
> > is exactly what the encoding used by open() when no encoding is
> > specified (encoding=None) and when os.device_encoding(fd) returns
> > None. See _Py_GetLocaleEncoding() for the C implementation
> > (Python/fileutils.c).
> >
> > Maybe we should add a public locale.get_locale_encoding() function? On
> > Unix, this function uses nl_langinfo(CODESET) *without* setting
> > LC_CTYPE locale to the user preferred locale.
> >
>
> I can not imagine any use case. Isn't it just confusing?

It's the same than locale.getpreferredencoding(False) but with a more
explicit name, no argument and a *sane default behavior* (don't change
the LC_CTYPE locale temporarily).

The use case is to pass text to the OS (or get text from the OS) when
you cannot pass text directly, but must encode it (or decode it)
manually. Not all use cases involve files ;-)

Example of locale.getpreferredencoding() usage:

* XML ElementTree uses locale.getpreferredencoding() when
encoding="unicode" is used
* Deprecate gettext functions use it to encode to bytes
* the cgi module uses it to encode the URL query string for the CGI
stdin (GET and HEAD methods)

I dislike getpreferredencoding() because by default it changes
temporarily the LC_CTYPE locale which affects all threads, and this is
bad.

Well, it doesn't have to be part of the PEP ;-)

> > I understand that encoding=locale.get_locale_encoding() would be
> > different from encoding="locale":
> > encoding=locale.get_locale_encoding() doesn't call
> > os.device_encoding(), right?
> >
>
> Yes.

Would it be useful to add a io.get_locale_encoding(fd)->str (maybe
"get_default_encoding"?) function which gives the chosen encoding from
a file descriptor, similar to open(fd, encoding="locale").encoding?
The os.device_encoding() call is not obvious.


> > Maybe the PEP should also explain (in a "How to teach this" section?)
> > when encoding="locale" is better than a specific encoding, like
> > encoding="utf-8" or encoding="cp1252". In my experience, it's mostly
> > for the inter-operability which other applications which also use the
> > current locale encoding.
>
> This option is for experts who are publishing cross-platform
> libraries, frameworks, etc.
>
> For students, I am suggesting another idea that make UTF-8 mode more 
> accessible.

Maybe just say that in "How to teach this" section in the PEP?

In case of doubt, pass encoding="utf-8". Only use encoding="locale" if
you understand that the encoding changes depending on the platform and
the user locale. The common issue with encoding="locale" is that files
should not be exchanged between two computers. encoding="locale" is
good for files which remain local. It's also good for interoperability
with other applications which use the locale encoding and with the
terminal.


> > > Opt-in warning
> > > ---------------
> > >
> > > Although ``DeprecationWarning`` is suppressed by default, emitting
> > > ``DeprecationWarning`` always when ``encoding`` option is omitted
> > > would be too noisy.
> >
> > The PEP is not very clear. Does "-X warn_encoding" only emits the
> > warning, or does it also display it by default? Does it add a warning
> > filter for EncodingWarning?
> >
>
> This section is not the spec. This section is the rationale for adding
> EncodingWarning instead of using DeprecationWarning.
>
> As spec saying, EncodingWarning is a subclass of Warning. So it is
> displayed by default. But it is not emitted by default.
>
> When -X encoding_warning (or -X warn_default_encoding) is used, the
> warning is emitted and shown unless the user suppresses warnings.

I understand that EncodingWarning is always displayed by default
(default warning filters don't ignore it, whereas DeprecationWarning
are ignored by default), but no warning is emitted by default. Ok,
that makes sense. Maybe try to say it explicitly in the PEP.


> This PEP doesn't have "backward compatibility" section because the PEP
> doesn't break any backward compatibility.

IMO it's a good thing to always have the section, just to say that you
took time to think about backward compatibility ;-) The section can be
empty, like just say "there is no incompatible change" ;-)


> And if developers want to support Python ~3.9 and use -X
> warn_default_encoding on 3.10, they need to write
> `encoding=getattr(io, "LOCALE_ENCODING", None)`, as written in the
> spec.

Maybe repeat it in the Backward Compatibility section.

It's important to provide a way to prevent the warning without losing
the support for old Python versions.


> > The main question is if it's possible to use encoding="locale" on
> > Python 3.6-3.9 (maybe using some ugly hacks).
>
> No.

Hum. To write code compatible with Python 3.9, I understand that
encoding=None is the closest to encoding="locale".

And I understand that encoding=getattr(io, "LOCALE_ENCODING", None) is
backward and forward compatible ;-)

Well, encoding=None will hopefully remain accepted with your PEP
anyway for lazy developers ;-)


> Oh, I'm sorry. I want to make it in 3.10.

Since it doesn't change anything by default, the warning is only
displayed when you opt-in for it, IMO Python 3.10 target is
reasonable.

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XDB6YASB37HJYKYYYNQ43IL2GESNWSFC/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to