Thanks all. The PR is in process, and I believe it includes everything brought up here.
If you have any more thoughts, please post them there. -CHB On Thu, Aug 26, 2021 at 1:54 AM Petr Viktorin <encu...@gmail.com> wrote: > On 26. 08. 21 9:54, Marc-Andre Lemburg wrote: > > On 26.08.2021 06:07, Christopher Barker wrote: > >> I'm working on a PR now. It seems there is little support for keeping > the > >> python2 content in the docs, so I'm re-writing it as though it was > never there. > >> If someone wants to add a note about Python 2, of course that can be > added later. > >> > >> Note that "moving the Python 2 content to a section at the end" is not > all that > >> straightforward, as it is pretty mixed in with the text at this point. > >> > >> But now a question -- the current text reads: > >> > >> "Code in the core Python distribution should always use UTF-8" > >> > >> and then: > >> > >> "In the standard library, non-default encodings should be used only for > >> test purposes or when a comment or docstring needs to mention an author > >> name that contains non-ASCII characters ..." > >> > >> I *think* that's a remnant of the Py2 ASCII encoding days -- but I > wanted to > >> make sure, a bit later on, it says: > >> > >> "The following policy is prescribed for the > >> standard library ... In addition, string literals and comments must > also be in > >> ASCII." > > > > For Python 2 code we mandated ASCII for the stdlib, with some exceptions > > using the source code encoding for testing purposes or in case e.g. > > Martin von Löwis or Marc-André Lemburg wanted to put his name into the > code > > without escaping part of it ;-) > > > > Note that Python 2 defaults to ASCII as source code encoding. > > > > With UTF-8 as standard source code encoding, this is no longer > > necessary. > > > > So the second quote can be changed to "In the standard library, > non-default > > source code encodings should be used only for test purposes ...". > > > >> Is that still correct for string literals and comments? And what > about docstrings? > >> > >> It seems to me that if we really are utf-8, then there is no need for > those > >> "textual" elements to be ASCII. e.g they can still contain non-ascii > characters, > >> and escaping those makes things less readable, not more. > >> > >> So I think that section should now read: > >> > >> """ > >> Source File Encoding > >> -------------------- > >> > >> Code in the core Python distribution should always use UTF-8, and > should not > >> have an encoding declaration. > >> > >> In the standard library, non-UTF-8 encodings should be used only for > >> test purposes. > > > > I think the above should be limited to Python code. In C or other > > source files you may well still need a source code encoding. > > > >> The following policy is prescribed for the standard library (see PEP > >> 3131): All identifiers in the Python standard library MUST use > >> ASCII-only identifiers, and SHOULD use English words wherever feasible > >> (in many cases, abbreviations and technical terms are used which aren't > >> English). In comment and docstrings, authors whose names tht are not > >> based on the Latin alphabet (latin-1, ISO/IEC 8859-1 character set) > >> MUST provide a transliteration of their names in this character set. > >> > >> Open source projects with a global audience are encouraged to adopt a > >> similar policy. > >> """ > >> > >> But maybe we do want to keep comments, docstrings and literals as ASCII > with > >> escapes? > > > > No need for the stdlib, since UTF-8 is widely accepted by now > > and why should people with non-ASCII names not be able to write > > their true name ? > > > > You may have noted that I rarely do... the reason is that in the > > past, the accent on the "e" caused me too many problems. Perhaps > > one of these days, I'll go back to adding it again :-) > > I would drop the weirdly specific "(latin-1, ISO/IEC 8859-1 character > set)" note, and only keep "based on the Latin alphabet". > The Ł in Łukasz's name is not in latin-1, and I don't think it needs > different treatment than German or French names. (As opposed to a > Russian or Chinese name, where an an average English speaker isn't able > to type an approximation of the name on their keyboard.) > > - Peťa Viktorin > > _______________________________________________ > Python-Dev mailing list -- python-dev@python.org > To unsubscribe send an email to python-dev-le...@python.org > https://mail.python.org/mailman3/lists/python-dev.python.org/ > Message archived at > https://mail.python.org/archives/list/python-dev@python.org/message/E6B6INCC5IH5477XF5BGXPC3GPIEER5R/ > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/AUDBMYIGNYJ5O37IWP2PW33HUVW24DB5/ Code of Conduct: http://python.org/psf/codeofconduct/