[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

2021-11-03 Thread Jim J. Jewett
Stephen J. Turnbull wrote: > Jim J. Jewett writes: > > At the time, we considered it, and we also considered a narrower > > restriction on using multiple scripts in the same identifier, or at > > least the same identifier portion (so it was OK if separated by > > _). > > This would ban "παν語",

[Python-Dev] 3.9.8 and 3.11a2 temporarily on hold due to Tcl/Tk

2021-11-03 Thread Łukasz Langa
As you might have noticed, 3.9.8 was scheduled for release on Monday. This didn't happen yet. There's a bunch of ongoing work fixing Tcl/Tk problems. macOS Monterey got released with a new incompatible Tcl/Tk version, some fixes were required for tkinter compatibility. Details in

[Python-Dev] Re: PEP 663:

2021-11-03 Thread Brett Cannon
Rendered versions can be found at the following links: https://www.python.org/dev/peps/pep-0663/ https://python.github.io/peps/pep-0663/ On Tue, Nov 2, 2021 at 8:41 PM Ethan Furman wrote: > See the latest changes, which are mostly a (hopefully) improved abstract, > better tables, and some

[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

2021-11-03 Thread Chris Jerdonek
On Tue, Nov 2, 2021 at 7:21 AM Petr Viktorin wrote: > That brings us to possible changes in Python in this area, which is an > interesting topic. Is there a use case or need for allowing the comment-starting character “#” to occur when text is still in the right-to-left direction? Disallowing

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Serhiy Storchaka
03.11.21 15:14, Stephen J. Turnbull пише: > So the only > time that wouldn't be true is if escape sequences are allowed to > represent characters. I believe unicode_escape is the only codec > that does. Also raw_unicode_escape and utf_7. And maybe punycode or idna, I am not sure.

[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

2021-11-03 Thread Serhiy Storchaka
03.11.21 14:31, Petr Viktorin пише: > For example: should the parser emit a lightweight audit event if it > finds a non-ASCII identifier? (See below for why ASCII is special.) > Or for encoding declarations? There are audit events for import and compile. You can also register import hooks if you

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Stephen J. Turnbull
Chris Angelico writes: > Ah, okay, so much for that, then. What about the weaker sense: > Characters below 128 are always and only represented by those byte > values? So if you find byte value 39, it might not actually be an > apostrophe, but if you're looking for an apostrophe, you know for

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Petr Viktorin
On 03. 11. 21 12:33, Serhiy Storchaka wrote: 03.11.21 12:36, Petr Viktorin пише: On 03. 11. 21 2:58, Kyle Stanley wrote: I'd suggest both: briefer, easier to read write up for average user in docs, more details/semantics in informational PEP. Thanks for working on this, Petr! Well, this is

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Petr Viktorin
On 03. 11. 21 12:37, Chris Angelico wrote: On Wed, Nov 3, 2021 at 10:22 PM Steven D'Aprano wrote: On Wed, Nov 03, 2021 at 11:21:53AM +1100, Chris Angelico wrote: TBH, I'm not entirely sure how valid it is to talk about *security* considerations when we're dealing with Python source code and

[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

2021-11-03 Thread Petr Viktorin
We seem to agree that this is work for linters. That's reasonable; I'd generalize it to "tools and policies". But even so, discussing what we'd expect linters to do is on topic here. Perhaps we can even find ways for the language to support linters -- type checking is also for external tools,

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Chris Angelico
On Wed, Nov 3, 2021 at 10:22 PM Steven D'Aprano wrote: > > On Wed, Nov 03, 2021 at 11:21:53AM +1100, Chris Angelico wrote: > > > TBH, I'm not entirely sure how valid it is to talk about *security* > > considerations when we're dealing with Python source code and variable > > confusions, but

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Serhiy Storchaka
03.11.21 12:36, Petr Viktorin пише: > On 03. 11. 21 2:58, Kyle Stanley wrote: >> I'd suggest both: briefer, easier to read write up for average user in >> docs, more details/semantics in informational PEP. Thanks for working >> on this, Petr! > > Well, this is the brief write-up :) > Maybe it

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Steven D'Aprano
On Wed, Nov 03, 2021 at 11:11:00AM +0100, Marc-Andre Lemburg wrote: > Coming back to the thread topic, many of the Unicode security > considerations don't apply to non-Unicode encodings, since those > usually don't support e.g. changing the bidi direction within a > stream of text or other

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Steven D'Aprano
On Wed, Nov 03, 2021 at 11:21:53AM +1100, Chris Angelico wrote: > TBH, I'm not entirely sure how valid it is to talk about *security* > considerations when we're dealing with Python source code and variable > confusions, but that's a term that is well understood. It's not like Unicode is the

[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

2021-11-03 Thread Steven D'Aprano
On Tue, Nov 02, 2021 at 05:55:55PM +0200, Serhiy Storchaka wrote: > All control characters except CR, LF, TAB and FF are banned outside > comments and string literals. I think it is worth to ban them in > comments and string literals too. In string literals you can use > backslash-escape

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Petr Viktorin
On 03. 11. 21 2:58, Kyle Stanley wrote: I'd suggest both: briefer, easier to read write up for average user in docs, more details/semantics in informational PEP. Thanks for working on this, Petr! Well, this is the brief write-up :) Maybe it would work better if the info was integrated into

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Paul Moore
On Wed, 3 Nov 2021 at 10:11, Marc-Andre Lemburg wrote: > I don't think limiting the source code encoding is the right approach > to making code more secure. Instead, tooling has to be used to detect > potentially malicious code points in code. +1 Discussing "making code more secure" without

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Marc-Andre Lemburg
On 03.11.2021 01:21, Chris Angelico wrote: > On Wed, Nov 3, 2021 at 11:09 AM Steven D'Aprano wrote: >> >> On Wed, Nov 03, 2021 at 03:03:54AM +1100, Chris Angelico wrote: >>> On Wed, Nov 3, 2021 at 1:06 AM Petr Viktorin wrote: Let me know if it's clear in the newest version, with this note:

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Chris Angelico
On Wed, Nov 3, 2021 at 8:01 PM Stephen J. Turnbull wrote: > > Chris Angelico writes: > > > But I was surprised to find that Python would let you use > > unicode_escape for source code. > > I'm not surprised. Today it's probably not necessary, but I've > exchanged a lot of code (not Python,

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Serhiy Storchaka
03.11.21 11:01, Stephen J. Turnbull пише: > And of > course UTF-16 is incompatible in that sense, although I don't know if > anybody actually saves Python code in UTF-16. CPython does not currently support UTF-16 for source files. ___ Python-Dev

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Stephen J. Turnbull
Chris Angelico writes: > But I was surprised to find that Python would let you use > unicode_escape for source code. I'm not surprised. Today it's probably not necessary, but I've exchanged a lot of code (not Python, though) with folks whose editors were limited to 8 bit codes or even just

[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

2021-11-03 Thread Serhiy Storchaka
02.11.21 18:49, Jim J. Jewett пише: > If escape sequences were also allowed in comments (or at least in strings > within comments), this would make sense. I don't like banning them > otherwise, since odd characters are often a good reason to need a comment, > but it is definitely a "mention,

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Chris Angelico
On Wed, Nov 3, 2021 at 5:12 PM Stephen J. Turnbull wrote: > > Chris Angelico writes: > > > Huh. Is that level of generality actually still needed? Can Python > > deprecate all but a small handful of encodings? > > I think that's pointless. With few exceptions (GB18030, Big5 has a > couple of

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Stephen J. Turnbull
Chris Angelico writes: > Huh. Is that level of generality actually still needed? Can Python > deprecate all but a small handful of encodings? I think that's pointless. With few exceptions (GB18030, Big5 has a couple of code point pairs that encode the same very rare characters, ISO 2022