On Sat, Nov 10, 2018 at 12:42 PM Joao S. O. Bueno <jsbu...@python.org.br> wrote: > > I just saw some document which reminded me that strings with a > backslash followed by 3 octal digits. When a backslash is followed by > 3 octal digits, that means a character with the corresponding > codepoint and all is well. > > The "valid scenaario": > > In [42]: "\777" > Out[42]: 'ǿ' > > The problem is when you have just two valid octal digits > > In [40]: "\778" > Out[40]: '?8' > > Which is ambiguous at least -- why is this not "\x07" "77" for > example? (0ct(77) actually corresponds to the "?" (63) character)
Not ambiguous. It takes as many valid octal digits as it can. https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals \ooo ==> Character with octal value ooo Note 1: As in Standard C, up to three octal digits are accepted. "Up to" means that one or two digits can also define a character. For obvious reasons, it has to take digits greedily (otherwise "\777" would be "\x07" followed by "77"), and it's not an error to have fewer digits. Permitting a single digit means that "\0" means the NUL character, which is often convenient. > And then when the second digit is not valid octal: > In [43]: "\797" > Out[43]: '\x0797' > WAT? > > So, between the possibly ambiguous scenario with two octal digits > followed by a no-octal digit, and the complety unexpected expansion > to a 4-hexadecimal digit codepoint in the last case You may possibly be misinterpreting the last result. It's exactly the same as the previous ones. >>> list("\797") ['\x07', '9', '7'] The octal escape grabs as many digits as it can, and when it finds a character in the literal that isn't a valid octal digit (same whether it's a '9' or a 'q'), it stops. The remaining characters have no special meaning; this does not become four hex digits. A "\xNN" escape in Python must be exactly two digits, no more and no less. > what do you say > of deprecating any r"\[0-9]{1,3}" sequence that don't match full 3 > octal digits, and yield a syntax error for that from Python 3.9 (or > 3.10) on? Nope. Would break code for no good reason. ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/