[Python-Dev] Re: Boundaries between numbers and identifiers

Victor Stinner Tue, 13 Apr 2021 16:03:56 -0700

It would be useful to first estimate how many projects would be broken
by such incompatible change (stricter syntax).


Inada-san wrote
https://github.com/methane/notes/blob/master/2020/wchar-cache/download_sdist.py
to download source files using
https://hugovk.github.io/top-pypi-packages/ API (top 4000 PyPI
projects).

Victor

On Tue, Apr 13, 2021 at 10:59 PM Guido van Rossum <gu...@python.org> wrote:
>
> On Tue, Apr 13, 2021 at 12:55 PM Serhiy Storchaka <storch...@gmail.com> wrote:
>>
>> 26.04.18 21:37, Serhiy Storchaka пише:
>> > In Python 2.5 `0or[]` was accepted by the Python parser. It became an
>> > error in 2.6 because "0o" became recognizing as an incomplete octal
>> > number. `1or[]` still is accepted.
>> >
>> > On other hand, `1if 2else 3` is accepted despites the fact that "2e" can
>> > be recognized as an incomplete floating point number. In this case the
>> > tokenizer pushes "e" back and returns "2".
>> >
>> > Shouldn't it do the same with "0o"? It is possible to make `0or[]` be
>> > parseable again. Python implementation is able to tokenize this example:
>> >
>> > $ echo '0or[]' | ./python -m tokenize
>> > 1,0-1,1:            NUMBER         '0'
>> > 1,1-1,3:            NAME           'or'
>> > 1,3-1,4:            OP             '['
>> > 1,4-1,5:            OP             ']'
>> > 1,5-1,6:            NEWLINE        '\n'
>> > 2,0-2,0:            ENDMARKER      ''
>> >
>> > On other hand, all these examples look weird. There is an assymmetry:
>> > `1or 2` is a valid syntax, but `1 or2` is not. It is hard to recognize
>> > visually the boundary between a number and the following identifier or
>> > keyword, especially if numbers can contain letters ("b", "e", "j", "o",
>> > "x") and underscores, and identifiers can contain digits. On both sides
>> > of the boundary can be letters, digits, and underscores.
>> >
>> > I propose to change the Python syntax by adding a requirement that there
>> > should be a whitespace or delimiter between a numeric literal and the
>> > following keyword.
>> >
>>
>> New example was found recently (see https://bugs.python.org/issue43833).
>>
>>  >>> [0x1for x in (1,2)]
>>  [31]
>>
>> It is parsed as [0x1f or x in (1,2)] instead of [0x1 for x in (1,2)].
>>
>> Since this code is clearly ambiguous, it makes more sense to emit a
>> SyntaxWarning if there is no space between number and identifier.
>
>
> I would totally make that a SyntaxError, and backwards compatibility be 
> damned.
>
> --
> --Guido van Rossum (python.org/~guido)
> Pronouns: he/him (why is my pronoun here?)
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/OU3USHVMXZJD4SA3FJGQQVQYAORHY5BM/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5VYOQRW4DOVDNSIB3G7GBHSUL5ZC3QZO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Boundaries between numbers and identifiers

Reply via email to