New submission from Erik Soma <stillusing...@gmail.com>:
'<>' is not recognized by the tokenize module as a single token, instead it is two tokens. ``` $ python -c "import tokenize; import io; import pprint; pprint.pprint(list(tokenize.tokenize(io.BytesIO(b'<>').readline)))" [TokenInfo(type=62 (ENCODING), string='utf-8', start=(0, 0), end=(0, 0), line=''), TokenInfo(type=54 (OP), string='<', start=(1, 0), end=(1, 1), line='<>'), TokenInfo(type=54 (OP), string='>', start=(1, 1), end=(1, 2), line='<>'), TokenInfo(type=4 (NEWLINE), string='', start=(1, 2), end=(1, 3), line=''), TokenInfo(type=0 (ENDMARKER), string='', start=(2, 0), end=(2, 0), line='')] ``` I would expect: ``` [TokenInfo(type=62 (ENCODING), string='utf-8', start=(0, 0), end=(0, 0), line=''), TokenInfo(type=54 (OP), string='<>', start=(1, 0), end=(1, 2), line='<>'), TokenInfo(type=4 (NEWLINE), string='', start=(1, 2), end=(1, 3), line=''), TokenInfo(type=0 (ENDMARKER), string='', start=(2, 0), end=(2, 0), line='')] ``` This is the behavior of the CPython tokenizer which the tokenizer module tries "to match the working of". ---------- messages: 383384 nosy: esoma priority: normal severity: normal status: open title: tokenize module does not recognize Barry as FLUFL versions: Python 3.10, Python 3.9 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue42687> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com