[issue42687] tokenize module does not recognize Barry as FLUFL

Erik Soma Sat, 19 Dec 2020 07:46:40 -0800


New submission from Erik Soma <stillusing...@gmail.com>:


'<>' is not recognized by the tokenize module as a single token, instead it is 
two tokens.

```
$ python -c "import tokenize; import io; import pprint; 
pprint.pprint(list(tokenize.tokenize(io.BytesIO(b'<>').readline)))"
[TokenInfo(type=62 (ENCODING), string='utf-8', start=(0, 0), end=(0, 0), 
line=''),
 TokenInfo(type=54 (OP), string='<', start=(1, 0), end=(1, 1), line='<>'),
 TokenInfo(type=54 (OP), string='>', start=(1, 1), end=(1, 2), line='<>'),
 TokenInfo(type=4 (NEWLINE), string='', start=(1, 2), end=(1, 3), line=''),
 TokenInfo(type=0 (ENDMARKER), string='', start=(2, 0), end=(2, 0), line='')]
```


I would expect:
```
[TokenInfo(type=62 (ENCODING), string='utf-8', start=(0, 0), end=(0, 0), 
line=''),
 TokenInfo(type=54 (OP), string='<>', start=(1, 0), end=(1, 2), line='<>'),
 TokenInfo(type=4 (NEWLINE), string='', start=(1, 2), end=(1, 3), line=''),
 TokenInfo(type=0 (ENDMARKER), string='', start=(2, 0), end=(2, 0), line='')]
```

This is the behavior of the CPython tokenizer which the tokenizer module tries 
"to match the working of".

----------
messages: 383384
nosy: esoma
priority: normal
severity: normal
status: open
title: tokenize module does not recognize Barry as FLUFL
versions: Python 3.10, Python 3.9

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue42687>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue42687] tokenize module does not recognize Barry as FLUFL

Reply via email to