[issue12675] tokenize module happily tokenizes code with syntax errors

Gareth Rees Fri, 05 Aug 2011 14:11:57 -0700

Gareth Rees <[email protected]> added the comment:

Terry: agreed. Does anyone actually use this module? Does anyone know what the 
design goals are for tokenize? If someone can tell me, I'll do my best to make 
it meet them.


Meanwhile, here's another bug. Each character of trailing whitespace is 
tokenized as an ERRORTOKEN.

    Python 3.3.0a0 (default:c099ba0a278e, Aug  2 2011, 12:35:03) 
    [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on 
darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from tokenize import tokenize,untokenize
    >>> from io import BytesIO
    >>> list(tokenize(BytesIO('1 '.encode('utf8')).readline))
    [TokenInfo(type=57 (ENCODING), string='utf-8', start=(0, 0), end=(0, 0), 
line=''), TokenInfo(type=2 (NUMBER), string='1', start=(1, 0), end=(1, 1), 
line='1 '), TokenInfo(type=54 (ERRORTOKEN), string=' ', start=(1, 1), end=(1, 
2), line='1 '), TokenInfo(type=0 (ENDMARKER), string='', start=(2, 0), end=(2, 
0), line='')]

----------

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue12675>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12675] tokenize module happily tokenizes code with syntax errors

Reply via email to