Dingyuan Wang added the comment:
Sorry for the inconvenience. I failed to find this old bug.
I think there is another problem. The docs of `untokenize` said "The iterable
must return sequences with **at least** two elements, the token type and the
token string. Any additional sequence elements are ignored.", so if I feed in,
say, a 3-tuple, the untokenize should accept it as tok[:2].
The attached patch should have addressed the problems above.
When trying to make a patch, a tokenize bug was found. Consider the new
attached tab.py, the tabs between comments and code, and the tabs between
expressions are lost, so when untokenizing, position information is used to
produce equivalent spaces, instead of tabs.
Despite the tokenization problem, the patch can produce syntactically correct
code as accurately as it can.
The PEP 8 recommends spaces for indentation, but the usage of tabs should not
be ignored.
new tab.py (in Python string):
'#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\ndef foo():\n\t"""\n\tTests
tabs in tokenization\n\t\tfoo\n\t"""\n\tpass\n\tpass\n\tif 1:\n\t\t# not indent
correctly\n\t\tpass\n\t\t# correct\ttab\n\t\tpass\n\tpass\n\tbaaz =
{\'a\ttab\':\t1,\n\t\t\t\'b\': 2}\t\t# also fails\n\npass\n#if
2:\n\t#pass\n#pass\n'
----------
keywords: +patch
nosy: +gumblex
Added file: http://bugs.python.org/file39748/tokenize.patch
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue20387>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com