Terry J. Reedy added the comment:
The no encoding issue was mentioned in #12691, but needed to be opened in a
separate issue, which is this one. The doc, as opposed to the docstring, says
Converts tokens back into Python source code. Python 3.3 source code is
defined in the reference manual
Tomasz Maćkowiak added the comment:
Attached is a patch for untokenize, it's tests and docs and some minor pep8
improvements.
The patch should fix unicode output and some corner cases handling in
untokenize.
--
___
Python tracker
Changes by Tomasz Maćkowiak kur...@kurazu.net:
--
keywords: +patch
Added file: http://bugs.python.org/file30838/bug16223.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16223
___
Tomasz Maćkowiak added the comment:
Attached corrected ('^' and '$' for regexp in tests) patch.
--
Added file: http://bugs.python.org/file30841/bug16223_2.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16223
Tomasz Maćkowiak added the comment:
untokenize has also some other problems, especially when it is using compat -
it will skip first significant token, if ENCODING token is not present in input.
For example for input like this (code simplified):
tokens = tokenize(b1 + 2)
Changes by Eric Snow ericsnowcurren...@gmail.com:
--
assignee: - eric.snow
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16223
___
___
New submission from Eric Snow:
If you pass an iterable of tokens and none of them are an ENCODING token,
tokenize.untokenize() returns a string. This is contrary to what the docs say:
It returns bytes, encoded using the ENCODING token, which is the
first token sequence output by