[issue20387] tokenize/untokenize roundtrip fails with tabs

2015-11-18 Thread Jason R. Coombs
Changes by Jason R. Coombs : -- status: open -> closed ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://

[issue20387] tokenize/untokenize roundtrip fails with tabs

2015-11-18 Thread Martin Panter
Martin Panter added the comment: It seems the problem with tabs in the indentation is fixed. Can we close this? For the problem with tabs between tokens in a line, I think that would be better handled in separate report. But I suspect it is at worst a documentation problem. You can’t guarantee

[issue20387] tokenize/untokenize roundtrip fails with tabs

2015-06-28 Thread Jason R. Coombs
Jason R. Coombs added the comment: For the sake of expediency, I've gone ahead and backported and pushed the fix to 2.7. Please back out the changes if appropriate. -- resolution: -> fixed ___ Python tracker

[issue20387] tokenize/untokenize roundtrip fails with tabs

2015-06-28 Thread Roundup Robot
Roundup Robot added the comment: New changeset 524a0e755797 by Jason R. Coombs in branch '2.7': Issue #20387: Backport test from Python 3.4 https://hg.python.org/cpython/rev/524a0e755797 New changeset cb9df1ae287b by Jason R. Coombs in branch '2.7': Issue #20387: Backport fix from Python 3.4 htt

[issue20387] tokenize/untokenize roundtrip fails with tabs

2015-06-28 Thread Jason R. Coombs
Jason R. Coombs added the comment: Benjamin, any objections to a backport of this patch? -- nosy: +benjamin.peterson ___ Python tracker ___ __

[issue20387] tokenize/untokenize roundtrip fails with tabs

2015-06-28 Thread Jason R. Coombs
Jason R. Coombs added the comment: Patch and test applied to 3.4+. I'm inclined to backport this to Python 2.7, as that was where I encountered it originally. -- versions: +Python 3.5 -Python 3.3 ___ Python tracker

[issue20387] tokenize/untokenize roundtrip fails with tabs

2015-06-28 Thread Roundup Robot
Roundup Robot added the comment: New changeset b784c842a63c by Jason R. Coombs in branch '3.4': Issue #20387: Add test capturing failure to roundtrip indented code in tokenize module. https://hg.python.org/cpython/rev/b784c842a63c New changeset 49323e5f6391 by Jason R. Coombs in branch '3.4': I

[issue20387] tokenize/untokenize roundtrip fails with tabs

2015-06-26 Thread Dingyuan Wang
Dingyuan Wang added the comment: I mean the patch only restores tabs in indentation. The reports above should be corrected. Tabs between tokens and other race conditions can't be restored exactly providing the token stream. This won't affect the syntax. I wonder if it's also a bug or a wont-fi

[issue20387] tokenize/untokenize roundtrip fails with tabs

2015-06-25 Thread Jason R. Coombs
Jason R. Coombs added the comment: @gumblex, I've applied your updated patch (though I couldn't figure out why it wouldn't apply mechanically; I had to paste it). I also corrected the test. Thanks for the advice on that. I don't understand the second half of your message. Are you stating cavea

[issue20387] tokenize/untokenize roundtrip fails with tabs

2015-06-21 Thread Dingyuan Wang
Dingyuan Wang added the comment: The new patch should now pass all tests correctly. The main idea is: * if the token is INDENT, push it on the `indents` stack and continue * if a new line starts, AND the position of the first token >= the length of the last indent level, we assume the indent is

[issue20387] tokenize/untokenize roundtrip fails with tabs

2015-06-20 Thread Jason R. Coombs
Jason R. Coombs added the comment: I've committed the patch without the change for "at least two elements" as https://bitbucket.org/jaraco/cpython-issue20387/commits/b7fe3c865b8dbdb33d26f4bc5cbb6096f5445fb2. The patch corrects the new test, demonstrating its effectiveness, but yields two new t

[issue20387] tokenize/untokenize roundtrip fails with tabs

2015-06-20 Thread Jason R. Coombs
Jason R. Coombs added the comment: I've created a repo clone and have added a version of Terry's test to it and will now test Dingyuan's patch. -- assignee: -> jason.coombs hgrepos: +313 ___ Python tracker __

[issue20387] tokenize/untokenize roundtrip fails with tabs

2015-06-20 Thread Jason R. Coombs
Jason R. Coombs added the comment: @gumblex: This is a good start. It certainly provides a candidate implementation. First, can I suggest that you remove the changes pertinent to the "at least two elements" and address that in a separate ticket/discussion? Second, any patch will necessarily n

[issue20387] tokenize/untokenize roundtrip fails with tabs

2015-06-20 Thread Dingyuan Wang
Dingyuan Wang added the comment: Sorry for the inconvenience. I failed to find this old bug. I think there is another problem. The docs of `untokenize` said "The iterable must return sequences with **at least** two elements, the token type and the token string. Any additional sequence elements

[issue20387] tokenize/untokenize roundtrip fails with tabs

2015-06-19 Thread Terry J. Reedy
Terry J. Reedy added the comment: #24447, closed as duplicate, has another example. -- ___ Python tracker ___ ___ Python-bugs-list mai

[issue20387] tokenize/untokenize roundtrip fails with tabs

2014-02-02 Thread Terry J. Reedy
Terry J. Reedy added the comment: The untokenize docstring has a stronger guarantee, and in the direction you were claiming. "Round-trip invariant for full input: Untokenized source will match input source exactly". For this to be true, the indent strings must be saved and not replaced by spac

[issue20387] tokenize/untokenize roundtrip fails with tabs

2014-02-02 Thread Terry J. Reedy
Terry J. Reedy added the comment: I think the problem is with untokenize. s =b"if False:\n\tx=3\n\ty=3\n" t = tokenize(io.BytesIO(s).readline) for i in t: print(i) produces a token stream that seems correct. TokenInfo(type=56 (ENCODING), string='utf-8', start=(0, 0), end=(0, 0), line='') Token

[issue20387] tokenize/untokenize roundtrip fails with tabs

2014-02-02 Thread Terry J. Reedy
Terry J. Reedy added the comment: I read the manual more carefully and noticed that the guarantee is that tokenizing the result of untokenize matches the input to untokenize. " The result is guaranteed to tokenize back to match the input so that the conversion is lossless and round-trips are a

[issue20387] tokenize/untokenize roundtrip fails with tabs

2014-02-01 Thread Terry J. Reedy
Terry J. Reedy added the comment: Whitespace equivalence is explicitly disclaimed. "The guarantee applies only to the token type and token string as the spacing between tokens (column positions) may change." The assert is not a valid test. I think you should close this. (Note that there are se

[issue20387] tokenize/untokenize roundtrip fails with tabs

2014-01-24 Thread Arfrever Frehtes Taifersar Arahesis
Changes by Arfrever Frehtes Taifersar Arahesis : -- nosy: +Arfrever ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscri

[issue20387] tokenize/untokenize roundtrip fails with tabs

2014-01-24 Thread Jason R. Coombs
New submission from Jason R. Coombs: Consider this simple unit test: def test_tokenize(): input = "if False:\n\tx=3\n\ty=3\n" g = list(generate_tokens(io.StringIO(input).readline)) assert untokenize(g) == input According to the docs, untokenize guarantees the output equals the input