[issue17061] tokenize unconditionally emits NL after comment lines & blank lines

2015-10-04 Thread Martin Panter

Martin Panter added the comment:

The plain Python shell does respond to lines with only a comment and/or 
horizontal space with a continuation prompt. It only treats completely blank 
lines without any horizontal space specially:

>>> 
... # Indented blank line above; completely blank line below:
... 
>>> 

Meador: The documentation already says what you proposed: “NL tokens are 
generated when a logical line of code is continued over multiple physical 
lines” .

Thomas: It sounds like you actually want to differentiate newlines inside 
bracketed expressions from newlines outside of statements. I think this would 
require a new feature.

Also, I noticed that an escaped continued newline doesn’t seem to generate any 
token at all. Not sure if this is a bug or intended, but it does seem 
inconsistent with the other uses of the NL token.

$ ./python -btWall -m tokenize
1 + \
1,0-1,1:NUMBER '1'
1,2-1,3:OP '+'
1
2,0-2,1:NUMBER '1'
2,1-2,2:NEWLINE'\n'   
3,0-3,0:ENDMARKER  ''

--
nosy: +martin.panter
type: behavior -> enhancement
versions: +Python 3.6 -Python 2.6, Python 2.7, Python 3.2, Python 3.3

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17061] tokenize unconditionally emits NL after comment lines blank lines

2013-02-13 Thread Thomas Kluyver

Thomas Kluyver added the comment:

Hmm, that's interesting.

For our purposes, a blank line or a comment line shouldn't result in a 
continuation prompt. This is consistent with what the plain Python shell does.

As part of this, we're tokenizing the code, and if the final \n results in a NL 
token (instead of NEWLINE), we wait to build a 'Python line'. (Likewise if the 
final \n doesn't appear before EOFError, indicating that a string continues to 
the next line). Since tokenize doesn't expose parenlev (parentheses level), my 
modification to tokenize makes this work as we need.

Maybe another way forward would be to make parenlev accessible in some way, so 
that we can use that rather than using NL == parenlev  0?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17061
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17061] tokenize unconditionally emits NL after comment lines blank lines

2013-02-02 Thread Terry J. Reedy

Changes by Terry J. Reedy tjre...@udel.edu:


--
nosy: +meador.inge

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17061
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17061] tokenize unconditionally emits NL after comment lines blank lines

2013-02-02 Thread Meador Inge

Meador Inge added the comment:

The current behavior seems consistent with the lexical definition for
blank lines [1]:


A logical line that contains only spaces, tabs, formfeeds and possibly a
comment, is ignored (i.e., no NEWLINE token is generated).


NL and COMMENT are used for items that the CPython tokenizer
ignores (and are not really tokens).  Also, the test suite explicitly
tests for this case.

Perhaps the tokenize documentation should be updated
to say something like:


NL tokens are generated when a logical line of code is continued over
multiple physical lines and for blank lines.


[1] http://docs.python.org/3.4/reference/lexical_analysis.html#blank-lines

--
type:  - behavior

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17061
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17061] tokenize unconditionally emits NL after comment lines blank lines

2013-01-28 Thread Thomas Kluyver

New submission from Thomas Kluyver:

The docs describe the NL token as Token value used to indicate a 
non-terminating newline. The NEWLINE token indicates the end of a logical line 
of Python code; NL tokens are generated when a logical line of code is 
continued over multiple physical lines.

However, after a comment or a blank line, tokenize emits NL, even when it's not 
inside a multi-line statement. For example:

In [15]: for tok in tokenize.generate_tokens(StringIO('#comment\n').readline):  
print(tok)
TokenInfo(type=54 (COMMENT), string='#comment', start=(1, 0), end=(1, 8), 
line='#comment\n')
TokenInfo(type=55 (NL), string='\n', start=(1, 8), end=(1, 9), 
line='#comment\n')
TokenInfo(type=0 (ENDMARKER), string='', start=(2, 0), end=(2, 0), line='')

This makes it difficult to use tokenize to detect multi-line statements, as we 
want to do in IPython.

In my tests so far, changing two instances of NL to NEWLINE in this block 
(lines 530  533) makes it behave as I expect:
http://hg.python.org/cpython/file/a375c3d88c7e/Lib/tokenize.py#l524

--
messages: 180846
nosy: takluyver
priority: normal
severity: normal
status: open
title: tokenize unconditionally emits NL after comment lines  blank lines
versions: Python 2.6, Python 2.7, Python 3.2, Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17061
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com