[issue2182] tokenize: does not allow CR for a newline

2008-04-08 Thread Jared Grubb

Jared Grubb [EMAIL PROTECTED] added the comment:

Yes, but exec(string) also gives a syntax error for \r\n:

exec('x=1\r\nprint x') 

The only explanation I could find for ONLY permitting \n as newlines in
 exec(string) comes from PEP278: There is no support for universal
newlines in strings passed to eval() or exec. It is envisioned that such
strings always have the standard \n line feed, if the strings come from
a file that file can be read with universal newlines. (This is why my
original example had to be exec(file) and not just a simple exec(string))

Of the 3 newline types, exec(*) allows 1 or all 3 as the case may be,
and tokenize allows exactly 2 of them. I honestly am not sure what the
right way is (or should be), but either way, the tokenize module is
not consistent with exec.

(By the way, if you're curious why I filed this issue and Issue#2180,
I'm working on the PyPy project to help improve its current Python
lexer/parser. In order to ensure that it is correct and robust, I was
experimenting with corner cases in Python syntax and I found these cases
where tokenize disagrees with exec.)

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2182
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2182] tokenize: does not allow CR for a newline

2008-04-08 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

I recommend that you only care about \n and consider everything else
unspecified.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2182
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2182] tokenize: does not allow CR for a newline

2008-04-08 Thread Jared Grubb

Jared Grubb [EMAIL PROTECTED] added the comment:

I actually hadnt thought of that. PyPy should actually use universal
newlines to its advantage; after all, it IS written in Python... Thanks
for the suggestion!

In any case, I wanted to get this bug about the standard library in your
record, in case you wanted to handle it. It is fairly innocuous, so I'll
let it go. Take care.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2182
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2182] tokenize: does not allow CR for a newline

2008-04-07 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

I don't think this ought to be changed in exec().  It ought to be done
by opening the file using universal newlines.

--
resolution:  - rejected
status: open - closed

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2182
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2182] tokenize: does not allow CR for a newline

2008-04-07 Thread Jared Grubb

Jared Grubb [EMAIL PROTECTED] added the comment:

This is not a report on a bug in exec(), but rather a bug in the
tokenize module -- the behavior between the CPython tokenizer and the
tokenize module is not consistent. If you look in the tokenize.py
source, it contains code to recognize both \n and \r\n as newlines, but
it ignores the possibility that \r could be the line ending character
(as the Python reference says).

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2182
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2182] tokenize: does not allow CR for a newline

2008-04-07 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

I still think it shouldn't be tokenize's business to handle this.  I'm
not quite sure how exec() manages to do this; I note that this gives a
syntax error:

exec('x = 1\rprint x\r')

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2182
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2182] tokenize: does not allow CR for a newline

2008-03-19 Thread Sean Reifschneider

Changes by Sean Reifschneider [EMAIL PROTECTED]:


--
assignee:  - gvanrossum
components: +Library (Lib) -Extension Modules
nosy: +gvanrossum
priority:  - normal

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2182
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2182] tokenize: does not allow CR for a newline

2008-02-24 Thread Jared Grubb

New submission from Jared Grubb:

tokenize recognizes '\n' and '\r\n' as newlines, but does not tolerate '\r':

 s = print 1\nprint 2\r\nprint 3\r
 open('temp.py','w').write(s)
 exec(open('temp.py','r'))
1
2
3
 tokenize.tokenize(open('temp.py','r').readline)
1,0-1,5:NAME'print'
1,6-1,7:NUMBER  '1'
1,7-1,8:NEWLINE '\n'
2,0-2,5:NAME'print'
2,6-2,7:NUMBER  '2'
2,7-2,9:NEWLINE '\r\n'
3,0-3,5:NAME'print'
3,6-3,7:NUMBER  '3'
3,7-3,8:ERRORTOKEN  '\r'
4,0-4,0:ENDMARKER   ''

--
components: Extension Modules
messages: 62959
nosy: jaredgrubb
severity: minor
status: open
title: tokenize: does not allow CR for a newline
type: behavior
versions: Python 2.5

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2182
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com