Serhiy Storchaka added the comment:
I haven't fixed all bugs in handling encoding cookie yet (there are separate
issues). Well, this issue can be closed, I'll open new issue about the PEP when
will be needed. The PEP should be corrected because it affects how other Python
implementations and
Terry J. Reedy added the comment:
This looks like it could be closed. We normally do not patch PEPs after they
are implemented. Does a corrected version of something in PEP263 need to be
added to the ref manual?
--
components: -IDLE
versions: +Python 3.5 -Python 3.3
Roundup Robot added the comment:
New changeset f16855d6d4e1 by Serhiy Storchaka in branch '2.7':
Remove the use of non-existing re.ASCII.
http://hg.python.org/cpython/rev/f16855d6d4e1
--
___
Python tracker rep...@bugs.python.org
Serhiy Storchaka added the comment:
Thanks, David.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18873
___
___
Python-bugs-list mailing list
Roundup Robot added the comment:
New changeset 2dfe8262093c by Serhiy Storchaka in branch '3.3':
Issue #18873: The tokenize module, IDLE, 2to3, and the findnocoding.py script
http://hg.python.org/cpython/rev/2dfe8262093c
New changeset 6b747ad4a99a by Serhiy Storchaka in branch 'default':
Issue
Serhiy Storchaka added the comment:
If there is not now, it would be nice if there were just one python-coded
function in Lib/tokenize.py that could be imported and used by the other
python code.
Agree. But look how many tokenize issues are opened around.
Thank you for your report Paul.
Terry J. Reedy added the comment:
One of the problem with encoding recognition is that the same logic is
more-or-less reproduced multiple places, so any fix needs to be applied
multiple places. From the detect_encoding_in_comments_only.patch:
Lib/idlelib/IOBinding.py
R. David Murray added the comment:
This appears to be resulting in buildbot lib2to3 test failures. ex:
http://buildbot.python.org/all/builders/x86%20Ubuntu%20Shared%202.7/builds/2319/steps/test/logs/stdio
Serhiy Storchaka added the comment:
The tokenize module, 2to3, IDLE, and the Tools/scripts/findnocoding.py script
affected by this bug. Proposed patch fixes this in all places and adds tests
for tokenize and 2to3.
--
components: +Demos and Tools, IDLE, Library (Lib)
nosy:
Serhiy Storchaka added the comment:
And here is a patch which fixes the regular expression in PEP 263.
--
Added file: http://bugs.python.org/file31646/pep0263_regex.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18873
Changes by Serhiy Storchaka storch...@gmail.com:
--
assignee: - serhiy.storchaka
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18873
___
___
Terry J. Reedy added the comment:
Nasty bug. Running a file with 'coding=0', a quite legitimate assignment
statement, causes Idle to close, with LookupError, leading to SyntaxError,
reported on the console if there is one ('crash' otherwise). (Idle closing is a
separate problem, with an
Serhiy Storchaka added the comment:
The code patch adds '^[ \t\f]' to the re. \f = FormFeed? Should that really
be there? The PEP patch instead adds '^[ \t\v]', \v= VerticalTab? Same
question, and why the difference?
Good catch. I missed in the PEP patch, it should be '\f' ('\014') in all
Changes by Serhiy Storchaka storch...@gmail.com:
Removed file: http://bugs.python.org/file31646/pep0263_regex.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18873
___
Sergey Vishnikin added the comment:
-cookie_re = re.compile(coding[:=]\s*([-\w.]+))
+cookie_re = re.compile(#[^\r\n]*coding[:=]\s*([-\w.]+))
Regex matches only if the encoding expression is preceded by a comment.
--
keywords: +patch
nosy: +armicron
Added file:
Serhiy Storchaka added the comment:
It will fail on:
#coding=0
I'm wondering why findall() is used to match this regexp.
--
nosy: +serhiy.storchaka
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18873
Changes by Serhiy Storchaka storch...@gmail.com:
--
keywords: +easy
nosy: +benjamin.peterson
stage: - needs patch
type: crash - behavior
versions: +Python 3.4 -Python 3.1, Python 3.2
___
Python tracker rep...@bugs.python.org
New submission from Paul Bonser:
lib2to3.pgen2.tokenize:detect_encoding looks for the regex
coding[:=]\s*([-\w.]+) in the first two lines of the file without first
checking if they are comment lines.
You can get 2to3 to fail with SyntaxError: unknown encoding: 0 with a single
line file:
18 matches
Mail list logo