New submission from Serhiy Storchaka:

Here is preliminary patch that refactors the lowest level of Python tokenizer, 
reading and decoding. It splits the code on smaller simpler functions, 
decreases the source size by 37 lines, and fixes bugs: issue14811, issue18961, 
and a number of others. Added tests for most of fixed bugs (except leaks and 
others hardly reproducible). But the fix for other bugs can be harder, 
especially for issues with null byte (issue1105770, issue20115).

Many bug easily can be fixed if read all Python file in memory instead of 
reading it line by line. I don't know if it is acceptable.

----------
assignee: serhiy.storchaka
components: Interpreter Core
files: tokenize_input.patch
keywords: patch
messages: 254778
nosy: serhiy.storchaka
priority: normal
severity: normal
status: open
title: Python tokenizer rewriting
type: behavior
versions: Python 3.6
Added file: http://bugs.python.org/file41058/tokenize_input.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue25643>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to