[issue20288] HTMLParse handing of non-numeric charrefs broken

2014-02-13 Thread Ezio Melotti
Ezio Melotti added the comment: This is now fixed, thanks for the report! This should be fixed, and the behavior of _run_check should probably be changed too -- maybe it could test both the char-by-char and the regular feeding. I created #20623 to track this. -- resolution: -

[issue20288] HTMLParse handing of non-numeric charrefs broken

2014-02-01 Thread Ezio Melotti
Ezio Melotti added the comment: Here's a patch against 2.7. -- keywords: +patch Added file: http://bugs.python.org/file33845/issue20288.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue20288

[issue20288] HTMLParse handing of non-numeric charrefs broken

2014-02-01 Thread Roundup Robot
Roundup Robot added the comment: New changeset 0d50b5851f38 by Ezio Melotti in branch '2.7': #20288: fix handling of invalid numeric charrefs in HTMLParser. http://hg.python.org/cpython/rev/0d50b5851f38 New changeset 32097f193892 by Ezio Melotti in branch '3.3': #20288: fix handling of invalid

[issue20288] HTMLParse handing of non-numeric charrefs broken

2014-01-17 Thread Anders Hammarquist
New submission from Anders Hammarquist: Python 2.7 HTMLParse.py lines 185-199 (similar lines still exist in Python 3.4) match = charref.match(rawdata, i) if match: ... else: if ; in rawdata[i:]: #bail by

[issue20288] HTMLParse handing of non-numeric charrefs broken

2014-01-17 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: -- assignee: - ezio.melotti nosy: +ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue20288 ___

[issue20288] HTMLParse handing of non-numeric charrefs broken

2014-01-17 Thread Ezio Melotti
Ezio Melotti added the comment: Thanks for the report, this is indeed a bug. This behavior was covered by a test (see Lib/test/test_htmlparser.py:164), but _run_check feeds the chars one by one to the parser, and in that case it works correctly. While feeding the parser a whole chunk I was