[issue8135] urllib.unquote doesn't decode mixed-case percent escapes
Senthil Kumaran orsent...@gmail.com added the comment: merged into release26-maint as r79492. This issue can be closed. -- status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8135 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8135] urllib.unquote doesn't decode mixed-case percent escapes
Senthil Kumaran orsent...@gmail.com added the comment: Fixed this in r79047. If we are to backport this to release26-maint, we need barry's approval. Barry, any thoughts? The change is a minor improvement, we have lived with normal case percent escape for long, mixed case would be bonus in release26. -- nosy: +barry resolution: accepted - fixed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8135 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8135] urllib.unquote doesn't decode mixed-case percent escapes
Senthil Kumaran orsent...@gmail.com added the comment: I reviewed the patch: +_hexdig = '0123456789ABCDEFabcdef' +_hextochr = dict((a+b, chr(int(a+b,16))) for a in _hexdig for b in _hexdig) is really a neat way to generate the dict of mixed-case percent escape to use with to unquote. I shall commit the patch to trunk code. yes, following the other bug on unquote and we should be able to fair conclusion on it and include this logic in there. Thanks. -- assignee: - orsenthil nosy: +orsenthil resolution: - accepted ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8135 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8135] urllib.unquote doesn't decode mixed-case percent escapes
Matt Giuca matt.gi...@gmail.com added the comment: Thanks very much. Importantly, note that unquote is currently duplicated between urllib and urlparse. I have a bug on it (#8143) but in the meantime, you will have to commit this fix to both modules. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8135 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8135] urllib.unquote doesn't decode mixed-case percent escapes
New submission from Matt Giuca matt.gi...@gmail.com: urllib.unquote fails to decode a percent-escape with mixed case. To demonstrate: unquote(%fc) '\xfc' unquote(%FC) '\xfc' unquote(%Fc) '%Fc' unquote(%fC) '%fC' Expected behaviour: unquote(%Fc) '\xfc' unquote(%fC) '\xfc' I actually fixed this bug in Python 3, at Guido's request as part of the huge fix to issue 3300. To quote Guido: # Maps lowercase and uppercase variants (but not mixed case). That sounds like a disaster. Why would %aa and %AA be correct but not %aA and %Aa? (Even though the old code had the same problem.) (Indeed, the RFC 3986 allows mixed-case percent escapes.) I have attached a patch which fixes it simply by removing the dict mapping all lower and uppercase variants to characters, and simply calling int(item[:2], 16). It's slower, but correct. This is the same solution we used in Python 3. I've also backported a number of test cases from Python 3 which cover this issue, and also legitimate bad percent encoding. Note: I've also backported the remainder of the 'unquote' test cases from Python 3 but I found another bug, so I will report that separately, with a patch. -- components: Library (Lib) files: urllib-unquote-mixcase.patch keywords: patch messages: 101044 nosy: mgiuca severity: normal status: open title: urllib.unquote doesn't decode mixed-case percent escapes type: behavior versions: Python 2.6, Python 2.7 Added file: http://bugs.python.org/file16540/urllib-unquote-mixcase.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8135 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8135] urllib.unquote doesn't decode mixed-case percent escapes
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti priority: - normal stage: - patch review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8135 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8135] urllib.unquote doesn't decode mixed-case percent escapes
Matt Giuca matt.gi...@gmail.com added the comment: Note: I've also backported the remainder of the 'unquote' test cases from Python 3 but I found another bug, so I will report that separately, with a patch. Filed under issue #8136. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8135 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8135] urllib.unquote doesn't decode mixed-case percent escapes
Matt Giuca matt.gi...@gmail.com added the comment: Oh, I just discovered that urlparse contains a copy of unquote, which will also need to be patched. I've submitted a patch to remove the duplicate (#8143) -- if that is accepted first then there's no need to worry about it. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8135 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8135] urllib.unquote doesn't decode mixed-case percent escapes
Matt Giuca matt.gi...@gmail.com added the comment: I thought more about it, and wrote a different patch which doesn't remove the dictionary. I just replaced the dictionary creation code -- now it includes keys for all combinations of upper and lower case (for two-letter hex codes). This dictionary isn't much bigger -- 484 entries where is previously had 412. Therefore, here is a replacement patch (urllib-unquote-mixcase.patch2). -- Added file: http://bugs.python.org/file16551/urllib-unquote-mixcase.patch2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8135 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8135] urllib.unquote doesn't decode mixed-case percent escapes
Changes by Matt Giuca matt.gi...@gmail.com: Removed file: http://bugs.python.org/file16551/urllib-unquote-mixcase.patch2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8135 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8135] urllib.unquote doesn't decode mixed-case percent escapes
Matt Giuca matt.gi...@gmail.com added the comment: Tiny fix to patch2 -- replaced list comprehension with generator expression in dictionary construction. -- Added file: http://bugs.python.org/file16552/urllib-unquote-mixcase.patch2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8135 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com