[issue8135] urllib.unquote doesn't decode mixed-case percent escapes

2010-03-29 Thread Senthil Kumaran

Senthil Kumaran orsent...@gmail.com added the comment:

merged into release26-maint as r79492. This issue can be closed.

--
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8135
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8135] urllib.unquote doesn't decode mixed-case percent escapes

2010-03-18 Thread Senthil Kumaran

Senthil Kumaran orsent...@gmail.com added the comment:

Fixed this in r79047. If we are to backport this to release26-maint, we need 
barry's approval. Barry, any thoughts? The change is a minor improvement, we 
have lived with normal case percent escape for long, mixed case would be bonus 
in release26.

--
nosy: +barry
resolution: accepted - fixed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8135
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8135] urllib.unquote doesn't decode mixed-case percent escapes

2010-03-15 Thread Senthil Kumaran

Senthil Kumaran orsent...@gmail.com added the comment:

I reviewed the patch:

+_hexdig = '0123456789ABCDEFabcdef'
+_hextochr = dict((a+b, chr(int(a+b,16))) for a in _hexdig for b in _hexdig)

is really a neat way to generate the dict of mixed-case percent escape to use 
with to unquote. I shall commit the patch to trunk code.

yes, following the other bug on unquote and we should be able to fair 
conclusion on it and include this logic in there.

Thanks.

--
assignee:  - orsenthil
nosy: +orsenthil
resolution:  - accepted

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8135
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8135] urllib.unquote doesn't decode mixed-case percent escapes

2010-03-15 Thread Matt Giuca

Matt Giuca matt.gi...@gmail.com added the comment:

Thanks very much. Importantly, note that unquote is currently duplicated 
between urllib and urlparse. I have a bug on it (#8143) but in the meantime, 
you will have to commit this fix to both modules.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8135
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8135] urllib.unquote doesn't decode mixed-case percent escapes

2010-03-14 Thread Matt Giuca

New submission from Matt Giuca matt.gi...@gmail.com:

urllib.unquote fails to decode a percent-escape with mixed case. To demonstrate:

 unquote(%fc)
'\xfc'
 unquote(%FC)
'\xfc'
 unquote(%Fc)
'%Fc'
 unquote(%fC)
'%fC'

Expected behaviour:

 unquote(%Fc)
'\xfc'
 unquote(%fC)
'\xfc'

I actually fixed this bug in Python 3, at Guido's request as part of the huge 
fix to issue 3300. To quote Guido:

 # Maps lowercase and uppercase variants (but not mixed case).
 That sounds like a disaster.  Why would %aa and %AA be correct but
 not %aA and %Aa?  (Even though the old code had the same problem.)

(Indeed, the RFC 3986 allows mixed-case percent escapes.)

I have attached a patch which fixes it simply by removing the dict mapping all 
lower and uppercase variants to characters, and simply calling int(item[:2], 
16). It's slower, but correct. This is the same solution we used in Python 3.

I've also backported a number of test cases from Python 3 which cover this 
issue, and also legitimate bad percent encoding.

Note: I've also backported the remainder of the 'unquote' test cases from 
Python 3 but I found another bug, so I will report that separately, with a 
patch.

--
components: Library (Lib)
files: urllib-unquote-mixcase.patch
keywords: patch
messages: 101044
nosy: mgiuca
severity: normal
status: open
title: urllib.unquote doesn't decode mixed-case percent escapes
type: behavior
versions: Python 2.6, Python 2.7
Added file: http://bugs.python.org/file16540/urllib-unquote-mixcase.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8135
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8135] urllib.unquote doesn't decode mixed-case percent escapes

2010-03-14 Thread Ezio Melotti

Changes by Ezio Melotti ezio.melo...@gmail.com:


--
nosy: +ezio.melotti
priority:  - normal
stage:  - patch review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8135
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8135] urllib.unquote doesn't decode mixed-case percent escapes

2010-03-14 Thread Matt Giuca

Matt Giuca matt.gi...@gmail.com added the comment:

 Note: I've also backported the remainder of the 'unquote' test cases
 from Python 3 but I found another bug, so I will report that separately,
 with a patch.

Filed under issue #8136.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8135
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8135] urllib.unquote doesn't decode mixed-case percent escapes

2010-03-14 Thread Matt Giuca

Matt Giuca matt.gi...@gmail.com added the comment:

Oh, I just discovered that urlparse contains a copy of unquote, which will also 
need to be patched. I've submitted a patch to remove the duplicate (#8143) -- 
if that is accepted first then there's no need to worry about it.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8135
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8135] urllib.unquote doesn't decode mixed-case percent escapes

2010-03-14 Thread Matt Giuca

Matt Giuca matt.gi...@gmail.com added the comment:

I thought more about it, and wrote a different patch which doesn't remove the 
dictionary. I just replaced the dictionary creation code -- now it includes 
keys for all combinations of upper and lower case (for two-letter hex codes). 
This dictionary isn't much bigger -- 484 entries where is previously had 412.

Therefore, here is a replacement patch (urllib-unquote-mixcase.patch2).

--
Added file: http://bugs.python.org/file16551/urllib-unquote-mixcase.patch2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8135
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8135] urllib.unquote doesn't decode mixed-case percent escapes

2010-03-14 Thread Matt Giuca

Changes by Matt Giuca matt.gi...@gmail.com:


Removed file: http://bugs.python.org/file16551/urllib-unquote-mixcase.patch2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8135
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8135] urllib.unquote doesn't decode mixed-case percent escapes

2010-03-14 Thread Matt Giuca

Matt Giuca matt.gi...@gmail.com added the comment:

Tiny fix to patch2 -- replaced list comprehension with generator expression in 
dictionary construction.

--
Added file: http://bugs.python.org/file16552/urllib-unquote-mixcase.patch2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8135
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com