In porting Django, I ran into this problem: Python 3.0a3+ (py3k:61727, Mar 22 2008, 01:44:52) [GCC 4.2.3 (Debian 4.2.3-1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. py> import urllib py> urllib.quote(b"/path") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/tmp/lib/python3.0/urllib.py", line 1161, in quote return ''.join(res) File "/tmp/lib/python3.0/urllib.py", line 1126, in __call__ if ord(c) < 256: TypeError: ord() expected string of length 1, but int found
The problem here is that the elements of bytes are integers, so the quoting algorithm fails. Is this supposed to work, ie. should urllib operate on bytes? I think it should: an URL *is* a sequence of bytes, not characters, and to support characters, Python would have to support IRIs (which it currently doesn't). It might be helpful to still accept strings as the input of quote, but (until IRIs are implemented) restricting that to ASCII strings. I'm skeptical about the entire non-ASCII quoting algorithm: why does it check for characters below 256? It seems it attempts something similar to IRIs for characters above 256, encoding them as UTF-8, but encodes characters below 256 as if they were latin-1 ... Regards, Martin _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com