I'm trying to open

http://пример.испытание

with

urllib2.urlopen(s1)

in Python 2.7 on Windows 7. This produces a Unicode exception:

>>> s1
u'http://\u043f\u0440\u0438\u043c\u0435\u0440.\u0438\u0441\u043f\u044b\u0442\u0430\u043d\u0438\u0435'
>>> fd = urllib2.urlopen(s1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\python27\lib\urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "C:\python27\lib\urllib2.py", line 394, in open
    response = self._open(req, data)
  File "C:\python27\lib\urllib2.py", line 412, in _open
    '_open', req)
  File "C:\python27\lib\urllib2.py", line 372, in _call_chain
    result = func(*args)
  File "C:\python27\lib\urllib2.py", line 1199, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "C:\python27\lib\urllib2.py", line 1168, in do_open
    h.request(req.get_method(), req.get_selector(), req.data, headers)
  File "C:\python27\lib\httplib.py", line 955, in request
    self._send_request(method, url, body, headers)
  File "C:\python27\lib\httplib.py", line 988, in _send_request
    self.putheader(hdr, value)
  File "C:\python27\lib\httplib.py", line 935, in putheader
    hdr = '%s: %s' % (header, '\r\n\t'.join([str(v) for v in values]))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-5: ordinal not in range(128)
>>>

The HTTP library is trying to put the URL in the header as ASCII. Why isn't "urllib2" handling that?

What does "urllib2" want?  Percent escapes?  Punycode?

                                John Nagle
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to