Vajrasky Kok added the comment:

Lars, I see.

For the uninitiated, the issue is the original url (containing only ascii 
character) redirects to the url containing non-ascii characters which upsets 
urllib.

To handle that situation, you can do something like this:
---------------------
import urllib.request
from urllib.parse import quote
url = "http://www.libon.it/libon/search/isbn/3499155443";
req = urllib.request.Request(url)
req.selector = urllib.parse.quote(req.selector)
response = urllib.request.urlopen(req, timeout=30)
the_page = response.read().decode('utf-8')
print(the_page)
---------------------

I admit it that this code is clunky and not pythonic.

I also believe in python standard library, we should have a module to access 
url containing non-ascii character in an easy manner.

At the very least, maybe we can give proper error message. Something like this 
would be nice:

"The url is not valid and contains non-ascii character: 
http://www.libon.it/ricerca/7817940/3499155443/dettaglio/3102314/Onkel-Oswald-und-der-Sudan-Käfer/order/date_desc.
 This url is redirected from this url: 
http://www.libon.it/libon/search/isbn/3499155443";

Because users can be confused. They thought they already gave 
only-ascii-characters url (http://www.libon.it/libon/search/isbn/3499155443) to 
urllib, but why did they get encoding error?

What do you say, Christian?

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17214>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to