On May 25, 2010, at 3:13 PM, Barry wrote:
Hi, The code below is giving me the error: Traceback (most recent call last): File "C:\Users\Administratör\Desktop\test.py", line 4, in <module> UnicodeDecodeError: 'utf8' codec can't decode byte 0x8b in position 1: unexpected code byte What am i doing wrong? Thanks, Barry request = urllib.request.Request(url='http://en.wiktionary.org/wiki/ baby',headers={'User-Agent':'Mozilla/5.0 (X11; U; Linux i686) Gecko/ 20071127 Firefox/2.0.0.11'} ) response = urllib.request.urlopen(request) html = response.read().decode('utf-8')
Well, for starters you're assuming that the response content is in UTF-8. You need to examine the Content-Type header to see what the encoding is. If it's not UTF-8, there's your problem.
HTH P -- http://mail.python.org/mailman/listinfo/python-list