[Tutor] Error 403 when accessing wikipedia articles?

Alex Ryu Fri, 26 Oct 2007 23:54:42 -0700

Hi all
I'm trying to use python to automatically download and process a (small)
number of wikipedia articles.  However, I keep getting a 403 (Forbidden
Error), when using urllib2:


>>> import urllib2
>>> ip = urllib2.urlopen("http://en.wikipedia.org/wiki/Pythonidae";)
which gives this:

Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    ip = urllib2.urlopen("http://en.wikipedia.org/wiki/Pythonidae";)
  File "G:\Python25\lib\urllib2.py", line 121, in urlopen
    return _opener.open(url, data)
  File "G:\Python25\lib\urllib2.py", line 380, in open
    response = meth(req, response)
  File "G:\Python25\lib\urllib2.py", line 491, in http_response
    'http', request, response, code, msg, hdrs)
  File "G:\Python25\lib\urllib2.py", line 418, in error
    return self._call_chain(*args)
  File "G:\Python25\lib\urllib2.py", line 353, in _call_chain
    result = func(*args)
  File "G:\Python25\lib\urllib2.py", line 499, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 403: Forbidden

Now, when I use urllib instead of urllib2, something different happens:

>>> import urllib
>>> ip2 = urllib.urlopen("http://en.wikipedia.org/wiki/Pythonidae";)
>>> st = ip2.read()

However, st does not contain the hoped-for page - instead it is a page of
html and (maybe?) javascript, which ends in:

>If reporting this error to the Wikimedia System Administrators, please
include the following >details:<br/>\n<span style="font-style:
>italic">\nRequest: GET
http://en.wikipedia.org/wiki>/Pythonidae<http://en.wikipedia.org/wiki/Pythonidae>,
from 98.195.188.89 via sq27.wikimedia.org (squid/2.6.STABLE13) >to
>()<br/>\nError: ERR_ACCESS_DENIED, errno [No Error] at Sat, 27 Oct 2007
06:45:00 >GMT\n</span>\n</div>\n\n</body>\n</html>\n'

Could anybody tell me what's going on, and what I should be doing
differently?
Thanks for your time
Alex

_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

[Tutor] Error 403 when accessing wikipedia articles?

Reply via email to