Sumeet Sandhu wrote: > Hi, > > I use urllib2 to grab google.com webpages on my Mac over my Comcast home > network. > > I see about 1 error for every 50 pages grabbed. Most exceptions are > ssl.SSLError, very few are socket.error and urllib2.URLError. > > The problem is - after a first exception, urllib2 occasionally stalls for > upto an hour (!), at either the urllib2.urlopen or response.read stages. > > Apparently the urllib2 and socket timeouts are not effective here - how do > I fix this? > > ---------------- > import urllib2 > import socket > from sys import exc_info as sysExc_info > timeout = 2 > socket.setdefaulttimeout(timeout) > > try : > req = urllib2.Request(url,None,headers) > response = urllib2.urlopen(req,timeout=timeout) > html = response.read() > except : > e = sysExc_info()[0] > open(logfile,'a').write('Exception: %s \n' % e) > < code that follows this : after the first exception, I try again for a > few tries >
I'd use separate try...except-s for response = urlopen() and response.read(). If the problem originates with read() you could try to replace it with select.select([response.fileno()], [], [], timeout) calls in a loop. -- https://mail.python.org/mailman/listinfo/python-list