Hello all, I've a problem using urllib2 with a proxy which need authentication.
I've tested the 'simple way' : -- code -- import urllib # example values for the post my_url = 'http://www.python.org' proxy_info = { 'host' : 'netcache.monentreprise.com', 'port' : 3128, 'user' : 'gaston.lagaffe', 'pass' : 'jeanne55' } proxy_support = urllib2.ProxyHandler({"http" : "http://%(user)s:%(pass)[EMAIL PROTECTED](host)s:%(port)d" % proxy_info}) opener = urllib2.build_opener(proxy_support) urllib2.install_opener(opener) # print proxies print "Proxies", urllib2.getproxies() # always print "Proxies {}" but I've set another proxy ! :-( req = urllib2.Request(url = my_url) handle = urllib2.urlopen(req) # get an error, seems like proxy is not contacted -- code -- But this doesn't work. It seems that the proxy is not recognized by urllib2. I've read a previoust post [1] by Ray Slakinski with John Lee answer, but unfortunatly it seems that this problem in urllib2 is well know. So my questions are : - Is there a way to make this work, and if yes, to make it work for another user which hasn't a custom (patched) urllib2 ? - What are your advices to use working web crawling with Python (I mean a good support for proxies including authenticated ones, cookies, ...) : mechanize[2] , pyCurl [3], others ? [1] posted the 8 nov 2005 on comp.lang.python, title "urllib2 Opener and Proxy/Authentication issues" [2] http://wwwsearch.sourceforge.net/mechanize/ [3] http://pycurl.sourceforge.net/ Thank you (and happy new year 2006!) Tom -- http://mail.python.org/mailman/listinfo/python-list