I'm trying to mimic my firefox browser in requesting a webpage with python. Here are the headers obtained by wireshark when I accessed it with firefox: GET /dirName/ HTTP/1.1 Host: www.website.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive
the website responds with this header: HTTP/1.1 200 OK Date: Fri, 17 Oct 2008 03:16:19 GMT Server: Apache/2.0.59 (FreeBSD) PHP/4.4.7 with Suhosin-Patch X-Powered-By: PHP/4.4.7 Set-Cookie: bbsessionhash=1c9eacae7c56fefc79e627b07a9af8ae; path=/; HttpOnly Set-Cookie: bblastvisit=1224613379; expires=Sat, 17 Oct 2009 03:16:19 GMT; path=/ Set-Cookie: bblastactivity=0; expires=Sat, 17 Oct 2009 03:16:19 GMT; path=/ Cache-Control: private Pragma: private X-UA-Compatible: IE=7 Content-Encoding: gzip Content-Length: 7099 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/html; charset=ISO-8859-1 So I tried trusty ol' urllib2 to request it in python: import urllib2 url = 'http://www.website.com' #headers h = { 'User-Agent' : 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3', 'Accept' : 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'Accept-Language' : 'en-us,en;q=0.5', 'Accept-Encoding' : 'gzip,deflate', 'Accept-Charset' : 'ISO-8859-1,utf-8;q=0.7,*;q=0.7', 'Keep-Alive' : '300', 'Connection' : 'keep-alive' } #request page reqObj = urllib2.Request(url, None, h) urlObj = urllib2.urlopen(reqObj) #read response print urlObj.read() print urlObj.geturl() print urlObj.info() #close urlObj urlObj.close() raw_input('press a key...') it returns these headers: Date: Fri, 17 Oct 2008 03:39:20 GMT Server: Apache/2.0.59 (FreeBSD) PHP/4.4.7 with Suhosin-Patch X-Powered-By: PHP/4.4.7 Content-Length: 1311 Connection: close Content-Type: text/html Notice the content length is considerably smaller, and no cookies are sent to me like they were in firefox. I know only a little about httpOnly cookies, but that it is some kind of special cookie that I suppose has something to do with python not being able to access it like firefox. All I want to do is have python receive the same cookies that firefox did, how can I do this? I read somewhere that httpOnly cookies were implemented in the python cookie module: http://glyphobet.net/blog/blurb/285 ....yet the other cookies aren't being sent either...
_______________________________________________ Tutor maillist - [email protected] http://mail.python.org/mailman/listinfo/tutor
