Hi all,

It's probably me, actually, I was hoping someone who spot my error.
I am attempting to use cookielib, and running into difficulties.
I have been following this recipe - http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/302930
as an example, as the official documentation is a bit sparse, but it seems rather easy.

However, as my code will demonstrate -

>>> import re
>>> import urllib2
>>> import cookielib
>>>
>>> a = re.compile('href\=\"showthread.php\?s\=.+?pagenumber=(?P<pagenum>\d+?)\"', re.IGNORECASE)
>>>
>>> Jar = cookielib.MozillaCookieJar(filename = 'c:/cookies.txt')
>>> opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(Jar))
>>> urllib2.install_opener(opener)

Now, that's all by the recipe I linked too. No exceptions, so I figured it was good.

>>> f = urllib2.urlopen('http://www.gpforums.co.nz/forumdisplay.php?s=&forumid=7029 ')
>>> j = f.read()
>>> ww = a.finditer(j)
>>> print ww.next().group()
href="">
Now, that's an issue. When I'm in a cookied session in Firefox, that link would be

showthread.php?s=&threadid=267930&pagenumber=2

Hmm... so I check by requesting an url that needs a cookie to get into -

>>> f = urllib2.urlopen(' http://www.gpforums.co.nz/newthread.php?s=&action="">')
>>> print f.read()

<lots snipped>
You are not logged in, or you do not have permission to access this page. This could be due to one of several reasons:
</lots>

Now, I'm using the exact same cookies.txt ol Firefox uses, so I'm a little perplexed. I check to see if I've actually got a cookie -

>>> print Jar
<_MozillaCookieJar.MozillaCookieJar[<Cookie bblastvisit=1113481269 for .gpforums.co.nz/>, <Cookie sessionhash=f6cba21ed58837ab935a564e6b9c3b05 for .gpforums.co.nz/>, <Cookie bblastvisit=1113481269 for .www.gpforums.co.nz/>, <Cookie sessionhash=f6cba21ed58837ab935a564e6b9c3b05 for .www.gpforums.co.nz/>]>


Which is exactly how that cookie looks, both in my cookies.txt, and when I packet sniff it going out.

I also tried it the way shown in the recipe, including changing the User-Agent -

>>> txheaders =  {'User-agent' : 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'}
>>> print txheaders
{'User-agent': 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'}
>>> theurl = '
http://www.gpforums.co.nz/newthread.php?s=&action="" '
>>> req = urllib2.Request(theurl, data = "" headers = txheaders)
>>> handle = urllib2.urlopen(req)
>>> g = handle.read()
>>> print g

<lots snipped>
You are not logged in, or you do not have permission to access this page. This could be due to one of several reasons:
</lots>


So yeah, I'm at a loss, no doubt my mistake is painfully obvious when pointed out, but any pointing would be greatly appreciated.

Regards,

Liam Clarke

<packet captures follow>

 GET /newthread.php?s=&action="" HTTP/1.1\r\n
        Request Method: GET
        Request URI: /newthread.php?s=&action="">
        Request Version: HTTP/1.1
    Accept-Encoding: identity\r\n
    Host: www.gpforums.co.nz\r\n
    Cookie: bblastvisit=1113481269; sessionhash=f6cba21ed58837ab935a564e6b9c3b05; bblastvisit=1113481269; sessionhash=f6cba21ed58837ab935a564e6b9c3b05\r\n
    Connection: close\r\n
    User-agent: Python-urllib/2.4\r\n
    \r\n

...and the response

Hypertext Transfer Protocol
    HTTP/1.1 200 OK\r\n
        Request Version: HTTP/1.1
        Response Code: 200
    Date: Thu, 14 Apr 2005 12:44:12 GMT\r\n
    Server: Apache/2.0.46 (CentOS)\r\n
    Accept-Ranges: bytes\r\n
    X-Powered-By: PHP/4.3.2\r\n
    Set-Cookie: sessionhash=43bcebcf4dba6878802b25cb126ed1f7; path=/; domain=gpforums.co.nz\r\n
    Set-Cookie: sessionhash=43bcebcf4dba6878802b25cb126ed1f7; path=/; domain=www.gpforums.co.nz\r\n
    Set-Cookie: sessionhash=43bcebcf4dba6878802b25cb126ed1f7; path=/; domain=gpforums.co.nz\r\n
    Set-Cookie: sessionhash=43bcebcf4dba6878802b25cb126ed1f7; path=/; domain=www.gpforums.co.nz\r\n


--
'There is only one basic human right, and that is to do as you damn well please.
And with it comes the only basic human duty, to take the consequences.'
_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

Reply via email to