> ----- Forwarded Message -----
> From: Benjamin Fishbein <bfishbei...@gmail.com>
> To: Alan Gauld <alan.ga...@btinternet.com>
> Sent: Friday, 2 November 2012, 3:55
> Subject: Re: [Tutor] running a javascript script with python
>
>>>> cj=cookielib.CookieJar()
>>>> opener=urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
> ....
>>>> data={"bb_isbns":isbns}
>>>> encoded_data=urllib.urlencode(data)
>>>> url='http://www.textbooks.com/BuyBack-Search.php'

You asked about this a month or so ago. This time around you're using
a cookie jar to store the session state, but you're skipping the CSID
parameter. If you look at the HTML source, you'll see
BuyBack-Search.php?CSID=Some_Value_From_Your_Session.
If you first open http://www.textbooks.com to read the session
cookies, CSID appears to be the 'tb_DSL" cookie.

That said, as I mentioned before, the site's terms of service forbid
scraping (see section II) :

http://www.textbooks.com/CustServ-Terms.php
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to