On Wed, May 2, 2012 at 11:37 PM, EPA WC <epawc...@gmail.com> wrote:
> Hi List,
> I am trying to write a crawler to go through web pages at
> http://www.freebookspot.es/CompactDefault.aspx?Keyword=. But I am not
> quite familiar with how asp uses _doPostBack function with the "next"
> button below the book list to advance to the next page. I hope someone
> who knows ASP well can help out here. I need to know how to retrieve
> next page with PHP code.
> Kind regards,
> Tom

Looking at that page source, I think this might be a bit problematic.

Notice that practically the whole page is inside a form. When you get
down to the "Next> " button, that is going to sumbmit the form with
it's appropriate fields set. If you look at the beginning of the form,
you'll see some interesting fields, one in particular, __VIEWSTATE is
pretty clearly an encoded value of some sort.

When your crawler parses the page, it will have to stash the field
values that the form sets in order to process the form correctly to
get the next page of entries by simulating a POST-data submit. This is
(probably?) most easily handled via libcurl.

Unsolicited advice: Many sites do not appreciate scraping activity;
make sure your crawler obeys robots.txt rules, and do not overtax the
site with crawler activity.

PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to