Re: [PHP] Retrieve pages from an ASP driven site
On Wed, May 2, 2012 at 11:37 PM, EPA WC epawc...@gmail.com wrote: Hi List, I am trying to write a crawler to go through web pages at http://www.freebookspot.es/CompactDefault.aspx?Keyword=. But I am not quite familiar with how asp uses _doPostBack function with the next button below the book list to advance to the next page. I hope someone who knows ASP well can help out here. I need to know how to retrieve next page with PHP code. Kind regards, Tom Looking at that page source, I think this might be a bit problematic. Notice that practically the whole page is inside a form. When you get down to the Next button, that is going to sumbmit the form with it's appropriate fields set. If you look at the beginning of the form, you'll see some interesting fields, one in particular, __VIEWSTATE is pretty clearly an encoded value of some sort. When your crawler parses the page, it will have to stash the field values that the form sets in order to process the form correctly to get the next page of entries by simulating a POST-data submit. This is (probably?) most easily handled via libcurl. Unsolicited advice: Many sites do not appreciate scraping activity; make sure your crawler obeys robots.txt rules, and do not overtax the site with crawler activity. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Retrieve pages from an ASP driven site
Tom, Here is how you would paginate in PHP. // // Number of records to show per page: $display = 4; // Determine how many records there are. if (isset($_GET['np'])) { $num_pages = $_GET['np']; } else { $query = SELECT * FROM mytable; $query_result = mysql_query ($query) or die (mysql_error()); $num_records = @mysql_num_rows ($query_result); if ($num_records $display) { $num_pages = ceil ($num_records/$display); } else { $num_pages = 1; } } // Determine where in the database to start returning results. if (isset($_GET['s'])) { $start = $_GET['s']; } else { $start = 0; } // Number of records to show per page: $display = 4; // Determine how many records there are. if (isset($_GET['np'])) { $num_pages = $_GET['np']; } else { $query3 = SELECT * FROM mytable; $query_result = mysql_query ($query3) or die (mysql_error()); $num_records = @mysql_num_rows ($query_result); if ($num_records $display) { $num_pages = ceil ($num_records/$display); } else { $num_pages = 1; } } // Determine where in the database to start returning results. if (isset($_GET['s'])) { $start = $_GET['s']; } else { $start = 0; } // On 3 May 2012 05:37, EPA WC epawc...@gmail.com wrote: Hi List, I am trying to write a crawler to go through web pages at http://www.freebookspot.es/CompactDefault.aspx?Keyword=. But I am not quite familiar with how asp uses _doPostBack function with the next button below the book list to advance to the next page. I hope someone who knows ASP well can help out here. I need to know how to retrieve next page with PHP code. Kind regards, Tom -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- *Terry Ally* Twitter.com/terryally Facebook.com/terryally ~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~ To print or not to print this email is the environmentally-searching question! Which has the highest ecological cost? A sheet of paper or constantly switching on your computer and connecting to the Internet to read your email?
Re: [PHP] Retrieve pages from an ASP driven site
Terry Ally (Gmail) wrote: Here is how you would paginate in PHP. Terry - Tom is not trying to create this in PHP, but read existing ASP pages. Tom - I don't think that it's simply a matter of the ASP code here, but rather how they have constructed the set of information they are sending back. That is done in javascript, but the navigation buttons are simple form submit. BNext is submitted for 'next'. Interestingly, the sales side seems to be .php ;) -- Lester Caine - G8HFL - Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk// Firebird - http://www.firebirdsql.org/index.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Retrieve pages from an ASP driven site
Thanks Lester. On Thu, May 3, 2012 at 3:49 AM, Lester Caine les...@lsces.co.uk wrote: Terry Ally (Gmail) wrote: Here is how you would paginate in PHP. Terry - Tom is not trying to create this in PHP, but read existing ASP pages. Tom - I don't think that it's simply a matter of the ASP code here, but rather how they have constructed the set of information they are sending back. That is done in javascript, but the navigation buttons are simple form submit. BNext is submitted for 'next'. Interestingly, the sales side seems to be .php ;) -- Lester Caine - G8HFL - Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk// Firebird - http://www.firebirdsql.org/index.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php