Re: [PHP] Retrieve pages from an ASP driven site

2012-05-04 Thread tamouse mailing lists
On Wed, May 2, 2012 at 11:37 PM, EPA WC epawc...@gmail.com wrote:
 Hi List,

 I am trying to write a crawler to go through web pages at
 http://www.freebookspot.es/CompactDefault.aspx?Keyword=. But I am not
 quite familiar with how asp uses _doPostBack function with the next
 button below the book list to advance to the next page. I hope someone
 who knows ASP well can help out here. I need to know how to retrieve
 next page with PHP code.

 Kind regards,
 Tom


Looking at that page source, I think this might be a bit problematic.

Notice that practically the whole page is inside a form. When you get
down to the Next  button, that is going to sumbmit the form with
it's appropriate fields set. If you look at the beginning of the form,
you'll see some interesting fields, one in particular, __VIEWSTATE is
pretty clearly an encoded value of some sort.

When your crawler parses the page, it will have to stash the field
values that the form sets in order to process the form correctly to
get the next page of entries by simulating a POST-data submit. This is
(probably?) most easily handled via libcurl.

Unsolicited advice: Many sites do not appreciate scraping activity;
make sure your crawler obeys robots.txt rules, and do not overtax the
site with crawler activity.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Retrieve pages from an ASP driven site

2012-05-03 Thread Terry Ally (Gmail)
Tom,

Here is how you would paginate in PHP.

//
// Number of records to show per page:
$display = 4;
// Determine how many records there are.
if (isset($_GET['np'])) {
$num_pages = $_GET['np'];
} else {
$query = SELECT * FROM mytable;
$query_result = mysql_query ($query) or die (mysql_error());
$num_records = @mysql_num_rows ($query_result);
 if ($num_records  $display) {
$num_pages = ceil ($num_records/$display);
} else {
$num_pages = 1;
}
}
 // Determine where in the database to start returning results.
if (isset($_GET['s'])) {
$start = $_GET['s'];
} else {
$start = 0;
}


// Number of records to show per page:
$display = 4;
// Determine how many records there are.
if (isset($_GET['np'])) {
$num_pages = $_GET['np'];
} else {
$query3 = SELECT * FROM mytable;
$query_result = mysql_query ($query3) or die (mysql_error());
$num_records = @mysql_num_rows ($query_result);
 if ($num_records  $display) {
$num_pages = ceil ($num_records/$display);
} else {
$num_pages = 1;
}
}
 // Determine where in the database to start returning results.
if (isset($_GET['s'])) {
$start = $_GET['s'];
} else {
$start = 0;
}
//




On 3 May 2012 05:37, EPA WC epawc...@gmail.com wrote:

 Hi List,

 I am trying to write a crawler to go through web pages at
 http://www.freebookspot.es/CompactDefault.aspx?Keyword=. But I am not
 quite familiar with how asp uses _doPostBack function with the next
 button below the book list to advance to the next page. I hope someone
 who knows ASP well can help out here. I need to know how to retrieve
 next page with PHP code.

 Kind regards,
 Tom

 --
 PHP General Mailing List (http://www.php.net/)
 To unsubscribe, visit: http://www.php.net/unsub.php




-- 
*Terry Ally*
Twitter.com/terryally
Facebook.com/terryally
~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~
To print or not to print this email is the environmentally-searching
question!
Which has the highest ecological cost? A sheet of paper or constantly
switching on your computer and connecting to the Internet to read your
email?


Re: [PHP] Retrieve pages from an ASP driven site

2012-05-03 Thread Lester Caine

Terry Ally (Gmail) wrote:

Here is how you would paginate in PHP.


Terry - Tom is not trying to create this in PHP, but read existing ASP pages.

Tom - I don't think that it's simply a matter of the ASP code here, but rather 
how they have constructed the set of information they are sending back. That is 
done in javascript, but the navigation buttons are simple form submit. BNext is 
submitted for 'next'.


Interestingly, the sales side seems to be .php ;)

--
Lester Caine - G8HFL
-
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk//
Firebird - http://www.firebirdsql.org/index.php

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Retrieve pages from an ASP driven site

2012-05-03 Thread EPA WC
Thanks Lester.

On Thu, May 3, 2012 at 3:49 AM, Lester Caine les...@lsces.co.uk wrote:
 Terry Ally (Gmail) wrote:

 Here is how you would paginate in PHP.


 Terry - Tom is not trying to create this in PHP, but read existing ASP
 pages.

 Tom - I don't think that it's simply a matter of the ASP code here, but
 rather how they have constructed the set of information they are sending
 back. That is done in javascript, but the navigation buttons are simple form
 submit. BNext is submitted for 'next'.

 Interestingly, the sales side seems to be .php ;)

 --
 Lester Caine - G8HFL
 -
 Contact - http://lsces.co.uk/wiki/?page=contact
 L.S.Caine Electronic Services - http://lsces.co.uk
 EnquirySolve - http://enquirysolve.com/
 Model Engineers Digital Workshop - http://medw.co.uk//
 Firebird - http://www.firebirdsql.org/index.php


 --
 PHP General Mailing List (http://www.php.net/)
 To unsubscribe, visit: http://www.php.net/unsub.php


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php