scraping amazon

Chris Wildman Wed, 15 Jun 2005 17:23:01 -0700

Need some help.

I am trying to scrape a page on amazon that lists all the sellers selling agiven book.

For some reason making a request to this page(see below) with LWP returnssuccessful but there is never any content. Many other pages work. I can getback the product detail page or the customer review page, etc.

Amazon has a maze for trying to figure out their URLs. This url alone youcan write 5 ways that I have found so far. Maybe this is because they don'twant people grabbing this page. I do not understand how my browser can getit though and LWP can't. Its still just html.

I realize these links redirect. But ones other than the store listing pagedo as well and they still return the html.


Please let me know if you have a clue as to what is going on.

so for example:

use LWP::UserAgent;
$ua = LWP::UserAgent->new;
$ua->env_proxy;

$req =$ua->get('http://amazon.com/o/tg/stores/offering/list/-/0596004478'); //this link doesn't return content to lwp but does work in browser


# check the outcome

if ($req->is_success)
{
$_ = $req->content;
print $_;
}
else
{
     print "Error: " . $req->status_line . "\n";
}

Chris Wildman

[EMAIL PROTECTED]

scraping amazon

Reply via email to