I've been wrestling with a script to scrape some information off of BusinessWeek.com for a while now. But I've run into problems trying to authenticate my agent to the businessweek server.
My script pulls a list of URLs by running some searches. Some of the URLs resulting from the searches kick you back to a registration page if you are not authenticated (I thought via HTTP authentication). If my program worked (that is my agent authenticated itself), the agent would GET: http://www.businessweek.com/cgi-bin/register/archiveSearch.cgi?h=03_47/b3859 655.htm and be redirected to: http://www.businessweek.com/@@KA8WaYYQAhrvjBkA/magazine/content/03_39/b38516 17.htm. Here's what I use to try and authenticate. What else could I try? Maybe I don't have the realm right? my $response; my $browser=LWP::UserAgent->new(); $browser->cookie_jar({}); $browser->agent('Mozilla/6.0 [en] (WinXP; U)'); print $browser->credentials('www-secure.businessweek.com:80', 'viewing Business Week Online', 'user' => 'password' );