I need some help in writing a script that will grab the "WEB PAGE RESULTS" from MSN. First, it is important to know that the default results from MSN do NOT include "WEB DIRECTORY" results. For this reason, I need my robot to mimic this behavior and return the results from my query WITHOUT the "WEB DIRECTORY" results.
MSN includes "WEB DIRECTORY" results when cookies are disabled in MSIE browsers. I tried this several times and confirmed it. When cookies are enabled, "WEB DIRECTORY" results are omitted. I wrote the following code to include cookies and have MSN return the results without the "WEB DIRECTORY" results, but the output always includes "WEB DIRECTORY RESULTS" :-( Can anyone give me some solid direction on what I should try next? #!/usr/bin/perl -w use strict; use HTTP::Cookies; use LWP::UserAgent; use LWP::Debug qw(+); my $page0 = 'http://www.msn.com'; my $page1 = 'http://search.msn.com/pass/results.asp?RS=CHECKED&FORM=MSNH&v=1&q=christian +web+host&cp=1252'; my $ua = LWP::UserAgent->new(); #----------------------- First prepare the browser environment via the user agent (ua) -------------------------------------- $ua->agent('Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)'); $ua->cookie_jar(HTTP::Cookies->new(file=>'/tmp/msn_cookies.txt',autosave=>1, ignore_discard=>1)); ## $ua->requests_redirectable ( ['POST'] ); -- this wipes out the default of GET and HEAD my $redirectable = $ua->requests_redirectable([qw(GET HEAD POST)]); # OR push @$redirectable, 'GET'; #---------------------------------- Second prepare the request ------------------------------ my $request = HTTP::Request->new(GET => $page1); # $request->content("htmUserName=$user&htmPassword=$pass&htmPage=>$htmPage"); ## my $login_post= POST $login_pg, Content=>[ htmUserName=>'vidals', htmPassword=>'Abalone-212', htmPage=>$htmPage ]; my $response=$ua->request($request); print "\n\nLogging into $page1\n\n"; #----------------------------------- Third check request results --------------------------------- if($response->is_error) { print "No Workie:".$response->status_line()."\n"; } else { ## print "It's good!".$response->as_string()."\n"; print "\nRESULTS I\n\n". $response->content; } __END__ MSN http://search.msn.com/pass/results.asp?RS=CHECKED&FORM=MSNH&v=1&q=christian+ web+host&cp=1252 <form style="margin:0px;padding:0px;" onsubmit="return CheckMT(this.q, this.q.value, this)" action="results.aspx" method="GET" name="STWF" ><table cellpadding="0" cellspacing="0" border="0"><tr><td valign="top" colspan="2"><label for="q" accesskey="S"></label></td></tr><tr><td><nobr><input type="text" name="q" maxlength="150" VCARD_NAME="SearchText" class="qform" tabindex="1" id="q" size="50" value="christian web host"></input> <input type="submit" id="submitbutton" value="Search" tabindex="2"></input></nobr></td></tr></table><br/><input type="hidden" name="FORM" value="SMCRT"/></form> [EMAIL PROTECTED] Position Research, Inc. Search engine results by research tel: (760) 480-8291 fax: (760) 480-8271 www.PositionResearch.com