I have a program that fetches robots.txt followed by some web pages,
and it's failing intermittently for this URL:
   http://pinti.geol.u-psud.fr:80/robots.txt

If I run a test program (see below) several times, I see one of two 
results, either:

LWP::UserAgent::new: ()
LWP::UserAgent::request: ()
LWP::UserAgent::send_request: GET http://pinti.geol.u-psud.fr:80/robots.txt
LWP::UserAgent::_need_proxy: Not proxied
LWP::Protocol::http::request: ()
LWP::UserAgent::request: Simple response: Internal Server Error
500 Can't connect to pinti.geol.u-psud.fr:80 (Timeout)
robots.txt from http://pinti.geol.u-psud.fr:80/robots.txt:

OR:

LWP::UserAgent::new: ()
LWP::UserAgent::request: ()
LWP::UserAgent::send_request: GET http://pinti.geol.u-psud.fr:80/robots.txt
LWP::UserAgent::_need_proxy: Not proxied
LWP::Protocol::http::request: ()

In the second case, the LWP::UserAgent::request call never returns, and 
no error messages are issued.
I put some debug prints in the http.pm module, and it looks like the 
following syswrite call never returns:
(about line 203 in http.pm)

    {
        my $n = $socket->syswrite($req_buf, length($req_buf));
        die $! unless defined($n);
        die "short write" unless $n == length($req_buf);
        #LWP::Debug::conns($req_buf);
    }

Is there a known bug with syswrite in Perl 5.6.1?
Is there any way to trap this so my program can recover?

Here's the test program:

use HTTP::Request;
use LWP::Debug qw(+trace +debug +conns);
use LWP::UserAgent;
$agent = new LWP::UserAgent;
$url = 'http://pinti.geol.u-psud.fr:80/robots.txt';
$request = new HTTP::Request('GET',$url);
$resp = $agent->request($request);
if ($resp->is_error) {
        $code = $resp->code;
        $msg = $resp->message;
        print "$code $msg\n";
}
print "robots.txt from $url:\n";
print $resp->content;


Thanks,

  Bob Fillmore


Reply via email to