--- Begin Message ---
Apologies if this is not an appropriate place to report issues
with libwww - if which case if you could let me know a better
address I'd be very grateful.
I've noticed at least one case where $response->base does not
match what would be set by a normal web browser.
For the url http://www.stateline.org/stateline/ the HTTP headers
returned are:
HTTP/1.1 200 OK
Date: Tue, 20 Jan 2004 16:28:28 GMT
Server: Orion/1.5.2
Content-Location: http://www.stateline.org:9090/jsp/staticSite/index2.jsp
Set-Cookie: JSESSIONID=KPDJDBGMOFOL; Domain=.stateline.org; Path=/
Cache-Control: private
Connection: Close
Content-Type: text/html
Transfer-Encoding: chunked
From this $response->base is set to
http://www.stateline.org:9090/jsp/staticSite/index2.jsp
which means any relative URIs start with
http://www.stateline.org:9090/
Unfortunately the server is not listening on 9090 (or more likely
firewalled), so attempts to download any links fail.
Normal web browsers do not set port 9090 in the base so can
access links and content without problem.
Trivial testlink script, run with
testlink http://www.stateline.org/stateline/
Thanks
#!/usr/pkg/bin/perl -wT
use strict;
use LWP;
my $browser = LWP::UserAgent->new(agent => 'Mozilla/5.0');
my $response = $browser->get($ARGV[0]);
if ($response->is_success && $response->content_type eq 'text/html')
{
my $base = $response->base;
my $data = $response->content;
print "Base: $base\n";
while ($data =~ s/.*?\b(src|<link\b[^<>]*\s+href)\s*=\s*"([^"]+)"//is)
{
my $link = URI->new_abs($2, $base);
print "Link: $link\n";
}
}
--
David Brownlee -- [EMAIL PROTECTED]
--- End Message ---