While working with a perl proxy server built in house I found that I was
getting strange behavior from recent builds of Mozilla. A discussion of
the bug is available here:
http://bugzilla.mozilla.org/show_bug.cgi?id=92140
After digging a little deeper I found that it seems that
LPW::UserAgent->request and LWP::UserAgent->simple_request return invalid
HTTP headers for some web pages. Here is a simple test script:
==========================================================================
#!/usr/bin/perl
use LWP::UserAgent;
use HTTP::Request;
use HTTP::Response;
$ua = new LWP::UserAgent;
$req = HTTP::Request->new("GET", 'http://www.math.grin.edu/');
$response = $ua->request($req);
print "Response is:".$response->as_string()."\n";
==========================================================================
In this case, the response returns two 'Content-Type' fields with two
different values, and according to the folks at Mozilla "the BNF
definition of Content-Type in RFC 2616, Section 14.17 does not allow multiple
values for Content-Type." I suspect that what is happening here is that perl
is parsing the HTML and extracting a second Content-Type declaration from one
of the Meta tags in the html document, and then storing that as a header. I
believe that this behavior is due to the UserAgent because using telnet I
do not get multiple 'Content-Type' definitions in the response from the
server. The link at the top of this message has more information on the
matter. I am working out a workaround in the proxy server, but I wonder if
this is not something that should be addressed in the libwww-perl codebase.
josh
--
Joshua Vickery
Grinnell College
14-21
Grinnell IA, 50112
[EMAIL PROTECTED]