I just found some behavior that surprised me.

LWP::Simple's perldoc says: 
 
The user agent created by this module will identify itself as 
"LWP::Simple/#.##" (where "#.##" is the libwww-perl version number) 
and will initialize its proxy defaults from the environment (by 
calling $ua->env_proxy). 
 
But it doesn't mention that the get method ends up not using it, 
allowing _trivial_http_get to write its own User Agent string. 
 
get results in this in a weblog:
69.109.167.40 - - [01/Dec/2004:00:41:54 -0800] "GET / HTTP/1.0" 200 
44222 "-" "lwp-trivial/1.40" 
 
getprint results in this:
69.109.167.40 - - [01/Dec/2004:00:41:59 -0800] "GET / HTTP/1.1" 200 
44222 "-" "LWP::Simple/5.79" 
 
I just went through some hair-pulling debugging 'cause getprint was 
working where get was failing, apparently because the site's robots.txt was 
allowing one and blocking the other. It's also 
striking that they use different HTTP versions.

I'd like to suggest these differences be documented. Does anyone know
why _trivial_http_get uses its own user agent and HTTP version?

Zed

Reply via email to