Hi,

There seems to be a bug with the way that WWW::RobotRules handles
default ports in the URL.  For example, if I use
<http://www.htmlhelp.com:80/robots.txt> as the robot_txt_uri,
WWW::RobotRules 1.21 will not disallow access to
<http://www.htmlhelp.com/award/>, but it will disallow access to
<http://www.htmlhelp.com:80/award/>.

I fixed this in my local copy of WWW::RobotRules with the following
diff:

[root@server WWW]# diff RobotRules.pm.orig RobotRules.pm
86c86
<     my $netloc = $robot_txt_uri->authority;
---
>     my $netloc = $robot_txt_uri->host . ":" . $robot_txt_uri->port;
188c188
<     my $netloc = $uri->authority;
---
>     my $netloc = $uri->host . ":" . $uri->port;

-- 
Liam Quinn

Reply via email to