The URL I tested is: http://www.midwestoffroad.com/
The robots.txt reads:
User-agent: * Disallow: admin.php Disallow: error.php Disallow: /admin/ Disallow: /images/ Disallow: /includes/ Disallow: /themes/ Disallow: /blocks/ Disallow: /modules/ Disallow: /language/ User-agent: Baidu Disallow: /
RobotRules returns that the URL is denied by robots.txt which should not be the case. A stripped script is:
use WWW::RobotRules; my $rules = WWW::RobotRules->new('MOMspider/1.0');
use LWP::Simple qw(get);
my $url = "http://www.midwestoffroad.com/robots.txt"; my $robots_txt = get $url; $rules->parse($url, $robots_txt) if defined $robots_txt;
if($rules->allowed('http://www.midwestoffroad.com/')) { print qq!Allowed by robots.txt\n\n!; }else { print qq!Denied by robots.txt\n\n!; } exit();
Which prints out "Denied by robots.txt".
Thanks