Using Nutch 1.6, I am having a problem with the processing of fetcher.max.crawl.delay.
The description for this property states that "If the Crawl-Delay in robots.txt is set to greater than this value (in seconds) then the fetcher will skip this page, generating an error report. If set to -1 the fetcher will never skip such pages and will wait the amount of time retrieved from robots.txt Crawl-Delay, however long that might be." I have found that the processing is not as stated when the value is set to -1. If I set the value of fetcher.max.crawl.delay to -1, any URL on a site that has Crawl-Delay specified in the applicable section of robots.text is rejected with a robots_denied(18). I am not a Java developer and I am completely new to using Nutch, but this looks like it may be either a documentation error for the property or a problem with the logic in Fetcher.java at Line 682. I can work around this by setting the property to some high value, but perhaps this is a problem that someone would like to look at. Happy to post in Jira if someone can confirm my assessment or if this is the right way to get this investigated. Thanks

