Hi Rainer,

it appears that nutch doesn't obey the "Crawl-Delay:" robots.txt
statement. Out robots.txt defines a crawl-delay of 30, and most robots
seem to obey it, unlike this nuch from tonight:

[snip]

Do current versions of nutch support crawl-delay, or could you add this to future
versions?

I believe you are correct, in that Nutch currently doesn't honor the "Crawl-Delay" extension to the robots.txt standard.

I expect that something will be contributed in the next day or so that will add this support.

Thanks,

-- Ken
--
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"Find Code, Find Answers"
  • Crawl-Delay? Rainer M. Canavan
    • Re: Crawl-Delay? Ken Krugler

Reply via email to