Using Nutch 1.6, I am having a problem with the processing of
fetcher.max.crawl.delay.
The description for this property states that If the Crawl-Delay in
robots.txt is set to greater than this value (in seconds) then the fetcher
will skip this page, generating an error report. If set to -1
Thanks Iain for raising this. I will look into it. Can you kindly share
urls for which you see this behavior ? I can run a crawl with those and try
at my end.
On Sat, Apr 27, 2013 at 1:13 PM, Iain Lopata ilopa...@hotmail.com wrote:
Using Nutch 1.6, I am having a problem with the processing of
Hi,
@Tejas, you will remember the work undertaken on NUTCH-1284 (the patch for
which you submitted included the fix for NUTCH-1042) relates to this.
I am not sure if the situations are identical, but they are closely linked
by the looks of it.
@ianin, can you look at the commentary and provide
Lewis -- Looks like a duplicate of NUTCH-1284. Sorry for not catching that
before posting.
-Original Message-
From: Lewis John Mcgibbney [mailto:lewis.mcgibb...@gmail.com]
Sent: Saturday, April 27, 2013 3:30 PM
To: user@nutch.apache.org
Subject: Re: Nutch 1.6 Processing
: Lewis John Mcgibbney [mailto:lewis.mcgibb...@gmail.com]
Sent: Saturday, April 27, 2013 3:30 PM
To: user@nutch.apache.org
Subject: Re: Nutch 1.6 Processing of fetcher.max.crawl.delay
Hi,
@Tejas, you will remember the work undertaken on NUTCH-1284 (the patch for
which you submitted included
.
-Original Message-
From: Lewis John Mcgibbney [mailto:lewis.mcgibb...@gmail.com]
Sent: Saturday, April 27, 2013 3:30 PM
To: user@nutch.apache.org
Subject: Re: Nutch 1.6 Processing of fetcher.max.crawl.delay
Hi,
@Tejas, you will remember the work undertaken on NUTCH-1284 (the patch
6 matches
Mail list logo