Nutch 1.6 Processing of fetcher.max.crawl.delay

2013-04-27 Thread Iain Lopata
Using Nutch 1.6, I am having a problem with the processing of fetcher.max.crawl.delay. The description for this property states that If the Crawl-Delay in robots.txt is set to greater than this value (in seconds) then the fetcher will skip this page, generating an error report. If set to -1

Re: Nutch 1.6 Processing of fetcher.max.crawl.delay

2013-04-27 Thread Tejas Patil
Thanks Iain for raising this. I will look into it. Can you kindly share urls for which you see this behavior ? I can run a crawl with those and try at my end. On Sat, Apr 27, 2013 at 1:13 PM, Iain Lopata ilopa...@hotmail.com wrote: Using Nutch 1.6, I am having a problem with the processing of

Re: Nutch 1.6 Processing of fetcher.max.crawl.delay

2013-04-27 Thread Lewis John Mcgibbney
Hi, @Tejas, you will remember the work undertaken on NUTCH-1284 (the patch for which you submitted included the fix for NUTCH-1042) relates to this. I am not sure if the situations are identical, but they are closely linked by the looks of it. @ianin, can you look at the commentary and provide

RE: Nutch 1.6 Processing of fetcher.max.crawl.delay

2013-04-27 Thread Iain Lopata
Lewis -- Looks like a duplicate of NUTCH-1284. Sorry for not catching that before posting. -Original Message- From: Lewis John Mcgibbney [mailto:lewis.mcgibb...@gmail.com] Sent: Saturday, April 27, 2013 3:30 PM To: user@nutch.apache.org Subject: Re: Nutch 1.6 Processing

Re: Nutch 1.6 Processing of fetcher.max.crawl.delay

2013-04-27 Thread Tejas Patil
: Lewis John Mcgibbney [mailto:lewis.mcgibb...@gmail.com] Sent: Saturday, April 27, 2013 3:30 PM To: user@nutch.apache.org Subject: Re: Nutch 1.6 Processing of fetcher.max.crawl.delay Hi, @Tejas, you will remember the work undertaken on NUTCH-1284 (the patch for which you submitted included

Re: Nutch 1.6 Processing of fetcher.max.crawl.delay

2013-04-27 Thread Lewis John Mcgibbney
. -Original Message- From: Lewis John Mcgibbney [mailto:lewis.mcgibb...@gmail.com] Sent: Saturday, April 27, 2013 3:30 PM To: user@nutch.apache.org Subject: Re: Nutch 1.6 Processing of fetcher.max.crawl.delay Hi, @Tejas, you will remember the work undertaken on NUTCH-1284 (the patch