RE: Two Nutch parallel crawl with two conf folder.

2010-03-09 Thread Pravin Karne
Hi Millebii, Thanks for your valuable inputs. As per our requirements we need to run multiple nutch instances with each instance pointing to their own conf dir and crawlDB. crawl -urlfilter.txt is different in both conf folder. But in our case both nutch instances picking same conf dir instead

Re: Two Nutch parallel crawl with two conf folder.

2010-03-09 Thread Gora Mohanty
On Tue, 9 Mar 2010 14:36:33 +0100 MilleBii mille...@gmail.com wrote: Never tried... Also you may want to check $NUTCH_HOME variable which should be different for each instance, otherwise it will only use one of the two conf dir. [...] Had meant to reply to the original poster, but had

Re: Two Nutch parallel crawl with two conf folder.

2010-03-09 Thread MilleBii
Never tried... Also you may want to check $NUTCH_HOME variable which should be different for each instance, otherwise it will only use one of the two conf dir. 2010/3/9, Pravin Karne pravin_ka...@persistent.co.in: Hi Millebii, Thanks for your valuable inputs. As per our requirements we need

Re: Two Nutch parallel crawl with two conf folder.

2010-03-09 Thread eks dev
coool answer - Original Message From: MilleBii mille...@gmail.com To: nutch-user@lucene.apache.org Sent: Tue, 9 March, 2010 8:35:42 Subject: Re: Two Nutch parallel crawl with two conf folder. Yes it should work, I personnaly run some tests crawl on the same hardware, even on

Re: Two Nutch parallel crawl with two conf folder.

2010-03-09 Thread eks dev
sorry for the noise.. I've mixed up Emails - Original Message From: eks dev eks...@yahoo.co.uk To: nutch-user@lucene.apache.org Sent: Tue, 9 March, 2010 18:07:47 Subject: Re: Two Nutch parallel crawl with two conf folder. coool answer - Original Message From:

Abt: Detect slow and timeout servers and drop their URLs

2010-03-09 Thread Yves Petinot
I was wondering if the current release of Nutch provides any support for slow servers ? The issue has been previously described in the following JIRA entry:

Re: Abt: Detect slow and timeout servers and drop their URLs

2010-03-09 Thread Julien Nioche
Bonjour Yves, Did you see https://issues.apache.org/jira/browse/NUTCH-770? It has been committed to the trunk back in December. HTH Julien -- DigitalPebble Ltd http://www.digitalpebble.com On 9 March 2010 17:26, Yves Petinot y...@snooth.com wrote: I was wondering if the current release of