Thanks Markus.

Regards,
Suraj Singh

-----Original Message-----
From: Markus Jelsma <markus.jel...@openindex.io> 
Sent: Wednesday, 20 February 2019 13:04
To: user@nutch.apache.org
Subject: RE: Increasing the number of reducer in Deduplication

Hello Suraj,

That should be no problem. Duplicates are grouped by their signature, this 
means you can have as many reducers as you would like.

Regards,
Markus
 
 
-----Original message-----
> From:Suraj Singh <ssi...@olbico.nl>
> Sent: Wednesday 20th February 2019 12:56
> To: user@nutch.apache.org
> Subject: Increasing the number of reducer in Deduplication
> 
> Hi All,
> 
> Can I increase the number of reducer in Deduplication on crawldb? Currently 
> it is running with 1 reducer.
> Will it impact the crawling in any way?
> 
> Current command in crawl script:
> __bin_nutch dedup "$CRAWL_PATH"/crawldb
> 
> Can I update it to:
> __bin_nutch dedup "$CRAWL_PATH"/crawldb mapreduce.job.reduces=32
> 
> Thanks it advance.
> 
> Regards,
> Suraj Singh
> 

Reply via email to