Hello Suraj,

That should be no problem. Duplicates are grouped by their signature, this 
means you can have as many reducers as you would like.

Regards,
Markus
 
 
-----Original message-----
> From:Suraj Singh <ssi...@olbico.nl>
> Sent: Wednesday 20th February 2019 12:56
> To: user@nutch.apache.org
> Subject: Increasing the number of reducer in Deduplication
> 
> Hi All,
> 
> Can I increase the number of reducer in Deduplication on crawldb? Currently 
> it is running with 1 reducer.
> Will it impact the crawling in any way?
> 
> Current command in crawl script:
> __bin_nutch dedup "$CRAWL_PATH"/crawldb
> 
> Can I update it to:
> __bin_nutch dedup "$CRAWL_PATH"/crawldb mapreduce.job.reduces=32
> 
> Thanks it advance.
> 
> Regards,
> Suraj Singh
> 

Reply via email to