Backport FetcherJob should run more reduce tasks than default
-------------------------------------------------------------
Key: NUTCH-1033
URL: https://issues.apache.org/jira/browse/NUTCH-1033
Project: Nutch
Issue Type: Improvement
Components: fetcher
Affects Versions: 1.3, 1.4
Reporter: Markus Jelsma
Fix For: 1.4
Attachments: NUTCH-1033-1.4-1.patch
Andrzej wrote:"FetcherJob now performs fetching in the reduce phase. This means
that in a typical Hadoop setup there will be many fewer reduce tasks than map
tasks, and consequently the max. total throughput of Fetcher will be
proportionally reduced. I propose that FetcherJob should set the number of
reduce tasks to the number of map tasks. This way the fetching will be more
granular."
This issue covers the backport of NUTCH-884 to Nutch 1.4-dev.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira