Re: segmentmerger spawns too many jobs

Dennis Kubes Wed, 26 Nov 2008 06:01:49 -0800

The mapred.map.tasks and mapred.reduce.tasks will define the approximatenumber of tasks per job. It is highly dependent upon the amount of databeing processed as well. The mapred.tasktracker.map.tasks.maximum andmapred.tasktracker.reduce.tasks.maximum define the maximum number oftasks to run on a single tasktracker for map and reduce tasks.

When you say 20 jobs I am assuming you mean tasks. Also what type ofhardware are you running this on, what are your memory settings, runningin local or DFS mode?


Dennis

Alexander Aristov wrote:

Hi all

Can someone suggest me how to restrict number of jobs Nutch lauches in
hadoop when starts segment merger.

When I run generate, fetch, updatedb tasks Nutch starts about 6-10 Mapreduce
jobs (cluster of 2 datanodes) - actual value varies from task to task but
when the script start merging segments it lauches about 20 jobs and servers
get overloaded and crash. Nutch settings are primary default one.

How can I control the number of jobs?

best Regards
Alexander

Re: segmentmerger spawns too many jobs

Reply via email to