My task_zoom.patch fixes "the 10 sec delay before getting another task when a task completes" bug. It is a rather minor part of the task_zoom.patch. Basically, the TaskTracker updates the JobTracker as soon as the task completes. There was another bug in the JobTracker that made it count all tasks rather than just the running tasks, which could cause a delay longer than 10 secs in some cases that the patch fixes.

ben

On May 25, 2006, at 8:57 AM, Doug Cutting wrote:

Gianlorenzo Thione wrote:
Thanks for the answer. So far I am still trying to understand how each tasktracker gets multiple map or reduce tasks to be executed simultaneously. I have run a simple job with 53 map tasks on 5 nodes, and at all times each node was executing a single task. Each cluster node is a 4 core machine, so theoretically this was a 16-node cluster and I feel that the resources were actually underutilized. Am I missing something? Is there a parameter for a minimum number of tasks to be executed in parallel (I found a parameter for setting a maximum [which I set to 4])? If I run 4 TaskTrackers per node then each node gets a map task at the same time and execution seems overall much faster.

The task tracker can currently get starved for work when tasks complete too quickly. This is a bug that will hopefully be fixed soon. The problem is that the task tracker only polls for a new task once per heartbeat (10 seconds). Instead it should poll for new tasks as soon as tasks complete. As a short-term workaround you can decrease the heartbeat interval to one second in MRConstants.java. With smaller clusters (< 100 machines) that should not cause any problems.

Doug



Reply via email to