My task_zoom.patch fixes "the 10 sec delay before getting another
task when a task completes" bug. It is a rather minor part of the
task_zoom.patch. Basically, the TaskTracker updates the JobTracker as
soon as the task completes. There was another bug in the JobTracker
that made it count all tasks rather than just the running tasks,
which could cause a delay longer than 10 secs in some cases that the
patch fixes.
ben
On May 25, 2006, at 8:57 AM, Doug Cutting wrote:
Gianlorenzo Thione wrote:
Thanks for the answer. So far I am still trying to understand how
each tasktracker gets multiple map or reduce tasks to be executed
simultaneously. I have run a simple job with 53 map tasks on 5
nodes, and at all times each node was executing a single task.
Each cluster node is a 4 core machine, so theoretically this was
a 16-node cluster and I feel that the resources were actually
underutilized. Am I missing something? Is there a parameter for a
minimum number of tasks to be executed in parallel (I found a
parameter for setting a maximum [which I set to 4])? If I run 4
TaskTrackers per node then each node gets a map task at the same
time and execution seems overall much faster.
The task tracker can currently get starved for work when tasks
complete too quickly. This is a bug that will hopefully be fixed
soon. The problem is that the task tracker only polls for a new
task once per heartbeat (10 seconds). Instead it should poll for
new tasks as soon as tasks complete. As a short-term workaround
you can decrease the heartbeat interval to one second in
MRConstants.java. With smaller clusters (< 100 machines) that
should not cause any problems.
Doug