On Aug 8, 2006, at 11:07 PM, Gian Lorenzo Thione wrote:
In my understanding, mapred.tasktracker.tasks.maximum is ued to
decide how
many tasks should be allocated simultaneously per tasktracker. My
problem is
I would like to set this parameter individually for each
tastracker, each
one telling a job tracker how many tasks that node can deal with
simultaneously (my tasks are extremely CPU and memory intensive),
so the
number would be a function of the number of CPUs, number of other
processes
running, amount of memory etc....
Your understanding of the current code is correct. Currently the job
tracker assumes that the number is constant across the cluster.
Is that something that hadoop supports? Is that something that we
could
implement and contribute back? Any interest in this functionality?
In my opinion, it is reasonable to let it vary between task trackers.
The changes would not be extensive to support it. If you wrote such
a patch it would be nice to commit it back.
Thanks,
Owen