C G wrote:
Hi All:
I have mapred.tasktracker.tasks.maximum set to 4 in our conf/hadoop-site.xml, yet I frequently see 5-6 instances of org.apache.hadoop.mapred.TaskTracker$Child running on the slave nodes. Is there another setting I need to tweak in order to dial back the number of children running? The effect of running this many children is that our boxes have extremely high load factors, and eventually mapred tasks start timing out and failing.
If mapred.tasktracker.tasks.maximum is set to four, the tasktracker has
4 map slots and 4 reduce slots, summing up to 8 slots. Then seeing 5-6
instances of org.apache.hadoop.mapred.TaskTracker$Child is expected. If
you want only 4 instances of it, mapred.tasktracker.tasks.maximum
should be 2. thus making 2 map slots and 2 reduce slots.
And as far as I know there is no other config variable for tweaking the
number of children.
Note that the number of instances is for a single job. I see far more if I run multiple jobs simultaneously (something we do not typically do).
This is on Hadoop 0.15.0, upgrading is not an option at the moment.
Any help appreciate...
Thanks,
C G
Thanks
Amareshwari