Ok, with my last change I've run the writer benchmark twice with no
(fatal) problems. But the difference is interesting... On the same 189
node cluster, the first time I had 30 some "ghost" task trackers. (eg.
JobTracker started, taskTracker stopped and restarted so you get two
tracker _node1100_<id> with different ids. One of which never delivers
a heartbeat and does not request any tasks). The ghost trackers tricked
the scheduler into thinking that the cluster wasn't very busy and so it
never scheduled more than one task on any node. The results are:
run1 (1 task/node; 189 nodes; 1890 maps writing 1 gig of dfs data):
time: 5405 seconds
task failures: 0
run1 (4 task/node; 189 nodes; 1890 maps writing 1 gig of dfs data):
time: 5785 seconds
task failures: 173
That is still a lot of failures, but at least none of them cascaded
into killing the entire job.
We need to time out TaskTrackers sooner (10 minutes? 30 minutes?)
We should probably use pending tasks rather than "current load" for
determining when to give out new tasks.
2 cpu intel boxes are getting overloaded at 4 tasks/node, i should
probably back off to at least 3 (or the Hadoop default of 2).
-- Owen