i discovered that some of my code was causing out of bounds
exceptions. i cleaned up that code and the map tasks seemed to work.
that confuses me -- i'm pretty sure hadoop is resilient to a few map
tasks failing (5 out of 13k). before this fix, my remaining 2% of
tasks were getting killed.
On Jul 1, 2008, at 10:06 PM, Amar Kamat wrote:
Mori Bellamy wrote:
hey all,
i've got a mapreduce task that works on small (~1G) input. when i
try to run the same task on large (~100G) input, i get the
following error around when the map tasks are almost done (~98%)
2008-07-01 13:10:59,231 INFO org.apache.hadoop.mapred.ReduceTask:
task_200807011005_0005_r_000000_0: Got 0 new map-outputs & 0
obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
2008-07-01 13:10:59,232 INFO org.apache.hadoop.mapred.ReduceTask:
task_200807011005_0005_r_000000_0 Got 0 known map output
location(s); scheduling...
2008-07-01 13:10:59,232 INFO org.apache.hadoop.mapred.ReduceTask:
task_200807011005_0005_r_000000_0 Scheduled 0 of 0 known outputs (0
slow hosts and 0 dup hosts)
2008-07-01 13:10:59,232 INFO org.apache.hadoop.mapred.ReduceTask:
task_200807011005_0005_r_000000_0 Need 1 map output(s)
...
...
These are not error messages. The reducers are stuck as not all maps
are completed. Mori, could you let us know what is happening to the
other 2% maps. Are they getting executed? Are they still pending
(waiting to run)? Were they killed/failed? Is there any lost tracker?
I'm running the task on a cluster of 5 workers, one DFS master, and
one task tracker.
What do you mean by 5 workers and 1 task tracker?
i'm chaining mapreduce tasks, so i'm using SequenceFileOutput and
SequenceFileInput. this error happens before the first link in the
chain sucessfully reduces.
Can you elaborate this a bit. Are you chaining MR jobs?
Amar
does anyone have any insight? thanks!