do Failed/Killed Task Attempts

Matthias Zengler Thu, 23 Feb 2012 05:30:32 -0800

Hi,

I'm using hadoop-0.20.2-cdh3u2.
I'm not really sure what is happening. But my Job stopped working at  map
--> 99,94% and reduce --> 100%. Thats kind of strange.


┏━━━━━━━━┯━━━━━━━━━━━━━┯━━━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━┓
┃  Kind  │ % Complete  │ Num Tasks │ Pending │ Running │ Complete │ Killed
│ Failed/Killed  ┃
┃        │             │           │         │         │          │
 │ Task Attempts  ┃
┠────────┼─────────────┼───────────┼─────────┼─────────┼──────────┼────────┼────────────────┨
┃        │      99,94% │           │         │         │          │
 │                ┃
┃  map   │ ┌────────┬┐ │      1785 │       0 │       0 │     1785 │      0
│          5 / 7 ┃
┃        │ └────────┴┘ │           │         │         │          │
 │                ┃
┠────────┼─────────────┼───────────┼─────────┼─────────┼──────────┼────────┼────────────────┨
┃        │     100,00% │           │         │         │          │
 │                ┃
┃ reduce │ ┌─────────┐ │        12 │       0 │       0 │       12 │      0
│         24 / 3 ┃
┃        │ └─────────┘ │           │         │         │          │
 │                ┃
┗━━━━━━━━┷━━━━━━━━━━━━━┷━━━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━┛


Yeah, i love w3m...but there is some relevant data in it. I have 1785 Tasks
and that's the number of executed tasks is equal to this. That should be
100% am I right?
There are some Failed/Killed tasks, did hadoop execute them again or is it
possible that they are missing? What can I do that they will be executed
again?

During the process the output gives output like that:
12/02/23 12:54:41 INFO mapred.JobClient:  map 99% reduce 79%
12/02/23 12:54:41 INFO mapred.JobClient: Task Id :
attempt_201201101557_0519_r_000005_2, Status : FAILED
java.io.IOException: Error Recovery for block blk_2222580152515928964_34420
failed  because recovery from primary datanode 10.6.0.19:50010 failed 6
times.  Pipeline was 10.6.0.19:50010. Aborting...
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2833)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2305)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2477)

attempt_201201101557_0519_r_000005_2: RedStage1: block_width=16
attempt_201201101557_0519_r_000005_2: log4j:WARN No appenders could be
found for logger (org.apache.hadoop.hdfs.DFSClient).
attempt_201201101557_0519_r_000005_2: log4j:WARN Please initialize the
log4j system properly.
12/02/23 12:54:42 INFO mapred.JobClient:  map 99% reduce 72%


Or in some other cases:
12/02/23 13:22:49 INFO mapred.JobClient:  map 99% reduce 89%
12/02/23 13:23:19 INFO mapred.JobClient: Task Id :
attempt_201201101557_0519_r_000004_2, Status : FAILED
Task attempt_201201101557_0519_r_000004_2 failed to report status for 600
seconds. Killing!
12/02/23 13:24:19 WARN mapred.JobClient: Error reading task outputRead
timed out
12/02/23 13:25:19 WARN mapred.JobClient: Error reading task outputRead
timed out
12/02/23 13:26:47 INFO mapred.JobClient:  map 99% reduce 90%


Anybody who can help?

Kind regards,
Mat

do Failed/Killed Task Attempts

Reply via email to