has anyone seen this? basically a child task is killing itself, as a ping
with the parent didn't quite work - the reply from the parent was
unexpected.

hadoop version: 0.19.0
userlogs on slave node:

2009-05-29 13:57:33,551 WARN org.apache.hadoop.mapred.TaskRunner: Parent
died.  Exiting attempt_200905281652_0013_m_000006_1
[r...@domu-12-31-38-01-7c-92 attempt_200905281652_0013_m_000006_1]#

tellingly, the last input line processed right before this WARN is 19K. (i
log the full input line in the map function for debugging)

output on map-reduce task:

Task attempt_200905281652_0013_m_000006_2 failed to report status for 600
seconds. Killing!
09/05/29 14:08:01 INFO mapred.JobClient:  map 99% reduce 32%
09/05/29 14:18:05 INFO mapred.JobClient:  map 98% reduce 32%
java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
    at
com.adxpose.data.mr.DailyHeatmapAggregator.run(DailyHeatmapAggregator.java:547)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at
com.adxpose.data.mr.DailyHeatmapAggregator.main(DailyHeatmapAggregator.java:553)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

i believe this is the code that kills the child:

in org.apache.hadoop.mapred.Task

protected void startCommunicationThread(final TaskUmbilicalProtocol
umbilical) {

....

              if (sendProgress) {
                // we need to send progress update
                updateCounters();
                taskStatus.statusUpdate(getState(),
                                        taskProgress.get(),
                                        taskProgress.toString(),
                                        counters);
                taskFound = umbilical.statusUpdate(taskId, taskStatus);
                taskStatus.clearStatus();
              }
              else {
                // send ping
                taskFound = umbilical.ping(taskId);
              }

              // if Task Tracker is not aware of our task ID (probably
because it died and
              // came back up), kill ourselves
              if (!taskFound) {
                LOG.warn("Parent died.  Exiting "+taskId);
                System.exit(66);
              }

Reply via email to