Client should be able to know why an AM crashed.
------------------------------------------------

                 Key: MAPREDUCE-2717
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2717
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: mrv2
            Reporter: Amol Kekre
             Fix For: 0.23.0


Today if an AM crashes, we have to dig through logs - very cumbersome. It is 
good to have client print some reason for
AM crash. Various possible reasons for AM crash:
 (1) AM container failed during localization itself.
 (2) AM container launched but failed before properly starting, for e.g. due to 
classpath issues
 (3) AM failed after starting properly.
 (4) an AM is expired and killed by the RM

Potential fixes:
 - For (1) and (2) the client should obtain the container-status, container 
diagnostics and exit code.
 - For (3), the AM should set some kind of reason for failure during its 
heartbeat to RM and the client should obtain
the same from RM.

                

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to