Sandy Ryza created YARN-499:
-------------------------------

             Summary: On failure, include last n lines of container logs in 
diagnostics
                 Key: YARN-499
                 URL: https://issues.apache.org/jira/browse/YARN-499
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: nodemanager
    Affects Versions: 2.0.3-alpha
            Reporter: Sandy Ryza
            Assignee: Sandy Ryza


When a container fails, the only way to diagnose it is to look at the logs.  
ContainerStatuses include a diagnostic string that is reported back to the 
resource manager by the node manager.

Currently in MR2 I believe whatever is sent to the task's standard out is added 
to the diagnostics string, but for MR standard out is redirected to a file 
called stdout.  In MR1, this string was populated with the last few lines of 
the task's stdout file, and got printed to the console, allowing for easy 
debugging.

Handling this would help to soothe the infuriating problem of an AM dying for a 
mysterious reason before setting a tracking URL (MAPREDUCE-3688).

This could be done in one of two ways.
* Use tee to send MR's standard out to both the stdout file and standard out.  
This requires modifying ShellCmdExecutor to roll what it reads in, as we 
wouldn't want to be storing the entire task log in NM memory.
* Read the task's log files.  This would require standardizing or making the 
container log files configurable.  Right now the log files are determined in 
userland and all that is YARN is aware of the log directory.

Does this present any issues I'm not considering?  If so it this might only be 
needed for AMs? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to