Young Chen created YARN-8895:
--------------------------------

             Summary: Improve YARN 
                 Key: YARN-8895
                 URL: https://issues.apache.org/jira/browse/YARN-8895
             Project: Hadoop YARN
          Issue Type: Improvement
            Reporter: Young Chen
            Assignee: Young Chen


Currently identifying error sources can be quite difficult, as they are written 
into an unstructured string "diagnostics" field. This is present in container 
statuses returned to the RM and in application attempts in the RM. These errors 
are difficult to classify without hard-coding diagnostic string searches.

This Jira aims to add a structured error field in NM and RM that preserves 
failure information and source component to enable faster and clearer error 
diagnosis.

Old error:

E.g.: 
Application application_1539325316309_0001 failed 1 times due to AM Container 
for appattempt_1539325316309_0001_000001 exited with exitCode: 57005
For more detailed output, check application tracking 
page:http://XXXXXXXX:80/cluster/app/application_1539325316309_0001Then, click 
on links to logs of each attempt.
Diagnostics: Container exited with a non-zero exit code 57005
Failing this attempt. Failing the application.
 
Proposed new error example:
{code:java}
{"errors":[{"errorId":"E_SYSTEM_AM_AMCRASHED",
"name":"AM_CRASHED","severity":"Error",
"component":"AM",
"source":"System",
"exitType":"CONTAINER_FINISHED","containerStatus":57005,
"description":"Application attempt appattempt_1539325316309_0001_000001 
encountered an error",
"helpLink":"http://XXXXXXXXXXX:80/proxy/application_1539325316309_0001/"}]}
{code}
 
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to