[ 
https://issues.apache.org/jira/browse/EAGLE-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jayesh updated EAGLE-920:
-------------------------
    Fix Version/s:     (was: v0.5.0)
                   v0.5.1

> mr failed job trouble shooting
> ------------------------------
>
>                 Key: EAGLE-920
>                 URL: https://issues.apache.org/jira/browse/EAGLE-920
>             Project: Eagle
>          Issue Type: Improvement
>          Components: App::Job Performance Monitor
>    Affects Versions: v0.5.0
>            Reporter: wujinhu
>            Assignee: wujinhu
>             Fix For: v0.5.1
>
>
> We will follow below steps when we find a failed mr job.
> 1. get error category distribution of the job via api
> query=TaskAttemptErrorCategoryService[@site="sandbox" and 
> @jobId="job_1486726244016_162594"]<@errorCategory>{count}
> 2. get error category - error message mapping and failed task attempts list
> query=JobErrorMappingService[@site="sandbox" and 
> @jobId="job_1486726244016_162594" and 
> @errorCategory="java.lang.RuntimeException"]
> 3. dive into one task attempt
> query=TaskAttemptExecutionService[@site="sandbox" and 
> @taskAttemptId="attempt_1486726244016_162594_m_002451_1"]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to