[
https://issues.apache.org/jira/browse/MAPREDUCE-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13718206#comment-13718206
]
Devaraj K commented on MAPREDUCE-5409:
--------------------------------------
Initially TaskAttemptImpl is SUCCEEDED. TaskAttemptImpl state moved from
SUCCEEDED to KILLED state with the reason as "Diagnostics report from
attempt_1374560536158_0003_m_000007_0: Container released on a *lost* node".
After some time JobImpl is getting the JOB_TASK_ATTEMPT_FETCH_FAILURE for the
same task attempt and during transition it is triggering the
TA_TOO_MANY_FETCH_FAILURE for the TaskAttemptImpl which is causing for this
invalid transition.
I think here we should not raise the event TA_TOO_MANY_FETCH_FAILURE for task
attempt when the state is KILLED, or we can ignore this event at KILLED for
TaskAttemptImpl.
> MRAppMaster throws InvalidStateTransitonException: Invalid event:
> TA_TOO_MANY_FETCH_FAILURE at KILLED for TaskAttemptImpl
> -------------------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-5409
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5409
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 2.0.5-alpha
> Reporter: Devaraj K
> Assignee: Devaraj K
>
> {code:xml}
> 2013-07-23 12:28:05,217 INFO [IPC Server handler 29 on 50796]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
> attempt_1374560536158_0003_m_000040_0 is : 0.0
> 2013-07-23 12:28:05,221 INFO [AsyncDispatcher event handler]
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Too many fetch-failures
> for output of task attempt: attempt_1374560536158_0003_m_000007_0 ... raising
> fetch failure to map
> 2013-07-23 12:28:05,222 ERROR [AsyncDispatcher event handler]
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle
> this event at current state for attempt_1374560536158_0003_m_000007_0
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
> TA_TOO_MANY_FETCH_FAILURE at KILLED
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)
> at
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1032)
> at
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:143)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1123)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1115)
> at
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130)
> at
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77)
> at java.lang.Thread.run(Thread.java:662)
> 2013-07-23 12:28:05,249 INFO [AsyncDispatcher event handler]
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl:
> job_1374560536158_0003Job Transitioned from RUNNING to ERROR
> 2013-07-23 12:28:05,338 INFO [IPC Server handler 16 on 50796]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from
> attempt_1374560536158_0003_m_000040_0
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira