[
https://issues.apache.org/jira/browse/MAPREDUCE-2667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062190#comment-13062190
]
Thomas Graves commented on MAPREDUCE-2667:
------------------------------------------
The issue here seems to be even though the unregister routine in the
RMCommunicator is setting the state to KILLED and
then calls finishApplicationMaster, the finishApplicationMaster is just sending
the FINISHED event which isn't handled
by the ApplicationImpl when it is in the RUNNING state. So basically the
killed state that got set is ignored.
2011-07-08 19:50:21,889 ERROR
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.ApplicationImpl:
Can't
handle this event at current
stateorg.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid
event: FINISH at
RUNNING at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:416)
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:331)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:39)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:476)
at
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.ApplicationImpl.handle(ApplicationImpl.java:587)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:202)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:187)
at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:111)
at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:74)
at
java.lang.Thread.run(Thread.java:619)
> MR279: mapred job -kill leaves application in RUNNING state
> -----------------------------------------------------------
>
> Key: MAPREDUCE-2667
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2667
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2
> Reporter: Thomas Graves
>
> the mapred job -kill command doesn't seem to fully clean up the application.
> If you kill a job and run mapred job -list again it still shows up as running:
> mapred job -kill job_1310072430717_0003
> Killed job job_1310072430717_0003
> mapred job -list
> Total jobs:1
> JobId State StartTime UserName Queue Priority
> SchedulingInfo
> job_1310072430717_0003 RUNNING 0 tgraves default NORMAL
> 98.139.92.22:19888/yarn/job/job_1310072430717_3_3
> Running kill again will error out.
> It also still shows up in the RM Applications UI as running with a note of:
> Kill Job received from client
> job_1310072430717_0003 Job received Kill while in RUNNING state.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira