Jian Fang created YARN-2416:
-------------------------------

             Summary: InvalidStateTransitonException in ResourceManager if 
AMLauncher does not receive response for startContainers() call in time
                 Key: YARN-2416
                 URL: https://issues.apache.org/jira/browse/YARN-2416
             Project: Hadoop YARN
          Issue Type: Bug
          Components: resourcemanager
    Affects Versions: 2.4.0
            Reporter: Jian Fang


AMLauncher calls startContainers(allRequests) to launch a container for 
application master. Normally, the call comes back immediately so that the 
RMAppAttempt changes its state from ALLOCATED to LAUNCHED. 

However, we do observed that in some cases, the RPC call came back very late 
but the AM container was already started. Because the RMAppAttempt stuck in 
ALLOCATED state, once resource manager received the REGISTERED event from the 
application master, it threw InvalidStateTransitonException as follows.

2014-07-05 08:59:05,021 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
REGISTERED at ALLOCATED
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:652)
        at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:752)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:733)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
        at java.lang.Thread.run(Thread.java:744)

For subsequent STATUS_UPDATE and CONTAINER_ALLOCATED events for this job, 
resource manager kept throwing InvalidStateTransitonException.

2014-07-05 08:59:06,152 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
STATUS_UPDATE at ALLOCATED
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:652)
        at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:752)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:733)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
        at java.lang.Thread.run(Thread.java:744)
2014-07-05 08:59:07,779 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
container_1404549222428_0001_02_000002 Container Transitioned from NEW to
 ALLOCATED
2014-07-05 08:59:07,779 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
CONTAINER_ALLOCATED at ALLOCATED
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:652)
        at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:752)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:733)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
        at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to