Might be this https://issues.apache.org/jira/browse/YARN-2964. This bug was injected due https://issues.apache.org/jira/browse/YARN-2704.
On Thu, Feb 5, 2015 at 1:04 PM, Rahul Chhiber < [email protected]> wrote: > Hi all, > > > > I am running a Hadoop cluster of 4 nodes (Hadoop 2.6). I am facing an > irregular error, relating to an *Invalid AMRM token* that causes my YARN > application to crash anytime from 1 day up to a week after starting. > Application launches successfully and runs for a period of time which is > not predictable, then crashes with the following logs (*Appmaster.stdout*). > When I restart, application works perfectly for some time before crashing > again with the same error. > > > > *2015-02-04 12:55:51,394 [AMRM Heartbeater thread] > org.apache.hadoop.ipc.Client-DEBUG-The ping interval is 60000 ms.* > > *2015-02-04 12:55:51,394 [AMRM Heartbeater thread] > org.apache.hadoop.ipc.Client-DEBUG-Connecting to > masternode/192.168.143.23:8030 <http://192.168.143.23:8030>* > > *2015-02-04 12:55:51,396 [AMRM Heartbeater thread] > org.apache.hadoop.security.UserGroupInformation-DEBUG-PrivilegedAction > as:user1 (auth:SIMPLE) from:org.ap* > > *ache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)* > > *2015-02-04 12:55:51,396 [AMRM Heartbeater thread] > org.apache.hadoop.security.SaslRpcClient-DEBUG-Sending sasl message state: > NEGOTIATE* > > > > *2015-02-04 12:55:51,397 [AMRM Heartbeater thread] > org.apache.hadoop.security.SaslRpcClient-DEBUG-Received SASL message state: > NEGOTIATE* > > *auths {* > > * method: "TOKEN"* > > * mechanism: "DIGEST-MD5"* > > * protocol: ""* > > * serverId: "default"* > > * challenge: > "realm=\"default\",nonce=\"FjsVVqgBotmE1OIpCE6f/KVmiuM3ixIolXg/l5et\",qop=\"auth\",charset=utf-8,algorithm=md5-sess"* > > *}* > > > > *2015-02-04 12:55:51,397 [AMRM Heartbeater thread] > org.apache.hadoop.security.SaslRpcClient-DEBUG-Get token info > proto:interface org.apache.hadoop.yarn.api.* > > *ApplicationMasterProtocolPB > info:org.apache.hadoop.yarn.security.SchedulerSecurityInfo$1@4ae70093* > > *2015-02-04 12:55:51,397 [AMRM Heartbeater thread] > org.apache.hadoop.yarn.security.AMRMTokenSelector-DEBUG-Looking for a token > with service 192.168.143.23:8 <http://192.168.143.23:8>* > > *030* > > *2015-02-04 12:55:51,397 [AMRM Heartbeater thread] > org.apache.hadoop.yarn.security.AMRMTokenSelector-DEBUG-Token kind is > YARN_AM_RM_TOKEN and the token's service name is 192.168.143.23:8030 > <http://192.168.143.23:8030>* > > *2015-02-04 12:55:51,397 [AMRM Heartbeater thread] > org.apache.hadoop.security.SaslRpcClient-DEBUG-Creating SASL > DIGEST-MD5(TOKEN) client to authenticate to service at default* > > *2015-02-04 12:55:51,398 [AMRM Heartbeater thread] > org.apache.hadoop.security.SaslRpcClient-DEBUG-Use TOKEN authentication for > protocol ApplicationMasterProtocolPB* > > *2015-02-04 12:55:51,398 [AMRM Heartbeater thread] > org.apache.hadoop.security.SaslRpcClient-DEBUG-SASL client callback: > setting username: Cg0KCQgBEM3bh860KRACEMyF5q/9/////wE=* > > *2015-02-04 12:55:51,398 [AMRM Heartbeater thread] > org.apache.hadoop.security.SaslRpcClient-DEBUG-SASL client callback: > setting userPassword* > > *2015-02-04 12:55:51,398 [AMRM Heartbeater thread] > org.apache.hadoop.security.SaslRpcClient-DEBUG-SASL client callback: > setting realm: default* > > *2015-02-04 12:55:51,398 [AMRM Heartbeater thread] > org.apache.hadoop.security.SaslRpcClient-DEBUG-Sending sasl message state: > INITIATE* > > *token: > "charset=utf-8,username=\"Cg0KCQgBEM3bh860KRACEMyF5q/9/////wE=\",realm=\"default\",nonce=\"FjsVVqgBotmE1OIpCE6f/KVmiuM3ixIolXg/l5et\",nc=00000001,cnonce=\"dETUXCWNF7Aw2ZReFDsKF5jj9bKcyvoQYEJDU9N5\",digest-uri=\"/default\",maxbuf=65536,response=1bdc86e7222c86e0692d30db6bec1479,qop=auth"* > > *auths {* > > * method: "TOKEN"* > > * mechanism: "DIGEST-MD5"* > > * protocol: ""* > > * serverId: "default"* > > *}* > > > > *2015-02-04 12:55:51,400 [AMRM Heartbeater thread] > org.apache.hadoop.security.UserGroupInformation-DEBUG-PrivilegedActionException > as:user1 (auth:SIMPLE) > cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > Invalid AMRMToken from appattempt_1422871621069_0001_000002* > > *2015-02-04 12:55:51,402 [AMRM Heartbeater thread] > org.apache.hadoop.security.UserGroupInformation-DEBUG-PrivilegedAction > as:user1 (auth:SIMPLE) > from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)* > > *2015-02-04 12:55:51,402 [AMRM Heartbeater thread] > org.apache.hadoop.ipc.Client-WARN-Exception encountered while connecting to > the server : > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > Invalid AMRMToken from appattempt_1422871621069_0001_000002* > > *2015-02-04 12:55:51,402 [AMRM Heartbeater thread] > org.apache.hadoop.security.UserGroupInformation-DEBUG-PrivilegedActionException > as:user1 (auth:SIMPLE) > cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > Invalid AMRMToken from appattempt_1422871621069_0001_000002* > > *2015-02-04 12:55:51,407 [AMRM Heartbeater thread] > org.apache.hadoop.ipc.Client-DEBUG-closing ipc connection to > masternode/192.168.143.23:8030 <http://192.168.143.23:8030>: Invalid > AMRMToken from appattempt_1422871621069_0001_000002* > > *org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > Invalid AMRMToken from appattempt_1422871621069_0001_000002* > > * at > org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:375)* > > * at > org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:553)* > > * at > org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:368)* > > * at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:722)* > > * at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:718)* > > * at java.security.AccessController.doPrivileged(Native Method)* > > * at javax.security.auth.Subject.doAs(Subject.java:415)* > > * at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)* > > * at > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)* > > * at > org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)* > > * at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)* > > * at org.apache.hadoop.ipc.Client.call(Client.java:1438)* > > * at org.apache.hadoop.ipc.Client.call(Client.java:1399)* > > * at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)* > > * at com.sun.proxy.$Proxy11.allocate(Unknown Source)* > > * at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)* > > * at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)* > > * at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)* > > * at java.lang.reflect.Method.invoke(Method.java:606)* > > * at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)* > > * at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)* > > * at com.sun.proxy.$Proxy12.allocate(Unknown Source)* > > * at > org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:333)* > > * at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:224)* > > *2015-02-04 12:55:51,409 [AMRM Heartbeater thread] > org.apache.hadoop.ipc.Client-DEBUG-IPC Client (184566950) connection to > masternode/192.168.143.23:8030 <http://192.168.143.23:8030> from user1: > closed* > > *2015-02-04 12:55:51,422 [AMRM Heartbeater thread] > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl-ERROR-Exception > on heartbeat* > > *org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid > AMRMToken from appattempt_1422871621069_0001_000002* > > * at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method)* > > * at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)* > > * at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)* > > * at > java.lang.reflect.Constructor.newInstance(Constructor.java:526)* > > * at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)* > > * at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)* > > * at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)* > > * at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)* > > * at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)* > > * at java.lang.reflect.Method.invoke(Method.java:606)* > > * at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)* > > * at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)* > > * at com.sun.proxy.$Proxy12.allocate(Unknown Source)* > > * at > org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:333)* > > * at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:224)* > > *Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > Invalid AMRMToken from appattempt_1422871621069_0001_000002* > > * at org.apache.hadoop.ipc.Client.call(Client.java:1468)* > > * at org.apache.hadoop.ipc.Client.call(Client.java:1399)* > > * at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)* > > * at com.sun.proxy.$Proxy11.allocate(Unknown Source)* > > * at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)* > > * ... 8 more* > > *2015-02-04 12:55:51,423 [AMRM Callback Handler Thread] > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl-INFO-Interrupted > while waiting for queue* > > *java.lang.InterruptedException* > > * at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)* > > * at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)* > > * at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)* > > * at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:274)* > > *2015-02-04 12:55:51,423 [AMRM Callback Handler Thread] > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl-ERROR-Stopping > callback due to:* > > *org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid > AMRMToken from appattempt_1422871621069_0001_000002* > > * at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method)* > > * at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)* > > * at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)* > > * at > java.lang.reflect.Constructor.newInstance(Constructor.java:526)* > > * at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)* > > * at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)* > > * at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)* > > * at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)* > > * at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)* > > * at java.lang.reflect.Method.invoke(Method.java:606)* > > * at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)* > > * at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)* > > * at com.sun.proxy.$Proxy12.allocate(Unknown Source)* > > * at > org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:333)* > > * at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:224)* > > *Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > Invalid AMRMToken from appattempt_1422871621069_0001_000002* > > * at org.apache.hadoop.ipc.Client.call(Client.java:1468)* > > * at org.apache.hadoop.ipc.Client.call(Client.java:1399)* > > * at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)* > > * at com.sun.proxy.$Proxy11.allocate(Unknown Source)* > > * at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)* > > * ... 8 more* > > *2015-02-04 12:55:51,424 [AMRM Callback Handler Thread] > org.apache.hadoop.service.AbstractService-DEBUG-Service: > org.apache.hadoop.yarn.client.api.async.AMRMClientAsync entered state > STOPPED* > > *2015-02-04 12:55:51,424 [AMRM Callback Handler Thread] > org.apache.hadoop.service.AbstractService-DEBUG-Service: > org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl entered state STOPPED* > > *2015-02-04 12:55:51,424 [AMRM Callback Handler Thread] > org.apache.hadoop.ipc.Client-DEBUG-stopping client from cache: > org.apache.hadoop.ipc.Client@552d7308* > > *2015-02-04 12:56:03,544 > [org.eclipse.jetty.server.session.HashSessionManager@425af55fTimer] > org.eclipse.jetty.server.session-DEBUG-Scavenging sessions at 1423054563544* > > > > Any help is greatly appreciated. > > > > Thanks, > > Rahul Chhiber > > >
