Hi all,
I am running a Hadoop cluster of 4 nodes (Hadoop 2.6). I am facing an irregular
error, relating to an Invalid AMRM token that causes my YARN application to
crash anytime from 1 day up to a week after starting. Application launches
successfully and runs for a period of time which is not predictable, then
crashes with the following logs (Appmaster.stdout). When I restart,
application works perfectly for some time before crashing again with the same
error.
2015-02-04 12:55:51,394 [AMRM Heartbeater thread]
org.apache.hadoop.ipc.Client-DEBUG-The ping interval is 60000 ms.
2015-02-04 12:55:51,394 [AMRM Heartbeater thread]
org.apache.hadoop.ipc.Client-DEBUG-Connecting to masternode/192.168.143.23:8030
2015-02-04 12:55:51,396 [AMRM Heartbeater thread]
org.apache.hadoop.security.UserGroupInformation-DEBUG-PrivilegedAction as:user1
(auth:SIMPLE) from:org.ap
ache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)
2015-02-04 12:55:51,396 [AMRM Heartbeater thread]
org.apache.hadoop.security.SaslRpcClient-DEBUG-Sending sasl message state:
NEGOTIATE
2015-02-04 12:55:51,397 [AMRM Heartbeater thread]
org.apache.hadoop.security.SaslRpcClient-DEBUG-Received SASL message state:
NEGOTIATE
auths {
method: "TOKEN"
mechanism: "DIGEST-MD5"
protocol: ""
serverId: "default"
challenge:
"realm=\"default\",nonce=\"FjsVVqgBotmE1OIpCE6f/KVmiuM3ixIolXg/l5et\",qop=\"auth\",charset=utf-8,algorithm=md5-sess"
}
2015-02-04 12:55:51,397 [AMRM Heartbeater thread]
org.apache.hadoop.security.SaslRpcClient-DEBUG-Get token info proto:interface
org.apache.hadoop.yarn.api.
ApplicationMasterProtocolPB
info:org.apache.hadoop.yarn.security.SchedulerSecurityInfo$1@4ae70093
2015-02-04 12:55:51,397 [AMRM Heartbeater thread]
org.apache.hadoop.yarn.security.AMRMTokenSelector-DEBUG-Looking for a token
with service 192.168.143.23:8
030
2015-02-04 12:55:51,397 [AMRM Heartbeater thread]
org.apache.hadoop.yarn.security.AMRMTokenSelector-DEBUG-Token kind is
YARN_AM_RM_TOKEN and the token's service name is 192.168.143.23:8030
2015-02-04 12:55:51,397 [AMRM Heartbeater thread]
org.apache.hadoop.security.SaslRpcClient-DEBUG-Creating SASL DIGEST-MD5(TOKEN)
client to authenticate to service at default
2015-02-04 12:55:51,398 [AMRM Heartbeater thread]
org.apache.hadoop.security.SaslRpcClient-DEBUG-Use TOKEN authentication for
protocol ApplicationMasterProtocolPB
2015-02-04 12:55:51,398 [AMRM Heartbeater thread]
org.apache.hadoop.security.SaslRpcClient-DEBUG-SASL client callback: setting
username: Cg0KCQgBEM3bh860KRACEMyF5q/9/////wE=
2015-02-04 12:55:51,398 [AMRM Heartbeater thread]
org.apache.hadoop.security.SaslRpcClient-DEBUG-SASL client callback: setting
userPassword
2015-02-04 12:55:51,398 [AMRM Heartbeater thread]
org.apache.hadoop.security.SaslRpcClient-DEBUG-SASL client callback: setting
realm: default
2015-02-04 12:55:51,398 [AMRM Heartbeater thread]
org.apache.hadoop.security.SaslRpcClient-DEBUG-Sending sasl message state:
INITIATE
token:
"charset=utf-8,username=\"Cg0KCQgBEM3bh860KRACEMyF5q/9/////wE=\",realm=\"default\",nonce=\"FjsVVqgBotmE1OIpCE6f/KVmiuM3ixIolXg/l5et\",nc=00000001,cnonce=\"dETUXCWNF7Aw2ZReFDsKF5jj9bKcyvoQYEJDU9N5\",digest-uri=\"/default\",maxbuf=65536,response=1bdc86e7222c86e0692d30db6bec1479,qop=auth"
auths {
method: "TOKEN"
mechanism: "DIGEST-MD5"
protocol: ""
serverId: "default"
}
2015-02-04 12:55:51,400 [AMRM Heartbeater thread]
org.apache.hadoop.security.UserGroupInformation-DEBUG-PrivilegedActionException
as:user1 (auth:SIMPLE)
cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
Invalid AMRMToken from appattempt_1422871621069_0001_000002
2015-02-04 12:55:51,402 [AMRM Heartbeater thread]
org.apache.hadoop.security.UserGroupInformation-DEBUG-PrivilegedAction as:user1
(auth:SIMPLE)
from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)
2015-02-04 12:55:51,402 [AMRM Heartbeater thread]
org.apache.hadoop.ipc.Client-WARN-Exception encountered while connecting to the
server :
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
Invalid AMRMToken from appattempt_1422871621069_0001_000002
2015-02-04 12:55:51,402 [AMRM Heartbeater thread]
org.apache.hadoop.security.UserGroupInformation-DEBUG-PrivilegedActionException
as:user1 (auth:SIMPLE)
cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
Invalid AMRMToken from appattempt_1422871621069_0001_000002
2015-02-04 12:55:51,407 [AMRM Heartbeater thread]
org.apache.hadoop.ipc.Client-DEBUG-closing ipc connection to
masternode/192.168.143.23:8030: Invalid AMRMToken from
appattempt_1422871621069_0001_000002
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
Invalid AMRMToken from appattempt_1422871621069_0001_000002
at
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:375)
at
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:553)
at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:368)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:722)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:718)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy11.allocate(Unknown Source)
at
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy12.allocate(Unknown Source)
at
org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:333)
at
org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:224)
2015-02-04 12:55:51,409 [AMRM Heartbeater thread]
org.apache.hadoop.ipc.Client-DEBUG-IPC Client (184566950) connection to
masternode/192.168.143.23:8030 from user1: closed
2015-02-04 12:55:51,422 [AMRM Heartbeater thread]
org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl-ERROR-Exception
on heartbeat
org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid AMRMToken
from appattempt_1422871621069_0001_000002
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at
org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)
at
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)
at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy12.allocate(Unknown Source)
at
org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:333)
at
org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:224)
Caused by:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
Invalid AMRMToken from appattempt_1422871621069_0001_000002
at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy11.allocate(Unknown Source)
at
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
... 8 more
2015-02-04 12:55:51,423 [AMRM Callback Handler Thread]
org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl-INFO-Interrupted
while waiting for queue
java.lang.InterruptedException
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at
org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:274)
2015-02-04 12:55:51,423 [AMRM Callback Handler Thread]
org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl-ERROR-Stopping
callback due to:
org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid AMRMToken
from appattempt_1422871621069_0001_000002
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at
org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)
at
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)
at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy12.allocate(Unknown Source)
at
org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:333)
at
org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:224)
Caused by:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
Invalid AMRMToken from appattempt_1422871621069_0001_000002
at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy11.allocate(Unknown Source)
at
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
... 8 more
2015-02-04 12:55:51,424 [AMRM Callback Handler Thread]
org.apache.hadoop.service.AbstractService-DEBUG-Service:
org.apache.hadoop.yarn.client.api.async.AMRMClientAsync entered state STOPPED
2015-02-04 12:55:51,424 [AMRM Callback Handler Thread]
org.apache.hadoop.service.AbstractService-DEBUG-Service:
org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl entered state STOPPED
2015-02-04 12:55:51,424 [AMRM Callback Handler Thread]
org.apache.hadoop.ipc.Client-DEBUG-stopping client from cache:
org.apache.hadoop.ipc.Client@552d7308
2015-02-04 12:56:03,544
[org.eclipse.jetty.server.session.HashSessionManager@425af55fTimer]
org.eclipse.jetty.server.session-DEBUG-Scavenging sessions at 1423054563544
Any help is greatly appreciated.
Thanks,
Rahul Chhiber