[ 
https://issues.apache.org/jira/browse/IGNITE-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nithyananthan N P updated IGNITE-8812:
--------------------------------------
    Description: 
Hello,

As Ignite-YARN is a long running application in YARN environment it should have 
a mechanism to renew the delegation token.

 

In Ignite-YARN, when the ApplicationMaster is started, it acquires Delegation 
tokens and stores in a ByteBuffer[Class: ApplicationMaster, Method: init()].

This ByteBuffer with token information is given to all the containers received 
from ResourceManager [Class: ApplicationMaster, Method: 
onContainersAllocated()].

Everything works fine till the life time of the delegation token.

Once the delegation token expires, the ApplicationMaster is not able to start 
Ignite inside containers it receive and below exception occurs

 

WARNING: Error launching container container_e37_1525770484566_11472_01_000020

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 token (token for HDFS_DELEGATION_TOKEN 
[|mailto:[email protected]], renewer=yarn, 
realUser=, issueDate=1527498085587, maxDate=1528102885587, 
sequenceNumber=231841, masterKeyId=631) can't be found in cache

at org.apache.hadoop.ipc.Client.call(Client.java:1504)

at org.apache.hadoop.ipc.Client.call(Client.java:1441)

at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)

at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)

at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)

at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)

at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)

at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)

at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2123)

at 
org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1253)

at 
org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1249)

at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)

at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1249)

at 
org.apache.ignite.yarn.utils.IgniteYarnUtils.setupFile(IgniteYarnUtils.java:65)

at 
org.apache.ignite.yarn.ApplicationMaster.onContainersAllocated(ApplicationMaster.java:131)

at 
org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:292)

 

ApplicationMaster keeps on asking for more and more containers [Class: 
ApplicationMaster, Method: run()] but not able to start Ignite inside any of 
the containers due to the expired/missing delegation token.

This repeats until all the resources in the cluster are allocated to Ignition.

  was:
Hello,

As Ignite-YARN is a long running application in YARN environment it should have 
a mechanism to renew the delegation token.

 

In Ignite-YARN, when the ApplicationMaster is started, it acquires Delegation 
tokens and stores in a ByteBuffer[Class: ApplicationMaster, Method: init()].

This ByteBuffer with token information is given to all the containers received 
from ResourceManager [Class: ApplicationMaster, Method: 
onContainersAllocated()].

Everything works fine till the life time of the delegation token.

Once the delegation token expires, the ApplicationMaster is not able to start 
Ignite inside containers it receive and below exception occurs

 

WARNING: Error launching container container_e37_1525770484566_11472_01_000020

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 token (token for srv-ae-rtp2: HDFS_DELEGATION_TOKEN 
[[email protected]|mailto:[email protected]],
 renewer=yarn, realUser=, issueDate=1527498085587, maxDate=1528102885587, 
sequenceNumber=231841, masterKeyId=631) can't be found in cache

at org.apache.hadoop.ipc.Client.call(Client.java:1504)

at org.apache.hadoop.ipc.Client.call(Client.java:1441)

at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)

at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)

at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)

at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)

at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)

at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)

at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2123)

at 
org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1253)

at 
org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1249)

at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)

at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1249)

at 
org.apache.ignite.yarn.utils.IgniteYarnUtils.setupFile(IgniteYarnUtils.java:65)

at 
org.apache.ignite.yarn.ApplicationMaster.onContainersAllocated(ApplicationMaster.java:131)

at 
org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:292)

 

ApplicationMaster keeps on asking for more and more containers [Class: 
ApplicationMaster, Method: run()] but not able to start Ignite inside any of 
the containers due to the expired/missing delegation token.

This repeats until all the resources in the cluster are allocated to Ignition.


> Ignite Yarn HDFS delegation token expiry issue
> ----------------------------------------------
>
>                 Key: IGNITE-8812
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8812
>             Project: Ignite
>          Issue Type: Bug
>          Components: yarn
>    Affects Versions: 2.3
>         Environment:  Version : 2.3.0
> Module : Ignite-YARN
> Class : ApplicationMaster
> Platform : Cloudera 2.6.0-cdh5.11.2
>            Reporter: Nithyananthan N P
>            Priority: Major
>
> Hello,
> As Ignite-YARN is a long running application in YARN environment it should 
> have a mechanism to renew the delegation token.
>  
> In Ignite-YARN, when the ApplicationMaster is started, it acquires Delegation 
> tokens and stores in a ByteBuffer[Class: ApplicationMaster, Method: init()].
> This ByteBuffer with token information is given to all the containers 
> received from ResourceManager [Class: ApplicationMaster, Method: 
> onContainersAllocated()].
> Everything works fine till the life time of the delegation token.
> Once the delegation token expires, the ApplicationMaster is not able to start 
> Ignite inside containers it receive and below exception occurs
>  
> WARNING: Error launching container container_e37_1525770484566_11472_01_000020
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for HDFS_DELEGATION_TOKEN 
> [|mailto:[email protected]], renewer=yarn, 
> realUser=, issueDate=1527498085587, maxDate=1528102885587, 
> sequenceNumber=231841, masterKeyId=631) can't be found in cache
> at org.apache.hadoop.ipc.Client.call(Client.java:1504)
> at org.apache.hadoop.ipc.Client.call(Client.java:1441)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
> at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
> at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2123)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1253)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1249)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1249)
> at 
> org.apache.ignite.yarn.utils.IgniteYarnUtils.setupFile(IgniteYarnUtils.java:65)
> at 
> org.apache.ignite.yarn.ApplicationMaster.onContainersAllocated(ApplicationMaster.java:131)
> at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:292)
>  
> ApplicationMaster keeps on asking for more and more containers [Class: 
> ApplicationMaster, Method: run()] but not able to start Ignite inside any of 
> the containers due to the expired/missing delegation token.
> This repeats until all the resources in the cluster are allocated to Ignition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to