Rajkumar Singh created HIVE-23753:
-------------------------------------

             Summary: Make LLAP Secretmanager token path configurable
                 Key: HIVE-23753
                 URL: https://issues.apache.org/jira/browse/HIVE-23753
             Project: Hive
          Issue Type: Bug
          Components: llap
    Affects Versions: 4.0.0
            Reporter: Rajkumar Singh
            Assignee: Rajkumar Singh


In a very Busy LLAP cluster if for some reason the Tokens under 
zkdtsm_hive_llap0 zk path are not cleaned then LLAP Daemon startup takes a very 
long time to startup, this may lead to service outage if LLAP daemons are not 
started and the number of retries while checking LLAP app status exceeds. upon 
looking the jstack of llap daemon it seems to traverse the zkdtsm_hive_llap0 zk 
path before starting the secret manager.


{code:java}
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:502)
        at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1386)
        - locked <0x00007fef36cdd338> (a org.apache.zookeeper.ClientCnxn$Packet)
        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1153)
        at 
org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:302)
        at 
org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:291)
        at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
        at 
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
        at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
        at 
org.apache.curator.framework.imps.GetDataBuilderImpl$2.forPath(GetDataBuilderImpl.java:142)
        at 
org.apache.curator.framework.imps.GetDataBuilderImpl$2.forPath(GetDataBuilderImpl.java:138)
        at 
org.apache.curator.framework.recipes.cache.PathChildrenCache.internalRebuildNode(PathChildrenCache.java:591)
        at 
org.apache.curator.framework.recipes.cache.PathChildrenCache.rebuild(PathChildrenCache.java:331)
        at 
org.apache.curator.framework.recipes.cache.PathChildrenCache.start(PathChildrenCache.java:300)
        at 
org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.startThreads(ZKDelegationTokenSecretManager.java:370)
        at 
org.apache.hadoop.hive.llap.security.SecretManager.startThreads(SecretManager.java:82)
        at 
org.apache.hadoop.hive.llap.security.SecretManager$1.run(SecretManager.java:223)
        at 
org.apache.hadoop.hive.llap.security.SecretManager$1.run(SecretManager.java:218)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:360)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1846)
        at 
org.apache.hadoop.hive.llap.security.SecretManager.createSecretManager(SecretManager.java:218)
        at 
org.apache.hadoop.hive.llap.security.SecretManager.createSecretManager(SecretManager.java:212)
        at 
org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.<init>(LlapDaemon.java:279)
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to