Rajkumar Singh created HIVE-23753:
-------------------------------------
Summary: Make LLAP Secretmanager token path configurable
Key: HIVE-23753
URL: https://issues.apache.org/jira/browse/HIVE-23753
Project: Hive
Issue Type: Bug
Components: llap
Affects Versions: 4.0.0
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh
In a very Busy LLAP cluster if for some reason the Tokens under
zkdtsm_hive_llap0 zk path are not cleaned then LLAP Daemon startup takes a very
long time to startup, this may lead to service outage if LLAP daemons are not
started and the number of retries while checking LLAP app status exceeds. upon
looking the jstack of llap daemon it seems to traverse the zkdtsm_hive_llap0 zk
path before starting the secret manager.
{code:java}
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1386)
- locked <0x00007fef36cdd338> (a org.apache.zookeeper.ClientCnxn$Packet)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1153)
at
org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:302)
at
org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:291)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
at
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
at
org.apache.curator.framework.imps.GetDataBuilderImpl$2.forPath(GetDataBuilderImpl.java:142)
at
org.apache.curator.framework.imps.GetDataBuilderImpl$2.forPath(GetDataBuilderImpl.java:138)
at
org.apache.curator.framework.recipes.cache.PathChildrenCache.internalRebuildNode(PathChildrenCache.java:591)
at
org.apache.curator.framework.recipes.cache.PathChildrenCache.rebuild(PathChildrenCache.java:331)
at
org.apache.curator.framework.recipes.cache.PathChildrenCache.start(PathChildrenCache.java:300)
at
org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.startThreads(ZKDelegationTokenSecretManager.java:370)
at
org.apache.hadoop.hive.llap.security.SecretManager.startThreads(SecretManager.java:82)
at
org.apache.hadoop.hive.llap.security.SecretManager$1.run(SecretManager.java:223)
at
org.apache.hadoop.hive.llap.security.SecretManager$1.run(SecretManager.java:218)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1846)
at
org.apache.hadoop.hive.llap.security.SecretManager.createSecretManager(SecretManager.java:218)
at
org.apache.hadoop.hive.llap.security.SecretManager.createSecretManager(SecretManager.java:212)
at
org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.<init>(LlapDaemon.java:279)
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)