risdenk commented on PR #1743: URL: https://github.com/apache/solr/pull/1743#issuecomment-1746825812
Some of the Hadoop test failures were just normal thread leaks that were handled by [de729bb](https://github.com/apache/solr/pull/1743/commits/de729bb20fc41b08552eb79d7a037d176285a711) There were another subset of failures that were more interesting. I found a solution to the Hadoop test failures: [40a8228](https://github.com/apache/solr/pull/1743/commits/40a82288a5a4999457a9222def4a7f030dbc85c0) The failure was that Solr through Hadoop's `ZKDelegationTokenSecretManager` could not create a znode since it already exists. There is a check but its a race condition against multiple Solr instances starting up - https://github.com/apache/hadoop/blame/rel/release-3.3.6/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java#L270 There is probably a fix in `ZKDelegationTokenSecretManager` that would avoid the race condition, but making Solr startup more serial in tests worked. ``` 236 ERROR (jetty-launcher-8-thread-1) [n:127.0.0.1:56203_solr] o.a.s.s.CoreContainerProvider Could not start Solr. Check solr/home property and the logs => java.lang.RuntimeException: Could not start class org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager: java.io.IOException: Could not create namespace at org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.init(DelegationTokenManager.java:149) java.lang.RuntimeException: Could not start class org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager: java.io.IOException: Could not create namespace at org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.init(DelegationTokenManager.java:149) ~[hadoop-common-3.3.6.jar:?] at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.initTokenManager(DelegationTokenAuthenticationHandler.java:163) ~[hadoop-common-3.3.6.jar:?] at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.init(DelegationTokenAuthenticationHandler.java:131) ~[hadoop-common-3.3.6.jar:?] at org.apache.hadoop.security.authentication.server.AuthenticationFilter.initializeAuthHandler(AuthenticationFilter.java:194) ~[hadoop-auth-3.3.6.jar:?] at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.initializeAuthHandler(DelegationTokenAuthenticationFilter.java:215) ~[hadoop-common-3.3.6.jar:?] at org.apache.solr.security.hadoop.HadoopAuthFilter.initializeAuthHandler(HadoopAuthFilter.java:124) ~[main/:?] at org.apache.hadoop.security.authentication.server.AuthenticationFilter.init(AuthenticationFilter.java:180) ~[hadoop-auth-3.3.6.jar:?] at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.init(DelegationTokenAuthenticationFilter.java:181) ~[hadoop-common-3.3.6.jar:?] at org.apache.solr.security.hadoop.HadoopAuthFilter.init(HadoopAuthFilter.java:75) ~[main/:?] at org.apache.solr.security.hadoop.HadoopAuthPlugin.init(HadoopAuthPlugin.java:135) ~[main/:?] at org.apache.solr.core.CoreContainer.initializeAuthenticationPlugin(CoreContainer.java:569) ~[solr-core-10.0.0-SNAPSHOT.jar:10.0.0-SNAPSHOT a3945a2c3710b1a355abdea7a2e63b5353ad0723 [snapshot build, details omitted]] at org.apache.solr.core.CoreContainer.reloadSecurityProperties(CoreContainer.java:1185) ~[solr-core-10.0.0-SNAPSHOT.jar:10.0.0-SNAPSHOT a3945a2c3710b1a355abdea7a2e63b5353ad0723 [snapshot build, details omitted]] at org.apache.solr.core.CoreContainer.loadInternal(CoreContainer.java:854) ~[solr-core-10.0.0-SNAPSHOT.jar:10.0.0-SNAPSHOT a3945a2c3710b1a355abdea7a2e63b5353ad0723 [snapshot build, details omitted]] at org.apache.solr.core.CoreContainer.load(CoreContainer.java:763) ~[solr-core-10.0.0-SNAPSHOT.jar:10.0.0-SNAPSHOT a3945a2c3710b1a355abdea7a2e63b5353ad0723 [snapshot build, details omitted]] at org.apache.solr.servlet.CoreContainerProvider.createCoreContainer(CoreContainerProvider.java:427) ~[solr-core-10.0.0-SNAPSHOT.jar:10.0.0-SNAPSHOT a3945a2c3710b1a355abdea7a2e63b5353ad0723 [snapshot build, details omitted]] at org.apache.solr.servlet.CoreContainerProvider.init(CoreContainerProvider.java:246) [solr-core-10.0.0-SNAPSHOT.jar:10.0.0-SNAPSHOT a3945a2c3710b1a355abdea7a2e63b5353ad0723 [snapshot build, details omitted]] at org.apache.solr.embedded.JettySolrRunner$1.lifeCycleStarted(JettySolrRunner.java:405) [solr-test-framework-10.0.0-SNAPSHOT.jar:10.0.0-SNAPSHOT a3945a2c3710b1a355abdea7a2e63b5353ad0723 [snapshot build, details omitted]] at org.eclipse.jetty.util.component.AbstractLifeCycle.setStarted(AbstractLifeCycle.java:253) [jetty-util-10.0.16.jar:10.0.16] at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:94) [jetty-util-10.0.16.jar:10.0.16] at org.apache.solr.embedded.JettySolrRunner.retryOnPortBindFailure(JettySolrRunner.java:614) [solr-test-framework-10.0.0-SNAPSHOT.jar:10.0.0-SNAPSHOT a3945a2c3710b1a355abdea7a2e63b5353ad0723 [snapshot build, details omitted]] at org.apache.solr.embedded.JettySolrRunner.start(JettySolrRunner.java:552) [solr-test-framework-10.0.0-SNAPSHOT.jar:10.0.0-SNAPSHOT a3945a2c3710b1a355abdea7a2e63b5353ad0723 [snapshot build, details omitted]] at org.apache.solr.embedded.JettySolrRunner.start(JettySolrRunner.java:523) [solr-test-framework-10.0.0-SNAPSHOT.jar:10.0.0-SNAPSHOT a3945a2c3710b1a355abdea7a2e63b5353ad0723 [snapshot build, details omitted]] at org.apache.solr.cloud.MiniSolrCloudCluster.startJettySolrRunner(MiniSolrCloudCluster.java:508) [solr-test-framework-10.0.0-SNAPSHOT.jar:10.0.0-SNAPSHOT a3945a2c3710b1a355abdea7a2e63b5353ad0723 [snapshot build, details omitted]] at org.apache.solr.cloud.MiniSolrCloudCluster.lambda$new$0(MiniSolrCloudCluster.java:320) [solr-test-framework-10.0.0-SNAPSHOT.jar:10.0.0-SNAPSHOT a3945a2c3710b1a355abdea7a2e63b5353ad0723 [snapshot build, details omitted]] at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:294) [solr-solrj-10.0.0-SNAPSHOT.jar:10.0.0-SNAPSHOT a3945a2c3710b1a355abdea7a2e63b5353ad0723 [snapshot build, details omitted]] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?] at java.lang.Thread.run(Thread.java:833) [?:?] Caused by: java.io.IOException: Could not create namespace at org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.startThreads(ZKDelegationTokenSecretManager.java:275) ~[hadoop-common-3.3.6.jar:?] at org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.init(DelegationTokenManager.java:146) ~[hadoop-common-3.3.6.jar:?] ... 28 more Caused by: org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /solr/security/zkdtsm/ZKDTSMRoot at org.apache.zookeeper.KeeperException.create(KeeperException.java:125) ~[zookeeper-3.9.0.jar:3.9.0] at org.apache.zookeeper.KeeperException.create(KeeperException.java:53) ~[zookeeper-3.9.0.jar:3.9.0] at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:1450) ~[zookeeper-3.9.0.jar:3.9.0] at org.apache.curator.framework.imps.CreateBuilderImpl$18.call(CreateBuilderImpl.java:1223) ~[curator-framework-5.2.0.jar:5.2.0] at org.apache.curator.framework.imps.CreateBuilderImpl$18.call(CreateBuilderImpl.java:1193) ~[curator-framework-5.2.0.jar:5.2.0] at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93) ~[curator-client-5.2.0.jar:?] at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:1190) ~[curator-framework-5.2.0.jar:5.2.0] at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:605) ~[curator-framework-5.2.0.jar:5.2.0] at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:595) ~[curator-framework-5.2.0.jar:5.2.0] at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:573) ~[curator-framework-5.2.0.jar:5.2.0] at org.apache.curator.framework.imps.CreateBuilderImpl$4.forPath(CreateBuilderImpl.java:461) ~[curator-framework-5.2.0.jar:5.2.0] at org.apache.curator.framework.imps.CreateBuilderImpl$4.forPath(CreateBuilderImpl.java:391) ~[curator-framework-5.2.0.jar:5.2.0] at org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.startThreads(ZKDelegationTokenSecretManager.java:272) ~[hadoop-common-3.3.6.jar:?] at org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.init(DelegationTokenManager.java:146) ~[hadoop-common-3.3.6.jar:?] ... 28 more ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
