On 13/07/2021 12:26, Colvin Cowie wrote: > What version of Solr are you on?
We observed this on 7.7. I assumed that SOLR_WAIT_FOR_ZK only applied to embedded ZK instances and not external ensembles. But going by your explanation, I'll have to revisit that. Thanks for pointing that out. Worth a shot. I'll see if I can backport this to 7.7. > I'm not familiar with ZkContainer, it looks to me like the > SolrDispatchFilter loadNodeConfig(...) will already have been called at the > point ZkContainer initZooKeeper(...) is called, so unless ZK goes down > between the two calls, the timeout in ZkContainer should be immaterial > because a successful connection was already made, so setting > SOLR_WAIT_FOR_ZK should be sufficient? Hmm, according to the stack trace we're seeing, Solr is creating a ZK client from within ZkContainer::initZooKeeper. SolrDispatcherFilter::loadNodeConfig does not occur in the stack trace. But maybe the order of operations is different between 7.7 and 8.x. Here's the full stack trace: org.apache.solr.common.SolrException.log(SolrException.java:159)|null:org.apache.solr.common.SolrException: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper PASWP01F.dns20.socgen:44011,PASWP01M.dns20.socgen:44011,PASWP04M.dns20.socgen:44011 within 30000 ms at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:201) at org.apache.solr.cloud.ZkController.<init>(ZkController.java:334) at org.apache.solr.core.ZkContainer.initZooKeeper(ZkContainer.java:114) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:570) at org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:253) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:173) at org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:136) at org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$0(ServletHandler.java:750) at java.base/java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) at java.base/java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:734) at java.base/java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:734) at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658) at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:744) at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:368) at org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1497) at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1459) at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:852) at org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:278) at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:545) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:46) at org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:192) at org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:505) at org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:151) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:180) at org.eclipse.jetty.deploy.providers.WebAppProvider.fileAdded(WebAppProvider.java:453) at org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:64) at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:610) at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:529) at org.eclipse.jetty.util.Scanner.scan(Scanner.java:392) at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:313) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:150) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:579) at org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:240) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:138) at org.eclipse.jetty.server.Server.start(Server.java:415) at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113) at org.eclipse.jetty.server.Server.doStart(Server.java:382) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1572) at org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1512) at java.base/java.security.AccessController.doPrivileged(Native Method) at org.eclipse.jetty.xml.XmlConfiguration.main(XmlConfiguration.java:1511) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.eclipse.jetty.start.Main.invokeMain(Main.java:220) at org.eclipse.jetty.start.Main.start(Main.java:490) at org.eclipse.jetty.start.Main.main(Main.java:77) Caused by: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper PASWP01F.dns20.socgen:44011,PASWP01M.dns20.socgen:44011,PASWP04M.dns20.socgen:44011 within 30000 ms at org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:250) at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:193) ... 53 more