I’ve also seen timeout with zkCli.sh of Solr8.4 when connected to 3 ZK and the
first is not accessible. Solr 8.4 has ZK3.5.5 while 7.x has Zk3.4.x
Jan Høydahl
> 10. jan. 2020 kl. 17:44 skrev Markus Jelsma :
>
> Hello,
>
> I have multiple collections, one 7.5.0 and the rest is on 8.3.1. They all
> share the same ZK ensemble and have the same ZK connection string. The first
> ZK address in the connection string is one that is not reachable, it seems
> firewalled, the rest is accessible.
>
> The 7.5.0 nodes do not appear to have problems with a partial accessible ZK
> ensemble. It gave a simple warning but the cores on the nodes keep starting
> up nicely.
>
> I have trouble starting up 8.x nodes because it times out when connecting to
> ZK. The logs are filled with:
>
> 2020-01-10 16:33:33.146 WARN (qtp1620948294-21) [ ]
> o.a.s.h.a.ZookeeperStatusHandler Failed talking to zookeeper bad_node1:2181
> => org.apache.solr.common.SolrException: Failed talking to Zookeeper
> 89.188.14.28:2181
>at
> org.apache.solr.handler.admin.ZookeeperStatusHandler.getZkRawResponse(ZookeeperStatusHandler.java:245)
>
> And i get this one for one of the cores on a restarted node:
>
> 2020-01-10 16:31:11.752 ERROR
> (searcherExecutor-12-thread-1-processing-n:s2.io:8983_solr
> x:documents_shard2_replica_t19 c:documents s:shard2 r:core_node20)
> [c:documents s:shard2 r:core_node20 x:documents_shard2_replica_t19]
> o.a.s.h.RequestHandlerBase java.lang.NullPointerException
>at
> org.apache.solr.handler.component.SearchHandler.initComponents(SearchHandler.java:183)
>
> This one is probably preventing the core from getting properly loaded. One
> the same node, however, there is another shard of the same collection, which
> did start up normally, as did other cores on the node.
>
> Is this a known 8.x problem? I can work around it by temporarily removing the
> bad node address from the ZK connection string but thats all.
>
> Thanks,
> Markus
>