Re: Solr 8.x Startup problems when ZK is partially unavailable

2020-01-10 Thread Jan Høydahl
I’ve also seen timeout with zkCli.sh of Solr8.4 when connected to 3 ZK and the 
first is not accessible. Solr 8.4 has ZK3.5.5 while 7.x has Zk3.4.x

Jan Høydahl

> 10. jan. 2020 kl. 17:44 skrev Markus Jelsma :
> 
> Hello,
> 
> I have multiple collections, one 7.5.0 and the rest is on 8.3.1. They all 
> share the same ZK ensemble and have the same ZK connection string. The first 
> ZK address in the connection string is one that is not reachable, it seems 
> firewalled, the rest is accessible.
> 
> The 7.5.0 nodes do not appear to have problems with a partial accessible ZK 
> ensemble. It gave a simple warning but the cores on the nodes keep starting 
> up nicely.
> 
> I have trouble starting up 8.x nodes because it times out when connecting to 
> ZK. The logs are filled with:
> 
> 2020-01-10 16:33:33.146 WARN  (qtp1620948294-21) [   ] 
> o.a.s.h.a.ZookeeperStatusHandler Failed talking to zookeeper bad_node1:2181 
> => org.apache.solr.common.SolrException: Failed talking to Zookeeper 
> 89.188.14.28:2181
>at 
> org.apache.solr.handler.admin.ZookeeperStatusHandler.getZkRawResponse(ZookeeperStatusHandler.java:245)
> 
> And i get this one for one of the cores on a restarted node:
> 
> 2020-01-10 16:31:11.752 ERROR 
> (searcherExecutor-12-thread-1-processing-n:s2.io:8983_solr 
> x:documents_shard2_replica_t19 c:documents s:shard2 r:core_node20) 
> [c:documents s:shard2 r:core_node20 x:documents_shard2_replica_t19] 
> o.a.s.h.RequestHandlerBase java.lang.NullPointerException
>at 
> org.apache.solr.handler.component.SearchHandler.initComponents(SearchHandler.java:183)
> 
> This one is probably preventing the core from getting properly loaded. One 
> the same node, however, there is another shard of the same collection, which 
> did start up normally, as did other cores on the node.
> 
> Is this a known 8.x problem? I can work around it by temporarily removing the 
> bad node address from the ZK connection string but thats all.
> 
> Thanks,
> Markus
> 


Solr 8.x Startup problems when ZK is partially unavailable

2020-01-10 Thread Markus Jelsma
Hello,

I have multiple collections, one 7.5.0 and the rest is on 8.3.1. They all share 
the same ZK ensemble and have the same ZK connection string. The first ZK 
address in the connection string is one that is not reachable, it seems 
firewalled, the rest is accessible.

The 7.5.0 nodes do not appear to have problems with a partial accessible ZK 
ensemble. It gave a simple warning but the cores on the nodes keep starting up 
nicely.

I have trouble starting up 8.x nodes because it times out when connecting to 
ZK. The logs are filled with:

2020-01-10 16:33:33.146 WARN  (qtp1620948294-21) [   ] 
o.a.s.h.a.ZookeeperStatusHandler Failed talking to zookeeper bad_node1:2181 => 
org.apache.solr.common.SolrException: Failed talking to Zookeeper 
89.188.14.28:2181
at 
org.apache.solr.handler.admin.ZookeeperStatusHandler.getZkRawResponse(ZookeeperStatusHandler.java:245)

And i get this one for one of the cores on a restarted node:

2020-01-10 16:31:11.752 ERROR 
(searcherExecutor-12-thread-1-processing-n:s2.io:8983_solr 
x:documents_shard2_replica_t19 c:documents s:shard2 r:core_node20) [c:documents 
s:shard2 r:core_node20 x:documents_shard2_replica_t19] 
o.a.s.h.RequestHandlerBase java.lang.NullPointerException
at 
org.apache.solr.handler.component.SearchHandler.initComponents(SearchHandler.java:183)

This one is probably preventing the core from getting properly loaded. One the 
same node, however, there is another shard of the same collection, which did 
start up normally, as did other cores on the node.

Is this a known 8.x problem? I can work around it by temporarily removing the 
bad node address from the ZK connection string but thats all.

Thanks,
Markus