Hi Reej, You can't change this behavior, this is how Zk works. You need to have at least *(2*N + 1)* nodes in the zookeeper cluster if you want to tolerate *N* zookeeper node failures.
This is also well documented in Solr ref guide :- https://solr.apache.org/guide/8_8/setting-up-an-external-zookeeper-ensemble.html "For a ZooKeeper service to be active, there must be a majority of non-failing machines that can communicate with each other. *To create a deployment that can tolerate the failure of F machines, you should count on deploying 2xF+1 machines*." This restriction is by design in order to make zookeeper function properly. If zookeeper can respond to requests with just one node alive, then the whole cluster can't be guaranteed to be partition tolerant anymore in all cases. Having said that, if you still want zookeeper to work properly even if 2 nodes are down (somewhat unlike in most cases), then increase your zookeeper cluster to have 5 nodes. Thanks, Vinay On Tue, Jan 25, 2022 at 2:15 PM Reej Nayagam <[email protected]> wrote: > Hi All, > > We are using solr 8.82 cloud setup with zk ensemble 3.6.3 (3 zk servers). > We have a HA monitoring system which will ping every 5 mins to check if the > solr URL or shards are available and ping is success. If there is a failure > ( if solr servers are down) it=E2=80=99ll flag to switch the searching > from= > using > solr to DB search. > > Now we have an issue we are connecting to solr passing the zk hosts. Now > if 2 of the zookeepers are down and if we are calling the > solrping.process(client), it’s processing for so > long as all the zk are down and going to a stuck thread and slowing down > the weblogic application server. > > This works fine when the majority of the > zookeepers are up and running > > Method Snippet > > private int pingRepo(zkHostList, corename) > { > Solrclient client = SolrConnectionUtil.getSokrClient(zkHostList, “ping”) > ((Cloudsolrclient)client).setdefaultCollection(corename) > Solrping ping = new Solrping > SolrPingResponse resp; > Try{ > resp = ping.process(client): > return resp.getQTime(); > } > Catch (Exception e) > {} > return -1; > } > > Can anyone suggest how to handle this, If the zookeepers are down. Thank > you > > Regards > Reej > > Sent from my iPhone
