Thank you Vinay On Tue, 25 Jan 2022 at 7:19 PM, Vinay Rajput <[email protected]> wrote:
> Hi Reej, > > You can't change this behavior, this is how Zk works. You need to have at > least *(2*N + 1)* nodes in the zookeeper cluster if you want to > tolerate *N* zookeeper > node failures. > > This is also well documented in Solr ref guide :- > > https://solr.apache.org/guide/8_8/setting-up-an-external-zookeeper-ensemble.html > "For a ZooKeeper service to be active, there must be a majority of > non-failing machines that can communicate with each other. *To create a > deployment that can tolerate the failure of F machines, you should count on > deploying 2xF+1 machines*." > > This restriction is by design in order to make zookeeper function properly. > If zookeeper can respond to requests with just one node alive, then the > whole cluster can't be guaranteed to be partition tolerant anymore in > all cases. > > Having said that, if you still want zookeeper to work properly even if 2 > nodes are down (somewhat unlike in most cases), then increase your > zookeeper cluster to have 5 nodes. > > Thanks, > Vinay > > On Tue, Jan 25, 2022 at 2:15 PM Reej Nayagam <[email protected]> wrote: > > > Hi All, > > > > We are using solr 8.82 cloud setup with zk ensemble 3.6.3 (3 zk servers). > > We have a HA monitoring system which will ping every 5 mins to check if > the > > solr URL or shards are available and ping is success. If there is a > failure > > ( if solr servers are down) it=E2=80=99ll flag to switch the searching > > from= > > using > > solr to DB search. > > > > Now we have an issue we are connecting to solr passing the zk hosts. Now > > if 2 of the zookeepers are down and if we are calling the > > solrping.process(client), it’s processing for so > > long as all the zk are down and going to a stuck thread and slowing down > > the weblogic application server. > > > > This works fine when the majority of the > > zookeepers are up and running > > > > Method Snippet > > > > private int pingRepo(zkHostList, corename) > > { > > Solrclient client = SolrConnectionUtil.getSokrClient(zkHostList, “ping”) > > ((Cloudsolrclient)client).setdefaultCollection(corename) > > Solrping ping = new Solrping > > SolrPingResponse resp; > > Try{ > > resp = ping.process(client): > > return resp.getQTime(); > > } > > Catch (Exception e) > > {} > > return -1; > > } > > > > Can anyone suggest how to handle this, If the zookeepers are down. Thank > > you > > > > Regards > > Reej > > > > Sent from my iPhone > -- *Thanks,* *Reej*
