In your setup, the load balancer prevents single points of failure. Since you're pinging a URL, what happens if that node dies or is turned off? Your PHP program has no way of knowing what to do, but the load balancer does.
Your understanding of Zookeeper's role shows a common misconception. Zookeeper keeps track of the topology of the collections, what nodes are up, what ones down etc. It does _not_ have anything to do with distributing queries or updates. Imagine a 1,000 node collection. If each and every request had to go through Zookeeper, that would be a bottleneck. Instead, when each node's state changes, it informs Zookeeper which in turn informs all the other Solr nodes who care. It looks like this. - node starts up. - as each replica comes up, it informs Zookeeper that it is now "active". - for each collection with any replica on that node, a "watch" is set on the collection's state.json node in Zookeeper - every time that state.json node changes, Zookeeper notifies the node. - eventually everything starts all the state changes are broadcast and Zookeeper just sits there. - periodically Zookeeper pings each Solr node and if it has gone away it informs all the Solr nodes that this node is dead and the Solr node updates it's snapshot of the cluster's topologyl A query comes in to a Solr node and this is what happens: - the Solr node looks in it's Zookeeper information to see where all the replicas for the collection are. - Solr picks one replica from each shard and sends the subquery to them - Solr assembles the response from the subrequests - Solr sends the response to the client. note that Zookeeper isn't involved at all. In fact, Zookeeper can go away completely and each Solr node will work on it's last snapshot of the topology of the network and answer _queries_. Updates will fail completely if Zookeeper falls below quorum, but Zookeeper isn't handling the _update_. It's still Solr knowing that Zookeeper is below quorum and refusing to process an update. There's more going on of course, but that's the general outline. Since you're using PHP, it doesn't know about Zookeeper, all it has is a URL so as I mentioned above, if that node goes away it's your php program that's not Zookeeper-aware. If you were using "CloudSolrClient" in SolrJ, it _is_ Zookeeper aware and you would not need a load balancer. But again that's because it knows the cluster topology (it registers its own watchers) and can "do the right thing" if something goes away. Zookeeper is still not directly involved in processing queries or updates. Best, Erick On Fri, Jun 29, 2018 at 7:31 PM, Sushant Vengurlekar <svengurle...@curvolabs.com> wrote: > Thanks for your reply. I have a follow up question. Why is a load balancer > needed? Isn't that the job of zookeeper to loadbalance queries across solr > nodes? > > I was under the impression that you send query to zookeeper and it handles > the rest and sends the response back. Can you please enlighten .me on that > one. > > Thank you > > On Fri, Jun 29, 2018 at 7:19 PM, Shalin Shekhar Mangar < > shalinman...@gmail.com> wrote: > >> You send your queries and updates directly to Solr's collection e.g. >> http://host:port/solr/<your_collection_name>. You can use any Solr node >> for >> this request. If the node does not have the collection being queried then >> the request will be forwarded internally to a Solr instance which has that >> collection. >> >> ZooKeeper is used by Solr's Java client to look up the list of Solr nodes >> having the collection being queried. But if you are using PHP then you can >> probably keep a list of Solr nodes in configuration and randomly choose >> one. A better implementation would be to setup a load balancer and put all >> Solr nodes behind it and query the load balancer URL in your application. >> >> On Sat, Jun 30, 2018 at 7:31 AM Sushant Vengurlekar < >> svengurle...@curvolabs.com> wrote: >> >> > I have a question regarding querying in solrcloud. >> > >> > I am working on php code to query solrcloud for search results. Do I send >> > the query to zookeeper or send it to a particular solr node? How does the >> > querying process work in general. >> > >> > Thank you >> > >> >> >> -- >> Regards, >> Shalin Shekhar Mangar. >>