On 12/14/2016 7:36 AM, GW wrote: > I understand accessing solr directly. I'm doing REST calls to a single > machine. > > If I have a cluster of five servers and say three Apache servers, I can > round robin the REST calls to all five in the cluster? > > I guess I'm going to find out. :-) If so I might be better off just > running Apache on all my solr instances.
If you're running SolrCloud (which uses zookeeper) then sending multiple query requests to any node will load balance the requests across all replicas for the collection. This is an inherent feature of SolrCloud. Indexing requests will be forwarded to the correct place. The node you're sending to is a potential single point of failure, which you can eliminate by putting a load balancer in front of Solr that connects to at least two of the nodes. As I just mentioned, SolrCloud will do further load balancing to all nodes which are capable of serving the requests. I use haproxy for a load balancer in front of Solr. I'm not running in Cloud mode, but a load balancer would also work for Cloud, and is required for high availability when your client only connects to one server and isn't cloud aware. http://www.haproxy.org/ Solr includes a cloud-aware Java client that talks to zookeeper and always knows the state of the cloud. This eliminates the requirement for a load balancer, but using that client would require that you write your website in Java. The PHP clients are third-party software, and as far as I know, are not cloud-aware. https://wiki.apache.org/solr/IntegratingSolr#PHP Some advantages of using a Solr client over creating HTTP requests yourself: The code is easier to write, and to read. You generally do not need to worry about making sure that your requests are properly escaped for URLs, XML, JSON, etc. The response to the requests is usually translated into data structures appropriate to the language -- your program probably doesn't need to know how to parse XML or JSON. Thanks, Shawn