On 12/14/2016 7:36 AM, GW wrote:
> I understand accessing solr directly. I'm doing REST calls to a single
> machine.
>
> If I have a cluster of five servers and say three Apache servers, I can
> round robin the REST calls to all five in the cluster?
>
> I guess I'm going to find out. :-)  If so I might be better off just
> running Apache on all my solr instances.

If you're running SolrCloud (which uses zookeeper) then sending multiple
query requests to any node will load balance the requests across all
replicas for the collection.  This is an inherent feature of SolrCloud. 
Indexing requests will be forwarded to the correct place.

The node you're sending to is a potential single point of failure, which
you can eliminate by putting a load balancer in front of Solr that
connects to at least two of the nodes.  As I just mentioned, SolrCloud
will do further load balancing to all nodes which are capable of serving
the requests.

I use haproxy for a load balancer in front of Solr.  I'm not running in
Cloud mode, but a load balancer would also work for Cloud, and is
required for high availability when your client only connects to one
server and isn't cloud aware.

http://www.haproxy.org/

Solr includes a cloud-aware Java client that talks to zookeeper and
always knows the state of the cloud.  This eliminates the requirement
for a load balancer, but using that client would require that you write
your website in Java.

The PHP clients are third-party software, and as far as I know, are not
cloud-aware.

https://wiki.apache.org/solr/IntegratingSolr#PHP

Some advantages of using a Solr client over creating HTTP requests
yourself:  The code is easier to write, and to read.  You generally do
not need to worry about making sure that your requests are properly
escaped for URLs, XML, JSON, etc.  The response to the requests is
usually translated into data structures appropriate to the language --
your program probably doesn't need to know how to parse XML or JSON.

Thanks,
Shawn

Reply via email to