For a java app using solrJ and a persistent pool is probably best. I've used varnish in front of solr with php clients and had a good experience but for java direct is likely best.
On Wed, Jun 2, 2021 at 6:07 AM Ixai Lanzagorta <[email protected]> wrote: > > Hi, I'm trying to understand the difference between doing load balancing > via an HTTP proxy vs. using SolrJ's CloudSolrClient. I created a test > cluster (3x SolrCloud 8.8 nodes, 3x ZooKeeper nodes) and then tested a few > things: > > 1) Configure a proxy to do the load balancing. I figured that: > - I can delegate health checks to both the proxy, and my container > orchestration. > - I can connect to the cluster with SolrJ using the HttpSolrClient with the > proxy URL. > > My concern is that, since health checks are done on the Solr instance (e.g. > GET /solr/), and not a specific collection, the proxy could redirect a > request to a healthy node with a faulty collection. Is this a real concern? > > 2) Alternatively, I could use CloudSolrClient and configure either the list > of `solrBaseUrls` or `zkHosts`. > > The constraint here is that, when using CloudSolrClient, the SolrJ client > gets back a list of resolved IP addresses from the SolrCloud cluster or > ZooKeeper ensemble. The client must be able to reach those resolved IP > addresses or the connection will fail. Therefore, either the client must > live in the same network as the servers (subnet, VPN, etc.), or the servers > must be publicly accessible. > > I'm new to Solr, so I wonder if there's any other specifics or alternatives > that I'm not considering. Are there any particular reasons why you'd > recommend one setup over the other? > > Any insight is appreciated, > Ixai
