Re: Solrcloud load balancing / failover

2020-12-26 Thread Dominique Bejean
Hi,
Thank you for your response.
Dominique

Le mar. 15 déc. 2020 à 08:06, Shalin Shekhar Mangar 
a écrit :

> No, the load balancing is based on random selection of replicas and
> CPU is not consulted. There are limited ways to influence the replica
> selection, see
> https://lucene.apache.org/solr/guide/8_4/distributed-requests.html#shards-preference-parameter
>
> If a replica fails then the query fails and an error is returned. I
> think (but I am not sure) that SolrJ retries the request on some
> specific errors in which case a different replica may be selected and
> the request may succeed.
>
> IMO, these are two weak areas of Solr right now. Suggestions/patches
> are welcome :-)
>
> On 12/11/20, Dominique Bejean  wrote:
> > Hi,
> >
> > Is there in Solrcloud any load balancing based on CPU load on Solr nodes
> ?
> >
> > If for shard a replica fails to handle a query, the query is sent to
> > another replica in order to be completed ?
> >
> > Regards
> >
> > Dominique
> >
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>


Re: solrcloud with EKS kubernetes

2020-12-26 Thread Jonathan Tan
Hi Abhishek,

Merry Christmas to you too!
I think it's really a question regarding your indexing speed NFRs.

Have you had a chance to take a look at your IOPS & write bytes/second
graphs for that host & PVC?

I'd suggest that's the first thing to go look at, so that you can find out
whether you're actually IOPS bound or not.
If you are, then it becomes a question of *how* you're indexing, and
whether that can be "slowed down" or not.



On Thu, Dec 24, 2020 at 5:55 PM Abhishek Mishra 
wrote:

> Hi Jonathan,
> Merry Christmas.
> Thanks for the suggestion. To manage IOPS can we do something on
> rate-limiting behalf?
>
> Regards,
> Abhishek
>
>
> On Thu, Dec 17, 2020 at 5:07 AM Jonathan Tan  wrote:
>
> > Hi Abhishek,
> >
> > We're running Solr Cloud 8.6 on GKE.
> > 3 node cluster, running 4 cpus (configured) and 8gb of min & max JVM
> > configured, all with anti-affinity so they never exist on the same node.
> > It's got 2 collections of ~13documents each, 6 shards, 3 replicas each,
> > disk usage on each node is ~54gb (we've got all the shards replicated to
> > all nodes)
> >
> > We're also using a 200gb zonal SSD, which *has* been necessary just so
> that
> > we've got the right IOPS & bandwidth. (That's approximately 6000 IOPS for
> > read & write each, and 96MB/s for read & write each)
> >
> > Various lessons learnt...
> > You definitely don't want them ever on the same kubernetes node. From a
> > resilience perspective, yes, but also when one SOLR node gets busy, they
> > tend to all get busy, so now you'll have resource contention. Recovery
> can
> > also get very busy and resource intensive, and again, sitting on the same
> > node is problematic. We also saw the need to move to SSDs because of how
> > IOPS bound we were.
> >
> > Did I mention use SSDs? ;)
> >
> > Good luck!
> >
> > On Mon, Dec 14, 2020 at 5:34 PM Abhishek Mishra 
> > wrote:
> >
> > > Hi Houston,
> > > Sorry for the late reply. Each shard has a 9GB size around.
> > > Yeah, we are providing enough resources to pods. We are currently
> > > using c5.4xlarge.
> > > XMS and XMX is 16GB. The machine is having 32 GB and 16 core.
> > > No, I haven't run it outside Kubernetes. But I do have colleagues who
> did
> > > the same on 7.2 and didn't face any issue regarding it.
> > > Storage volume is gp2 50GB.
> > > It's not the search query where we are facing inconsistencies or
> > timeouts.
> > > Seems some internal admin APIs sometimes have issues. So while adding
> new
> > > replica in clusters sometimes result in inconsistencies. Like recovery
> > > takes some time more than one hour.
> > >
> > > Regards,
> > > Abhishek
> > >
> > > On Thu, Dec 10, 2020 at 10:23 AM Houston Putman <
> houstonput...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hello Abhishek,
> > > >
> > > > It's really hard to provide any advice without knowing any
> information
> > > > about your setup/usage.
> > > >
> > > > Are you giving your Solr pods enough resources on EKS?
> > > > Have you run Solr in the same configuration outside of kubernetes in
> > the
> > > > past without timeouts?
> > > > What type of storage volumes are you using to store your data?
> > > > Are you using headless services to connect your Solr Nodes, or
> > ingresses?
> > > >
> > > > If this is the first time that you are using this data + Solr
> > > > configuration, maybe it's just that your data within Solr isn't
> > optimized
> > > > for the type of queries that you are doing.
> > > > If you have run it successfully in the past outside of Kubernetes,
> > then I
> > > > would look at the resources that you are giving your pods and the
> > storage
> > > > volumes that you are using.
> > > > If you are using Ingresses, that might be causing slow connections
> > > between
> > > > nodes, or between your client and Solr.
> > > >
> > > > - Houston
> > > >
> > > > On Wed, Dec 9, 2020 at 3:24 PM Abhishek Mishra  >
> > > > wrote:
> > > >
> > > > > Hello guys,
> > > > > We are kind of facing some of the issues(Like timeout etc.) which
> are
> > > > very
> > > > > inconsistent. By any chance can it be related to EKS? We are using
> > solr
> > > > 7.7
> > > > > and zookeeper 3.4.13. Should we move to ECS?
> > > > >
> > > > > Regards,
> > > > > Abhishek
> > > > >
> > > >
> > >
> >
>