Hey Houston, Thanks for putting this together. It's a really cool direction for the operator.
A few quick questions about the proposal: 1. Does this SIP include one or more OOTB implementations for the UtilizeSelectionRequest interface? What heuristics might those work off of? 2. What does "managing it themselves" mean in the "Solr Operator Interfaces" section. (Used in reference to those who set autoscaleReplicas.hpa.create to "false") Is that flag just meant to control CRUD operations on the HPA itself, or does it also govern the UTILIZENODE/REPLACENODE calls that the operator might make on the user's behalf? 3. Will the operator only drain nodes prior to pod-shutdown when the HPA is in use? Or might it do that even for users who aren't using an HPA as a response to statefulset size changes, etc. (And is that in or out of scope for this SIP.) Hopefully those questions make sense and don't betray I'm out of my depth haha! Thanks in advance for clarifying! Best, Jason On Wed, Apr 5, 2023 at 9:39 AM Radu Gheorghe <radu.gheor...@sematext.com> wrote: > > Hi Houston, > > Thanks a lot for putting this together! I'd like to help with Solr > Operator. Though I have limited availability in the following two months, > maybe I can still be useful with a few things. > > Some comments regarding the SIP: > - I think that in general it sounds like a good plan. I don't want to get > in the way instead of helping :) > - I think it tackles one of the three common use-cases that I've seen for > autoscaling: > 1) *AutoAddReplicas*: mostly for enterprise search, some people want to > expand on query throughput. Combining that with autoscaling sounds very > appealing. > 2) *Rotate indices on autoscaling events*, which should work well for > time-series data. This is what we presented last year at BBuzz and KubeCon > for Elasticsearch/OpenSearch > <https://sematext.com/blog/kubernetes-elasticsearch-autoscaling/>. The gzip > -9 version of it is that you'll probably want to create a new index with > the right number of shards after scaling out (or back in) to ensure that > the write workload (which tends to be dominant) is evenly balanced. You may > or may not want to rebalance previous shards, based on how often you go > back and forth. > 3) *Rebalance existing shards as you add/remove nodes*. Which is what this > SIP tackles, if I'm getting it right. > > If I understand correctly, these three don't exclude each other, so I > wouldn't bother changing this SIP to account for the other use-cases, but I > think it's nice to have them in mind or discuss them in case anyone has any > ideas. > > With regards to UTILIZENODE&REPLACENODE, I think they will work OK but I > wonder if a general REBALANCESHARDS command will work better? Or maybe it's > just because I'm thinking of Elasticsearch/OpenSearch. But it seems like a > more "general" approach. > > If REBALANCESHARDS sounds like a good idea, I'm thinking it could be per > collection or for the whole cluster, I'm not sure what's best. My initial > thought is that per cluster is what we need, but on the other hand per > collection is easier to implement (just assign shards of that collection, > and if the number of shards doesn't divide by the number of nodes, just > assign to the node with fewer replicas or maybe piggyback replica placement > plugins?) and it's easier to stop/recover when something goes wrong. Plus, > it's more opinionated in the sense that you'll want to have the current > (and future) number of nodes be a divisor of your number of shards. And > then maybe the Operator could have some config options on the steps that > you'll want to take. For example, I know I have 12 shards in total per > collection, I want 2,3,4,6 and 12-node configurations. > > Please let me know if you have any thoughts/questions/reactions :) > > Best regards, > Radu > -- > Elasticsearch/OpenSearch & Solr Consulting, Production Support & Training > Sematext Cloud - Full Stack Observability > https://sematext.com/ <http://sematext.com/> > > > On Thu, Mar 30, 2023 at 10:54 PM Houston Putman <hous...@apache.org> wrote: > > > Hello everyone, > > > > This is kind of a long-time coming, but I've finally created a SIP for > > autoscaling Solr Nodes on Kubernetes using the Solr Operator. > > > > > > https://cwiki.apache.org/confluence/display/SOLR/SIP-17%3A+Node+Autoscaling+via+Kubernetes > > > > There are still some details that need to be ironed out, but hopefully we > > can finalize everything relatively soon and try to get this out in Solr > > 9.3/9.4 and the Solr Operator v0.8.0. > > > > I've talked with quite a few people about this, so hopefully we can get a > > good amount of turn-out to get this implemented! And if anyone is > > interested in helping with the Solr Operator parts, I'd be very happy to > > mentor. It's not going to be the most straightforward code, but you will > > definitely be ramped up on contributing to the operator by the end! > > > > Please let me know if I can answer any questions! > > > > - Houston > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org For additional commands, e-mail: dev-h...@solr.apache.org