Hello team, noob warning: today I learned what SIP means. with SIP17 and 18 being very interesting reads. https://cwiki.apache.org/confluence/display/SOLR/Solr+Improvement+Proposals Too many telephone references. sorry for the interruption. Alejandro Arrieta
On Thu, Apr 20, 2023 at 5:27 PM Houston Putman <hous...@apache.org> wrote: > Thanks for the questions Jason! > > So the general idea is that we'd add a Solr contrib/module, and that > > module would have a dep on some sort of Kubernetes client so it could > > manage certain Solr entities (e.g. security.json, configsets, etc.) as > > Kubernetes resources (configmaps, etc.). Am I understanding that > > right? > > > > Yes, absolutely. And possibly other things, like leverage Kubernetes' > secrets managements to manage > credentials for users. (Auto-import BasicAuth secrets with certain labels, > integrate with Kubernetes ServiceAccounts, etc.) > > But yeah, generally the idea is to use Kubernetes state instead of > Zookeeper state for certain features. > > One place there might be room for improvement in the writeup so far is > > around the motivation/value-prop for some of these Solr->Kubernetes > > integrations. The value in some integrations (e.g. > > KubernetesSSLCredentialsProvider) is relatively self-evident I think, > > but others are a little less clear and could use spelled out > > explicitly IMO. e.g. What's the benefit of storing security.json or > > configsets in Kubernetes configmaps over ZooKeeper? > > > > This is a great question. > > Generally Solr has fairly good tool support for managing various things in > Zookeeper. > > The "zkCli.sh" script and various "bin/solr" commands allow users to easily > manage their Zookeeper state to setup > Solr to run the way they need it to. This works very well for users running > Solr on bare-metal, and manually running these commands. > > However, running these commands in Kubernetes is not very convenient and it > does not really jive with > the Kubernetes' idempotent model. Basically there isn't a good or easy way > to run to run the > solr/zk setup commands before a SolrCloud is created. And when we do it in > things like an "initContainer", > the commands have to be run every time a solr process is started (or > restarted). This isn't really convenient > and adds complexity that really makes running Solr on Kubernetes much less > appealing. > > Another thing is state management. So let's say that the Solr Operator > wants to enable auth by default when running Solr. > It has to create a security.json for Solr to use, and generate passwords > and secrets for users to use. > However, it also needs to setup a user & password for itself (the operator) > to use to interact with the cluster. > But that's ok, it does it, and it can easily upload this file to zookeeper > in the initContainer if no security.json already exists. > > However we need to allow users to update this file themselves to add more > users, and do other stuff. So basically we > can't let the Solr Operator make any changes to this file. So even if a > user decides that they want to change the security.json secret > they passed in the SolrCloud, the operator can't make that change happen, > since it can't overwrite what already exists in zookeeper. > This will always be a problem when there are two "sources of truth". One > has to be prioritized. > > If we allow the security.json to be loaded from a kubernetes secret, then > the secret that the user provides is the > single source of truth. So no matter if the security.json is changed > through the security UI, the changes will be reflected in > the kubernetes secret. So users can be free to overwrite that secret if > they want to, given that everyone knows its the current > accepted state of the security.json file. > > The exact same issues exist with ConfigMaps. Many Solr Operator users want > to manage their configMaps through > Kubernetes, just like they manage their SolrClouds. It makes sense, keep > all of your Solr infra managed together. > However the operator cannot keep the configSets managed in Zookeeper > updated with the configSets managed > via Kube ConfigSets. It's two sources of truth. > > *TLDR*: Solr has many command line utilities that work well to setup Solr > when its running on bare metal or a VM. > However, these solutions do not work well in a cloud system like > Kubernetes. If we try to make these things > easier to setup in Kubernetes, it ultimately results in 2 sources of truth > (Kubernetes and Zookeeper). If we make > plugins that allow to load in these settings from Kubernetes instead of > Zookeeper, we are back down to 1 source > of truth. And this single source of truth (obviously) works well in > Kubernetes, because they are native Kubernetes resources. > > - Houston > > On Tue, Apr 11, 2023 at 2:36 PM Jason Gerlowski <gerlowsk...@gmail.com> > wrote: > > > Hi Houston, > > > > So the general idea is that we'd add a Solr contrib/module, and that > > module would have a dep on some sort of Kubernetes client so it could > > manage certain Solr entities (e.g. security.json, configsets, etc.) as > > Kubernetes resources (configmaps, etc.). Am I understanding that > > right? > > > > > Please let me know if I can explain more, or how I can make the SIP > page > > better. > > > > One place there might be room for improvement in the writeup so far is > > around the motivation/value-prop for some of these Solr->Kubernetes > > integrations. The value in some integrations (e.g. > > KubernetesSSLCredentialsProvider) is relatively self-evident I think, > > but others are a little less clear and could use spelled out > > explicitly IMO. e.g. What's the benefit of storing security.json or > > configsets in Kubernetes configmaps over ZooKeeper? > > > > Best, > > > > Jason > > > > On Wed, Apr 5, 2023 at 12:45 PM Houston Putman <hous...@apache.org> > wrote: > > > > > > Hey everyone, > > > > > > This is a new SIP, not a duplicate of SIP-17 (Authoscaling on > > Kubernetes), > > > and completely unrelated. > > > > > > Basically there is a lot of very messy logic we do in the Solr Operator > > to > > > bootstrap security and manage various things. This logic must exist > > because > > > Solr has no idea that Kubernetes exists. > > > If we can use Kubernetes APIs to pull in information, instead of > relying > > on > > > the Solr Operator to inject that information in hacky-ways, the user > > > experience on Kubernetes is going to get many times better for users > > > wanting to secure their SolrClouds. This will also help us use > > > authorization by default (which we always preach) via the Solr > Operator. > > > > > > This SIP is not very filled out because I'm still thinking on various > > > aspects. But in general, we can attack the different plugins one-by-one > > and > > > the SIP can evolve throughout the process. This SIP is very easy to > break > > > up, which is nice. > > > > > > Please let me know if I can explain more, or how I can make the SIP > page > > > better. > > > > > > - Houston > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org > > For additional commands, e-mail: dev-h...@solr.apache.org > > > > >