Summary: I've been contemplating a simple enhancement to how SolrCloud resolves files in a configSet: when a file isn't in ZooKeeper, fallback resolution to the same-named configset on the file system (which normally is ignored in SolrCloud today). A further fallback to _default on the filesystem could be useful as well. The mutable space is always ZK if you edit a schema or configOverlay.json or whatever.
My primary motivation is allowing for upgrades to plugins, configs, or Solr itself to be easier in some scenarios (certainly not all!). Imagine that you've got configOverlay.json (with some handlers defined) & params.json & schema.xml in ZK, and solrconfig.xml on the file system, plus some partial xml file of schema field types that is "xi:include"-ed by schema.xml. Assume that a custom Solr Docker image is used including custom plugins, and with this configSet baked in. One day you add some new token filters, add a new Lucene merge policy, and remove some outdated update request processor. You do plugin code changes and xi:included field type changes and edit solrconfig.xml, and build this into your latest company Solr Docker image, and you get it deployed using Kubernetes. Those changes can be safe to deploy without touching any ZK resident configSet. Other changes might not be (e.g. removing a field type that is referenced, etc. or doing changes to analyzed text that are too incompatible requiring a re-index) but my point is that some are, and this would be easier. An additional motivation is storing large relatively static common resources on the file system. Where I work, I've got over a gig of them :-). This can be worked around with solr.allow.unsafe.resourceloading=true but... it'd be nice to not have to resort to that. Another benefit would be to make it easier to separate one's own configuration with that of the _default configSet you took from Solr when starting a new project. Resolving differences and then doing Solr upgrades was a common task I had to do as a consultant and my own Solr upgrades. Granted this is possible today but perhaps if this overlay was emphasized/embraced more, it would lead to this outcome. It's still a problem that a bare-bones solrconfig.xml & schema.xml are either too bare-bones or say too much, and it's a separate issue for Solr to improve that. Probably secondary related issue: If the SolrCloud configSet ZK node were to be optional instead of required (thus assume the configSet is entirely on the file system), it would bring other benefits. It would allow users to use the "file store" or some network mounted storage (NFS) as the configSet location. It would accelerate experimentation with SolrCloud in docker locally. The biggest PITA anyone notices when first exploring SolrCloud is that configs are fundamentally not on the file system despite you seeing them there; it's all in ZK. And there's no super convenient way to edit the configuration, not even a web UI. Using the file system for configSets would be especially nice when doing local SolrCloud experimentation in Docker, eliminating an annoying configSet deployment step. I plan to file an issue of course but I think this deserved a dev list discussion. I know the new package manager could help with my primary motivating use-case, but I think at present there are too many obstacles there, at least at present. A file system fallback is a simple thing by comparison. Question: Does the k8s Solr Operator do anything to make configSet & plugin upgrades better? ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley
