I'm not familiar with these classes, but I'm not particularly fond of
anything that leads us in a direction of requiring a *third* installation
for initial use. (Zookeeper being #2 already). That said, we really need a
good replacement for autoscaling, and large installs might reasonably want
to offload any non query work. Ideally, we would have a smooth transition
so that users can easily follow this path:

   1. Single Cloud Node, embedded zookeeper, local management (mostly
   unused, maybe not loaded)
   2. A few Cloud nodes (2-5), embedded zookeeper, local management
   3. Moderate cloud (6-12 nodes), embedded zookeeper on subset of nodes,
   local management
   4. Large cloud 13-25 nodes, external zookeeper, local management
   5. > 25 nodes, external zookeeper, management local or external.
   6. >100 nodes, recommended external zk and management

Thus folks doing moderate stuff don't need to bother with installing
anything other than Solr. Somewhere along that scale they would likely
start using tlog and then tlog/pull setups as well. Ideally we would have a
clear path to make these transitions with minimal downtime.

So if we can fit what these classes do into that dream, great. If they
point elsewhere meh.

Note: of course none of this has anything to do with "user-managed" Solr
(a.k.a. legacy solr) which is managed manually by users and doesn't have zk.


On Mon, Jun 19, 2023, 4:42 PM David Smiley <dsmi...@apache.org> wrote:

> I noticed the SolrCloudManager concept added some time ago brought about to
> abstract away SolrCloud in the context of doing simulated experiments on
> auto-scaling.  Essentially -- need to simulate SolrCloud and not actually
> use a real SolrCloud.  But that need and code went away in 9.0...
> nonetheless SolrCloudManager and its friends (like DistributedStateManager)
> are still around.  I could imagine someone advocating for them
> nonetheless.  But the present state is very half-implemented as there is
> code all over the place that assumes ZooKeeper (e.g. uses SolrZkClient or
> ZkStateReader) instead of some of these abstractions.  I think there is a
> need to set a direction here -- do we embrace abstracting SolrCloud within
> Solr or do we revert this stuff as needless indirection / concepts.
>
> I think there's lots of room to debate / review the particulars of
> SolrCloudManager and friends if we do want to keep it.
> DistributedQueueFactory isn't even used anymore.  NodeStateProvider is only
> for AttributeFactory; not very obvious.  DistribStateManager is essentially
> SolrZkClient but nonetheless still references ZK classes.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>

Reply via email to