Re: SolrCloudManager... DistribStateManager...

Gus Heck Mon, 26 Jun 2023 11:23:00 -0700

Yeah, looking back at the original email I read too quick and misunderstood
the juxtaposition of "manager" and "extracting", but the current effort to
make embedded zk a supported production config (not sure where that
stands),  and some form of  or tweak to server roles stuff would get us
through 4, and the (misunderstood) idea of management code (possibly along
with the UI) being extractable to a management server for large installs
(so that there's no chance of the balancing calculations and other such
needs bogging down the servers serving/indexing) seems interesting... also
has the effect of potentially isolating cluster wide operations away from
individual query endpoints.  This is part of where I was going when I
extracted the startup mechanisms into a ServletListner. Breaking out admin
into its own servlet (or two servlets, cluster-admin and node-admin, the
latter only containing endpoints for actions applicable to the current
node) would be a way to make this easy. There's not really a good reason
for the query and update paths to be passing through an "if this is an
admin request do something else" A precursor to that in my mind would be to
pull out auth into a separate filter that can be reused across contexts.


-Gus

On Tue, Jun 20, 2023 at 12:05 PM Eric Pugh <[email protected]>
wrote:

> I think this is a very interesting progression…..    It’s a really nice
> mental model of “what tools should I reach for when?"
>
> > On Jun 20, 2023, at 11:41 AM, Gus Heck <[email protected]> wrote:
> >
> > I'm not familiar with these classes, but I'm not particularly fond of
> > anything that leads us in a direction of requiring a *third* installation
> > for initial use. (Zookeeper being #2 already). That said, we really need
> a
> > good replacement for autoscaling, and large installs might reasonably
> want
> > to offload any non query work. Ideally, we would have a smooth transition
> > so that users can easily follow this path:
> >
> >   1. Single Cloud Node, embedded zookeeper, local management (mostly
> >   unused, maybe not loaded)
> >   2. A few Cloud nodes (2-5), embedded zookeeper, local management
> >   3. Moderate cloud (6-12 nodes), embedded zookeeper on subset of nodes,
> >   local management
> >   4. Large cloud 13-25 nodes, external zookeeper, local management
> >   5. > 25 nodes, external zookeeper, management local or external.
> >   6. >100 nodes, recommended external zk and management
> >
> > Thus folks doing moderate stuff don't need to bother with installing
> > anything other than Solr. Somewhere along that scale they would likely
> > start using tlog and then tlog/pull setups as well. Ideally we would
> have a
> > clear path to make these transitions with minimal downtime.
> >
> > So if we can fit what these classes do into that dream, great. If they
> > point elsewhere meh.
> >
> > Note: of course none of this has anything to do with "user-managed" Solr
> > (a.k.a. legacy solr) which is managed manually by users and doesn't have
> zk.
> >
> >
> > On Mon, Jun 19, 2023, 4:42 PM David Smiley <[email protected]> wrote:
> >
> >> I noticed the SolrCloudManager concept added some time ago brought
> about to
> >> abstract away SolrCloud in the context of doing simulated experiments on
> >> auto-scaling.  Essentially -- need to simulate SolrCloud and not
> actually
> >> use a real SolrCloud.  But that need and code went away in 9.0...
> >> nonetheless SolrCloudManager and its friends (like
> DistributedStateManager)
> >> are still around.  I could imagine someone advocating for them
> >> nonetheless.  But the present state is very half-implemented as there is
> >> code all over the place that assumes ZooKeeper (e.g. uses SolrZkClient
> or
> >> ZkStateReader) instead of some of these abstractions.  I think there is
> a
> >> need to set a direction here -- do we embrace abstracting SolrCloud
> within
> >> Solr or do we revert this stuff as needless indirection / concepts.
> >>
> >> I think there's lots of room to debate / review the particulars of
> >> SolrCloudManager and friends if we do want to keep it.
> >> DistributedQueueFactory isn't even used anymore.  NodeStateProvider is
> only
> >> for AttributeFactory; not very obvious.  DistribStateManager is
> essentially
> >> SolrZkClient but nonetheless still references ZK classes.
> >>
> >> ~ David Smiley
> >> Apache Lucene/Solr Search Developer
> >> http://www.linkedin.com/in/davidwsmiley
> >>
>
> _______________________
> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 |
> http://www.opensourceconnections.com <
> http://www.opensourceconnections.com/> | My Free/Busy <
> http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <
> https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless of
> whether attachments are marked as such.
>
>

-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)

Re: SolrCloudManager... DistribStateManager...

Reply via email to