Not sure we should workaround an imperfect design by adding throttling.
Plus this throttling would be at the cluster level, while the issue with
distributed mode is when concurrent updates are done for the same
collection.

I agree not having a single node that does all the updates looks much
better on paper. But unfortunately, it seems to be not so easy in practice.
AFAIK distributed mode hasn't been used by anyone at scale yet. There are
many changes required before we can consider it as stable.

For example, discussing this specific issue when deleting a big collection,
there is no reason each node removes its replicas from the ZK collection
state. We want the collection to be removed, so why not just remove it from
Zookeeper with a single operation? I think some admin operations must be
redesigned. We can't assume what scales well with the overseer will scale
well with "no-overseer" mode.
Unfortunately, I don't have enough experience with running distributed
updates to say whether other major problems exist.

Making it the default in Solr 10 would mean we are confident it works well.
Personally, I'm not confident yet.

Best,
Pierre

Le lun. 29 sept. 2025 à 15:36, David Smiley <[email protected]> a écrit :

> About the particular scenario:  couldn't we just limit the number of
> in-flight requests?  Cluster admin commands use HttpShardHandler, and it
> has an executor defaulting to Integer.MAX_VALUE.  This is configurable in
> solr.xml with "maximumPoolSize" but I'm not certain if that same instance
> of ShardHandler is also used for cluster admin commands.  We ought to have
> separately configurable ones.
>
> I don't think this should prevent shipping a system that is objectively way
> simpler than the Overseer.  Solr 10 will have both modes, no matter what
> the default is.  Changing the default makes it easier to remove it in Solr
> 11.  The impact on ease of understanding SolrCloud in 11 will be amazing!
> It can't be seen now because we can't remove the needless ZNodeProps admin
> command conversions that exist simply because we put commands into ZK
> queues.
>
> ~ David
>
> On Mon, Sep 29, 2025 at 9:00 AM Pierre Salagnac <[email protected]>
> wrote:
>
> > Indeed, we're having some issues with "no-overseer" mode and we decided
> to
> > hold off deployment of this feature for now.
> >
> > This is mostly because of a design flaw in distributed cluster updates.
> > When a collection reaches a given size, we end up having many "writers"
> > that want to update the state.json of this collection concurrently. Since
> > they all do Zookeeper "check-and-write" operations using the version
> > number, there is no data corruption, but this puts a huge load on the
> > cluster.
> >
> > This issue has a major impact when deleting a large collection (1000+
> > cores). The collection API sends 1000+ UNLOAD core admin requests that
> are
> > processed concurrently, and each one wants to update the 'state.json'
> file
> > of the collection. We enter a tidal wave of writes operations that are
> > rejected by Zookeeper.
> >
> > This mode probably works pretty well and scales better when there are
> many
> > small collections in the cluster. Updates to these collections can then
> be
> > done concurrently since the probability of having write conflicts is low.
> > That's the opposite for big collections, where it seems the overseer
> scales
> > better.
> >
> > Best,
> > Pierre
> >
> >
> > Le ven. 26 sept. 2025 à 15:37, Jason Gerlowski <[email protected]> a
> > écrit :
> >
> > > In the latest Virtual Community Meetup, Pierre and Bruno shared that
> > > they recently enabled "no-overseer" mode (really need a better name!)
> > > at their workplace in production and were hitting some stability
> > > issues.  Their anecdotes give me at least some "pause" about whether
> > > this is really ready to be the default, or whether it needs a bit more
> > > time to bake.
> > >
> > > Any chance either of them are watching this thread and can provide an
> > > update or their own 2c here?
> > >
> > > Best,
> > >
> > > Jason
> > >
> > > On Wed, Aug 20, 2025 at 6:35 PM David Smiley <[email protected]>
> wrote:
> > > >
> > > > Update:  I proposed the Solr version go into cluster properties[1] on
> > ZK
> > > > initialization but Houston pushed back on that approach.  The other
> > > > approach is to rely on the "least Solr version" as returned by the
> Solr
> > > > version being added to live nodes[2].  However that's very dynamic
> and
> > > I'm
> > > > concerned about an old Solr version somehow joining an existing
> > cluster.
> > > > Rather than simply tell users "don't do that" (which we should do
> > > anyway),
> > > > I'm inclined to have Solr fail to join an existing cluster if doing
> so
> > > > would *lower* the least Solr version.  Perhaps ignoring the final
> patch
> > > > version.  I plan on updating that PR accordingly.  The PR would have
> to
> > > go
> > > > into 9.x too, and thus such logic can't be enforced for older Solr
> > > > versions.  Regardless, an upgrading user can choose the setting.
> > > >
> > > > [1] https://issues.apache.org/jira/browse/SOLR-17664
> > > > [2] https://issues.apache.org/jira/browse/SOLR-17620
> > > >
> > > > On Tue, Jul 22, 2025 at 1:12 AM David Smiley <[email protected]>
> > wrote:
> > > >
> > > > > On the epic of seeking the _eventual_ demise of the Overseer, I'm
> > > seeking
> > > > > to make it disabled for *new* SolrCloud clusters in Solr 10. --
> > > > > https://issues.apache.org/jira/browse/SOLR-17293
> > > > > The epic: https://issues.apache.org/jira/browse/SOLR-14927 (oddly
> no
> > > SIP
> > > > > but it has a doc anyway).  I think it's sufficiently ready for the
> > > great
> > > > > majority of SolrCloud clusters.  A cluster with a collection
> > > containing a
> > > > > thousand+ replicas might pose a performance concern on
> start/restart
> > > events
> > > > > due to independent replica state updates.  Of course, with such a
> > > change,
> > > > > there will be a section in the upgrade page in the ref guide to
> > advise
> > > > > users who may opt to make an explicit choice.
> > > > >
> > > > > I don't love that the new mode doesn't have an elegant/clear way to
> > > refer
> > > > > to it.  The best I've come up with is to say what it *isn't* -- it
> > > *isn't*
> > > > > the Overseer.  "The Overseer is disabled".  Awkwardly there are two
> > > > > undocumented solr.xml booleans, both a mouthful:
> > > > > distributedClusterStateUpdates and
> > > > > distributedCollectionConfigSetExecution.  I propose instead that a
> > > single
> > > > > boolean cluster property be defined named "overseer" or
> > > "overseerEnabled".
> > > > > FYI the existing known cluster properties are defined here:
> > > > >  org.apache.solr.common.cloud.ZkStateReader#KNOWN_CLUSTER_PROPS
> > > > > Even if such a boolean is agreeable... it raises the question of
> what
> > > > > should become of the "overseer" node role.
> > > > >
> > > > > ~ David Smiley
> > > > > Apache Lucene/Solr Search Developer
> > > > > http://www.linkedin.com/in/davidwsmiley
> > > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [email protected]
> > > For additional commands, e-mail: [email protected]
> > >
> > >
> >
>

Reply via email to