On Fri, Oct 10, 2025 at 5:08 AM Ilan Ginzburg <[email protected]> wrote:

> Actually the current lock level - if a lock is needed - should not be
> REPLICA but SHARD due to the isUnique flag that can lead to updating other
> replicas of the shard.
>

Ugh.  It's a shame to lock on account of the possibility of
BALANCESHARDUNIQUE.  Perhaps to lock or not should be a parameter/option of
the command, or that command has advise on what *not* to do when calling
it.  I think it's fairly obvious that one should not  manipulate the
specific replica property involved in that command during the execution of
that command.


> The actual cluster state update does not need locks. In Overseer it is
> handled by a single thread and in distributed mode it uses CAS so all
> updates end up being serialized.
> I would therefore tend to agree that setting a property on a replica will
> not have bad interactions with other concurrent Collection API commands.
>

Thanks for confirming.


> Notes on waitForFinalState: due to the async nature of ZooKeeper watches,
> when a wait completes on a Solr node it doesn't mean the state update is
> also visible on other Solr nodes. In some cases with commands running
> across multiple nodes, some of the nodes might not have seen the updates
> which could make them fail (for example creating a core on a remote Solr
> node for a replica that is not yet visible there?).
>

Understood.  With the _stateVer_ protocol between CloudSolrClient and the
server, that is somewhat solved, but that mechanism has room for
improvement.  I have a draft plan/notes to do so.  Anyway, today with
waitForFinalState=false (the default), a replica has to sync with the
leader and this takes time; potentially minutes.

Also, always waiting for state can be inefficient when doing multiple state
> changes. For example creating multiple replicas during collection creation.
> We don't want to wait for each replica separately which currently happens
> if WAIT_FOR_FINAL_STATE is set to true in the collection creation message,
> and with the PR will happen if WAIT_FOR_FINAL_STATE is *not set* to false
> in the collection creation message, and if the notion of waiting for state
> is completely removed as suggested in a PR comment
> <https://github.com/apache/solr/pull/3684/files#r2377582861> then creating
> a collection will be slower. We likely want to group the replica creation
> for a new collection always, and then wait (or not wait) for all of them to
> be visible.
>

Very good point!  I'll follow up on that PR to consider this.

~ David

Reply via email to