Actually the current lock level - if a lock is needed - should not be
REPLICA but SHARD due to the isUnique flag that can lead to updating other
replicas of the shard.

The actual cluster state update does not need locks. In Overseer it is
handled by a single thread and in distributed mode it uses CAS so all
updates end up being serialized.
I would therefore tend to agree that setting a property on a replica will
not have bad interactions with other concurrent Collection API commands.

Notes on waitForFinalState: due to the async nature of ZooKeeper watches,
when a wait completes on a Solr node it doesn't mean the state update is
also visible on other Solr nodes. In some cases with commands running
across multiple nodes, some of the nodes might not have seen the updates
which could make them fail (for example creating a core on a remote Solr
node for a replica that is not yet visible there?).
Also, always waiting for state can be inefficient when doing multiple state
changes. For example creating multiple replicas during collection creation.
We don't want to wait for each replica separately which currently happens
if WAIT_FOR_FINAL_STATE is set to true in the collection creation message,
and with the PR will happen if WAIT_FOR_FINAL_STATE is *not set* to false
in the collection creation message, and if the notion of waiting for state
is completely removed as suggested in a PR comment
<https://github.com/apache/solr/pull/3684/files#r2377582861> then creating
a collection will be slower. We likely want to group the replica creation
for a new collection always, and then wait (or not wait) for all of them to
be visible.

Ilan


On Fri, Oct 10, 2025 at 1:27 AM David Smiley <[email protected]> wrote:

> For the SolrCloud experts here... I'm looking at this LockLevel:
>
> org.apache.solr.common.params.CollectionParams.CollectionAction#ADDREPLICAPROP
> I'm doubtful that replica property manipulation needs a mutual exclusion
> lock against other things in the cluster.  I believe a collection's state
> is modified in a CAS manner with a retry if the basis of the state version
> (the znode version of the collection's state) is different.
> Adding/removing properties seems simple/atomic enough so as to not need a
> lock level.  Am I missing something?
>
> This is in the context of a PR that will change waitForFinalState in order
> to help a test:
> https://github.com/apache/solr/pull/3684
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>

Reply via email to