Actually the current lock level - if a lock is needed - should not be REPLICA but SHARD due to the isUnique flag that can lead to updating other replicas of the shard.
The actual cluster state update does not need locks. In Overseer it is handled by a single thread and in distributed mode it uses CAS so all updates end up being serialized. I would therefore tend to agree that setting a property on a replica will not have bad interactions with other concurrent Collection API commands. Notes on waitForFinalState: due to the async nature of ZooKeeper watches, when a wait completes on a Solr node it doesn't mean the state update is also visible on other Solr nodes. In some cases with commands running across multiple nodes, some of the nodes might not have seen the updates which could make them fail (for example creating a core on a remote Solr node for a replica that is not yet visible there?). Also, always waiting for state can be inefficient when doing multiple state changes. For example creating multiple replicas during collection creation. We don't want to wait for each replica separately which currently happens if WAIT_FOR_FINAL_STATE is set to true in the collection creation message, and with the PR will happen if WAIT_FOR_FINAL_STATE is *not set* to false in the collection creation message, and if the notion of waiting for state is completely removed as suggested in a PR comment <https://github.com/apache/solr/pull/3684/files#r2377582861> then creating a collection will be slower. We likely want to group the replica creation for a new collection always, and then wait (or not wait) for all of them to be visible. Ilan On Fri, Oct 10, 2025 at 1:27 AM David Smiley <[email protected]> wrote: > For the SolrCloud experts here... I'm looking at this LockLevel: > > org.apache.solr.common.params.CollectionParams.CollectionAction#ADDREPLICAPROP > I'm doubtful that replica property manipulation needs a mutual exclusion > lock against other things in the cluster. I believe a collection's state > is modified in a CAS manner with a retry if the basis of the state version > (the znode version of the collection's state) is different. > Adding/removing properties seems simple/atomic enough so as to not need a > lock level. Am I missing something? > > This is in the context of a PR that will change waitForFinalState in order > to help a test: > https://github.com/apache/solr/pull/3684 > > ~ David Smiley > Apache Lucene/Solr Search Developer > http://www.linkedin.com/in/davidwsmiley >
