I replied in JIRA https://issues.apache.org/jira/browse/SOLR-17712?focusedCommentId=18029152&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-18029152 noting that replicas are created in parallel already
On Sun, Oct 12, 2025 at 2:33 PM Ilan Ginzburg <[email protected]> wrote: > I forgot in my previous message: credit to Pierre Salagnac for noting the > inefficiency of current collection creation and the need to create all > replicas at once. > > On Sat, Oct 11, 2025 at 12:23 AM David Smiley <[email protected]> wrote: > > > On Fri, Oct 10, 2025 at 5:08 AM Ilan Ginzburg <[email protected]> > wrote: > > > > > Actually the current lock level - if a lock is needed - should not be > > > REPLICA but SHARD due to the isUnique flag that can lead to updating > > other > > > replicas of the shard. > > > > > > > Ugh. It's a shame to lock on account of the possibility of > > BALANCESHARDUNIQUE. Perhaps to lock or not should be a parameter/option > of > > the command, or that command has advise on what *not* to do when calling > > it. I think it's fairly obvious that one should not manipulate the > > specific replica property involved in that command during the execution > of > > that command. > > > > > > > The actual cluster state update does not need locks. In Overseer it is > > > handled by a single thread and in distributed mode it uses CAS so all > > > updates end up being serialized. > > > I would therefore tend to agree that setting a property on a replica > will > > > not have bad interactions with other concurrent Collection API > commands. > > > > > > > Thanks for confirming. > > > > > > > Notes on waitForFinalState: due to the async nature of ZooKeeper > watches, > > > when a wait completes on a Solr node it doesn't mean the state update > is > > > also visible on other Solr nodes. In some cases with commands running > > > across multiple nodes, some of the nodes might not have seen the > updates > > > which could make them fail (for example creating a core on a remote > Solr > > > node for a replica that is not yet visible there?). > > > > > > > Understood. With the _stateVer_ protocol between CloudSolrClient and the > > server, that is somewhat solved, but that mechanism has room for > > improvement. I have a draft plan/notes to do so. Anyway, today with > > waitForFinalState=false (the default), a replica has to sync with the > > leader and this takes time; potentially minutes. > > > > Also, always waiting for state can be inefficient when doing multiple > state > > > changes. For example creating multiple replicas during collection > > creation. > > > We don't want to wait for each replica separately which currently > happens > > > if WAIT_FOR_FINAL_STATE is set to true in the collection creation > > message, > > > and with the PR will happen if WAIT_FOR_FINAL_STATE is *not set* to > false > > > in the collection creation message, and if the notion of waiting for > > state > > > is completely removed as suggested in a PR comment > > > <https://github.com/apache/solr/pull/3684/files#r2377582861> then > > creating > > > a collection will be slower. We likely want to group the replica > creation > > > for a new collection always, and then wait (or not wait) for all of > them > > to > > > be visible. > > > > > > > Very good point! I'll follow up on that PR to consider this. > > > > ~ David > > >
