I replied in JIRA
https://issues.apache.org/jira/browse/SOLR-17712?focusedCommentId=18029152&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-18029152
noting that replicas are created in parallel already

On Sun, Oct 12, 2025 at 2:33 PM Ilan Ginzburg <[email protected]> wrote:

> I forgot in my previous message: credit to Pierre Salagnac for noting the
> inefficiency of current collection creation and the need to create all
> replicas at once.
>
> On Sat, Oct 11, 2025 at 12:23 AM David Smiley <[email protected]> wrote:
>
> > On Fri, Oct 10, 2025 at 5:08 AM Ilan Ginzburg <[email protected]>
> wrote:
> >
> > > Actually the current lock level - if a lock is needed - should not be
> > > REPLICA but SHARD due to the isUnique flag that can lead to updating
> > other
> > > replicas of the shard.
> > >
> >
> > Ugh.  It's a shame to lock on account of the possibility of
> > BALANCESHARDUNIQUE.  Perhaps to lock or not should be a parameter/option
> of
> > the command, or that command has advise on what *not* to do when calling
> > it.  I think it's fairly obvious that one should not  manipulate the
> > specific replica property involved in that command during the execution
> of
> > that command.
> >
> >
> > > The actual cluster state update does not need locks. In Overseer it is
> > > handled by a single thread and in distributed mode it uses CAS so all
> > > updates end up being serialized.
> > > I would therefore tend to agree that setting a property on a replica
> will
> > > not have bad interactions with other concurrent Collection API
> commands.
> > >
> >
> > Thanks for confirming.
> >
> >
> > > Notes on waitForFinalState: due to the async nature of ZooKeeper
> watches,
> > > when a wait completes on a Solr node it doesn't mean the state update
> is
> > > also visible on other Solr nodes. In some cases with commands running
> > > across multiple nodes, some of the nodes might not have seen the
> updates
> > > which could make them fail (for example creating a core on a remote
> Solr
> > > node for a replica that is not yet visible there?).
> > >
> >
> > Understood.  With the _stateVer_ protocol between CloudSolrClient and the
> > server, that is somewhat solved, but that mechanism has room for
> > improvement.  I have a draft plan/notes to do so.  Anyway, today with
> > waitForFinalState=false (the default), a replica has to sync with the
> > leader and this takes time; potentially minutes.
> >
> > Also, always waiting for state can be inefficient when doing multiple
> state
> > > changes. For example creating multiple replicas during collection
> > creation.
> > > We don't want to wait for each replica separately which currently
> happens
> > > if WAIT_FOR_FINAL_STATE is set to true in the collection creation
> > message,
> > > and with the PR will happen if WAIT_FOR_FINAL_STATE is *not set* to
> false
> > > in the collection creation message, and if the notion of waiting for
> > state
> > > is completely removed as suggested in a PR comment
> > > <https://github.com/apache/solr/pull/3684/files#r2377582861> then
> > creating
> > > a collection will be slower. We likely want to group the replica
> creation
> > > for a new collection always, and then wait (or not wait) for all of
> them
> > to
> > > be visible.
> > >
> >
> > Very good point!  I'll follow up on that PR to consider this.
> >
> > ~ David
> >
>

Reply via email to