Would per-replica state (PRS) help with that? That slices by replica, not 
collection, but it should allow finer-grained locking.

https://searchscale.com/blog/prs/
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jul 16, 2024, at 9:03 AM, David Smiley <dsmi...@apache.org> wrote:
> 
> At work, in a scenario when a node starts with thousands of cores for
> thousands of collections, we've seen that core registration can
> bottleneck on ZkStateReader.forceUpdateCollection(collection) which
> synchronizes on getUpdateLock, a global lock (not per-collection).  I
> don't know the history or strategy behind that lock, but it's a
> code-smell to see a global lock that is used in a circumstance that is
> scoped to one collection.  I suspect it's there because ClusterState
> is immutable and encompasses basically all state.  If it was instead a
> cache that can be snapshotted (for consumers that require an immutable
> state to act on), we could probably make getUpdateLock go away.  *If*
> a collection's state needs to be locked (and I'm suspicious that it
> is, so long as cache insertion is done properly / exclusively), we
> could have a lock just for the collection.
> 
> Any concerns with this idea?
> 
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
> 

Reply via email to