Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]

via GitHub Wed, 20 Mar 2024 05:40:58 -0700


benwtrent commented on PR #13190:
URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009457766


   > How do we control the risk that a massive merge with KNN vectors soaks up 
all available concurrency from the shared Executor for intra-merge concurrency 
(all threads doing HNSW merging) and then starves smaller merges that would 
finish quickly?
   
   Intra-merge threads do not count against the `maxThreadCount` and are not 
tracked in `mergeThreads`. Additionally, intra-merge threads only allow up to 
`maxThreadCount - mergeThreads.size() - 1` threads to ever run on its pool.
   
   With this change, it is possible that a chunk of work delegated starves an 
intra-merge thread and counts against that parallelism, but it wouldn't block 
other `mergeThreads` from running. Those other `mergeThreads` just might have 
to run on themselves instead of off of the intra-merge thread-pool.
   
   I do think there is some future work here to unify thread tracking between 
all the inter&intra merge threads, but all the custom logic in `mergeThreads` 
and how we create, track, etc. just seemed like too much to figure out in 
addition to adding intra-merge parallelism.
   
   > Maybe we don't need the IO write rate limiter anymore?
   
   I honestly don't know. There is still logic in the CMS that determines if we 
are on a spinning disk vs. SSD, which is sort of crazy to me :D.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]

Reply via email to