benwtrent commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009457766
> How do we control the risk that a massive merge with KNN vectors soaks up all available concurrency from the shared Executor for intra-merge concurrency (all threads doing HNSW merging) and then starves smaller merges that would finish quickly? Intra-merge threads do not count against the `maxThreadCount` and are not tracked in `mergeThreads`. Additionally, intra-merge threads only allow up to `maxThreadCount - mergeThreads.size() - 1` threads to ever run on its pool. With this change, it is possible that a chunk of work delegated starves an intra-merge thread and counts against that parallelism, but it wouldn't block other `mergeThreads` from running. Those other `mergeThreads` just might have to run on themselves instead of off of the intra-merge thread-pool. I do think there is some future work here to unify thread tracking between all the inter&intra merge threads, but all the custom logic in `mergeThreads` and how we create, track, etc. just seemed like too much to figure out in addition to adding intra-merge parallelism. > Maybe we don't need the IO write rate limiter anymore? I honestly don't know. There is still logic in the CMS that determines if we are on a spinning disk vs. SSD, which is sort of crazy to me :D. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org