Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2032358113 @mikemccand oh dang, I haven't been doing that. Thanks for picking up my slack! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
mikemccand commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2032311146 It looks like this awesome change was backported for 9.11.0? I'll add the milestone. So hard to remember to set the milestones on our issues/PRs... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
dnhatn commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2016724945 Elasticsearch CI has identified an issue related to this change. The PerFieldDocValuesFormat and PerFieldPostingsFormat, which mutate and reset the fieldInfos of the mergeState while executing a merge. Consequently, other ongoing merges may fail to access some fieldInfos. https://github.com/apache/lucene/blob/e2788336d49c56050b3cea5944008bcffe35dcb1/lucene/core/src/java/org/apache/lucene/codecs/perfield/PerFieldDocValuesFormat.java#L150-L158 https://github.com/apache/lucene/blob/04f335ad79cae019d1a552e41e0029d061d8e3d9/lucene/core/src/java/org/apache/lucene/codecs/perfield/PerFieldPostingsFormat.java#L196-L214 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent merged PR #13190: URL: https://github.com/apache/lucene/pull/13190 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2011813580 > Nightly benchy hardwires the maxMergeCount=16, maxThreadCount=12 Do you remember how you came up with these numbers? Is there some reasoning behind these numbers, or do they come from experimentation? I'm also curious if you know where the current `maxThreadCount = max(1, min(4, coreCount / 2))` is coming from. Is the `4` number trying to approximate the number of tiers of `TieredMergePolicy` (max seg size = 5GB, 500MB, 50MB, 5MB ~= 2MB = min seg size) in order to allow one merge on each tier to run concurrently? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
mikemccand commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2011765808 > Thanks for pushing on this change, I like it. The fact that the extra merge concurrency may not starve merging from threads is good. I'm curious how the nightly benchmarks will react to it, given the high number of CPU cores of beast3. Nightly benchy is still forcing its own thread pool into the HNSW indexing Codec component (`Lucene99Hnsw*VectorsFormat`) -- once this change is merged, I'll remove that so we switch to this awesome default approach. But maybe we won't see too much change in the nightly indexing throughput since it's already doing intra-merge concurrency "itself". > One question on my mind is whether this change should make us update the default number of merging threads that `ConcurrentMergeScheduler` configures (in a follow-up issue/PR). +1 to explore this. Nightly benchy hardwires the maxMergeCount=16, maxThreadCount=12. Maybe they should be higher :) But also note that the nightly benchy does not wait for merges on `IndexWriter.close`, so it's not really a fair test of merge performance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on code in PR #13190: URL: https://github.com/apache/lucene/pull/13190#discussion_r1533418454 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -445,8 +456,15 @@ private static String rateToString(double mbPerSec) { } @Override - public void close() { -sync(); + public void close() throws IOException { +super.close(); Review Comment: nit: should we try to close as best as possible in the event of an exception, e.g. doing something like below ```java IOUtils.close( this::sync, super::close, intraMergeExecutor == null ? null : intraMergeExecutor::shutdown); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on code in PR #13190: URL: https://github.com/apache/lucene/pull/13190#discussion_r1533418454 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -445,8 +456,15 @@ private static String rateToString(double mbPerSec) { } @Override - public void close() { -sync(); + public void close() throws IOException { +super.close(); Review Comment: nit: should we try to close as cleanly as possible in the event of an exception, e.g. doing something like below ```java IOUtils.close( this::sync, super::close, intraMergeExecutor == null ? null : intraMergeExecutor::shutdown); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2010582889 @jpountz added comments to the creation of the ratelimiter in CMS and the limiter itself. Also updated the close interaction. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2010548779 @benwtrent FYI I left a few minor comments: on `ConcurrentMergeScheduler#close`, and about adding a comment for the limitation you identified that a merge may not get throttled if the sum of bytes written exceeds the rate but none of the merge threads for the same merge exceed the rate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2010260555 If y'all are good with it, I will merge this tomorrow. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
mikemccand commented on code in PR #13190: URL: https://github.com/apache/lucene/pull/13190#discussion_r1532364018 ## lucene/core/src/java/org/apache/lucene/index/MergeRateLimiter.java: ## @@ -118,24 +118,32 @@ private long maybePause(long bytes, long curNS) throws MergePolicy.MergeAbortedE throw new MergePolicy.MergeAbortedException("Merge aborted."); } -double rate = mbPerSec; // read from volatile rate once. -double secondsToPause = (bytes / 1024. / 1024.) / rate; - -// Time we should sleep until; this is purely instantaneous -// rate (just adds seconds onto the last time we had paused to); -// maybe we should also offer decayed recent history one? -long targetNS = lastNS + (long) (10 * secondsToPause); - -long curPauseNS = targetNS - curNS; - -// We don't bother with thread pausing if the pause is smaller than 2 msec. -if (curPauseNS <= MIN_PAUSE_NS) { - // Set to curNS, not targetNS, to enforce the instant rate, not - // the "averaged over all history" rate: - lastNS = curNS; +final double rate = mbPerSec; // read from volatile rate once. +final double secondsToPause = (bytes / 1024. / 1024.) / rate; + +AtomicLong curPauseNSSetter = new AtomicLong(); +lastNS.updateAndGet( +last -> { + // Time we should sleep until; this is purely instantaneous + // rate (just adds seconds onto the last time we had paused to); + // maybe we should also offer decayed recent history one? + long targetNS = last + (long) (10 * secondsToPause); + long curPauseNS = targetNS - curNS; + // We don't bother with thread pausing if the pause is smaller than 2 msec. + if (curPauseNS <= MIN_PAUSE_NS) { +// Set to curNS, not targetNS, to enforce the instant rate, not +// the "averaged over all history" rate: +curPauseNSSetter.set(0); +return curNS; + } Review Comment: > > is giving us here, ensuring we accurately account for all bytes written by N threads within a single merge > > That is not what it is doing. It is ensuring that the throttling timestamps stay in sync :/. Bytes being throttled are still per RateLimitedIndexOutput, which means they are per thread. Ahh gotchya. Each `RateLimitedIndexOutput` tracks its own `bytesSinceLastPause` and then invokes `pause` with its own private byte count, OK. So, yes, with the current approach we rate limit MB/sec per thread, not per merge. I think that's fine. Best effort! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009561079 > Maybe we don't need the IO write rate limiter anymore? Nevermind, I see that you added more details on #13193. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009512368 > Maybe we don't need the IO write rate limiter anymore? I'm curious what you have in mind. Are you considering throttling merges purely based on merge concurrency then? E.g. slow indexing -> single merge at a time, heavy indexing -> many parallel merges? This wouldn't sound crazy to me as I don't recally seeing indexing/merging saturate I/O. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009500202 I re-ran `IndexGeoNames` with the latest commits on this PR. I didn't override CMS settings at all & I allowed merging to occur while indexing. This was done on my macbook, so the numbers might be a little fuzzy. I indexed with 4 threads, and did the batching. RamBufferSize: 12MB Baseline: ``` 8221789: 28.378 sec forceMerge took 18856 ms ``` Candidate: ``` 8221789: 27.428 sec forceMerge took 6581 ms ``` RamBufferSize: 1MB Baseline: ``` 8221789: 49.209 sec forceMerge took 17235 ms ``` Candidate: ``` 8221789: 46.078 sec forceMerge took 4638 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
mikemccand commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009418841 > Selfishly, I wish for the second option as its the simplest. +1 to keep with the second approach. It is best effort, and, the "instant vs burst-bucket" bug above is yet more weirdness about it :) Another thought on the "instant vs burst-bucket" bug: I suppose in some case, e.g. bulk merging of stored fields (no deletions, so we are just copying byte blocks), there is very little CPU and tons of IO and in cases like that, the instant vs burst-bucket approaches would be essentially the same (both wind up throttling based on the instant rate since the burst bucket is quickly exhausted). I'll open a spinoff ... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009457766 > How do we control the risk that a massive merge with KNN vectors soaks up all available concurrency from the shared Executor for intra-merge concurrency (all threads doing HNSW merging) and then starves smaller merges that would finish quickly? Intra-merge threads do not count against the `maxThreadCount` and are not tracked in `mergeThreads`. Additionally, intra-merge threads only allow up to `maxThreadCount - mergeThreads.size() - 1` threads to ever run on its pool. With this change, it is possible that a chunk of work delegated starves an intra-merge thread and counts against that parallelism, but it wouldn't block other `mergeThreads` from running. Those other `mergeThreads` just might have to run on themselves instead of off of the intra-merge thread-pool. I do think there is some future work here to unify thread tracking between all the inter merge threads, but all the custom logic in `mergeThreads` and how we create, track, etc. just seemed like too much to figure out in addition to adding intra-merge parallelism. > Maybe we don't need the IO write rate limiter anymore? I honestly don't know. There is still logic in the CMS that determines if we are on a spinning disk vs. SSD, which is sort of crazy to me :D. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
mikemccand commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009439976 I opened https://github.com/apache/lucene/issues/13193 specifically about the "instant vs burst bucket" IO rate limiting approaches. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
mikemccand commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009413290 Phew, it took me two hours of strongly caffeinated time to catch up on this! Thank you @benwtrent. What an awesome change, modernizing Lucene's merge concurrency so that intra-merge (just HNSW, but maybe other index parts soon) concurrency is enabled by default. (Separately, the nightly benchy had a silly off-by-one bug preventing it from resuming after you reverted the previous PR ... I think I've fixed that bug now, and kicked off a one-off nightly benchy run that will hopefully finish late today). Some high level questions (sorry if these were already asked/answered): * How do we control the risk that a massive merge with KNN vectors soaks up all available concurrency from the shared Executor for intra-merge concurrency (all threads doing HNSW merging) and then starves smaller merges that would finish quickly? * Maybe we don't need the IO write rate limiter anymore? It's a tricky setting because the `mbPerSec` you set is then multiplied by the number of concurrent merges that are running. It is a per-merge setting, not a global setting, so it's kinda trappy today. Finally, I worry that the rate-limiter's simplistic "instantaneous" measure is making it effectively super buggy (there is a TODO about this), because merging that does a lot of CPU work and then writes a lot of bytes will be effectively throttled to far below the target `mbPerSec` in aggregate. A better solution might be something like the "burst IOPs bucket" that AWS (and likely other cloud providers) offer ("Burst IOPS is a feature of Amazon Web Services (AWS) EBS volume types that allows applications to store unused IOPS in a burst bucket and then drain them when needed" -- thank you Gemini for the summary). This is clearly not a blocker for this issue, but we really should (separately) fix it. I'll open a follow-on issue about this ... but to me it's another reason to maybe remove this feature entirely. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
mikemccand commented on code in PR #13190: URL: https://github.com/apache/lucene/pull/13190#discussion_r1531945845 ## lucene/core/src/java/org/apache/lucene/index/MergeRateLimiter.java: ## @@ -118,24 +118,32 @@ private long maybePause(long bytes, long curNS) throws MergePolicy.MergeAbortedE throw new MergePolicy.MergeAbortedException("Merge aborted."); } -double rate = mbPerSec; // read from volatile rate once. -double secondsToPause = (bytes / 1024. / 1024.) / rate; - -// Time we should sleep until; this is purely instantaneous -// rate (just adds seconds onto the last time we had paused to); -// maybe we should also offer decayed recent history one? -long targetNS = lastNS + (long) (10 * secondsToPause); - -long curPauseNS = targetNS - curNS; - -// We don't bother with thread pausing if the pause is smaller than 2 msec. -if (curPauseNS <= MIN_PAUSE_NS) { - // Set to curNS, not targetNS, to enforce the instant rate, not - // the "averaged over all history" rate: - lastNS = curNS; +final double rate = mbPerSec; // read from volatile rate once. +final double secondsToPause = (bytes / 1024. / 1024.) / rate; + +AtomicLong curPauseNSSetter = new AtomicLong(); +lastNS.updateAndGet( +last -> { + // Time we should sleep until; this is purely instantaneous + // rate (just adds seconds onto the last time we had paused to); + // maybe we should also offer decayed recent history one? + long targetNS = last + (long) (10 * secondsToPause); + long curPauseNS = targetNS - curNS; + // We don't bother with thread pausing if the pause is smaller than 2 msec. + if (curPauseNS <= MIN_PAUSE_NS) { +// Set to curNS, not targetNS, to enforce the instant rate, not +// the "averaged over all history" rate: +curPauseNSSetter.set(0); +return curNS; + } Review Comment: Sorry I'm trying to catch up here and will probably ask a bunch of stupid questions :) Thank you @benwtrent for persisting in this hairy logic! > (I'm considering making `pause` synchronized rather than `maybePause` so that `System.nanoTime()` is computed within the lock and the pausing logic accounts for the fact that some time may have been spent waiting on the lock already.) Couldn't we move the `curNS = System.nanoTime()` inside the locked/updateAndGet'd part of `maybePause`? I like the thread safety that the `updateAndGet` is giving us here, ensuring we accurately account for all bytes written by N threads within a single merge. Also, I don't expect the added sync required here will hurt performance much: Lucene is doing much work to produce these bytes being written already, so conflict should be rare-ish. Also, if we make `pause` sync'd, it will cause scary looking `jstack` thread dumps? Making it look like one thread is sleeping while holding a lock and blocking other threads (which indeed is what it'd be doing). Versus the thread stacks we'd see w/ the current approach that make it quite clear that all N threads are intentionally stalling in `mergeProgress.pauseNanos`? It would reduce the horror reaction we'd see peeking at thread dumps maybe ... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on code in PR #13190: URL: https://github.com/apache/lucene/pull/13190#discussion_r1531964164 ## lucene/core/src/java/org/apache/lucene/index/MergeRateLimiter.java: ## @@ -118,24 +118,32 @@ private long maybePause(long bytes, long curNS) throws MergePolicy.MergeAbortedE throw new MergePolicy.MergeAbortedException("Merge aborted."); } -double rate = mbPerSec; // read from volatile rate once. -double secondsToPause = (bytes / 1024. / 1024.) / rate; - -// Time we should sleep until; this is purely instantaneous -// rate (just adds seconds onto the last time we had paused to); -// maybe we should also offer decayed recent history one? -long targetNS = lastNS + (long) (10 * secondsToPause); - -long curPauseNS = targetNS - curNS; - -// We don't bother with thread pausing if the pause is smaller than 2 msec. -if (curPauseNS <= MIN_PAUSE_NS) { - // Set to curNS, not targetNS, to enforce the instant rate, not - // the "averaged over all history" rate: - lastNS = curNS; +final double rate = mbPerSec; // read from volatile rate once. +final double secondsToPause = (bytes / 1024. / 1024.) / rate; + +AtomicLong curPauseNSSetter = new AtomicLong(); +lastNS.updateAndGet( +last -> { + // Time we should sleep until; this is purely instantaneous + // rate (just adds seconds onto the last time we had paused to); + // maybe we should also offer decayed recent history one? + long targetNS = last + (long) (10 * secondsToPause); + long curPauseNS = targetNS - curNS; + // We don't bother with thread pausing if the pause is smaller than 2 msec. + if (curPauseNS <= MIN_PAUSE_NS) { +// Set to curNS, not targetNS, to enforce the instant rate, not +// the "averaged over all history" rate: +curPauseNSSetter.set(0); +return curNS; + } Review Comment: > Couldn't we move the `curNS = System.nanoTime()` inside the locked/updateAndGet'd part of maybePause? Yes, we could, otherwise if there are conflicts between threads (thus having to do the `updateAndGet` more than once), the `curNS` gets further in the past. > is giving us here, ensuring we accurately account for all bytes written by N threads within a single merge That is not what it is doing. It is ensuring that the throttling timestamps stay in sync :/. Bytes being throttled are still per RateLimitedIndexOutput, which means they are per thread. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009404137 @mikemccand @jpountz So, thinking about this more as I fell asleep. This is how throttling will work as it is in this PR: - Throttling is per thread. Meaning, a intra-merge thread only gets throttled once the bytes it has written get to the rate limit set - Consequently, we may get to `rateLimitBytes*numIntraMergeIO` bytes before a single thread gets throttled. This makes us throttle less often given how many bytes are actually written. Maybe this is OK as the throttling logic could "catch up" if things continue to get backed up? Throttling has always been a "best effort" thing anyways. Even if we somehow made the RateLimiter throttle every thread (via some global semaphore or something...), we would still only throttle once one of the multiple threads hit the byte throttling limit. IMO, either we: - account for bytes globally (per directory) & throttle globally (global lock that pauses all threads) - accept that throttling is per thread and bytes used is measured in individual threads. Selfishly, I wish for the second option as its the simplest. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on code in PR #13190: URL: https://github.com/apache/lucene/pull/13190#discussion_r1531735553 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -446,7 +454,13 @@ private static String rateToString(double mbPerSec) { @Override public void close() { -sync(); +try { Review Comment: should we call `super.close()` as well, to close `MergeScheduler`'s `SameThreadExecutorService` which `ConcurrentMergeScheduler` uses for small merges? It's not a big deal not to close it, but it would help catch if we ever send tasks to this executor after closing. ## lucene/core/src/java/org/apache/lucene/index/MergeRateLimiter.java: ## @@ -118,24 +118,32 @@ private long maybePause(long bytes, long curNS) throws MergePolicy.MergeAbortedE throw new MergePolicy.MergeAbortedException("Merge aborted."); } -double rate = mbPerSec; // read from volatile rate once. -double secondsToPause = (bytes / 1024. / 1024.) / rate; - -// Time we should sleep until; this is purely instantaneous -// rate (just adds seconds onto the last time we had paused to); -// maybe we should also offer decayed recent history one? -long targetNS = lastNS + (long) (10 * secondsToPause); - -long curPauseNS = targetNS - curNS; - -// We don't bother with thread pausing if the pause is smaller than 2 msec. -if (curPauseNS <= MIN_PAUSE_NS) { - // Set to curNS, not targetNS, to enforce the instant rate, not - // the "averaged over all history" rate: - lastNS = curNS; +final double rate = mbPerSec; // read from volatile rate once. +final double secondsToPause = (bytes / 1024. / 1024.) / rate; + +AtomicLong curPauseNSSetter = new AtomicLong(); +lastNS.updateAndGet( Review Comment: > the output isn't really throttled until a single thread exceeds the limit. This limitation feels ok to me, let's just add a comment about it? Intuitively, the write rate at merge time is rather bursty, so if the sum of the bytes written by all threads running this merge exceeds the limit, then there would often be one thread that exceeds the limit on its own as well. ## lucene/core/src/java/org/apache/lucene/index/MergeRateLimiter.java: ## @@ -118,24 +118,32 @@ private long maybePause(long bytes, long curNS) throws MergePolicy.MergeAbortedE throw new MergePolicy.MergeAbortedException("Merge aborted."); } -double rate = mbPerSec; // read from volatile rate once. -double secondsToPause = (bytes / 1024. / 1024.) / rate; - -// Time we should sleep until; this is purely instantaneous -// rate (just adds seconds onto the last time we had paused to); -// maybe we should also offer decayed recent history one? -long targetNS = lastNS + (long) (10 * secondsToPause); - -long curPauseNS = targetNS - curNS; - -// We don't bother with thread pausing if the pause is smaller than 2 msec. -if (curPauseNS <= MIN_PAUSE_NS) { - // Set to curNS, not targetNS, to enforce the instant rate, not - // the "averaged over all history" rate: - lastNS = curNS; +final double rate = mbPerSec; // read from volatile rate once. +final double secondsToPause = (bytes / 1024. / 1024.) / rate; + +AtomicLong curPauseNSSetter = new AtomicLong(); +lastNS.updateAndGet( +last -> { + // Time we should sleep until; this is purely instantaneous + // rate (just adds seconds onto the last time we had paused to); + // maybe we should also offer decayed recent history one? + long targetNS = last + (long) (10 * secondsToPause); + long curPauseNS = targetNS - curNS; + // We don't bother with thread pausing if the pause is smaller than 2 msec. + if (curPauseNS <= MIN_PAUSE_NS) { +// Set to curNS, not targetNS, to enforce the instant rate, not +// the "averaged over all history" rate: +curPauseNSSetter.set(0); +return curNS; + } Review Comment: It's hard for me to reason about this as well. I'm wondering about keeping `maybePause` as-is and making `MergeRateLimiter#pause` synchronized, essentially trying to make the pausing logic behave as if threads were writing bytes sequentially rather than in parallel. (I'm considering making `pause` synchronized rather than `maybePause` so that `System.nanoTime()` is computed within the lock and the pausing logic accounts for the fact that some time may have been spent waiting on the lock already.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on code in PR #13190: URL: https://github.com/apache/lucene/pull/13190#discussion_r1531018293 ## lucene/core/src/java/org/apache/lucene/index/MergeRateLimiter.java: ## @@ -118,24 +118,32 @@ private long maybePause(long bytes, long curNS) throws MergePolicy.MergeAbortedE throw new MergePolicy.MergeAbortedException("Merge aborted."); } -double rate = mbPerSec; // read from volatile rate once. -double secondsToPause = (bytes / 1024. / 1024.) / rate; - -// Time we should sleep until; this is purely instantaneous -// rate (just adds seconds onto the last time we had paused to); -// maybe we should also offer decayed recent history one? -long targetNS = lastNS + (long) (10 * secondsToPause); - -long curPauseNS = targetNS - curNS; - -// We don't bother with thread pausing if the pause is smaller than 2 msec. -if (curPauseNS <= MIN_PAUSE_NS) { - // Set to curNS, not targetNS, to enforce the instant rate, not - // the "averaged over all history" rate: - lastNS = curNS; +final double rate = mbPerSec; // read from volatile rate once. +final double secondsToPause = (bytes / 1024. / 1024.) / rate; + +AtomicLong curPauseNSSetter = new AtomicLong(); +lastNS.updateAndGet( Review Comment: So, the limits are on a per `RateLimitedIndexOutput` bases, not necessarily per thread. Since all `RateLimitedIndexOutput` constructed share the same rate-limiter, it might pause all threads? But I don't think so, as each process could have its own `RateLimitedIndexOutput` which is tracking its own bytes locally and while the combined total of all threads might exceed the throttling limit, the output isn't really throttled until a single thread exceeds the limit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on code in PR #13190: URL: https://github.com/apache/lucene/pull/13190#discussion_r1531011510 ## lucene/core/src/java/org/apache/lucene/index/MergePolicy.java: ## @@ -136,13 +136,13 @@ public boolean isAborted() { */ public void pauseNanos(long pauseNanos, PauseReason reason, BooleanSupplier condition) throws InterruptedException { - if (Thread.currentThread() != owner) { + /* if (Thread.currentThread() != owner) { throw new RuntimeException( "Only the merge owner thread can call pauseNanos(). This thread: " + Thread.currentThread().getName() + ", owner thread: " + owner); - } + }*/ Review Comment: > I guess not, but now we need to make sure that the rate is applied across all threads rather than per-thread? Looking at the code, it seems like it was previously assumed that only one `createOutput` was called at a time for a merge. Now, it could be that more than one is called. I guess this means we need a `RateLimitingDirectory` that passes things in to the `RateLimitedIndexOutput` ensure global rate limiting is controlled across all outputs. Is this what you are talking about? I am not sure that `pauseNanos` should directly know about other threads and pause them. It seems better to put this up on the directory level. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on code in PR #13190: URL: https://github.com/apache/lucene/pull/13190#discussion_r1530907881 ## lucene/core/src/java/org/apache/lucene/store/RateLimitedIndexOutput.java: ## @@ -28,30 +30,30 @@ public final class RateLimitedIndexOutput extends FilterIndexOutput { private final RateLimiter rateLimiter; /** How many bytes we've written since we last called rateLimiter.pause. */ - private long bytesSinceLastPause; + private final AtomicLong bytesSinceLastPause = new AtomicLong(0); Review Comment: Ah, yeah, I could remove all the concurrency checks here. I was just concerned as the above assertions tripped because of the threads. but you are correct, they are still ultimately only written to via one thread. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on code in PR #13190: URL: https://github.com/apache/lucene/pull/13190#discussion_r1530384809 ## lucene/core/src/java/org/apache/lucene/index/MergePolicy.java: ## @@ -136,13 +136,13 @@ public boolean isAborted() { */ public void pauseNanos(long pauseNanos, PauseReason reason, BooleanSupplier condition) throws InterruptedException { - if (Thread.currentThread() != owner) { + /* if (Thread.currentThread() != owner) { throw new RuntimeException( "Only the merge owner thread can call pauseNanos(). This thread: " + Thread.currentThread().getName() + ", owner thread: " + owner); - } + }*/ Review Comment: I guess not, but now we need to make sure that the rate is applied across all threads rather than per-thread? ## lucene/core/src/java/org/apache/lucene/store/RateLimitedIndexOutput.java: ## @@ -28,30 +30,30 @@ public final class RateLimitedIndexOutput extends FilterIndexOutput { private final RateLimiter rateLimiter; /** How many bytes we've written since we last called rateLimiter.pause. */ - private long bytesSinceLastPause; + private final AtomicLong bytesSinceLastPause = new AtomicLong(0); Review Comment: I'm still unclear why this needs to become an `AtomicLong`, if `bytesSinceLastPause` could be updated/read concurrently, then this would imply that multiple `writeXXX` methods would also be called concurrently, which doesn't make much sense as bytes would be written to disk in a non-deterministic order? ## lucene/core/src/java/org/apache/lucene/index/MergeRateLimiter.java: ## @@ -118,24 +118,32 @@ private long maybePause(long bytes, long curNS) throws MergePolicy.MergeAbortedE throw new MergePolicy.MergeAbortedException("Merge aborted."); } -double rate = mbPerSec; // read from volatile rate once. -double secondsToPause = (bytes / 1024. / 1024.) / rate; - -// Time we should sleep until; this is purely instantaneous -// rate (just adds seconds onto the last time we had paused to); -// maybe we should also offer decayed recent history one? -long targetNS = lastNS + (long) (10 * secondsToPause); - -long curPauseNS = targetNS - curNS; - -// We don't bother with thread pausing if the pause is smaller than 2 msec. -if (curPauseNS <= MIN_PAUSE_NS) { - // Set to curNS, not targetNS, to enforce the instant rate, not - // the "averaged over all history" rate: - lastNS = curNS; +final double rate = mbPerSec; // read from volatile rate once. +final double secondsToPause = (bytes / 1024. / 1024.) / rate; + +AtomicLong curPauseNSSetter = new AtomicLong(); +lastNS.updateAndGet( Review Comment: If we used a lock for this whole method, then this may help pause all threads that share the same rate limiter, and effectively apply the rate limit across all threads that run the same merge, rather than applying the rate limit on a per-thread basis? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2007100589 @jpountz > the caller needs to manage synchronization themselves anyway if they want bytes to be written in the correct order? The MT safety isn't really around the bytes. I agree with your assessment there. But its more around the throttling. The way I read the class is that different threads could be writing to different files but both could experience throttling and we should account for that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2006570648 I understand that we need to make changes to account for the fact that multiple threads may be contributing to the same merge concurrently, but I would not expect `RateLimitedIndexOutput` to need to be thread-safe: the caller needs to manage synchronization themselves anyway if they want bytes to be written in the correct order? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on code in PR #13190: URL: https://github.com/apache/lucene/pull/13190#discussion_r1528642636 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -281,11 +297,11 @@ public IndexOutput createOutput(String name, IOContext context) throws IOExcepti // Because rateLimiter is bound to a particular merge thread, this method should // always be called from that context. Verify this. -assert mergeThread == Thread.currentThread() -: "Not the same merge thread, current=" -+ Thread.currentThread() -+ ", expected=" -+ mergeThread; +/*assert mergeThread == Thread.currentThread() +: "Not the same merge thread, current=" ++ Thread.currentThread() ++ ", expected=" ++ mergeThread;*/ Review Comment: I don't know about this, it seems like paranoid check that may not be necessary if RLIO was threadsafe? ## lucene/core/src/java/org/apache/lucene/store/RateLimitedIndexOutput.java: ## @@ -28,30 +30,30 @@ public final class RateLimitedIndexOutput extends FilterIndexOutput { private final RateLimiter rateLimiter; /** How many bytes we've written since we last called rateLimiter.pause. */ - private long bytesSinceLastPause; + private final AtomicLong bytesSinceLastPause = new AtomicLong(0); Review Comment: This obviously creates some overhead now as RLIO could be accessed via multiple threads. I will likely have to benchmark to see the impact. ## lucene/core/src/java/org/apache/lucene/index/MergeRateLimiter.java: ## @@ -118,24 +118,32 @@ private long maybePause(long bytes, long curNS) throws MergePolicy.MergeAbortedE throw new MergePolicy.MergeAbortedException("Merge aborted."); } -double rate = mbPerSec; // read from volatile rate once. -double secondsToPause = (bytes / 1024. / 1024.) / rate; - -// Time we should sleep until; this is purely instantaneous -// rate (just adds seconds onto the last time we had paused to); -// maybe we should also offer decayed recent history one? -long targetNS = lastNS + (long) (10 * secondsToPause); - -long curPauseNS = targetNS - curNS; - -// We don't bother with thread pausing if the pause is smaller than 2 msec. -if (curPauseNS <= MIN_PAUSE_NS) { - // Set to curNS, not targetNS, to enforce the instant rate, not - // the "averaged over all history" rate: - lastNS = curNS; +final double rate = mbPerSec; // read from volatile rate once. +final double secondsToPause = (bytes / 1024. / 1024.) / rate; + +AtomicLong curPauseNSSetter = new AtomicLong(); +lastNS.updateAndGet( Review Comment: We would need to lock here or do an `updateAndGet`. `lastNS` is only updated in this particular method, so using atomics seemed better to me. ## lucene/core/src/java/org/apache/lucene/store/RateLimitedIndexOutput.java: ## @@ -62,30 +64,42 @@ public void writeBytes(byte[] b, int offset, int length) throws IOException { @Override public void writeInt(int i) throws IOException { -bytesSinceLastPause += Integer.BYTES; +bytesSinceLastPause.addAndGet(Integer.BYTES); checkRate(); out.writeInt(i); } @Override public void writeShort(short i) throws IOException { -bytesSinceLastPause += Short.BYTES; +bytesSinceLastPause.addAndGet(Short.BYTES); checkRate(); out.writeShort(i); } @Override public void writeLong(long i) throws IOException { -bytesSinceLastPause += Long.BYTES; +bytesSinceLastPause.addAndGet(Long.BYTES); checkRate(); out.writeLong(i); } private void checkRate() throws IOException { -if (bytesSinceLastPause > currentMinPauseCheckBytes) { - rateLimiter.pause(bytesSinceLastPause); - bytesSinceLastPause = 0; - currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); +AtomicLong localBytesSinceLastPause = new AtomicLong(0); +AtomicBoolean shouldPause = new AtomicBoolean(false); +bytesSinceLastPause.updateAndGet( +bytes -> { + if (bytes > currentMinPauseCheckBytes.get()) { +shouldPause.set(true); +currentMinPauseCheckBytes.set(rateLimiter.getMinPauseCheckBytes()); +localBytesSinceLastPause.set(bytes); +return 0; + } else { +shouldPause.set(false); + } Review Comment: I figured if another thread reached this throttling and is paused & reset the counter, than this current thread shouldn't pause. Throttle tracking is effectively global and each thread uses the same localBytesSinceLastPause, so the work done by another thread could cause a current thread to pause. Or another thread pausing could allow this thread to continue work. ##
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2004024374 @dweiss @mikemccand I am currently iterating on how to best make `RateLimitedIndexOutput` `MergePolicy` and `MergeRateLimiter` thread safe. Right now, it is all assumed that the interactions with all these are from the same thread, this obviously breaks when we add intra-merge parallelism. I was hoping y'all had some thoughts on how these should all work together with intra-merge parallelism? Is it enough to make these classes threadsafe and remove assertions? Do we want to somehow figure out if the ultimate calling thread was a MergeThread? (This is possible, but will require some wrangling on the tp-executor to keep track of which thread belongs where...) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-2003789175 I am going to revert the change and open a new PR for iterating a fix. `RateLimitedIndexOutput` isn't threadsafe and our rate limiting assumes a single thread. With this commit, more than one thread could be writing and that needs to be accounted for. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-2002000206 Ah, that is likely due to this change. I bet throttling is occurring but in one of the other threadpool threads. I didn't override the thread construction, so maybe that is enough? I can look more closely next week -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
mikemccand commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-2001975632 This looks like a great change! ... Lucene finally catching up to modern concurrency :) I have not looked closely yet. But nightly benchy is upset, prolly related to this, when running `runFacetsBenchmarks.py`. I haven't looked closely yet, and will be offline mostly next couple days (sorry for the hit!), but here's the exception is popped out: ``` Exception in thread "main" org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:913) at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:926) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:4070) at perf.facets.IndexFacets.main(IndexFacets.java:127) Caused by: java.lang.RuntimeException: Only the merge owner thread can call pauseNanos(). This thread: pool-1-thread-1, owner thread: Thread[#290,Lucene Merge Thread #12,5,main] at org.apache.lucene.index.MergePolicy$OneMergeProgress.pauseNanos(MergePolicy.java:142) at org.apache.lucene.index.MergeRateLimiter.maybePause(MergeRateLimiter.java:147) at org.apache.lucene.index.MergeRateLimiter.pause(MergeRateLimiter.java:92) at org.apache.lucene.store.RateLimitedIndexOutput.checkRate(RateLimitedIndexOutput.java:86) at org.apache.lucene.store.RateLimitedIndexOutput.writeBytes(RateLimitedIndexOutput.java:55) at org.apache.lucene.store.ByteBuffersDataOutput.copyTo(ByteBuffersDataOutput.java:339) at org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter$TermsWriter.writeBlock(Lucene90BlockTreeTermsWriter.java:1022) at org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter$TermsWriter.writeBlocks(Lucene90BlockTreeTermsWriter.java:760) at org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter$TermsWriter.pushTerm(Lucene90BlockTreeTermsWriter.java:1133) at org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter$TermsWriter.write(Lucene90BlockTreeTermsWriter.java:1089) at org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter.write(Lucene90BlockTreeTermsWriter.java:399) at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:95) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.merge(PerFieldPostingsFormat.java:205) at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:236) at org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:325) at org.apache.lucene.index.SegmentMerger.lambda$merge$0(SegmentMerger.java:147) at org.apache.lucene.search.TaskExecutor$TaskGroup.lambda$createTask$0(TaskExecutor.java:117) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317) at org.apache.lucene.index.ConcurrentMergeScheduler$CachedExecutor.lambda$execute$0(ConcurrentMergeScheduler.java:978) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Suppressed: java.lang.RuntimeException: Only the merge owner thread can call pauseNanos(). This thread: pool-1-thread-2, owner thread: Thread[#290,Lucene Merge Thread #12,5,main] at org.apache.lucene.index.MergePolicy$OneMergeProgress.pauseNanos(MergePolicy.java:142) at org.apache.lucene.index.MergeRateLimiter.maybePause(MergeRateLimiter.java:147) at org.apache.lucene.index.MergeRateLimiter.pause(MergeRateLimiter.java:92) at org.apache.lucene.store.RateLimitedIndexOutput.checkRate(RateLimitedIndexOutput.java:86) at org.apache.lucene.store.RateLimitedIndexOutput.writeBytes(RateLimitedIndexOutput.java:55) at org.apache.lucene.store.DataOutput.writeBytes(DataOutput.java:54) at org.apache.lucene.util.packed.DirectWriter.flush(DirectWriter.java:97) at org.apache.lucene.util.packed.DirectWriter.add(DirectWriter.java:83) at org.apache.lucene.codecs.lucene90.Lucene90DocValuesConsumer.writeValuesSingleBlock(Lucene90DocValuesConsumer.java:349) at org.apache.lucene.codecs.lucene90.Lucene90DocValuesConsumer.writeValues(Lucene90DocValuesConsumer.java:328) at org.apache.lucene.codecs.lucene90.Lucene90DocValuesConsumer.doAddSortedNumericField(Lucene90DocValuesConsumer.java:707) at
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent merged PR #13124: URL: https://github.com/apache/lucene/pull/13124 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
zhaih commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1525307991 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -902,12 +932,52 @@ private static String getSegmentName(MergePolicy.OneMerge merge) { } static { -TestSecrets.setConcurrentMergeSchedulerAccess( -new ConcurrentMergeSchedulerAccess() { - @Override - public void setSuppressExceptions(ConcurrentMergeScheduler cms) { -cms.setSuppressExceptions(); - } -}); + TestSecrets.setConcurrentMergeSchedulerAccess(ConcurrentMergeScheduler::setSuppressExceptions); + } + + private class CachedExecutor implements Executor { Review Comment: Can we have some simple javadoc explaining what this executor is doing? E.g. will use a spare thread if it is within configured max thread and will run directly on caller thread if we have no extra thread available? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1525270702 ## lucene/core/src/java/org/apache/lucene/index/SegmentMerger.java: ## @@ -130,19 +135,31 @@ MergeState merge() throws IOException { IOContext.READ, segmentWriteState.segmentSuffix); +TaskExecutor taskExecutor = new TaskExecutor(mergeState.intraMergeTaskExecutor); +List> mergingTasks = new ArrayList<>(); + if (mergeState.mergeFieldInfos.hasNorms()) { mergeWithLogging(this::mergeNorms, segmentWriteState, segmentReadState, "norms", numMerged); } mergeWithLogging(this::mergeTerms, segmentWriteState, segmentReadState, "postings", numMerged); Review Comment: Presumably, the worst-case scenario is not much worse than if you have many concurrent merges, so I think it's fine. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1525266457 ## lucene/core/src/java/org/apache/lucene/index/SegmentMerger.java: ## @@ -130,19 +135,31 @@ MergeState merge() throws IOException { IOContext.READ, segmentWriteState.segmentSuffix); +TaskExecutor taskExecutor = new TaskExecutor(mergeState.intraMergeTaskExecutor); +List> mergingTasks = new ArrayList<>(); + if (mergeState.mergeFieldInfos.hasNorms()) { mergeWithLogging(this::mergeNorms, segmentWriteState, segmentReadState, "norms", numMerged); } mergeWithLogging(this::mergeTerms, segmentWriteState, segmentReadState, "postings", numMerged); Review Comment: OK, I added a commit that does this. The candidate merge time is now: `forceMerge took 4965 ms` vs. baseline (i re-ran to make sure) `forceMerge took 15783 ms` So, 3x faster? Seems pretty good. The major downside is that now memory usage on merge might increase as we are potentially doing all this merge activity at the same time. If we are cool with that, this seems like a nice speed improvement. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1525217090 ## lucene/core/src/java/org/apache/lucene/index/SegmentMerger.java: ## @@ -130,19 +135,31 @@ MergeState merge() throws IOException { IOContext.READ, segmentWriteState.segmentSuffix); +TaskExecutor taskExecutor = new TaskExecutor(mergeState.intraMergeTaskExecutor); +List> mergingTasks = new ArrayList<>(); + if (mergeState.mergeFieldInfos.hasNorms()) { mergeWithLogging(this::mergeNorms, segmentWriteState, segmentReadState, "norms", numMerged); } mergeWithLogging(this::mergeTerms, segmentWriteState, segmentReadState, "postings", numMerged); Review Comment: Can we have a parallel task that handles norms + terms so that the order is respected? ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -902,12 +932,57 @@ private static String getSegmentName(MergePolicy.OneMerge merge) { } static { -TestSecrets.setConcurrentMergeSchedulerAccess( -new ConcurrentMergeSchedulerAccess() { - @Override - public void setSuppressExceptions(ConcurrentMergeScheduler cms) { -cms.setSuppressExceptions(); + TestSecrets.setConcurrentMergeSchedulerAccess(ConcurrentMergeScheduler::setSuppressExceptions); + } + + private class ScaledExecutor implements Executor { + +private final AtomicInteger activeCount = new AtomicInteger(0); +private final ThreadPoolExecutor executor; + +public ScaledExecutor() { + this.executor = + new ThreadPoolExecutor(0, 1024, 1L, TimeUnit.MINUTES, new SynchronousQueue<>()); +} + +void shutdown() { + executor.shutdown(); +} + +@Override +public void execute(Runnable command) { + assert mergeThreads.contains(Thread.currentThread()) : "caller is not a merge thread"; Review Comment: I'd expect this assertion to no longer be valid, since SegmentMerger may fork into this executor for vectors, and then the task for vectors may want to further fork into a separate thread? So the caller may be either a MergeThread, or a thread from the wrapper thread pool? ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -910,4 +936,55 @@ public void setSuppressExceptions(ConcurrentMergeScheduler cms) { } }); } + + private class ScaledExecutor implements Executor { Review Comment: nit: should it be called `CachedExecutor` now? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1995183835 @jpountz ok, that naive attempt failed as norms & terms apparently need to be merged in order (one or the other would fail due to missing files...). I am not sure if this is true with other things. But when I adjusted for this (latest commit) I did see a speed improvement. My test was over IndexGeoNames, I did no segment merging and flushed every 12MB to ensure I got many segments. I also set CMS merge threads to `8` threads. Baseline forcemerge(1): 17793 ms Candidates forcemerge(1): 13739 ms -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
zhaih commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1523620861 ## lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsFormat.java: ## @@ -152,7 +154,25 @@ public Lucene99HnswVectorsFormat() { * @param beamWidth the size of the queue maintained during graph construction. */ public Lucene99HnswVectorsFormat(int maxConn, int beamWidth) { -this(maxConn, beamWidth, DEFAULT_NUM_MERGE_WORKER, null); +super("Lucene99HnswVectorsFormat"); +if (maxConn <= 0 || maxConn > MAXIMUM_MAX_CONN) { + throw new IllegalArgumentException( + "maxConn must be positive and less than or equal to " + + MAXIMUM_MAX_CONN + + "; maxConn=" + + maxConn); +} +if (beamWidth <= 0 || beamWidth > MAXIMUM_BEAM_WIDTH) { + throw new IllegalArgumentException( + "beamWidth must be positive and less than or equal to " + + MAXIMUM_BEAM_WIDTH + + "; beamWidth=" + + beamWidth); +} +this.maxConn = maxConn; +this.beamWidth = beamWidth; +this.mergeExec = null; +this.numMergeWorkers = 1; Review Comment: Yeah I agree, so for now we'll just fix it at 8? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1994921965 > Yes exactly, something very simple, mostly to exercise intra-merge concurrency with more than just vectors. Latest commit adds `TaskExecutor` actions to merge to allow different merging field types to be merged in parallel. I needed to use `TaskExecutor` as it abstracts out the exception handling (all these methods throw `IOException`). I still need to benchmark it with Lucene-util -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1994458100 Yes exactly, something very simple, mostly to exercise intra-merge concurrency with more than just vectors. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1994399018 @jpountz would you prefer something like the original patch from @dweiss ? I can submit the merging actions independently to the intra-merge executor. Anything more (like figuring out how to make postings merge truly parallel), will be quite a lift for me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1994373820 Thanks for sharing performance numbers @benwtrent, very interesting. Also double checking if you saw my above comment: https://github.com/apache/lucene/pull/13124#pullrequestreview-1930418114, I'd have more confidence about the way merge concurrency is exposed if we took advantage of it in more than one place. If there's some difficulty with it, I'm happy with looking into it in a follow-up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1994372130 @zhaih @jpountz I am going to create a separate issue around making HNSW worker slicing automatic. It will require a bunch of its own benchmarking and work and honestly seems orthogonal to this particular change as HNSW merging still works fine when statically setting the number of workers & using the CMS thread pool. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1523142360 ## lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsFormat.java: ## @@ -152,7 +154,25 @@ public Lucene99HnswVectorsFormat() { * @param beamWidth the size of the queue maintained during graph construction. */ public Lucene99HnswVectorsFormat(int maxConn, int beamWidth) { -this(maxConn, beamWidth, DEFAULT_NUM_MERGE_WORKER, null); +super("Lucene99HnswVectorsFormat"); +if (maxConn <= 0 || maxConn > MAXIMUM_MAX_CONN) { + throw new IllegalArgumentException( + "maxConn must be positive and less than or equal to " + + MAXIMUM_MAX_CONN + + "; maxConn=" + + maxConn); +} +if (beamWidth <= 0 || beamWidth > MAXIMUM_BEAM_WIDTH) { + throw new IllegalArgumentException( + "beamWidth must be positive and less than or equal to " + + MAXIMUM_BEAM_WIDTH + + "; beamWidth=" + + beamWidth); +} +this.maxConn = maxConn; +this.beamWidth = beamWidth; +this.mergeExec = null; +this.numMergeWorkers = 1; Review Comment: > Oh I see, you haven't yet put in the change for auto figure out number of workers I am wanting to do that separately. It will require its own benchmarking, etc. and this change is good even without choosing an automatic split for the hnsw graph merge. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
zhaih commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1522601711 ## lucene/core/src/java/org/apache/lucene/index/MergeScheduler.java: ## @@ -52,6 +56,14 @@ public Directory wrapForMerge(OneMerge merge, Directory in) { return in; } + /** + * Provides an executor for parallelism during a single merge operation. By default, the method + * returns `null` indicating that there is no parallelism during a single merge operation. Review Comment: I think it's ok to force this method always return non-null? Since we already have provided a good default? ## lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsFormat.java: ## @@ -152,7 +154,25 @@ public Lucene99HnswVectorsFormat() { * @param beamWidth the size of the queue maintained during graph construction. */ public Lucene99HnswVectorsFormat(int maxConn, int beamWidth) { -this(maxConn, beamWidth, DEFAULT_NUM_MERGE_WORKER, null); +super("Lucene99HnswVectorsFormat"); +if (maxConn <= 0 || maxConn > MAXIMUM_MAX_CONN) { + throw new IllegalArgumentException( + "maxConn must be positive and less than or equal to " + + MAXIMUM_MAX_CONN + + "; maxConn=" + + maxConn); +} +if (beamWidth <= 0 || beamWidth > MAXIMUM_BEAM_WIDTH) { + throw new IllegalArgumentException( + "beamWidth must be positive and less than or equal to " + + MAXIMUM_BEAM_WIDTH + + "; beamWidth=" + + beamWidth); +} +this.maxConn = maxConn; +this.beamWidth = beamWidth; +this.mergeExec = null; +this.numMergeWorkers = 1; Review Comment: Can remove those 2? ## lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsFormat.java: ## @@ -152,7 +154,25 @@ public Lucene99HnswVectorsFormat() { * @param beamWidth the size of the queue maintained during graph construction. */ public Lucene99HnswVectorsFormat(int maxConn, int beamWidth) { -this(maxConn, beamWidth, DEFAULT_NUM_MERGE_WORKER, null); +super("Lucene99HnswVectorsFormat"); +if (maxConn <= 0 || maxConn > MAXIMUM_MAX_CONN) { + throw new IllegalArgumentException( + "maxConn must be positive and less than or equal to " + + MAXIMUM_MAX_CONN + + "; maxConn=" + + maxConn); +} +if (beamWidth <= 0 || beamWidth > MAXIMUM_BEAM_WIDTH) { + throw new IllegalArgumentException( + "beamWidth must be positive and less than or equal to " + + MAXIMUM_BEAM_WIDTH + + "; beamWidth=" + + beamWidth); +} +this.maxConn = maxConn; +this.beamWidth = beamWidth; +this.mergeExec = null; +this.numMergeWorkers = 1; Review Comment: Oh I see, you haven't yet put in the change for auto figure out number of workers -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1992172143 @jpountz OK, so I did some benchmarking to look at the impact of this change. 500k docs, flushing segments every 1MB and then force merging. For all of these, the number of workers is `8` and the threads available are `8`. Baseline single threaded (no concurrency in HNSW) ``` Indexed: 467617ms Force merge: 593037 ms ``` Baseline multi-threaded (executor separate from CMS) ``` Indexed: 173143ms Force merge: 120341 ms ``` Candidate, CMS with 8 merge threads, sharing with intra-merge (what this PR is doing). Shows that we share the threads with the CMS background threads. There is a ton of merging work being done, so usually just 1 thread is being passed to the intra-merge executor. This is an expected result and working as intended. However, we see a speed up in force-merge as no other merge activity is occurring and it can use all the provided CMS threads. ``` Indexed: 424924ms Force merge: 121705 ms ``` To confirm that this is indeed the case and that indexing wasn’t slowed down due to some other weird overhead, I gave 2x as many threads to intra-merging. This effectively removes the limit that CMS is providing to intra-merges. Shows that that the indexing is now inline with how it is now, basically using 2x as many threads as configured for CMS (8 merge threads and 8 intra-merge threads). ``` Indexed: 171825ms Force merge: 121886 ms ``` To make sure we don’t slow down too much compared to baseline single threaded, I gave 8 workers to HNSW merge, but only provided a SameThreadExecutorService. I did this to justify the default to a SameThreadExecutorService, even when HNSW has > 1 worker. Well within the runtime of baseline. ``` Indexed: 418529ms Force merge: 524468 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1521095208 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -910,4 +941,58 @@ public void setSuppressExceptions(ConcurrentMergeScheduler cms) { } }); } + + private class ScaledExecutor implements Executor { + +private final AtomicInteger activeCount = new AtomicInteger(0); +private final ThreadPoolExecutor executor; + +public ScaledExecutor() { + this.executor = + new ThreadPoolExecutor( + 0, Math.max(1, maxThreadCount), 1L, TimeUnit.MINUTES, new SynchronousQueue<>()); +} + +void shutdown() { + executor.shutdown(); +} + +private void updatePoolSize() { + executor.setMaximumPoolSize(Math.max(1, maxThreadCount)); +} Review Comment: It feels like we don't actually need this since we already have control on the thread pool size via `maxThreadCount`. What about setting a high max thread count in the ctor and never updating it (ie. making it a "cached" executor rather than a "scaled" executor). ## lucene/core/src/java/org/apache/lucene/index/MergeScheduler.java: ## @@ -52,6 +53,14 @@ public Directory wrapForMerge(OneMerge merge, Directory in) { return in; } + /** + * Provides an executor for parallelism during a single merge operation. By default, the method + * returns `null` indicating that there is no parallelism during a single merge operation. + */ + public Executor getIntraMergeExecutor(OneMerge merge) { +return null; Review Comment: What about returning `SameThreadExecutor` by default instead? ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -910,4 +936,68 @@ public void setSuppressExceptions(ConcurrentMergeScheduler cms) { } }); } + + private class ScaledExecutor implements Executor { + +private final AtomicInteger activeCount = new AtomicInteger(0); +private final ThreadPoolExecutor executor; + +public ScaledExecutor() { + this.executor = + new ThreadPoolExecutor( + 0, Math.max(1, maxThreadCount), 1L, TimeUnit.MINUTES, new SynchronousQueue<>()); +} + +void shutdown() { + executor.shutdown(); +} + +private void updatePoolSize() { + executor.setMaximumPoolSize(Math.max(1, maxThreadCount)); +} + +@Override +public void execute(Runnable command) { + assert mergeThreads.contains(Thread.currentThread()) : "caller is not a merge thread"; + Thread currentThread = Thread.currentThread(); + if (currentThread instanceof MergeThread mergeThread) { +execute(mergeThread, command); + } else { +command.run(); + } +} + +private void execute(MergeThread mergeThread, Runnable command) { + // don't do multithreaded merges for small merges + if (mergeThread.merge.estimatedMergeBytes < MIN_BIG_MERGE_MB * 1024 * 1024) { +command.run(); + } else { +final boolean isThreadAvailable; +// we need to check if a thread is available before submitting the task to the executor +// synchronize on CMS to get an accurate count of current threads +synchronized (ConcurrentMergeScheduler.this) { + int max = maxThreadCount - mergeThreads.size(); + int value = activeCount.get(); + if (value < max) { +activeCount.incrementAndGet(); +isThreadAvailable = true; + } else { +isThreadAvailable = false; + } +} Review Comment: @benwtrent ++ There is a risk of letting bigger merges starve smaller merges from threads otherwise, and I'm not sure how we could fix it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
msokolov commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1513268905 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -910,4 +936,68 @@ public void setSuppressExceptions(ConcurrentMergeScheduler cms) { } }); } + + private class ScaledExecutor implements Executor { + +private final AtomicInteger activeCount = new AtomicInteger(0); +private final ThreadPoolExecutor executor; + +public ScaledExecutor() { + this.executor = + new ThreadPoolExecutor( + 0, Math.max(1, maxThreadCount), 1L, TimeUnit.MINUTES, new SynchronousQueue<>()); +} + +void shutdown() { + executor.shutdown(); +} + +private void updatePoolSize() { + executor.setMaximumPoolSize(Math.max(1, maxThreadCount)); +} + +@Override +public void execute(Runnable command) { + assert mergeThreads.contains(Thread.currentThread()) : "caller is not a merge thread"; + Thread currentThread = Thread.currentThread(); + if (currentThread instanceof MergeThread mergeThread) { +execute(mergeThread, command); + } else { +command.run(); + } +} + +private void execute(MergeThread mergeThread, Runnable command) { + // don't do multithreaded merges for small merges + if (mergeThread.merge.estimatedMergeBytes < MIN_BIG_MERGE_MB * 1024 * 1024) { +command.run(); + } else { +final boolean isThreadAvailable; +// we need to check if a thread is available before submitting the task to the executor +// synchronize on CMS to get an accurate count of current threads +synchronized (ConcurrentMergeScheduler.this) { + int max = maxThreadCount - mergeThreads.size(); + int value = activeCount.get(); + if (value < max) { +activeCount.incrementAndGet(); +isThreadAvailable = true; + } else { +isThreadAvailable = false; + } +} Review Comment: I am going to just make stuff up here since I don't know this code well at all -- but I think I was assuming there was a ThreadPoolExecutor somewhere that was doling out threads for both of these operations. Now I'm realizing perhaps that's not the case and CMS is doing all that tracking on its own. I wonder if we should open a separate issue to migrate CMS to use JDK's thread pool abstractions? It could make some of this easier to handle. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1513208670 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -910,4 +936,68 @@ public void setSuppressExceptions(ConcurrentMergeScheduler cms) { } }); } + + private class ScaledExecutor implements Executor { + +private final AtomicInteger activeCount = new AtomicInteger(0); +private final ThreadPoolExecutor executor; + +public ScaledExecutor() { + this.executor = + new ThreadPoolExecutor( + 0, Math.max(1, maxThreadCount), 1L, TimeUnit.MINUTES, new SynchronousQueue<>()); +} + +void shutdown() { + executor.shutdown(); +} + +private void updatePoolSize() { + executor.setMaximumPoolSize(Math.max(1, maxThreadCount)); +} + +@Override +public void execute(Runnable command) { + assert mergeThreads.contains(Thread.currentThread()) : "caller is not a merge thread"; + Thread currentThread = Thread.currentThread(); + if (currentThread instanceof MergeThread mergeThread) { +execute(mergeThread, command); + } else { +command.run(); + } +} + +private void execute(MergeThread mergeThread, Runnable command) { + // don't do multithreaded merges for small merges + if (mergeThread.merge.estimatedMergeBytes < MIN_BIG_MERGE_MB * 1024 * 1024) { +command.run(); + } else { +final boolean isThreadAvailable; +// we need to check if a thread is available before submitting the task to the executor +// synchronize on CMS to get an accurate count of current threads +synchronized (ConcurrentMergeScheduler.this) { + int max = maxThreadCount - mergeThreads.size(); + int value = activeCount.get(); + if (value < max) { +activeCount.incrementAndGet(); +isThreadAvailable = true; + } else { +isThreadAvailable = false; + } +} Review Comment: One major decision I made here is that CMS thread throttling knows nothing about intra-merge threads. This means, if there are intra-merge threads running, it is possible that concurrency creeps above `maxThreadCount` until the intra-merge thread(s) finish(es). Making throttling & overall CMS behavior more aware of intra-merge thread usage will be a much trickier change. I wasn't sure it was worth it. I am open to opinions on this @msokolov @jpountz . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
shubhamvishu commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1506927786 ## lucene/core/src/java/org/apache/lucene/index/SegmentMerger.java: ## @@ -56,13 +58,19 @@ final class SegmentMerger { InfoStream infoStream, Directory dir, FieldInfos.FieldNumbers fieldNumbers, - IOContext context) + IOContext context, + Executor parallelMergeTaskExecutor) throws IOException { if (context.context != IOContext.Context.MERGE) { throw new IllegalArgumentException( "IOContext.context should be MERGE; got: " + context.context); } -mergeState = new MergeState(readers, segmentInfo, infoStream); +mergeState = +new MergeState( +readers, +segmentInfo, +infoStream, +parallelMergeTaskExecutor == null ? null : new TaskExecutor(parallelMergeTaskExecutor)); Review Comment: > it's a bit weird to use a class from the search package for merging (TaskExecutor) +1 .. @jpountz since now we are using `TaskExecutor` on both search and indexing use cases does it make sense to move it into `org.apache.lucene.util` package maybe? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1503892704 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -910,4 +936,58 @@ public void setSuppressExceptions(ConcurrentMergeScheduler cms) { } }); } + + private class ScaledExecutor extends ThreadPoolExecutor { + +AtomicInteger activeCount = new AtomicInteger(0); + +public ScaledExecutor() { + super( + Math.max(0, maxThreadCount - 1), + Math.max(1, maxThreadCount - 1), + Long.MAX_VALUE, + TimeUnit.NANOSECONDS, + new SynchronousQueue<>()); +} + +private void updatePoolSize() { + int newMax = Math.max(0, maxThreadCount - 1); + if (newMax > getCorePoolSize()) { +setMaximumPoolSize(Math.max(newMax, 1)); +setCorePoolSize(newMax); + } else { +setCorePoolSize(newMax); +setMaximumPoolSize(Math.max(newMax, 1)); Review Comment: The order of setting the max is important. But I am gonna change this. I think the core thread count should be `0` and we shouldn't update it at all and allow the threadpool to scale. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
zhaih commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1503101967 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -910,4 +936,58 @@ public void setSuppressExceptions(ConcurrentMergeScheduler cms) { } }); } + + private class ScaledExecutor extends ThreadPoolExecutor { + +AtomicInteger activeCount = new AtomicInteger(0); + +public ScaledExecutor() { + super( + Math.max(0, maxThreadCount - 1), + Math.max(1, maxThreadCount - 1), + Long.MAX_VALUE, + TimeUnit.NANOSECONDS, + new SynchronousQueue<>()); +} + +private void updatePoolSize() { + int newMax = Math.max(0, maxThreadCount - 1); + if (newMax > getCorePoolSize()) { +setMaximumPoolSize(Math.max(newMax, 1)); +setCorePoolSize(newMax); + } else { +setCorePoolSize(newMax); +setMaximumPoolSize(Math.max(newMax, 1)); Review Comment: Aren't those two branch the same? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
msokolov commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1964229191 ooh exciting! I left some comments in a related issue that were maybe a little clueless given all the discussion here that I missed until now. Still I'm happy about the direction this seems to be going re: moving the configuration out of codec and into merge policy, and coordinating thread usage across inter/intra segment merges -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1502165551 ## lucene/core/src/java/org/apache/lucene/index/MergeScheduler.java: ## @@ -52,6 +56,14 @@ public Directory wrapForMerge(OneMerge merge, Directory in) { return in; } + /** + * Provides an executor for parallelism during a single merge operation. By default, this method + * returns an executor that runs tasks in the calling thread. + */ + public Executor getInterMergeExecutor(OneMerge merge) { Review Comment: 臘 `inter` vs `intra` is just as bad as `affect` and `effect`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1502165014 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -910,4 +936,58 @@ public void setSuppressExceptions(ConcurrentMergeScheduler cms) { } }); } + + private class ScaledExecutor extends ThreadPoolExecutor { + +AtomicInteger activeCount = new AtomicInteger(0); + +public ScaledExecutor() { + super( + Math.max(0, maxThreadCount - 1), + Math.max(1, maxThreadCount - 1), + Long.MAX_VALUE, + TimeUnit.NANOSECONDS, + new SynchronousQueue<>()); +} + +private void updatePoolSize() { + int newMax = Math.max(0, maxThreadCount - 1); + if (newMax > getCorePoolSize()) { +setMaximumPoolSize(Math.max(newMax, 1)); +setCorePoolSize(newMax); + } else { +setCorePoolSize(newMax); +setMaximumPoolSize(Math.max(newMax, 1)); + } +} + +boolean incrementUpTo(int max) { + while (true) { +int value = activeCount.get(); +if (value >= max) { + return false; +} +if (activeCount.compareAndSet(value, value + 1)) { + return true; +} Review Comment: @jpountz you are correct, my logic here isn't what it needs to be. I will fix it up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1502163736 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -910,4 +936,58 @@ public void setSuppressExceptions(ConcurrentMergeScheduler cms) { } }); } + + private class ScaledExecutor extends ThreadPoolExecutor { + +AtomicInteger activeCount = new AtomicInteger(0); + +public ScaledExecutor() { + super( + Math.max(0, maxThreadCount - 1), + Math.max(1, maxThreadCount - 1), + Long.MAX_VALUE, + TimeUnit.NANOSECONDS, + new SynchronousQueue<>()); +} Review Comment: > Thinking out loud: for the case when tens of index writers are open in the same JVM, we may want to configure a timeout on threads in order to avoid spending too much heap on idle threads? The timeout would only apply to non-core threads. But, your point is taken, it is possible for merges to be idle for a while and we don't want threads just sitting around taking up unnecessary space. So, I can make `core` always `0`, and rely on the pool dynamically adding threads up to max. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1502160415 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -910,4 +936,58 @@ public void setSuppressExceptions(ConcurrentMergeScheduler cms) { } }); } + + private class ScaledExecutor extends ThreadPoolExecutor { Review Comment: Yeah no doubt. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1501838905 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -910,4 +936,58 @@ public void setSuppressExceptions(ConcurrentMergeScheduler cms) { } }); } + + private class ScaledExecutor extends ThreadPoolExecutor { Review Comment: nit: I usually have a preference for composition over inheritance, ie. could we wrap the thread-pool executor instead of wrapping it? ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -910,4 +936,58 @@ public void setSuppressExceptions(ConcurrentMergeScheduler cms) { } }); } + + private class ScaledExecutor extends ThreadPoolExecutor { + +AtomicInteger activeCount = new AtomicInteger(0); + +public ScaledExecutor() { + super( + Math.max(0, maxThreadCount - 1), + Math.max(1, maxThreadCount - 1), + Long.MAX_VALUE, + TimeUnit.NANOSECONDS, + new SynchronousQueue<>()); +} + +private void updatePoolSize() { + int newMax = Math.max(0, maxThreadCount - 1); + if (newMax > getCorePoolSize()) { +setMaximumPoolSize(Math.max(newMax, 1)); +setCorePoolSize(newMax); + } else { +setCorePoolSize(newMax); +setMaximumPoolSize(Math.max(newMax, 1)); + } +} + +boolean incrementUpTo(int max) { + while (true) { +int value = activeCount.get(); +if (value >= max) { + return false; +} +if (activeCount.compareAndSet(value, value + 1)) { + return true; +} Review Comment: If I read correctly, this tries to keep `activeCount <= maxThreadCount`. I was thinking we should try to keep `activeCount <= maxThreadCount - mergeThreadCount()`. Otherwise we're effectively using more than `maxThreadCount` for merging in total, which I find a bit surprising? ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -910,4 +936,58 @@ public void setSuppressExceptions(ConcurrentMergeScheduler cms) { } }); } + + private class ScaledExecutor extends ThreadPoolExecutor { + +AtomicInteger activeCount = new AtomicInteger(0); + +public ScaledExecutor() { + super( + Math.max(0, maxThreadCount - 1), + Math.max(1, maxThreadCount - 1), + Long.MAX_VALUE, + TimeUnit.NANOSECONDS, + new SynchronousQueue<>()); +} Review Comment: Thinking out loud: for the case when tens of index writers are open in the same JVM, we may want to configure a timeout on threads in order to avoid spending too much heap on idle threads? ## lucene/core/src/java/org/apache/lucene/index/MergeScheduler.java: ## @@ -52,6 +56,14 @@ public Directory wrapForMerge(OneMerge merge, Directory in) { return in; } + /** + * Provides an executor for parallelism during a single merge operation. By default, this method + * returns an executor that runs tasks in the calling thread. + */ + public Executor getInterMergeExecutor(OneMerge merge) { Review Comment: Should it be get**Intra**MergeExecutor? ## lucene/core/src/java/org/apache/lucene/index/NoMergeScheduler.java: ## @@ -52,4 +53,9 @@ public Directory wrapForMerge(OneMerge merge, Directory in) { public MergeScheduler clone() { return this; } + + @Override + public Executor getInterMergeExecutor(OneMerge merge) { +return null; Review Comment: nit: throw instead? ## lucene/core/src/java/org/apache/lucene/index/SegmentMerger.java: ## @@ -56,13 +58,19 @@ final class SegmentMerger { InfoStream infoStream, Directory dir, FieldInfos.FieldNumbers fieldNumbers, - IOContext context) + IOContext context, + Executor parallelMergeTaskExecutor) throws IOException { if (context.context != IOContext.Context.MERGE) { throw new IllegalArgumentException( "IOContext.context should be MERGE; got: " + context.context); } -mergeState = new MergeState(readers, segmentInfo, infoStream); +mergeState = +new MergeState( +readers, +segmentInfo, +infoStream, +parallelMergeTaskExecutor == null ? null : new TaskExecutor(parallelMergeTaskExecutor)); Review Comment: Nit: it's a bit weird to use a class from the search package for merging (TaskExecutor). Should merging get access to the raw Executor, which is a bit more flexible (I don't know if all formats will be able to split work into a list of tasks up-front)? Vectors could still wrap inside a `TaskExecutor` for convenience? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1961990406 OK, I took a stab at this @jpountz latest commit has a POC. Need to test. But early performance testing shows indexing speed is better and forcemerge speed is just about the same. I am still using the numWorkers setting from HNSW to determine workers. This is mainly to prevent testing too many performance changes at a time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
zhaih commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1961832502 > I am not thinking about binding them. I think that MergeScheduler itself should be extended to return a TaskExecutor (probably defaulting to null to indicate none, or maybe SameThreadExecutorService). Sounds good > I really don't think users should configure numWorkers or workPerThread at all. I would much prefer us supply good defaults and remove configuration. No I'm not suggesting that either, I'm open to let user specify or we provide default, just want to make sure the performance is not affected (by not pre-dividing the doc space into each thread). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1961808528 @zhaih I don't see any reason why we also cannot extend the SerlialMS and allow multiple threads per merge ran. We will have to update the base `MergeScheduler` class anyways as the merges being ran don't know anything about who kicked off the merge (and shouldn't). The reason for all the CMS discussion is that it is the hardest to implement correctly (to me anyways...). For SerialMS, users could provide a number and its a static executor with that number of threads. > But still keep the current way of merge to keep the performance? I really don't think users should configure numWorkers or workPerThread at all. I would much prefer us supply good defaults and remove configuration. If we did the parallelism outside the codec to being with, I don't think we would have added any configurable values to the HNSW codec. > So if we bind those two together whether we potentially prevent a part of users using the intra-segment merges? I am not thinking about binding them. I think that `MergeScheduler` itself should be extended to return a TaskExecutor (probably defaulting to `null` to indicate none, or maybe `SameThreadExecutorService`). CMS is just the most difficult one to figure out. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
zhaih commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1961788368 So the current way of HNSW concurrent merge implemented is: each worker will try to use an AtomicInteger to coordinate and only do a small batch of work (1024 documents) each time. The advantage is we are able to load balance between workers and I remember this did brings some (5-10%) performance gain when I was testing it. Maybe, instead of specify a numWorkers per merge, we can default a expected work load per thread (like 10K), and then allocate numWorkers dynamically? But still keep the current way of merge to keep the performance? One thing I'm worried about putting all things into CMS is that we're binding intra segment merge with CMS. But to my understanding using CMS means we're using background thread to merge and merge become indeterministic, such that there are still some part of users are using SMS (SerialMS) or similar thing to keep the deterministic of merging. But on the other hand the HNSW concurrent merge does not affect that aspect at all, no matter how many threads you're using it won't affect the determinism of merge result. So if we bind those two together whether we potentially prevent a part of users using the intra-segment merges? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1961730452 > Intra-merge concurrency would take advantage of the fact that there will sometimes be fewer active merges than threads to enable intra-merge concurrency. Sorry for being so dense, `Executor#execute` finally clicked it for me. `Executor#execute` can check for `currentRunningThreads + mergeThreads.size()`. If its larger than `>= maxThreadCount` execute in the current thread, otherwise spawn or give a thread from a pool. This would then increment `currentRunningThreads`. I will spin a little bit on this to see what I can come up with. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1961602616 Maybe some of these things are too ambitious, but ideally I'd like it to work this way. `ConcurrentMergeScheduler` already tracks a `maxMergeCount` which controls the max number of running merges and a `maxThreadCount` that tracks the max number of threads that merges may use at most. Ideally I'd like `maxThreadCount` to include both threads used for inter-merge concurrency and intra-merge concurrency. So this is similar to your first suggestion except that I'm bounding the total number of threads to `maxThreadCount` rather than `maxThreadCount + maxMergeCount`. Intra-merge concurrency would take advantage of the fact that there will sometimes be fewer active merges than threads to enable intra-merge concurrency. E.g. we could have a pool of threads for intra-merge concurrency that would try to ensure that its number of active threads is always less than or equals to `max(0, maxThreadCount - mergeThreads.size())`. For instance `Executor#execute` could be implemented such that it runs the runnable in the current thread if the number of active merges plus the number of active threads in the intra-merge thread pool is greater than or equal to `maxThreadCount`. Otherwise it would fork to the intra-merge thread pool. Concurrent merging for vectors wants to know the number of available workers today, but maybe we can change the logic (like you suggested) to split the doc ID space into some number of slices, e.g. max(128, maxDoc / 2^16), and sequentially send these slices to `Executor#execute` (sometimes running in the same thread, sometime forked to the intra-merge threadpool), except the last one that would be forced to run in the current thread (like we used to do in `IndexSearcher` until recently). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1961417895 OK, my head is spinning a bit trying to grok the side-effects of the CMS. I think I understand most of it now. Currently we have two adjustable parameters. `maxThreadCount` dictates merge through put, or now many merges we have at a time. `mergeWorkerCount` dictates back pressure, or when we stop allowing merges to be queued at all & block upstream indexing from continuing. Throughout the life of a merge, it can become (un)paused numerous times, this is controlled via the `RateLimitedIndexOutput` which only rate limits WRITING results, not reading or anything else related to threads. Now we want to add a new configuration, `maxMergeThreads`, which will control merge latency by adding parallelism to each individual thread. Since merge thread pausing only has to do with index output, I am not sure we need to add any individual thread throttling other than what’s already there. The Directory wrapper will pause/throttle all writing occurring for the merge. This is acceptable even if the merge is using multiple threads. I also think that small merges (<50MB) should never be allowed to run over multiple threads. Similar to how we never throttle those because they are so small, the benefit in latency reduction will be minuscule and we should reserve the extra thread usage to larger merges. What I am stuck on is on this: - Should we use a common pool for all merges? Thus restricting the total threads used by merging to be `maxThreadCount` + `maxMergeThreads`? This will simplify the logic in the CMS significantly as a single task executor can be used for all merges. - Or should we use an individual executor/pool per merge? Thus restricting total threads by `maxMergeThreads` * `maxMergeThreads` (or some fraction of `maxMergeThreads`)? This could get interesting… How do we determine how many threads each merge can get? Are we ok with creating a new task executor on every larger merge and then closing it? What do you think @jpountz? One other thing we need to fix is this idea of “numWorkers” in HNSW. It seems like it should just pick optimal slices given the number of vectors it has (similar to multi-segment search stuff). Chunk itself into those slices and then be at the behest of the task executor. What say you @zhaih ? It seems weird to use `numWorkers` as a way to say “only use these many threads” when we have no knowledge of how many threads the task executor actually has. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1499916055 ## lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsFormat.java: ## @@ -152,7 +153,25 @@ public Lucene99HnswVectorsFormat() { * @param beamWidth the size of the queue maintained during graph construction. */ public Lucene99HnswVectorsFormat(int maxConn, int beamWidth) { -this(maxConn, beamWidth, DEFAULT_NUM_MERGE_WORKER, null); Review Comment: @zhaih I see what you are saying about the weighing of different codecs. Where certain codecs could merged in parallel using a different number of threads. My concern is how this would ever be effectively communicate to the merge scheduler. While it isn't a 1-1 relationship, adding more threads here in HNSW will increase I/O. I assume that for other codecs that could use parallelism, additional I/O will also be a concern. The scheduler needs to know how to throttle I/O if this gets too large so that merges and indexing can be smoothed out correct? This indicates to me the scheduler should be in charge of handing out threads to do work. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1959856815 > For instance the merge scheduler could return a custom Executor that dynamically decides to run a new task in the current thread or to fork to a separate thread depending on how many threads are currently busy across all merges vs. the current value of ConcurrentMergeScheduler.maxThreadCount? This makes me think that instead of it being a separate executor it should return a dynamic value for `numParallelMergeWorkers` or if `numParallelMergeWorkers==1` return `null` so that there is no parallelism. This would require some sort of "planning" on the merge scheduler (knowing how many are queued and how many `numParallelMergeWorkers` it provided to each executing merge action). I guess we could rely on an executor's `getActiveCount()` but that seems trappy. This all implies that the configuration will actually `maxParallelMergeWorkers` and the MergeScheduler is free to provide any number of workers up to that limit. The tricky part to me is determining 'how busy' the current merges are. The MergeScheduler could pass `numParallelMergeWorkers` to the `SegmentMerger`, but it goes completely unused. Maybe this is OK and we just assume it will be used. Another confusion is determining the total number of threads allowed for merging (inter/intra). We could default this to `maxParallelMergeWorkers * maxThreadCount`. In this instance `maxParallelMergeWorkers == 1` would behave as the current working. `maxParallelMergeWorkers == 2` would mean that we potentially use twice as many resources and the user should adjust `maxThreadCount` accordingly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1959736838 > So, we would have to update the MergeScheduler to have some methods to return the executor for us to use and pass to MergeState (which is only created via the SegmentMerger object). This means that the scheduler for individual merges and the parallelism available to those individual merges are independent. I agree with passing some executor from the scheduler to the `MergeState`, but I'm not sure I agree that this implies that inter-merge and intra-merge parallelism would be independent. For instance the merge scheduler could return a custom `Executor` that dynamically decides to run a new task in the current thread or to fork to a separate thread depending on how many threads are currently busy across all merges vs. the current value of `ConcurrentMergeScheduler.maxThreadCount`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1959551147 > So I think it is still better to have two separate thread pools for inter-segment merge and inner segment merge, but I wonder whether we can have a ThreadPoolManager which configures both inter and inner pool. And CMS and HNSW merge (and any future merges) will source from this manager? Maybe? But this doesn't seem to simplify anything for users. They will still need two things to configure (number of merges that can happen at a time, number of threads available to all those merges to take action). There is always the issue of best using resources and the potential of over-allocating the CPU cores. I don't think any finagling we do will ultimately change that. Selfishly, I really don't want to mess with the CMS code at all. Even if we switch it to use an executor of some sort, the complexities of I/O throttling & executable queuing would still all exist. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1499311351 ## lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsFormat.java: ## @@ -152,7 +153,25 @@ public Lucene99HnswVectorsFormat() { * @param beamWidth the size of the queue maintained during graph construction. */ public Lucene99HnswVectorsFormat(int maxConn, int beamWidth) { -this(maxConn, beamWidth, DEFAULT_NUM_MERGE_WORKER, null); Review Comment: I am not sure we want such configuration at all in a default codec. I would rather the concurrency is all controlled upstream. My thought is that if we had all this part of the MergeState before, we wouldn't have updated the codec definition at all. In the future, as more things use parallelism maybe then we should consider adding this setting back (or a completely new setting). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
zhaih commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1498656246 ## lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsFormat.java: ## @@ -152,7 +153,25 @@ public Lucene99HnswVectorsFormat() { * @param beamWidth the size of the queue maintained during graph construction. */ public Lucene99HnswVectorsFormat(int maxConn, int beamWidth) { -this(maxConn, beamWidth, DEFAULT_NUM_MERGE_WORKER, null); Review Comment: If there will be more future inner segment concurrent merge, I think it might still be better to leave the `numMergeWorkers` here rather than put it in IWC? Otherwise we'll either have too many similar configures in IWC or hard to allocate weight on different types of merges? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
zhaih commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1958716375 +1 to move executor away from the Codec API (altho it's me who placed them there LOL) > it would be nice to fully encapsulate the merging concurrency there instead of having two sources of merging concurrency that are not aware of one another. I like the idea, but I'm a little bit worry about the situation where the HNSW merging threads are taking most of the resource such that original CMS threads cannot be executed. As the `numWorkers` we configured is a per-merge limit, which means if there are several big merges happening at the same time, say `n` merges, they will use `n * (numWorkers + 1)` threads, the thread pool can run out of threads very fastly and result in smaller merges be blocked by large merges. So I think it is still better to have two separate thread pools for inter-segment merge and inner segment merge, but I wonder whether we can have a `ThreadPoolManager` which configures both inter and inner pool. And CMS and HNSW merge (and any future merges) will source from this manager? Also I heard there's a usecase where they're currently sharing one CMS with multiple IW in a same process for better resource management, this new `ThreadPoolManager` can be a solution for them as well maybe? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
benwtrent commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1957696114 > it would be nice to fully encapsulate the merging concurrency there instead of having two sources of merging concurrency that are not aware of one another. To take advantage of the executor, etc. it needs to be accessible from the scheduler. So, we would have to update the `MergeScheduler` to have some methods to return the executor for us to use and pass to `MergeState` (which is only created via the `SegmentMerger` object). This means that the scheduler for individual merges and the parallelism available to those individual merges are independent. I don't know how they would ever be "aware" of one another. > I'm also keen on keeping the codec API as simple as possible and adding the executor as a member of the MergeState rather than a new parameter of all codec write APIs. I updated the PR to do just this. Its a much cleaner API. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]
jpountz commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1957581351 Thinking out loud: since merge schedulers already have the ability to merge concurrently (across multiple merges rather than within a merge though), it would be nice to fully encapsulate the merging concurrency there instead of having two sources of merging concurrency that are not aware of one another. I'm also keen on keeping the codec API as simple as possible and adding the executor as a member of the `MergeState` rather than a new parameter of all codec write APIs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org