shubhamvishu commented on code in PR #12183: URL: https://github.com/apache/lucene/pull/12183#discussion_r1325027898
########## lucene/core/src/java/org/apache/lucene/index/TermStates.java: ########## @@ -86,19 +93,58 @@ public TermStates( * @param needsStats if {@code true} then all leaf contexts will be visited up-front to collect * term statistics. Otherwise, the {@link TermState} objects will be built only when requested */ - public static TermStates build(IndexReaderContext context, Term term, boolean needsStats) + public static TermStates build(IndexSearcher indexSearcher, Term term, boolean needsStats) throws IOException { - assert context != null && context.isTopLevel; + IndexReaderContext context = indexSearcher.getTopReaderContext(); + assert context != null; final TermStates perReaderTermState = new TermStates(needsStats ? null : term, context); if (needsStats) { - for (final LeafReaderContext ctx : context.leaves()) { - // if (DEBUG) System.out.println(" r=" + leaves[i].reader); - TermsEnum termsEnum = loadTermsEnum(ctx, term); - if (termsEnum != null) { - final TermState termState = termsEnum.termState(); - // if (DEBUG) System.out.println(" found"); - perReaderTermState.register( - termState, ctx.ord, termsEnum.docFreq(), termsEnum.totalTermFreq()); + Executor executor = indexSearcher.getExecutor(); + if (executor == null) { + executor = Runnable::run; + } + List<FutureTask<TermStateInfo>> tasks = + context.leaves().stream() + .map( + ctx -> + new FutureTask<>( + () -> { + TermsEnum termsEnum = loadTermsEnum(ctx, term); + if (termsEnum != null) { + return new TermStateInfo( + termsEnum.termState(), + ctx.ord, + termsEnum.docFreq(), + termsEnum.totalTermFreq()); + } + return null; + })) + .toList(); + for (FutureTask<TermStateInfo> task : tasks) { + if (executor instanceof ThreadPoolExecutor pool) { + if ((pool.getCorePoolSize() - pool.getActiveCount()) <= 1) { + task.run(); Review Comment: > If you really want to do this use StackWalker. But take care that it might require additional privileges if used in a wrong way. I did give this a shot in the new revision(s). I tried to use `Thread.currentThread().getStackTrace()` first and also `StackWalker` using `StackWalker.Option.SHOW_HIDDEN_FRAMES` option. I tried using different `StackWalker.Option`s and observed the tests failing while using `StackWalker.Option.RETAIN_CLASS_REFERENCE` due to missing privileges. Hence, I tried to keep a fail-safe/fallback approach i.e. incase of permission issues we would fallback to running the tasks on the caller thread sequentially. Since I'm using `StackWalker.Option.SHOW_HIDDEN_FRAMES` here maybe we don't need the fail-safe approach and directly throw the exception instead but I reckon we would not want to fail here due to permission issues? PS: I'm not much aware of the privileges in `StackWalker` and the scenarios it could cause issues. > The main problem with instance of checks is that there are many other implementations of Executor that may deadlock in similar ways. Think also about virtual threads in java 21! Totally agreed💯....in the new revision I have tried to only go with concurrent approach if its a `ThreadPoolExecutor` and it's not already running on executor thread, if not the tasks would run sequentially on the caller thread. This means for any other executor implementations we would not make use of concurrency and stick to current behaviour of `TermStates#build` which is non concurrent atleast till we have the nice permanent fix to this. > I think the only way is to go away with the stupid old Executor abstraction and use the ForkJoin framework of JDK. It is capable of forking from inside threads which were already forked and can handle that without deadlocks. Nice! I really liked this idea....it's time to bid bye to these old executor implementations. I'll try to dig and see more into this Fork/Join strategy. ------------- @javanna @uschindler In the new revision I have tried to work around and made changes to `TaskExecutor` to handle this specific deadlock issue/situation only if 1) its a `ThreadPoolExecutor` and 2) we are already not running executor's thread. So this solves the deadlock issues but we only run `TermStates#build` concurrently when the above 2 situations suffice. This seems to be a fix to unblock this PR but indeed we need to get rid of this and make use of Fork/Join strategy as permanent fix (also avoid such annoying `instanceof` checks for executors). Please let me know if you think we should hold on this PR until we get the ideal fix in and we could instead go with this fix meanwhile?. Thanks! ########## lucene/core/src/java/org/apache/lucene/index/TermStates.java: ########## @@ -86,19 +93,58 @@ public TermStates( * @param needsStats if {@code true} then all leaf contexts will be visited up-front to collect * term statistics. Otherwise, the {@link TermState} objects will be built only when requested */ - public static TermStates build(IndexReaderContext context, Term term, boolean needsStats) + public static TermStates build(IndexSearcher indexSearcher, Term term, boolean needsStats) throws IOException { - assert context != null && context.isTopLevel; + IndexReaderContext context = indexSearcher.getTopReaderContext(); + assert context != null; final TermStates perReaderTermState = new TermStates(needsStats ? null : term, context); if (needsStats) { - for (final LeafReaderContext ctx : context.leaves()) { - // if (DEBUG) System.out.println(" r=" + leaves[i].reader); - TermsEnum termsEnum = loadTermsEnum(ctx, term); - if (termsEnum != null) { - final TermState termState = termsEnum.termState(); - // if (DEBUG) System.out.println(" found"); - perReaderTermState.register( - termState, ctx.ord, termsEnum.docFreq(), termsEnum.totalTermFreq()); + Executor executor = indexSearcher.getExecutor(); + if (executor == null) { + executor = Runnable::run; + } + List<FutureTask<TermStateInfo>> tasks = + context.leaves().stream() + .map( + ctx -> + new FutureTask<>( + () -> { + TermsEnum termsEnum = loadTermsEnum(ctx, term); + if (termsEnum != null) { + return new TermStateInfo( + termsEnum.termState(), + ctx.ord, + termsEnum.docFreq(), + termsEnum.totalTermFreq()); + } + return null; + })) + .toList(); + for (FutureTask<TermStateInfo> task : tasks) { + if (executor instanceof ThreadPoolExecutor pool) { + if ((pool.getCorePoolSize() - pool.getActiveCount()) <= 1) { + task.run(); Review Comment: > If you really want to do this use StackWalker. But take care that it might require additional privileges if used in a wrong way. I did give this a shot in the new revision(s). I tried to use `Thread.currentThread().getStackTrace()` first and also `StackWalker` using `StackWalker.Option.SHOW_HIDDEN_FRAMES` option. I tried using different `StackWalker.Option`s and observed the tests failing while using `StackWalker.Option.RETAIN_CLASS_REFERENCE` due to missing privileges. Hence, I tried to keep a fail-safe/fallback approach i.e. incase of permission issues we would fallback to running the tasks on the caller thread sequentially. Since I'm using `StackWalker.Option.SHOW_HIDDEN_FRAMES` here maybe we don't need the fail-safe approach and directly throw the exception instead but I reckon we would not want to fail here due to permission issues? PS: I'm not much aware of the privileges in `StackWalker` and the scenarios it could cause issues. > The main problem with instance of checks is that there are many other implementations of Executor that may deadlock in similar ways. Think also about virtual threads in java 21! Totally agreed💯....in the new revision I have tried to only go with concurrent approach if its a `ThreadPoolExecutor` and it's not already running on executor thread, if not the tasks would run sequentially on the caller thread. This means for any other executor implementations we would not make use of concurrency and stick to current behaviour of `TermStates#build` which is non concurrent atleast till we have the nice permanent fix to this. > I think the only way is to go away with the stupid old Executor abstraction and use the ForkJoin framework of JDK. It is capable of forking from inside threads which were already forked and can handle that without deadlocks. Nice! I really liked this idea....it's time to bid bye to these old executor implementations. I'll try to dig and see more into this Fork/Join strategy. ------------- @javanna @uschindler In the new revision I have tried to work around and made changes to `TaskExecutor` to handle this specific deadlock issue/situation only if 1) its a `ThreadPoolExecutor` and 2) we are already not running executor's thread. So this solves the deadlock issues but we only run `TermStates#build` concurrently when the above 2 situations suffice. This seems to be a fix to unblock this PR but indeed we need to get rid of this and make use of Fork/Join strategy as permanent fix (also avoid such annoying `instanceof` checks for executors). Please let me know if you think we should hold on this PR until we get the ideal fix in and we could instead go with this fix meanwhile?. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org