Hi Adrien! That makes sense, I’ll see if I can put something together for TopFieldCollectorManager.
> On 15 Jan 2026, at 13:47, Adrien Grand <[email protected]> wrote: > > Hi Alan! > > It's exciting that you're looking into this, it's long overdue! I suspect > that fully merging skipping by score or field is not going to work as they > behave very differently (skipping by score skips on the postings of the very > field that is used by the Query, while skipping by field needs to introduce a > new conjunctive clause (the competitive iterator) that works by looking at > the index structures of the field that is used for sorting) and that it'd be > easier to introduce sharing of the competitive value thresholds in a somewhat > independent way. > > On Thu, Jan 15, 2026 at 1:33 PM Alan Woodward <[email protected] > <mailto:[email protected]>> wrote: >> Hi all, >> >> I've been working on improving Sort skipping by using segment re-ordering[1] >> for sorted segments, which on benchmarking was working brilliantly on the >> wikimedium10k dataset, but then showing no improvements at all on >> wikimediumall - very confusing! On digging further, I found the issue to >> be with search concurrency. The segment re-ordering happens within a single >> unit of work, but for sufficiently large segments we generate large numbers >> of tasks even if the concurrency is low - e.g., with an ExecutorService with >> 2 threads, we still get 17 tasks. >> >> For score-based searches we share competitive scores between tasks using a >> MaxScoreAccumulator, but we don’t have anything similar for field-sort-based >> searches. So these 17 tasks all run without any knowledge of what has >> happened before, and even if we re-ordered the tasks so that the first ones >> contained the best hits, the subsequent tasks wouldn’t be able to early >> terminate. >> >> I can look into adding some support for sharing competitive value thresholds >> between threads, but I wonder if it’s worth considering merging the skipping >> infrastructure between score-based sorting and field-based sorting? >> >> - Alan >> >> [1] https://github.com/apache/lucene/pull/15436 >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> <mailto:[email protected]> >> For additional commands, e-mail: [email protected] >> <mailto:[email protected]> >> > > > > -- > Adrien
