This query latency increased from 14.65 to 20.90ms. We use the `TopScoreDocCollector.createSharedManager(/*batchSize*/ 101, /*searchAfterFieldDoc*/ null, /*hitsThreshold*/ 1000); `
Thanks a lot! On Tue, Sep 17, 2024 at 6:45 AM Adrien Grand <jpou...@gmail.com> wrote: > Can you tell us how long this query used to take, and how long it > takes now? > Also are you using IndexSearcher's default total hit count threshold of > 1,000, or are you passing a custom value to TopScoreDocCollectorManager? > > On Tue, Sep 17, 2024 at 10:14 AM Rui Wu <rui...@mongodb.com> wrote: > >> Hi Adrien, >> >> Thanks for looking into this! Here are more screenshots of the >> flamegraph. The original flamegraph HTMLs have stack traces from our app so >> I don't share it here. >> [image: Screenshot 2024-09-17 at 1.13.07 AM.png][image: Screenshot >> 2024-09-17 at 1.12.01 AM.png] >> >> On Tue, Sep 17, 2024 at 1:00 AM Adrien Grand <jpou...@gmail.com> wrote: >> >>> Hello Rui, >>> >>> We actually released a change that should make MaxScoreBulkScorer faster >>> on dense disjunctions in 9.8: >>> https://github.com/apache/lucene/pull/12444. Your benchmark case is >>> quite specific though as all clauses match all docs and produce constant >>> scores, so I would expect the scorer to quickly realize that it can skip >>> all documents once it's scored the first k docs. This makes me wonder if it >>> bottleneck on skipping blocks of documents rather than on scoring them. >>> Would you be able to share your whole flame graph, it looks like it may be >>> truncated a the top? >>> >>> On Mon, Sep 16, 2024 at 10:01 PM Rui Wu <rui...@mongodb.com> wrote: >>> >>>> Correction: The index has 3.6 million documents. >>>> >>>> On Mon, Sep 16, 2024 at 1:00 PM Rui Wu <rui...@mongodb.com> wrote: >>>> >>>>> Dear experts, >>>>> >>>>> In our Mongodb Atlas Search performance regression test between Lucene >>>>> 9.7 and Lucene 9.11, we detect a 43% latency regression in this query >>>>> shape: >>>>> 12 SHOULD clause, and each clause matches all of the documents. Each >>>>> should clause is wrapped in ConstantScoreQuery. >>>>> >>>>> The index has 3.6 documents, and every document is identical: Every >>>>> document is {"path": ["1", "2", "3" ... "12"]} >>>>> The query shape is a BooleanQuery of SHOULD "1", SHOULD "2", ... >>>>> SHOULD "12". >>>>> >>>>> Our flamegraphs show that most of the time in search() is spent on >>>>> the MaxScoreBulkScorer class: >>>>> [image: image.png] >>>>> >>>>> We wonder if this extreme test case is expected to be slow on >>>>> MaxScoreBulkScorer? >>>>> >>>>> Thanks a lot! >>>>> >>>>> Rui Wu >>>>> Lead Engineer, MongoDB >>>>> >>>> >>> >>> -- >>> Adrien >>> >> > > -- > Adrien >