This query latency increased from 14.65 to 20.90ms.

We use the `TopScoreDocCollector.createSharedManager(/*batchSize*/ 101,
/*searchAfterFieldDoc*/ null, /*hitsThreshold*/ 1000); `

Thanks a lot!

On Tue, Sep 17, 2024 at 6:45 AM Adrien Grand <jpou...@gmail.com> wrote:

> Can you tell us how long this query used to take, and how long it
> takes now?
> Also are you using IndexSearcher's default total hit count threshold of
> 1,000, or are you passing a custom value to TopScoreDocCollectorManager?
>
> On Tue, Sep 17, 2024 at 10:14 AM Rui Wu <rui...@mongodb.com> wrote:
>
>> Hi Adrien,
>>
>> Thanks for looking into this! Here are more screenshots of the
>> flamegraph. The original flamegraph HTMLs have stack traces from our app so
>> I don't share it here.
>> [image: Screenshot 2024-09-17 at 1.13.07 AM.png][image: Screenshot
>> 2024-09-17 at 1.12.01 AM.png]
>>
>> On Tue, Sep 17, 2024 at 1:00 AM Adrien Grand <jpou...@gmail.com> wrote:
>>
>>> Hello Rui,
>>>
>>> We actually released a change that should make MaxScoreBulkScorer faster
>>> on dense disjunctions in 9.8:
>>> https://github.com/apache/lucene/pull/12444. Your benchmark case is
>>> quite specific though as all clauses match all docs and produce constant
>>> scores, so I would expect the scorer to quickly realize that it can skip
>>> all documents once it's scored the first k docs. This makes me wonder if it
>>> bottleneck on skipping blocks of documents rather than on scoring them.
>>> Would you be able to share your whole flame graph, it looks like it may be
>>> truncated a the top?
>>>
>>> On Mon, Sep 16, 2024 at 10:01 PM Rui Wu <rui...@mongodb.com> wrote:
>>>
>>>> Correction: The index has 3.6 million documents.
>>>>
>>>> On Mon, Sep 16, 2024 at 1:00 PM Rui Wu <rui...@mongodb.com> wrote:
>>>>
>>>>> Dear experts,
>>>>>
>>>>> In our Mongodb Atlas Search performance regression test between Lucene
>>>>> 9.7 and Lucene 9.11, we detect a 43% latency regression in this query 
>>>>> shape:
>>>>> 12 SHOULD clause, and each clause matches all of the documents. Each
>>>>> should clause is wrapped in ConstantScoreQuery.
>>>>>
>>>>> The index has 3.6 documents, and every document is identical: Every
>>>>> document is {"path": ["1", "2", "3" ... "12"]}
>>>>> The query shape is a BooleanQuery of SHOULD "1", SHOULD "2", ...
>>>>> SHOULD "12".
>>>>>
>>>>> Our flamegraphs show that most of the time in search() is spent on
>>>>> the MaxScoreBulkScorer class:
>>>>> [image: image.png]
>>>>>
>>>>> We wonder if this extreme test case is expected to be slow on
>>>>> MaxScoreBulkScorer?
>>>>>
>>>>> Thanks a lot!
>>>>>
>>>>> Rui Wu
>>>>> Lead Engineer, MongoDB
>>>>>
>>>>
>>>
>>> --
>>> Adrien
>>>
>>
>
> --
> Adrien
>

Reply via email to