Re: MaxScoreBulkScorer increased latency for a extreme test case (many SHOULD and each SHOULD clause matches all docs)

Rui Wu Tue, 17 Sep 2024 01:16:08 -0700

Hi Adrien,

Thanks for looking into this! Here are more screenshots of the flamegraph.
The original flamegraph HTMLs have stack traces from our app so I don't
share it here.
[image: Screenshot 2024-09-17 at 1.13.07 AM.png][image: Screenshot
2024-09-17 at 1.12.01 AM.png]


On Tue, Sep 17, 2024 at 1:00 AM Adrien Grand <jpou...@gmail.com> wrote:

> Hello Rui,
>
> We actually released a change that should make MaxScoreBulkScorer faster
> on dense disjunctions in 9.8: https://github.com/apache/lucene/pull/12444.
> Your benchmark case is quite specific though as all clauses match all docs
> and produce constant scores, so I would expect the scorer to quickly
> realize that it can skip all documents once it's scored the first k docs.
> This makes me wonder if it bottleneck on skipping blocks of documents
> rather than on scoring them. Would you be able to share your whole flame
> graph, it looks like it may be truncated a the top?
>
> On Mon, Sep 16, 2024 at 10:01 PM Rui Wu <rui...@mongodb.com> wrote:
>
>> Correction: The index has 3.6 million documents.
>>
>> On Mon, Sep 16, 2024 at 1:00 PM Rui Wu <rui...@mongodb.com> wrote:
>>
>>> Dear experts,
>>>
>>> In our Mongodb Atlas Search performance regression test between Lucene
>>> 9.7 and Lucene 9.11, we detect a 43% latency regression in this query shape:
>>> 12 SHOULD clause, and each clause matches all of the documents. Each
>>> should clause is wrapped in ConstantScoreQuery.
>>>
>>> The index has 3.6 documents, and every document is identical: Every
>>> document is {"path": ["1", "2", "3" ... "12"]}
>>> The query shape is a BooleanQuery of SHOULD "1", SHOULD "2", ... SHOULD
>>> "12".
>>>
>>> Our flamegraphs show that most of the time in search() is spent on
>>> the MaxScoreBulkScorer class:
>>> [image: image.png]
>>>
>>> We wonder if this extreme test case is expected to be slow on
>>> MaxScoreBulkScorer?
>>>
>>> Thanks a lot!
>>>
>>> Rui Wu
>>> Lead Engineer, MongoDB
>>>
>>
>
> --
> Adrien
>

Re: MaxScoreBulkScorer increased latency for a extreme test case (many SHOULD and each SHOULD clause matches all docs)

Reply via email to