zclllyybb commented on issue #65118:
URL: https://github.com/apache/doris/issues/65118#issuecomment-4854200807

   Breakwater-GitHub-Analysis-Slot: slot_cb135054c297
   This content is generated by AI for reference only.
   
   Initial code-level triage:
   
   This looks like a real `score()` correctness bug, not only an unstable 
`LIMIT` tie issue. The screenshot shows one execution returning non-zero BM25 
scores and the next execution returning `0` for all returned rows.
   
   I checked the Doris 4.1.2 code path, where the public `4.1.2` tag points to 
the same commit as `4.1.2-rc01`. The likely root cause is the interaction 
between `score()` materialization and the segment condition cache:
   
   - `enable_condition_cache` is enabled by default and FE sets 
`condition_cache_digest` when it is enabled.
   - BE disables condition cache for TopN filters, but not for `ORDER BY 
score()` / `ScoreRuntime`.
   - On a condition-cache hit, 
`SegmentIterator::_init_row_bitmap_by_condition_cache()` filters `_row_bitmap` 
from the cached bitmap/block result.
   - For `SEARCH(...)`, the inverted-index expression can then be treated as 
already evaluated and removed from `_common_expr_ctxs_push_down`.
   - Later, `SegmentIterator::_prepare_score_column_materialization()` still 
materializes `score()` from `IndexQueryContext::collection_similarity`.
   - The condition cache only stores filter results, not BM25 scores. Therefore 
on a cache-hit path the per-query `CollectionSimilarity` score map can be 
empty, and `CollectionSimilarity::get_topn_bm25_scores()` / `get_bm25_scores()` 
falls back to `0.0F` for rows not found in `_bm25_scores`.
   
   This also explains why the first execution can return non-zero scores and a 
later execution of the same SQL can return zeros. The inverted-index query 
cache itself already avoids cache hits when similarity scoring is required, so 
the problematic cache here is the separate condition cache.
   
   Suggested confirmation:
   
   1. Re-run the query after `SET enable_condition_cache = false;`. If this 
diagnosis is correct, `score()` should stay non-zero across repeated executions.
   2. Compare the runtime profile of the bad execution and check whether 
`ConditionCacheSegmentHit` is greater than 0.
   3. Please also provide the exact build hash/tag and the output of `SHOW 
VARIABLES LIKE 'enable_condition_cache';`.
   
   Suggested fix direction:
   
   - Do not use condition cache for scans that need `ScoreRuntime` / `score()` 
materialization, or
   - make the cache path force inverted-index scoring execution so 
`collection_similarity` is populated before `score()` is materialized.
   
   A regression test should run the same `SEARCH(...) ORDER BY score() DESC 
LIMIT ...` query twice with condition cache enabled and assert that the second 
execution still returns non-zero scores. Since all rows in this repro contain 
the same searched term and should have equal BM25 scores, add a deterministic 
tie breaker such as `ORDER BY relevance DESC, id ASC` if the expected row order 
itself is asserted.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to