github-actions[bot] commented on code in PR #64165:
URL: https://github.com/apache/doris/pull/64165#discussion_r3366774040
##########
be/src/exec/scan/scanner.cpp:
##########
@@ -197,7 +197,8 @@ Status Scanner::get_block(RuntimeState* state, Block*
block, bool* eof) {
Status Scanner::_filter_output_block(Block* block) {
auto old_rows = block->rows();
- Status st = VExprContext::filter_block(_conjuncts, block,
block->columns());
+ Status st = VExprContext::filter_block(_conjuncts, block, block->columns(),
+ ScanFilterStage::EXEC_RESIDUAL);
Review Comment:
Recording all remaining scanner conjuncts as `EXEC_RESIDUAL` here makes
file-scan `ScanFilterInfo` wrong when native readers already applied
pushed-down predicates. For Parquet, `RowGroupReader` filters
`_filter_conjuncts` inside `vparquet_group_reader.cpp` before this block
reaches the scanner, and ORC has the same pattern around its reader-local
`execute_conjuncts()` calls. Those reader-local filters do not receive a
scan-filter stage, so the real filtered rows are invisible; this residual pass
then sees only surviving rows and often reports `FilteredRows=0`. Please record
the reader-local predicate evaluations with an appropriate stage, or avoid
attributing predicates already applied by the reader as residual-only work.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]