discord9 opened a new issue, #22925:
URL: https://github.com/apache/datafusion/issues/22925

   ### Describe the bug
   
   `AggregateExec::gather_filters_for_pushdown` can reorder the parent filter 
results it returns to the filter pushdown optimizer.
   
   The filter pushdown optimizer maps child pushdown results back to parent 
filters by position. `AggregateExec` currently splits incoming filters into 
`safe_filters` and `unsafe_filters`, builds the child filter description from 
the safe filters, then appends the unsupported unsafe filters. For mixed 
filters this changes the result order.
   
   For example, with a filter above an aggregate such as:
   
   ```text
   cnt@2 = 1 AND b@1 = bar
   ```
   
   where `cnt` is an aggregate output and `b` is a grouping column:
   
   - `b@1 = bar` is safe to push below the aggregate
   - `cnt@2 = 1` must remain above the aggregate
   
   Because the results are reordered, the optimizer can interpret the 
pushed-down grouping-column filter result as belonging to the aggregate-output 
filter. The aggregate-output filter can then be removed incorrectly, while the 
already pushed-down grouping-column filter remains above the aggregate.
   
   ### To Reproduce
   
   Add a regression test with a mixed predicate above `AggregateExec`:
   
   ```text
   FilterExec: cnt@2 = 1 AND b@1 = bar
     AggregateExec: mode=Final, gby=[a@0 as a, b@1 as b], aggr=[cnt]
       DataSourceExec: ...
   ```
   
   The expected optimized plan should keep only `cnt@2 = 1` above the aggregate 
and push `b@1 = bar` into the scan.
   
   On current `main`, the resulting plan keeps `b@1 = bar` above the aggregate 
instead.
   
   ### Expected behavior
   
   `AggregateExec::gather_filters_for_pushdown` should preserve the order of 
`parent_filters` in its returned parent filter results, marking unsupported 
filters in place rather than moving them to the end.
   
   ### Additional context
   
   This affects correctness for mixed aggregate-output and grouping-column 
predicates during physical filter pushdown.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to