nuno-faria opened a new issue, #17196: URL: https://github.com/apache/datafusion/issues/17196
### Describe the bug In the following example, the Dynamic Filter Pushdown is built from `t2 (k, v)` and correctly pushed to `t1 (k)`, to reduce the number of scanned rows from `t1`: ```sql select * from t1 -- 1M rows, will become the right side join t2 on t1.k = t2.k where t2.v = 1000000; -- 1 row, will become the left side HashJoinExec join_type=Inner build_time=78.306µs, join_time=22.402µs DataSourceExec t2.parquet predicate=v@1 = 1000000 output_rows=1 DataSourceExec t1.parquet predicate=DynamicFilterPhysicalExpr [ k@0 >= 1000000 AND k@0 <= 1000000 ] -- dynamic filter added here output_rows=1 -- only returns 1 row ``` Now if the `t2` side is greater than `t1`, then the order of the join switches, as the smaller relation becomes the build side. However, the Dynamic Filter Pushdown is still applied to the right relation (now `t2`) instead of `t1`: ```sql select * from t1 -- 1M rows, will become the left side join t2 on t1.k = t2.k where t2.v >= 1000000; -- 9M rows, will become the right side HashJoinExec join_type=Inner build_time=54.3462ms, join_time=2.403µs -- join is more expensive DataSourceExec t1.parquet output_rows=1000000 DataSourceExec t2.parquet predicate=v@1 >= 1000000 AND DynamicFilterPhysicalExpr [ k@0 >= 3 AND k@0 <= 999998 ] -- dynamic filter added here output_rows=0 -- should be 1, possibly due to issue https://github.com/apache/datafusion/issues/17188 ``` ### To Reproduce ```sql copy (select i as k from generate_series(1, 1000000) t(i)) to 't1.parquet'; copy (select i as k, i as v from generate_series(1, 10000000) t(i)) to 't2.parquet'; create external table t1 stored as parquet location 't1.parquet'; create external table t2 stored as parquet location 't2.parquet'; -- works explain analyze select * from t1 join t2 on t1.k = t2.k where t2.v = 1000000; -- pushed to the wrong relation explain analyze select * from t1 join t2 on t1.k = t2.k where t2.v >= 1000000; ``` ### Expected behavior Dynamic Filter applied to the correct table. ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org