LiaCastaneda opened a new pull request, #18938: URL: https://github.com/apache/datafusion/pull/18938
## Which issue does this PR close? Closes https://github.com/apache/datafusion/issues/17527 ## Rationale for this change Currently, DataFusion computes bounds for all queries that contain a HashJoinExec node whenever the option enable_dynamic_filter_pushdown is set to true (default). It might make sense to compute these bounds only when we explicitly know there is a consumer that will use them. ## What changes are included in this PR? As suggested in https://github.com/apache/datafusion/issues/17527#issuecomment-3576945224, this PR adds an is_used() method to DynamicFilterPhysicalExpr that checks if any consumers are holding a reference to the filter using Arc::strong_count(). During filter pushdown, consumers that accept the filter and use it later in execution have to retain a reference to Arc<DynamicFilterPhysicalExpr>. For example, scan nodes like ParquetSource. ## Are these changes tested? I added a unit test in dynamic_filters.rs (test_is_used) that verifies the Arc reference counting behavior. Existing integration tests in datafusion/core/tests/physical_optimizer/filter_pushdown/mod.rs validate the end-to-end behavior. These tests verify that dynamic filters are computed and filled when consumers are present. ## Are there any user-facing changes? new is_used() function -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
