LiaCastaneda opened a new pull request, #18938:
URL: https://github.com/apache/datafusion/pull/18938

   ## Which issue does this PR close?
   
   Closes https://github.com/apache/datafusion/issues/17527
   
   
   ## Rationale for this change
   
   Currently, DataFusion computes bounds for all queries that contain a 
HashJoinExec node whenever the option enable_dynamic_filter_pushdown is set to 
true (default). It might make sense to compute these bounds only when we 
explicitly know there is a consumer that will use them.
   
   ## What changes are included in this PR?
   
   As suggested in 
https://github.com/apache/datafusion/issues/17527#issuecomment-3576945224, this 
PR adds an is_used() method to DynamicFilterPhysicalExpr that checks if any 
consumers are holding a reference to the filter using Arc::strong_count(). 
   
   During filter pushdown, consumers that accept the filter and use it later in 
execution have to retain a reference to Arc<DynamicFilterPhysicalExpr>. For 
example, scan nodes like ParquetSource.
   
   ## Are these changes tested?
   
   I added a unit test in dynamic_filters.rs (test_is_used) that verifies the 
Arc reference counting behavior.
   Existing integration tests in 
datafusion/core/tests/physical_optimizer/filter_pushdown/mod.rs validate the 
end-to-end behavior. These tests verify that dynamic filters are computed and 
filled when consumers are present. 
   
   ## Are there any user-facing changes?
   
   new is_used() function
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to