gabotechs commented on code in PR #21641:
URL: https://github.com/apache/datafusion/pull/21641#discussion_r3088377599


##########
datafusion/physical-plan/src/joins/hash_join/stream.rs:
##########
@@ -216,9 +214,6 @@ pub(super) struct HashJoinStream {
     right_side_ordered: bool,
     /// Shared build accumulator for coordinating dynamic filter updates 
(collects hash maps and/or bounds, optional)
     build_accumulator: Option<Arc<SharedBuildAccumulator>>,
-    /// Optional future to signal when build information has been reported by 
all partitions
-    /// and the dynamic filter has been updated
-    build_waiter: Option<OnceFut<()>>,

Review Comment:
   One pattern that I've seen in both Trino and DataFusion data source 
implementations is to accept the dynamic filter push down, and during 
execution, wait a grace period in the consumer side of the dynamic filter 
before starting to pull data from the data source.
   
   This means that the one responsible for deferring further execution is not 
the one that produces the dynamic filter, but whoever is willing to consume it.
   
   I think we have the right APIs for this to 
([DynamicFilterPhysicalExpr::wait_update](https://github.com/apache/datafusion/blob/main/datafusion/physical-expr/src/expressions/dynamic_filters.rs#L283-L283)
 and 
[DynamicFilterPhysicalExpr::wait_complete](https://github.com/apache/datafusion/blob/main/datafusion/physical-expr/src/expressions/dynamic_filters.rs#L303)),
 although I could imagine how depending on the source of the dynamic filter (if 
it comes from a TopK or from a Join), the decisions of whether it's worth 
waiting can be different.
   
   I see that this PR does not close the door to having something like that, so 
I think it should be good.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to