NGA-TRAN commented on code in PR #20246:
URL: https://github.com/apache/datafusion/pull/20246#discussion_r2800370451


##########
datafusion/physical-expr/src/expressions/dynamic_filters.rs:
##########
@@ -450,6 +524,25 @@ impl PhysicalExpr for DynamicFilterPhysicalExpr {
     }
 }
 
+/// Snapshot a `PhysicalExpr` tree, replacing any 
[`DynamicFilterPhysicalExpr`] that
+/// has per-partition data with its partition-specific filter expression.
+/// If a `DynamicFilterPhysicalExpr` does not have partitioned data, it is 
left unchanged.
+pub fn snapshot_physical_expr_for_partition(
+    expr: Arc<dyn PhysicalExpr>,
+    partition: usize,
+) -> Result<Arc<dyn PhysicalExpr>> {
+    expr.transform_up(|e| {
+        if let Some(dynamic) = 
e.as_any().downcast_ref::<DynamicFilterPhysicalExpr>()
+            && dynamic.has_partitioned_filters()
+        {
+            let snapshot = dynamic.current_for_partition(partition)?;
+            return Ok(Transformed::yes(snapshot));

Review Comment:
   If there are 2 joins in the plan like this and dynamic filtering is turned 
on for both of them but they will be on different partitioning expressions. 
Will this transform stop and get the right expression? Adding some tests for 
those cases will help verify that and uncover bugs if any
   
                     Join2
                    /           \            
          repartition     T3
           /              
         Join2      
      /         \   
   T1          T2



##########
datafusion/common/src/config.rs:
##########
@@ -996,6 +996,11 @@ config_namespace! {
         ///
         /// Note: This may reduce parallelism, rooting from the I/O level, if 
the number of distinct
         /// partitions is less than the target_partitions.
+        ///
+        /// Note for partitioned hash join dynamic filtering:
+        /// preserving file partitions can allow partition-index routing (`i 
-> i`) instead of
+        /// CASE-hash routing, but this assumes build/probe partition indices 
stay aligned for
+        /// dynamic filter consumers.

Review Comment:
   Agreed. Example will epxlain things clearer here. Monodraw is a good tool 
for this



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to