NGA-TRAN commented on code in PR #20246:
URL: https://github.com/apache/datafusion/pull/20246#discussion_r2800370451
##########
datafusion/physical-expr/src/expressions/dynamic_filters.rs:
##########
@@ -450,6 +524,25 @@ impl PhysicalExpr for DynamicFilterPhysicalExpr {
}
}
+/// Snapshot a `PhysicalExpr` tree, replacing any
[`DynamicFilterPhysicalExpr`] that
+/// has per-partition data with its partition-specific filter expression.
+/// If a `DynamicFilterPhysicalExpr` does not have partitioned data, it is
left unchanged.
+pub fn snapshot_physical_expr_for_partition(
+ expr: Arc<dyn PhysicalExpr>,
+ partition: usize,
+) -> Result<Arc<dyn PhysicalExpr>> {
+ expr.transform_up(|e| {
+ if let Some(dynamic) =
e.as_any().downcast_ref::<DynamicFilterPhysicalExpr>()
+ && dynamic.has_partitioned_filters()
+ {
+ let snapshot = dynamic.current_for_partition(partition)?;
+ return Ok(Transformed::yes(snapshot));
Review Comment:
If there are 2 joins in the plan like this and dynamic filtering is turned
on for both of them but they will be on different partitioning expressions.
Will this transform stop and get the right expression? Adding some tests for
those cases will help verify that and uncover bugs if any
Join2
/ \
repartition T3
/
Join2
/ \
T1 T2
##########
datafusion/common/src/config.rs:
##########
@@ -996,6 +996,11 @@ config_namespace! {
///
/// Note: This may reduce parallelism, rooting from the I/O level, if
the number of distinct
/// partitions is less than the target_partitions.
+ ///
+ /// Note for partitioned hash join dynamic filtering:
+ /// preserving file partitions can allow partition-index routing (`i
-> i`) instead of
+ /// CASE-hash routing, but this assumes build/probe partition indices
stay aligned for
+ /// dynamic filter consumers.
Review Comment:
Agreed. Example will epxlain things clearer here. Monodraw is a good tool
for this
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]