kosiew commented on code in PR #17518:
URL: https://github.com/apache/datafusion/pull/17518#discussion_r2359423705


##########
datafusion/common/src/join_type.rs:
##########
@@ -74,6 +74,12 @@ pub enum JoinType {
     RightMark,
 }
 
+const LEFT_PRESERVING: &[JoinType] =
+    &[JoinType::Left, JoinType::Full, JoinType::LeftMark];

Review Comment:
   Thanks for double-checking! In this file the LEFT_PRESERVING array is meant 
to capture only the join variants that are guaranteed to emit every left row at 
least once—i.e. the outer and mark joins. Semi/anti joins intentionally aren’t 
listed because they drop rows from their respective inputs, so they don’t 
satisfy that preservation property.
   
   That distinction matters a few lines later when we decide which child can 
safely receive a dynamic filter. If we marked LeftSemi/LeftAnti as 
left-preserving, dynamic_filter_pushdown_side would classify them the same way 
as a left outer join and start attaching the dynamic filter to the right 
input—the exact misbehaviour you’re warning about for predicates that reference 
b.y. Because they remain non-preserving, those join types fall through to the 
(false, false) arm and we keep the dynamic filter on the left side only, which 
means predicates like a.x < 5 are still pushed while b.y < 10 is not.
   
   I’ll add a short comment in the code/tests to spell this out so it’s harder 
to miss in the future.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to