neilconway opened a new issue, #22931:
URL: https://github.com/apache/datafusion/issues/22931

   ### Is your feature request related to a problem or challenge?
   
   To evaluate a semi join, we support two orientations: `LeftSemi` or 
`RightSemi` (analogously for anti and mark joins; I'll just refer to semijoins 
here to simplify the discussion). Under `RightSemi`, we build the non-preserved 
input and stream the preserved input; these are swapped for `LeftSemi`. While 
it might seem like these two orientations are symmetrical, there are actually 
significant differences in evaluation behavior between them:
   
   * The build-side hash table has to be resident in memory; all else being 
equal, building the smaller join input is a good general rule, and that's the 
main rule we follow today.
   * `RightSemi` only needs to store the join keys for the build side; 
`LeftSemi` needs to store wider rows. By definition, the consumer of a semijoin 
can't be interested in any values from the non-preserved side of the join. So 
even if the non-preserved side has more rows than the preserved side, building 
the hash table on the non-preserved side might still require less memory.
   * `RightSemi` preserves the partitioning of the preserved input, whereas 
`LeftSemi` always emits with `UnknownPartitioning`.
   * `RightSemi` works better with dynamic filter pushdown: XXX elaborate
   
   Two additional factors that might change:
   
   * `RightSemi` only needs to build on distinct values from the non-preserved 
side. In the future, we can optimize `RightSemi` to discard duplicate 
build-side rows. We don't do that today but we might in the future (#22930)
   * `RightSemi` allows emitting join results incrementally: as we see each 
probe row, we can immediately determine if it should be output or not. Whereas 
`LeftSemi` consumes the _entire_ non-preserved side, marking which of the 
preserved-side rows matched, and only at the end of the non-preserved input 
stream can we do a pass over the matched bitmap to determine which 
preserved-side rows to emit. This is not fundamental though; probably worth 
fixing LeftSemi to emit incrementally (#22929)
   
   The current optimizer rules don't reflect this:
   
   * `LeftSemi` and `RightSemi` are considered symmetrically; whichever 
semijoin input is predicted to be smaller is  placed on the build side
   * If there are absent stats, `LeftSemi` is chosen
   
   I think revising these rules as follows would make more sense:
   
   * Prefer `RightSemi` over `LeftSemi`, _unless_ the non-preserved input is k 
times larger than the preserved input. Choosing `k` is a bit arbitrary, but a 
value in the range of 2-4 seems reasonable.
   * If there are absent stats, prefer `RightSemi`
   
   ### Describe the solution you'd like
   
   _No response_
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to