gengliangwang opened a new pull request, #56076: URL: https://github.com/apache/spark/pull/56076
### What changes were proposed in this pull request? This is a sub-task of [SPARK-56908](https://issues.apache.org/jira/browse/SPARK-56908). `HashJoin.codegenOuter` emits a `boolean conditionPassed` variable plus either an `if (!conditionPassed) { reset }` block (unique-key path) or an `if (conditionPassed) { ... }` wrap around the inner loop body (non-unique-key path) regardless of whether `condition` is defined. When `condition.isEmpty`: - the variable is initialized to `true` and never reassigned; - the `if (!conditionPassed)` reset block is dead; - the `if (conditionPassed)` wrap is unconditional. Detect `condition.isEmpty` and omit the variable, the reset block, and the wrap. ### Why are the changes needed? Smaller generated Java per stage for the common case where outer joins have no join condition. JIT eliminates the dead code at runtime; the win is smaller generated source, more 64KB method-limit headroom, and slightly faster Janino compile. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing `OuterJoinSuite` covers `BroadcastHashJoin` and `ShuffledHashJoin` outer joins with whole-stage codegen on and off, with and without join conditions. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
