Cheng Su created SPARK-34681:
--------------------------------
Summary: Full outer shuffled hash join when building left side
produces wrong result
Key: SPARK-34681
URL: https://issues.apache.org/jira/browse/SPARK-34681
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 3.1.1, 3.2.0
Reporter: Cheng Su
For full outer shuffled hash join with building hash map on left side, and
having non-equal condition, the join can produce wrong result.
The root cause is `boundCondition` in `HashJoin.scala` always assumes the left
side row is `streamedPlan` and right side row is `buildPlan`
(streamedPlan.output ++ buildPlan.output). This is valid assumption, except for
full outer + build left case.
The fix is to correct `boundCondition` in `HashJoin.scala` to handle full outer
+ build left case properly. See reproduce in
https://issues.apache.org/jira/browse/SPARK-32399?focusedCommentId=17298414&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17298414
.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]