agrawaldevesh commented on a change in pull request #29304:
URL: https://github.com/apache/spark/pull/29304#discussion_r464105940
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala
##########
@@ -245,7 +244,7 @@ case class BroadcastHashJoinExec(
|boolean $found = false;
|// generate join key for stream side
|${keyEv.code}
- |if ($anyNull) {
+ |if (${if (isLongHashedRelation) s"$anyNull" else
s"${keyEv.value}.allNull()"}) {
Review comment:
Okay lets add tests here for @viirya's case of multiple short keys
packing into a single long. Please ensure that at least two of those keys are
nullable in your test. I am curious if that will trigger the LongHashedRelation
or the (exploding) UnsafeHashedRelation. I believe that the former is incorrect.
Basically I am claiming that LongHashedRelation cannot be used with more
than one null keys. This was moot earlier because we were dropping rows that
had any null key.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]