aokolnychyi commented on code in PR #41499:
URL: https://github.com/apache/spark/pull/41499#discussion_r1227428110
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala:
##########
@@ -341,6 +341,16 @@ trait JoinSelectionHelper {
)
}
+ def getBroadcastNestedLoopJoinBuildSide(hint: JoinHint): Option[BuildSide] =
{
+ if (hintToNotBroadcastAndReplicateLeft(hint)) {
+ Some(BuildRight)
+ } else if (hintToNotBroadcastAndReplicateRight(hint)) {
+ Some(BuildLeft)
+ } else {
+ None
+ }
+ }
Review Comment:
At the moment, the new hint can be only set on one side and never on both.
BNLJ is considered as the default join strategy and having no broadcast and
replicate hints on both sides would mean there is no applicable fallback join
strategy to use. If we were to adapt the method above, we can't keep the
existing default logic that picks the broadcast side based on size (that could
cause a correctness problem). What about adding validation the new hint is set
only on one side?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]