viirya commented on code in PR #194:
URL:
https://github.com/apache/arrow-datafusion-comet/pull/194#discussion_r1522083809
##########
spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala:
##########
@@ -1836,6 +1838,48 @@ object QueryPlanSerde extends Logging with
ShimQueryPlanSerde {
}
}
+ case join: ShuffledHashJoinExec if isCometOperatorEnabled(op.conf,
"hash_join") =>
+ if (join.buildSide == BuildRight) {
+ // DataFusion HashJoin assumes build side is always left.
+ // TODO: support BuildRight
Review Comment:
Yea, in DataFusion, only left side could be the build side. But in Spark,
the HashJoin operator has a build side parameter to indicate which side is
build side. The operator will do right thing accordingly internally. So
currently we cannot just create a DataFusion HashJoin operator with right side
as build side.
It can be swapped between left and right side, only if we also swap outputs
and also column binding in joining keys and joining filter. I'd like to relax
the build side constraint in DataFusion instead of doing the swap in Comet.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]