JkSelf commented on code in PR #8931:
URL: https://github.com/apache/incubator-gluten/pull/8931#discussion_r2904525268
##########
backends-velox/src/main/scala/org/apache/gluten/backendsapi/velox/VeloxSparkPlanExecApi.scala:
##########
@@ -678,9 +682,136 @@ class VeloxSparkPlanExecApi extends SparkPlanExecApi {
child: SparkPlan,
numOutputRows: SQLMetric,
dataSize: SQLMetric): BuildSideRelation = {
+
+ val buildKeys = mode match {
+ case mode1: HashedRelationBroadcastMode =>
+ mode1.key
+ case _ =>
+ // IdentityBroadcastMode
+ Seq.empty
+ }
+ var offload = true
+ val (newChild, newOutput, newBuildKeys) =
+ if (VeloxConfig.get.enableBroadcastBuildOncePerExecutor) {
+
+ // Try to lookup from TreeNodeTag using child's logical plan
+ // Need to recursively find logicalLink in case of AQE or other
wrappers
+ @scala.annotation.tailrec
+ def findLogicalLink(
+ plan: SparkPlan):
Option[org.apache.spark.sql.catalyst.plans.logical.LogicalPlan] = {
+ plan.logicalLink match {
+ case some @ Some(_) => some
+ case None =>
+ plan.children match {
+ case Seq(child) => findLogicalLink(child)
+ case _ => None
+ }
+ }
+ }
+
+ val newBuildKeys = findLogicalLink(child)
Review Comment:
> 1. Under what scenarios would the logical link fail to contain the tag?
In some unit tests (e.g.,
[GlutenOuterJoinSuiteForceShjOn](https://github.com/apache/spark/blob/dc9f559660937098ceeded113c2318c5c14ba73f/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/OuterJoinSuite.scala#L136-L138)),
BHJ is created manually and does not trigger `GlutenJoinKeysCapture`. This
causes the loss of the original expression.
> 2. The tag seems to store the original Attributes, but the buildKeys in
the mode are already bindReferenced. If getOriginalKeysFromPacked returns
BoundReferences, don't we need to resolve them back to Attributes using the
child's output schema?
We added the getOriginalKeysFromPacked method to extract the original
expression instead of the bound reference.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]