EnricoMi commented on code in PR #39131:
URL: https://github.com/apache/spark/pull/39131#discussion_r1070415424
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PushDownLeftSemiAntiJoin.scala:
##
@@ -61,9 +61,10 @@ object PushDownLeftSemiAntiJoin extends Rule[LogicalPlan]
}
// LeftSemi/LeftAnti over Aggregate, only push down if join can be planned
as broadcast join.
-case join @ Join(agg: Aggregate, rightOp, LeftSemiOrAnti(_), _, _)
+case join @ Join(agg: Aggregate, rightOp, LeftSemiOrAnti(_), joinCond, _)
if agg.aggregateExpressions.forall(_.deterministic) &&
agg.groupingExpressions.nonEmpty &&
!agg.aggregateExpressions.exists(ScalarSubquery.hasCorrelatedScalarSubquery) &&
+ canPushThroughCondition(agg.children, joinCond, rightOp) &&
Review Comment:
I don't understand. The `canPushThroughCondition` is called before the
`Join` is being pushed through the `Aggregate`, it has been added to prevent
this from happening in this situation. The other cases (e.g. `Union`) are
calling into `canPushThroughCondition` equivalently.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org