zhengruifeng commented on a change in pull request #34850:
URL: https://github.com/apache/spark/pull/34850#discussion_r766296266
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -2148,6 +2149,17 @@ object RewriteIntersectAll extends Rule[LogicalPlan] {
}
}
+/**
+ * Deduplicate the right side of left-semi join and left-anti join.
+ */
+object DeduplicateLeftSemiLeftAntiRightSide extends Rule[LogicalPlan] {
+ def apply(plan: LogicalPlan): LogicalPlan = plan.transformWithPruning(
+ _.containsPattern(LEFT_SEMI_OR_ANTI_JOIN), ruleId) {
+ case join @ Join(_, right, LeftSemiOrAnti(_), _, _) if
!right.isInstanceOf[Aggregate] =>
+ join.copy(right = Aggregate(right.output, right.output, right))
Review comment:
yes, there are some cases that had not been taken into account in this
PR. I think
https://github.com/apache/spark/commit/b5599010ba969ba4cc3a2ce85549fe226b75ae65
is much better. I will close this one.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]