Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/17520#discussion_r109603467
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1032,6 +1251,109 @@ object PushPredicateThroughJoin extends
Rule[LogicalPlan] with PredicateHelper {
}
/**
+ * Pushes down a subquery, in the form of [[Join LeftSemi/LeftAnti]]
operator
+ * to the left or right side of a join below.
+ */
+object PushLeftSemiLeftAntiThroughJoin extends Rule[LogicalPlan] with
PredicateHelper {
+ /**
+ * Define an enumeration to identify whether a Exists/In subquery,
+ * in the form of a LeftSemi/LeftAnti, can be pushed down to
+ * the left table or the right table.
+ */
+ object subqueryPushdown extends Enumeration {
+ val toRightTable, toLeftTable, none = Value
+ }
+
+ /**
+ * Determine which side of the join an Exists/In subquery (in the form of
+ * LeftSemi/LeftAnti join) can be pushed down to.
+ */
+ private def pushTo(child: Join, subquery: LogicalPlan, joinCond:
Option[Expression]) = {
+ val left = child.left
+ val right = child.right
+ val joinType = child.joinType
+ val subqueryOutput = subquery.outputSet
+
+ if (joinCond.nonEmpty) {
+ /**
+ * Note: In order to ensure correctness, it's important to not
change the relative ordering of
+ * any deterministic expression that follows a non-deterministic
expression. To achieve this,
+ * we only consider pushing down those expressions that precede the
first non-deterministic
+ * expression in the condition.
+ */
+ val noPushdown = (subqueryPushdown.none, None)
+ val conditions = splitConjunctivePredicates(joinCond.get)
+ val (candidates, containingNonDeterministic) =
conditions.span(_.deterministic)
+ lazy val (pushDownCandidates, subquery) =
--- End diff --
`subquery` can be easily confused with function parameter `subquery:
LogicalPlan`. It is better to have a different name.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]