Github user maryannxue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20816#discussion_r178187202
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -669,11 +672,42 @@ object InferFiltersFromConstraints extends
Rule[LogicalPlan] with PredicateHelpe
val newConditionOpt = conditionOpt match {
case Some(condition) =>
val newFilters = additionalConstraints --
splitConjunctivePredicates(condition)
- if (newFilters.nonEmpty) Option(And(newFilters.reduce(And),
condition)) else None
+ if (newFilters.nonEmpty) Option(And(newFilters.reduce(And),
condition)) else conditionOpt
case None =>
additionalConstraints.reduceOption(And)
}
- if (newConditionOpt.isDefined) Join(left, right, joinType,
newConditionOpt) else join
+ // Infer filter for left/right outer joins
+ val newLeftOpt = joinType match {
+ case RightOuter if newConditionOpt.isDefined =>
+ val inferredConstraints = left.getRelevantConstraints(
+ left.constraints
+ .union(right.constraints)
+
.union(splitConjunctivePredicates(newConditionOpt.get).toSet))
+ val newFilters = inferredConstraints
+ .filterNot(left.constraints.contains)
+ .reduceLeftOption(And)
+ newFilters.map(Filter(_, left))
+ case _ => None
+ }
+ val newRightOpt = joinType match {
+ case LeftOuter if newConditionOpt.isDefined =>
+ val inferredConstraints = right.getRelevantConstraints(
+ right.constraints
+ .union(left.constraints)
+
.union(splitConjunctivePredicates(newConditionOpt.get).toSet))
+ val newFilters = inferredConstraints
+ .filterNot(right.constraints.contains)
+ .reduceLeftOption(And)
+ newFilters.map(Filter(_, right))
+ case _ => None
+ }
+
+ if ((newConditionOpt.isDefined && (newConditionOpt ne conditionOpt))
--- End diff --
```
val newConditionOpt = conditionOpt match {
case Some(condition) =>
val newFilters = additionalConstraints --
splitConjunctivePredicates(condition)
if (newFilters.nonEmpty) Option(And(newFilters.reduce(And),
condition)) else conditionOpt
case None =>
additionalConstraints.reduceOption(And)
}
```
So here, if `conditionOpt` is matched "None" and meanwhile
`additionalConstraints` is empty, I assume `newConditionOpt` and `conditionOpt`
will both be an empty Opt, but reference comparison `ne` will return false.
Since this is part of the original InferFilterFromConstraints logic, and I
only modified it so as to make `newConditionOpt` work for the rest of the
function (the new logic added), I assume it has already been covered by the
existing tests.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]