Github user maryannxue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20816#discussion_r175330576
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -669,11 +672,42 @@ object InferFiltersFromConstraints extends
Rule[LogicalPlan] with PredicateHelpe
val newConditionOpt = conditionOpt match {
case Some(condition) =>
val newFilters = additionalConstraints --
splitConjunctivePredicates(condition)
- if (newFilters.nonEmpty) Option(And(newFilters.reduce(And),
condition)) else None
+ if (newFilters.nonEmpty) Option(And(newFilters.reduce(And),
condition)) else conditionOpt
case None =>
additionalConstraints.reduceOption(And)
}
- if (newConditionOpt.isDefined) Join(left, right, joinType,
newConditionOpt) else join
+ // Infer filter for left/right outer joins
+ val newLeftOpt = joinType match {
+ case RightOuter if newConditionOpt.isDefined =>
+ val rightConstraints = right.constraints.union(
+ splitConjunctivePredicates(newConditionOpt.get).toSet)
+ val inferredConstraints = ExpressionSet(
+
QueryPlanConstraints.inferAdditionalConstraints(rightConstraints))
+ val leftConditions = inferredConstraints
--- End diff --
I think the `constructIsNotNullConstraints` logic does not deal with the
"transitive" constraints so we not need to include it here. Instead the
"isNotNull" deduction for inferred filters on the null-supplying side is
guaranteed by 2 things here:
1) when getting constraints from the preserved side,
`constructIsNotNullConstraints` has already been called and will be carried
over by `inferAdditionalConstraints` to the null-supplying side;
2) the Filter matching part of `InferFiltersFromConstraints`.
That said, I'm good with the name `getRelevantConstraints` too.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]