Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/20444#discussion_r164948893
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceExceptWithFilter.scala
---
@@ -46,18 +46,27 @@ object ReplaceExceptWithFilter extends
Rule[LogicalPlan] {
}
plan.transform {
- case Except(left, right) if isEligible(left, right) =>
- Distinct(Filter(Not(transformCondition(left, skipProject(right))),
left))
+ case e @ Except(left, right) if isEligible(left, right) =>
+ val newCondition = transformCondition(left, skipProject(right))
+ newCondition.map { c =>
+ Distinct(Filter(Not(c), left))
+ }.getOrElse {
+ e
+ }
}
}
- private def transformCondition(left: LogicalPlan, right: LogicalPlan):
Expression = {
+ private def transformCondition(left: LogicalPlan, right: LogicalPlan):
Option[Expression] = {
val filterCondition =
InferFiltersFromConstraints(combineFilters(right)).asInstanceOf[Filter].condition
val attributeNameMap: Map[String, Attribute] = left.output.map(x =>
(x.name, x)).toMap
- filterCondition.transform { case a : AttributeReference =>
attributeNameMap(a.name) }
+ if (filterCondition.references.forall(r =>
attributeNameMap.contains(r.name))) {
+ Some(filterCondition.transform { case a: AttributeReference =>
attributeNameMap(a.name) })
--- End diff --
Yes. There are multiple potential cases we can improve for this case. If we
make it more complicated, it just takes a longer time to review the work. This
blocks the 2.3 RC. Thus, I would like to fix it in a conservative way.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]