Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/15857#discussion_r87998688
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
---
@@ -428,43 +428,47 @@ object FoldablePropagation extends Rule[LogicalPlan] {
}
case _ => Nil
})
+ val replaceFoldable: PartialFunction[Expression, Expression] = {
+ case a: AttributeReference if foldableMap.contains(a) =>
foldableMap(a)
+ }
if (foldableMap.isEmpty) {
plan
} else {
var stop = false
CleanupAliases(plan.transformUp {
- case u: Union =>
- stop = true
- u
- case c: Command =>
- stop = true
- c
- // For outer join, although its output attributes are derived from
its children, they are
- // actually different attributes: the output of outer join is not
always picked from its
- // children, but can also be null.
+ // Allow all leafnodes
+ case l: LeafNode =>
+ l
+
+ // Whitelist of all nodes we are allowed to apply this rule to.
+ case p @ (_: Project | _: Filter | _: SubqueryAlias | _: Aggregate
| _: Window |
+ _: Sample | _: GlobalLimit | _: LocalLimit | _: Generate
| _: Distinct |
+ _: AppendColumns | _: AppendColumnsWithObject | _:
BroadcastHint |
+ _: RedistributeData | _: Repartition | _: Sort | _:
TypedFilter) if !stop =>
+ p.transformExpressions(replaceFoldable)
+
+ // Allow inner joins. We do not allow outer join, although its
output attributes are
+ // derived from its children, they are actually different
attributes: the output of outer
+ // join is not always picked from its children, but can also be
null.
// TODO(cloud-fan): It seems more reasonable to use new attributes
as the output attributes
// of outer join.
- case j @ Join(_, _, LeftOuter | RightOuter | FullOuter, _) =>
+ case j @ Join(_, _, Inner, _) =>
+ j.transformExpressions(replaceFoldable)
+
+ // We can fold the projections an expand holds. However expand
changes the output columns
+ // and often reuses the underlying attributes; so we cannot assume
that a column is still
+ // foldable after the expand has been applied.
+ case expand: Expand if !stop =>
--- End diff --
I have added the TODO. I think that we really should revisit attribute
reuse in general (it causes a lot of subtle bugs).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]