frankyin-factual commented on a change in pull request #28898:
URL: https://github.com/apache/spark/pull/28898#discussion_r446598259
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala
##########
@@ -39,6 +39,22 @@ object NestedColumnAliasing {
NestedColumnAliasing.replaceToAliases(plan, nestedFieldToAlias,
attrToAliases)
}
+ /**
+ * This is to solve a `LogicalPlan` like `Project`->`Filter`->`Window`.
+ * In this case, `Window` can be plan that is `canProjectPushThrough`.
+ * By adding this, it allows nested columns to be passed onto next stages.
+ * Currently, not adding `Filter` into `canProjectPushThrough` due to
+ * infinitely loop in optimizers during the predicate push-down rule.
+ */
Review comment:
I don't know exactly why it's broken, but here is a simple query that
can reproduce this issue:
`select name.last from contacts where name.first='Jane'`
The error message is like:
```
20/06/27 21:17:41 WARN internal.BaseSessionStateBuilder$$anon$2: Max
iterations (100) reached for batch Operator Optimization before Inferring
Filters, please set 'spark.sql.optimizer.maxIterations' to a larger value.
20/06/27 21:17:41 WARN internal.BaseSessionStateBuilder$$anon$2: Max
iterations (100) reached for batch Operator Optimization after Inferring
Filters, please set 'spark.sql.optimizer.maxIterations' to a larger value.
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]