dbtsai commented on a change in pull request #23542: [WIP] [SPARK-25603][SQL]
Pushing Down Nested Field projections
URL: https://github.com/apache/spark/pull/23542#discussion_r248107850
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -611,6 +611,29 @@ object ColumnPruning extends Rule[LogicalPlan] {
// Can't prune the columns on LeafNode
case p @ Project(_, _: LeafNode) => p
+ // If the current project or the child references to nested fields, we can
substitute them
+ // by alias attributes; then a project of the nested fields as aliases on
the children
+ // of the child will be created.
+ //
+ // Note that if the child is also a [[Project]], it will be handled by the
previous rule.
+ // If the child is a [[Filter]], it will be conflict with
PushPredicatesThroughProject.
+ // When the child is a [[SerializeFromObject]], we can not push the
projection through it.
+ case p @ Project(_, child) if !child.isInstanceOf[Project] &&
!child.isInstanceOf[Filter] &&
Review comment:
I was debating the same. Will change to a white list of operators.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]