[GitHub] [spark] cloud-fan commented on a change in pull request #35147: [SPARK-37768][SQL][FOLLOWUP] Schema pruning for the metadata struct

GitBox Wed, 12 Jan 2022 23:23:58 -0800


cloud-fan commented on a change in pull request #35147:
URL: https://github.com/apache/spark/pull/35147#discussion_r783690630




##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/SchemaPruning.scala
##########
@@ -204,5 +205,58 @@ object SchemaPruning extends Rule[LogicalPlan] {
     }
   }
 
+  private def applyMetadataSchemaPruning(plan: LogicalPlan): LogicalPlan =
+    plan transformDown {
+      case op @ PhysicalOperation(projects, filters, l @ LogicalRelation(_, _, 
_, _))
+        if containsMetadataAttributes(l) => pruneMetadataSchema(l, projects, 
filters).getOrElse(op)
+    }
 
+  /**
+   * This method returns optional logical plan with pruned metadata schema.
+   * `None` is returned if no nested field is required or all nested fields 
are required.
+   */
+  private def pruneMetadataSchema(

Review comment:
       The code structure is very similar to `prunePhysicalColumns`. Can we 
combine these two, and prune data and metadata schema in one pass?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on a change in pull request #35147: [SPARK-37768][SQL][FOLLOWUP] Schema pruning for the metadata struct

Reply via email to