joshrosen-stripe commented on a change in pull request #26993: 
[SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter
URL: https://github.com/apache/spark/pull/26993#discussion_r363479947
 
 

 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
 ##########
 @@ -318,10 +318,33 @@ private[parquet] class ParquetRowConverter(
         new ParquetMapConverter(parquetType.asGroupType(), t, updater)
 
       case t: StructType =>
+        val wrappedUpdater = {
+          // SPARK-30338: avoid unnecessary InternalRow copying for nested 
structs:
+          if (updater.isInstanceOf[RowUpdater]) {
+            // `updater` is a RowUpdater, implying that the parent container 
is a struct.
+            // We do NOT need to perform defensive copying here because either:
+            //
+            //   1. The path from the schema root to this field consists only 
of nested
 
 Review comment:
   Yes, that's right. After thinking about this some more, I think I've come up 
with a clearer explanation and have updated the code comment: 
https://github.com/apache/spark/pull/26993/commits/4651b2fd724a56515c087903284682c9ba947c31

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to