HyukjinKwon commented on a change in pull request #35090:
URL: https://github.com/apache/spark/pull/35090#discussion_r777788852



##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcDeserializer.scala
##########
@@ -137,26 +137,38 @@ class OrcDeserializer(
         v.changePrecision(precision, scale)
         updater.set(ordinal, v)
 
-      case st: StructType => (ordinal, value) =>
+      case st: StructType =>
         val result = new SpecificInternalRow(st)
         val fieldUpdater = new RowUpdater(result)
         val fieldConverters = st.map(_.dataType).map { dt =>
           newWriter(dt, fieldUpdater)
         }.toArray
-        val orcStruct = value.asInstanceOf[OrcStruct]
 
-        var i = 0
-        while (i < st.length) {
-          val value = orcStruct.getFieldValue(i)
-          if (value == null) {
-            result.setNullAt(i)
-          } else {
-            fieldConverters(i)(i, value)
-          }
-          i += 1
+        val containerUpdater = updater match {
+          case r: RowUpdater => r
+          case _ =>
+            // If the struct is contained by an array or map, we cannot reuse 
the same result row.
+            // We must copy the result row before setting it into the array or 
map
+            new CatalystDataUpdater {
+              override def set(ordinal: Int, value: Any) = {
+                updater.set(ordinal, 
value.asInstanceOf[SpecificInternalRow].copy())

Review comment:
       One nitpick question might be that if it's faster to copy instead of 
creating a new row .. but I guess that's less common to create a map or array 
of structs (compared to flat rows). Would be great to make sure covering that 
case in the benchmark




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to