JoshRosen commented on code in PR #36683:
URL: https://github.com/apache/spark/pull/36683#discussion_r883099524


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala:
##########
@@ -190,32 +191,30 @@ private[sql] object ArrowConverters {
   }
 
   /**
-   * Create a DataFrame from an RDD of serialized ArrowRecordBatches.
+   * Create a DataFrame from an iterator of serialized ArrowRecordBatches.
    */
-  private[sql] def toDataFrame(
-      arrowBatchRDD: JavaRDD[Array[Byte]],
+  def toDataFrame(
+      arrowBatches: Iterator[Array[Byte]],
       schemaString: String,
       session: SparkSession): DataFrame = {
-    val schema = DataType.fromJson(schemaString).asInstanceOf[StructType]
-    val timeZoneId = session.sessionState.conf.sessionLocalTimeZone
-    val rdd = arrowBatchRDD.rdd.mapPartitions { iter =>
-      val context = TaskContext.get()
-      ArrowConverters.fromBatchIterator(iter, schema, timeZoneId, context)
-    }
-    session.internalCreateDataFrame(rdd.setName("arrow"), schema)
+    val attrs = 
DataType.fromJson(schemaString).asInstanceOf[StructType].toAttributes
+    val data = ArrowConverters.fromBatchIterator(
+      arrowBatches,
+      DataType.fromJson(schemaString).asInstanceOf[StructType],

Review Comment:
   Maybe we can avoid calling `DataType.fromJson` twice if we store `schema` 
into a local variable and use a separate `attrs` variable? Like this: 
   
   ```suggestion
       val schema = DataType.fromJson(schemaString).asInstanceOf[StructType]
       val attrs = schema.toAttributes
       val data = ArrowConverters.fromBatchIterator(
         arrowBatches,
         schema,
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to