Kimahriman commented on code in PR #731: URL: https://github.com/apache/datafusion-comet/pull/731#discussion_r1693649707
########## spark/src/main/scala/org/apache/spark/sql/comet/CometRowToColumnarExec.scala: ########## @@ -60,8 +62,17 @@ case class CometRowToColumnarExec(child: SparkPlan) val timeZoneId = conf.sessionLocalTimeZone val schema = child.schema - child - .execute() + val rdd: RDD[InternalRow] = if (child.supportsColumnar) { + child + .executeColumnar() + .mapPartitionsInternal { iter => + iter.flatMap(_.rowIterator().asScala) + } + } else { + child.execute() + } + + rdd Review Comment: There might be a more efficient way than using a row iterator to write to the row based arrow writer, but as this is mostly for testing/fallback purposes, I didn't try to figure out a faster Spark vector to arrow vector approach. Hopefully someone is able to add complex type support to the Comet parquet reader. If not it could be worth thinking about using the Spark reader as a real fallback use case instead of just for testing purposes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org