hvanhovell commented on code in PR #48664:
URL: https://github.com/apache/spark/pull/48664#discussion_r1838077151


##########
sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala:
##########
@@ -95,9 +95,14 @@ private[sql] object Dataset {
   def ofRows(sparkSession: SparkSession, logicalPlan: LogicalPlan): DataFrame =
     sparkSession.withActive {
       val qe = sparkSession.sessionState.executePlan(logicalPlan)
-      qe.assertAnalyzed()
-      new Dataset[Row](qe, RowEncoder.encoderFor(qe.analyzed.schema))
-  }
+      val encoder = if (qe.isLazyAnalysis) {
+        RowEncoder.encoderFor(new StructType())

Review Comment:
   @ueshin this breaks collect (and other operations) on these dataframes. The 
alternative would be to defer the construction of the encoder until it is 
needed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to