joshrosen-stripe commented on a change in pull request #25503: [SPARK-28702][SQL] Display useful error message (instead of NPE) for invalid Dataset operations URL: https://github.com/apache/spark/pull/25503#discussion_r315749862
########## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ########## @@ -184,11 +184,22 @@ private[sql] object Dataset { */ @Stable class Dataset[T] private[sql]( - @transient val sparkSession: SparkSession, + @transient private val _sparkSession: SparkSession, @DeveloperApi @Unstable @transient val queryExecution: QueryExecution, @DeveloperApi @Unstable @transient val encoder: Encoder[T]) extends Serializable { + @transient lazy val sparkSession: SparkSession = { + if (_sparkSession == null) { + throw new SparkException( + "Dataset transformations and actions can only be invoked by the driver, not inside of" + + " other transformations; for example, dataset1.map(x => dataset2.values.count() * x)" + Review comment: What about `not inside of other Dataset transformations`? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org