EnricoMi commented on a change in pull request #26969: [SPARK-30319][SQL] Add a 
stricter version of `as[T]`
URL: https://github.com/apache/spark/pull/26969#discussion_r366256637
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -495,6 +495,25 @@ class Dataset[T] private[sql](
     select(newCols : _*)
   }
 
+  /**
+   * Returns a new Dataset where each record has been mapped on to the 
specified type.
+   * This only supports `U` being a class. Fields for the class will be mapped 
to columns of the
+   * same name (case sensitivity is determined by `spark.sql.caseSensitive`).
+   *
+   * If the schema of the Dataset does not match the desired `U` type, you can 
use `select`
+   * along with `alias` or `as` to rearrange or rename as required.
+   *
+   * This method eagerly projects away any columns that are not present in the 
specified class.
+   * It further guarantees the order of columns as well as data types to match 
`U`.
+   *
+   * @group basic
+   * @since 3.0.0
+   */
+  def toDS[U : Encoder]: Dataset[U] = {
+    val columns = implicitly[Encoder[U]].schema.fields.map(f => 
col(f.name).cast(f.dataType))
 
 Review comment:
   As the documentation of that method says, this only supports U being a 
class. Tuples or structs are not supported. Those would require knowledge that 
only the encoder does have.
   
   For classes a simple projection is sufficient, which is very cheap. An 
encoder round trip should be avoided in this use case. The stricter as method 
could be extended later to other encoders and kinds of Us, but I think classes 
are a wide use case that justify adding it already.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to