Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/21944#discussion_r207248446
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -1367,6 +1367,22 @@ class Dataset[T] private[sql](
}: _*)
}
+ /**
+ * Casts all the values of the current Dataset following the types of a
specific StructType.
+ * This method works also with nested structTypes.
+ *
+ * @group typedrel
+ * @since 2.4.0
+ */
+ def castBySchema(schema: StructType): DataFrame = {
+
assert(schema.fields.map(_.name).toList.sameElements(this.schema.fields.map(_.name).toList),
+ "schema should have the same fields as the original schema")
+
+ selectExpr(schema.map(
--- End diff --
There are many good one liner tricks and I would just leave those good
tricks in mailing list or something. I wouldn't add an API only because it
_might be_ helpful to some users.
We shouldn't add an API only because it _might be_ useful. I would consider
adding this if there's a request for this PR multiple times, it is not one
liner change and there's no easy workaround for it.
Otherwise, every system will have an API to send an email.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]