GitHub user mt40 opened a pull request:
https://github.com/apache/spark/pull/22309
[SPARK-20384][core] Support value class in schema of Dataset
## What changes were proposed in this pull request?
This PR adds support for [Scala value class][1] in schema of Datasets (as
both top level class and nested field).
The idea is to treat value class as its underlying type at run time. For
example:
```scala
case class Id(get: Int) extends AnyVal
case class User(id: Id) // field `id` will be treated as Int
```
However, if the value class is top-level (e.g. `Dataset[Id]`) then it must
be treated like a boxed type and must be instantiated. I'm not sure why it
behaves this way but I suspect it is related to the [expansion of value
class][2] when we do casting (e.g. `asInstanceOf[T]`)
Actually, this feature is addressed before in [SPARK-17368][3] but the
patch only supports top-level case. Hence we see the error when value class is
nested as in [SPARK-19741][4] and [SPARK-20384][5]
[1]: https://docs.scala-lang.org/sips/value-classes.html
[2]: https://docs.scala-lang.org/sips/value-classes.html#example-1
[3]: https://issues.apache.org/jira/browse/SPARK-17368
[4]: https://issues.apache.org/jira/browse/SPARK-19741
[5]: https://issues.apache.org/jira/browse/SPARK-20384
## How was this patch tested?
I added unit tests for top-level and nested case.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mt40/spark dataset_value_class
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22309.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22309
----
commit 5613217771b1929b9f66106468fd2da2c3ea7dec
Author: minhthai <minhthai40@...>
Date: 2018-08-31T13:49:21Z
[SPARK-20384] Support value class in schema of Dataset
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]