GitHub user mt40 opened a pull request:

    https://github.com/apache/spark/pull/22309

    [SPARK-20384][core] Support value class in schema of Dataset

    ## What changes were proposed in this pull request?
    This PR adds support for [Scala value class][1] in schema of Datasets (as 
both top level class and nested field). 
    The idea is to treat  value class as its underlying type at run time. For 
example:
    ```scala
    case class Id(get: Int) extends AnyVal
    case class User(id: Id) // field `id` will be treated as Int
    ```
    However, if the value class is top-level (e.g. `Dataset[Id]`) then it must 
be treated like a boxed type and must be instantiated. I'm not sure why it 
behaves this way but I suspect it is related to the [expansion of value 
class][2] when we do casting (e.g. `asInstanceOf[T]`)
    
    Actually, this feature is addressed before in [SPARK-17368][3] but the 
patch only supports top-level case. Hence we see the error when value class is 
nested as in [SPARK-19741][4] and [SPARK-20384][5]
    
    [1]: https://docs.scala-lang.org/sips/value-classes.html
    [2]: https://docs.scala-lang.org/sips/value-classes.html#example-1
    [3]: https://issues.apache.org/jira/browse/SPARK-17368
    [4]: https://issues.apache.org/jira/browse/SPARK-19741
    [5]: https://issues.apache.org/jira/browse/SPARK-20384
    
    ## How was this patch tested?
    I added unit tests for top-level and nested case.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mt40/spark dataset_value_class

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22309.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22309
    
----
commit 5613217771b1929b9f66106468fd2da2c3ea7dec
Author: minhthai <minhthai40@...>
Date:   2018-08-31T13:49:21Z

    [SPARK-20384] Support value class in schema of Dataset

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to