These common UDTs can always be wrapped in libraries and published to spark-packages http://spark-packages.org/ :-)

Cheng

On 4/12/15 3:00 PM, Justin Yip wrote:
Cheng, this is great info. I have a follow up question. There are a few very common data types (i.e. Joda DateTime) that is not directly supported by SparkSQL. Do you know if there are any plans for accommodating some common data types in SparkSQL? They don't need to be a first class datatype, but if they are available as UDT and provided by the SparkSQL library, that will make DataFrame users' life easier.

Justin

On Sat, Apr 11, 2015 at 5:41 AM, Cheng Lian <lian.cs....@gmail.com <mailto:lian.cs....@gmail.com>> wrote:

    One possible approach can be defining a UDT (user-defined type)
    for Joda time. A UDT maps an arbitrary type to and from Spark SQL
    data types. You may check the ExamplePointUDT [1] for more details.

    [1]:
    
https://github.com/apache/spark/blob/694aef0d71d2683eaf63cbd1d8e95c2da423b72e/sql/core/src/main/scala/org/apache/spark/sql/test/ExamplePointUDT.scala



    On 4/8/15 6:09 AM, adamgerst wrote:

        I've been using Joda Time in all my spark jobs (by using the
        nscala-time
        package) and have not run into any issues until I started
        trying to use
        spark sql.  When I try to convert a case class that has a
        com.github.nscala_time.time.Imports.DateTime object in it, an
        exception is
        thrown for with a MatchError

        My assumption is that this is because the basic types of spark
        sql are
        java.sql.Timestamp and java.sql.Date and therefor spark
        doesn't know what to
        do about the DateTime value.

        How can I get around this? I would prefer not to have to
        change my code to
        make the values be Timestamps but I'm concerned that might be
        the only way.
        Would something like implicit conversions work here?

        It seems that even if I specify the schema manually then I
        would still have
        the issue since you have to specify the column type which has
        to be of type
        org.apache.spark.sql.types.DataType



        --
        View this message in context:
        
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-use-Joda-Time-with-Spark-SQL-tp22415.html
        Sent from the Apache Spark User List mailing list archive at
        Nabble.com.

        ---------------------------------------------------------------------
        To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
        <mailto:user-unsubscr...@spark.apache.org>
        For additional commands, e-mail: user-h...@spark.apache.org
        <mailto:user-h...@spark.apache.org>




    ---------------------------------------------------------------------
    To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
    <mailto:user-unsubscr...@spark.apache.org>
    For additional commands, e-mail: user-h...@spark.apache.org
    <mailto:user-h...@spark.apache.org>



Reply via email to