Re: How to use Joda Time with Spark SQL?

Justin Yip Tue, 14 Apr 2015 13:18:18 -0700

Hi Daoyuan,

I am curious to see if there are more straightforward way of manipulating
joda DateTime. In my work, I have lots of case classes like:


case class Event(
  t: DateTime,
  a: Int,
  b: Double)

SparkSQL can infer the schema by reflecting the case class. It would be
very helpful if there is a way to tell sqlContext how to handle a custom
class (i.e. DateTime) using conversion. So that I can utilize this case
class.

Thanks.

Justin

On Sun, Apr 12, 2015 at 9:20 AM, Wang, Daoyuan <daoyuan.w...@intel.com>
wrote:

>  Actually, I did a little investigation on joda time when I was working
> on SPARK-4987 for Timestamp ser-de in parquet format. I think Joda offers
> interface to get java object from joda time object natively.
>
>
>
> For example, to transform a java.util.Date (parent of java.sql.Date and
> java.sql.Timestamp) object named jd, in jave code you can use
>
> DateTime dt = new DateTime(jd);
>
> Or in scala code
>
> val dt: DateTime = new DateTime(jd)
>
>
>
> On the other hand, giving a DateTime object named dt, you can use code like
>
> val jd: java.sql.Timestamp = new Timestamp(dt.getMillis)
>
> to get the java object.
>
>
>
> Thanks,
>
> Daoyuan.
>
>
>
> *From:* Cheng Lian [mailto:lian.cs....@gmail.com]
> *Sent:* Sunday, April 12, 2015 11:51 PM
> *To:* Justin Yip
> *Cc:* adamgerst; user@spark.apache.org
> *Subject:* Re: How to use Joda Time with Spark SQL?
>
>
>
> These common UDTs can always be wrapped in libraries and published to
> spark-packages http://spark-packages.org/ :-)
>
> Cheng
>
> On 4/12/15 3:00 PM, Justin Yip wrote:
>
>  Cheng, this is great info. I have a follow up question. There are a few
> very common data types (i.e. Joda DateTime) that is not directly supported
> by SparkSQL. Do you know if there are any plans for accommodating some
> common data types in SparkSQL? They don't need to be a first class
> datatype, but if they are available as UDT and provided by the SparkSQL
> library, that will make DataFrame users' life easier.
>
>
>
> Justin
>
>
>
> On Sat, Apr 11, 2015 at 5:41 AM, Cheng Lian <lian.cs....@gmail.com> wrote:
>
> One possible approach can be defining a UDT (user-defined type) for Joda
> time. A UDT maps an arbitrary type to and from Spark SQL data types. You
> may check the ExamplePointUDT [1] for more details.
>
> [1]:
> https://github.com/apache/spark/blob/694aef0d71d2683eaf63cbd1d8e95c2da423b72e/sql/core/src/main/scala/org/apache/spark/sql/test/ExamplePointUDT.scala
>
>
>
> On 4/8/15 6:09 AM, adamgerst wrote:
>
> I've been using Joda Time in all my spark jobs (by using the nscala-time
> package) and have not run into any issues until I started trying to use
> spark sql.  When I try to convert a case class that has a
> com.github.nscala_time.time.Imports.DateTime object in it, an exception is
> thrown for with a MatchError
>
> My assumption is that this is because the basic types of spark sql are
> java.sql.Timestamp and java.sql.Date and therefor spark doesn't know what
> to
> do about the DateTime value.
>
> How can I get around this? I would prefer not to have to change my code to
> make the values be Timestamps but I'm concerned that might be the only way.
> Would something like implicit conversions work here?
>
> It seems that even if I specify the schema manually then I would still have
> the issue since you have to specify the column type which has to be of type
> org.apache.spark.sql.types.DataType
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-use-Joda-Time-with-Spark-SQL-tp22415.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>
>
>
>

Re: How to use Joda Time with Spark SQL?

Reply via email to