These common UDTs can always be wrapped in libraries and published to
spark-packages http://spark-packages.org/ :-)
Cheng
On 4/12/15 3:00 PM, Justin Yip wrote:
Cheng, this is great info. I have a follow up question. There are a
few very common data types (i.e. Joda DateTime) that is not directly
supported by SparkSQL. Do you know if there are any plans for
accommodating some common data types in SparkSQL? They don't need to
be a first class datatype, but if they are available as UDT and
provided by the SparkSQL library, that will make DataFrame users' life
easier.
Justin
On Sat, Apr 11, 2015 at 5:41 AM, Cheng Lian <lian.cs....@gmail.com
<mailto:lian.cs....@gmail.com>> wrote:
One possible approach can be defining a UDT (user-defined type)
for Joda time. A UDT maps an arbitrary type to and from Spark SQL
data types. You may check the ExamplePointUDT [1] for more details.
[1]:
https://github.com/apache/spark/blob/694aef0d71d2683eaf63cbd1d8e95c2da423b72e/sql/core/src/main/scala/org/apache/spark/sql/test/ExamplePointUDT.scala
On 4/8/15 6:09 AM, adamgerst wrote:
I've been using Joda Time in all my spark jobs (by using the
nscala-time
package) and have not run into any issues until I started
trying to use
spark sql. When I try to convert a case class that has a
com.github.nscala_time.time.Imports.DateTime object in it, an
exception is
thrown for with a MatchError
My assumption is that this is because the basic types of spark
sql are
java.sql.Timestamp and java.sql.Date and therefor spark
doesn't know what to
do about the DateTime value.
How can I get around this? I would prefer not to have to
change my code to
make the values be Timestamps but I'm concerned that might be
the only way.
Would something like implicit conversions work here?
It seems that even if I specify the schema manually then I
would still have
the issue since you have to specify the column type which has
to be of type
org.apache.spark.sql.types.DataType
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-use-Joda-Time-with-Spark-SQL-tp22415.html
Sent from the Apache Spark User List mailing list archive at
Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
<mailto:user-unsubscr...@spark.apache.org>
For additional commands, e-mail: user-h...@spark.apache.org
<mailto:user-h...@spark.apache.org>
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
<mailto:user-unsubscr...@spark.apache.org>
For additional commands, e-mail: user-h...@spark.apache.org
<mailto:user-h...@spark.apache.org>