[ 
https://issues.apache.org/jira/browse/SPARK-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15229506#comment-15229506
 ] 

Hyukjin Kwon commented on SPARK-14428:
--------------------------------------

I can work on this if it is decided to be supported. (I am working on CSV one 
for this, https://github.com/apache/spark/pull/11550)

> [SQL] Allow more flexibility when parsing dates and timestamps in json 
> datasources
> ----------------------------------------------------------------------------------
>
>                 Key: SPARK-14428
>                 URL: https://issues.apache.org/jira/browse/SPARK-14428
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.6.1
>            Reporter: Michel Lemay
>            Priority: Minor
>              Labels: date, features, json, timestamp
>
> Reading a json with dates and timestamps is limited to predetermined string 
> formats or long values.
> 1) Should be able to set an option on json datasource to parse dates and 
> timestamps using custom string format.
> 2) Should be able to change the interpretation of long values since epoch.  
> It could support different precisions like days, seconds, milliseconds, 
> microseconds and nanoseconds.  
> Something in the lines of :
> {code}
> object Precision extends Enumeration {
>     val days, seconds, milliseconds, microseconds, nanoseconds = Value
>   }
> def convertWithPrecision(time: Long, from: Precision.Value, to: 
> Precision.Value): Long = ...
> ...
>   val dateFormat = parameters.getOrElse("dateFormat", "").trim
>   val timestampFormat = parameters.getOrElse("timestampFormat", "").trim
>   val longDatePrecision = getOrElse("longDatePrecision", "days")
>   val longTimestampPrecision = getOrElse("longTimestampPrecision", 
> "milliseconds")
> {code}
> and 
> {code}
>       case (VALUE_STRING, DateType) =>
>         val stringValue = parser.getText
>         val days = if (configOptions.dateFormat.nonEmpty) {
>           // User defined format, make sure it complies to the SQL DATE 
> format (number of days)
>           val sdf = new SimpleDateFormat(configOptions.dateFormat) // Not 
> thread safe.
>           DateTimeUtils.convertWithPrecision(sdf.parse(stringValue).getTime, 
> Precision.milliseconds, Precision.days)
>         } else if (stringValue.forall(_.isDigit)) {
>           DateTimeUtils.convertWithPrecision(stringValue.toLong, 
> configOptions.longDatePrecision, Precision.days)
>         } else {
>           // The format of this string will probably be "yyyy-mm-dd".
>           
> DateTimeUtils.convertWithPrecision(DateTimeUtils.stringToTime(parser.getText).getTime,
>  Precision.milliseconds, Precision.days)
>         }
>         days.toInt
>       case (VALUE_NUMBER_INT, DateType) =>
>           DateTimeUtils.convertWithPrecision((parser.getLongValue, 
> configOptions.longDatePrecision, Precision.days).toInt
> {code}
> With similar handling for Timestamps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to