[ 
https://issues.apache.org/jira/browse/SPARK-30668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17028709#comment-17028709
 ] 

Wenchen Fan commented on SPARK-30668:
-------------------------------------

I checked the doc in [Spark 
2.4|https://github.com/apache/spark/blob/branch-2.4/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L2967],
 and it says the pattern string follows java.text.SimpleDateFormat, so I think 
this is a breaking change.

AFAIK we fixed several bugs by switching to the 
java.time.format.DateTimeFormatter, so it should be OK to do it in 3.0. We can 
make the migration more smooth by
1. providing a legacy config to restore the old behavior
2. when we use the new formatter, fall back to the old formatter if the new one 
fails to parse. This can at least fix the problem reported by this ticket.

thoughts?

> to_timestamp failed to parse 2020-01-27T20:06:11.847-0800 using pattern 
> "yyyy-MM-dd'T'HH:mm:ss.SSSz"
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-30668
>                 URL: https://issues.apache.org/jira/browse/SPARK-30668
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Xiao Li
>            Priority: Blocker
>
> {code:java}
> SELECT to_timestamp("2020-01-27T20:06:11.847-0800", 
> "yyyy-MM-dd'T'HH:mm:ss.SSSz")
> {code}
> This can return a valid value in Spark 2.4 but return NULL in the latest 
> master
> **2.4.5 RC2**
> {code}
> scala> sql("""SELECT to_timestamp("2020-01-27T20:06:11.847-0800", 
> "yyyy-MM-dd'T'HH:mm:ss.SSSz")""").show
> +----------------------------------------------------------------------------+
> |to_timestamp('2020-01-27T20:06:11.847-0800', 'yyyy-MM-dd\'T\'HH:mm:ss.SSSz')|
> +----------------------------------------------------------------------------+
> |                                                         2020-01-27 20:06:11|
> +----------------------------------------------------------------------------+
> {code}
> **2.2.3 ~ 2.4.4** (2.0.2 ~ 2.1.3 doesn't have `to_timestamp`).
> {code}
> spark-sql> SELECT to_timestamp("2020-01-27T20:06:11.847-0800", 
> "yyyy-MM-dd'T'HH:mm:ss.SSSz");
> 2020-01-27 20:06:11
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to