[
https://issues.apache.org/jira/browse/SPARK-30668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025549#comment-17025549
]
Maxim Gekk commented on SPARK-30668:
------------------------------------
Date/timestamp parsing is based on Java 8 DateTimeFormat in Spark 3.0 which may
have different notion of pattern letters (see
[https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html]):
{code}
V time-zone ID zone-id America/Los_Angeles;
Z; -08:30
z time-zone name zone-name Pacific Standard Time;
PST
O localized zone-offset offset-O GMT+8; GMT+08:00;
UTC-08:00;
X zone-offset 'Z' for zero offset-X Z; -08; -0830; -08:30;
-083015; -08:30:15;
x zone-offset offset-x +0000; -08; -0830;
-08:30; -083015; -08:30:15;
Z zone-offset offset-Z +0000; -0800; -08:00;
{code}
As you can see 'z' is for time zone name, but you is going to parse zone
offsets. You can use 'x' or 'Z' in the pattern instead of 'z':
{code}
scala> spark.sql("""SELECT to_timestamp("2020-01-27T20:06:11.847-0800",
"yyyy-MM-dd'T'HH:mm:ss.SSSZ")""").show(false)
+----------------------------------------------------------------------------+
|to_timestamp('2020-01-27T20:06:11.847-0800', 'yyyy-MM-dd\'T\'HH:mm:ss.SSSZ')|
+----------------------------------------------------------------------------+
|2020-01-28 07:06:11.847 |
+----------------------------------------------------------------------------+
{code}
Parsing in Spark 2.4 is based on SimpleDateFormat (see
https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html)
where 'z' has slightly different meaning.
> to_timestamp failed to parse 2020-01-27T20:06:11.847-0800 using pattern
> "yyyy-MM-dd'T'HH:mm:ss.SSSz"
> ----------------------------------------------------------------------------------------------------
>
> Key: SPARK-30668
> URL: https://issues.apache.org/jira/browse/SPARK-30668
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.0.0
> Reporter: Xiao Li
> Priority: Blocker
>
> {code:java}
> SELECT to_timestamp("2020-01-27T20:06:11.847-0800",
> "yyyy-MM-dd'T'HH:mm:ss.SSSz")
> {code}
> This can return a valid value in Spark 2.4 but return NULL in the latest
> master
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]