[
https://issues.apache.org/jira/browse/SPARK-17914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16750292#comment-16750292
]
Chaitanya P Chandurkar commented on SPARK-17914:
------------------------------------------------
I'm still seeing this issue in Spark 2.4.0 when using from_json() function. In
ISO Zulu format datetime, it is not interpreting the timezone accurately after
certain number of digits. Every digit added after 3rd digit in the timestamp is
adding up more seconds to the parsed datetime. For example, This datetime:
"2019-01-23T17:50:29.9991Z" when parsed using spark's build-in from_json()
function results in "2019-01-23T17:50:38.991+0000" ( Note the number of seconds
added )
> Spark SQL casting to TimestampType with nanosecond results in incorrect
> timestamp
> ---------------------------------------------------------------------------------
>
> Key: SPARK-17914
> URL: https://issues.apache.org/jira/browse/SPARK-17914
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.6.1
> Reporter: Oksana Romankova
> Assignee: Anton Okolnychyi
> Priority: Major
> Fix For: 2.2.0, 2.3.0
>
>
> In some cases when timestamps contain nanoseconds they will be parsed
> incorrectly.
> Examples:
> "2016-05-14T15:12:14.0034567Z" -> "2016-05-14 15:12:14.034567"
> "2016-05-14T15:12:14.000345678Z" -> "2016-05-14 15:12:14.345678"
> The issue seems to be happening in DateTimeUtils.stringToTimestamp(). It
> assumes that only 6 digit fraction of a second will be passed.
> With this being the case I would suggest either discarding nanoseconds
> automatically, or throw an exception prompting to pre-format timestamps to
> microsecond precision first before casting to the Timestamp.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]