[
https://issues.apache.org/jira/browse/SPARK-30767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon resolved SPARK-30767.
----------------------------------
Resolution: Not A Problem
> from_json changes times of timestmaps by several minutes without error
> -----------------------------------------------------------------------
>
> Key: SPARK-30767
> URL: https://issues.apache.org/jira/browse/SPARK-30767
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.4.4
> Environment: We ran the example code with Spark 2.4.4 via Azure
> Databricks with Databricks Runtime version 6.3 within an interactive cluster.
> We encountered the issue first on a Job Cluster running a streaming
> application on Databricks Runtime Version 5.4.
> Reporter: Benedikt Maria Beckermann
> Priority: Major
>
> When a json text column includes a timestamp and the timestamp has a format
> like {{2020-01-25T06:39:45.887429Z}}, the function
> {{from_json(Column,StructType)}} is able to infer a timestamp but that
> timestamp is changed by several minutes.
> Spark does not throw any kind of error but continues to run with the
> invalidated timestamp.
> The following scala snipped is able to reproduce the issue.
>
> {code:scala}
> import org.apache.spark.sql._
> import org.apache.spark.sql.functions._
> import org.apache.spark.sql.types._
> val df = Seq("""{"time":"2020-01-25T06:39:45.887429Z"}""").toDF("json")
> val struct = new StructType().add("time", TimestampType, nullable = true)
> val timeDF = df
> .withColumn("time (string)", get_json_object(col("json"), "$.time"))
> .withColumn("time casted directly (CORRECT)", col("time
> (string)").cast(TimestampType))
> .withColumn("time casted via struct (INVALID)", from_json(col("json"),
> struct))
> display(timeDF)
> {code}
> Output:
> ||json||time (string)||time casted directly (CORRECT)||time casted via struct
> (INVALID)
> |{"time":"2020-01-25T06:39:45.887429Z"}|2020-01-25T06:39:45.887429Z|2020-01-25T06:39:45.887+0000|{"time":"2020-01-25T06:54:32.429+0000"}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]