Maxim Gekk created SPARK-31183:
----------------------------------
Summary: Incompatible Avro dates/timestamps with Spark 2.4
Key: SPARK-31183
URL: https://issues.apache.org/jira/browse/SPARK-31183
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 3.0.0
Reporter: Maxim Gekk
Write dates/timestamps to Avro file in Spark 2.4.5:
{code}
$ export TZ="America/Los_Angeles"
$ bin/spark-shell --packages org.apache.spark:spark-avro_2.11:2.4.5
{code}
{code:scala}
scala>
df.write.format("avro").save("/Users/maxim/tmp/before_1582/2_4_5_ts_avro")
scala>
spark.read.format("avro").load("/Users/maxim/tmp/before_1582/2_4_5_ts_avro").show(false)
+----------+
|date |
+----------+
|1001-01-01|
+----------+
scala>
df2.write.format("avro").save("/Users/maxim/tmp/before_1582/2_4_5_ts_avro")
scala>
spark.read.format("avro").load("/Users/maxim/tmp/before_1582/2_4_5_ts_avro").show(false)
+--------------------------+
|ts |
+--------------------------+
|1001-01-01 01:02:03.123456|
+--------------------------+
{code}
Spark 3.0.0-preview2 ( and 3.1.0-SNAPSHOT) outputs different values from Spark
2.4.5:
{code}
$ export TZ="America/Los_Angeles"
$ /bin/spark-shell --packages org.apache.spark:spark-avro_2.12:2.4.5
{code}
{code:scala}
scala> spark.conf.set("spark.sql.session.timeZone", "America/Los_Angeles")
scala>
spark.read.format("avro").load("/Users/maxim/tmp/before_1582/2_4_5_date_avro").show(false)
+----------+
|date |
+----------+
|1001-01-07|
+----------+
scala>
spark.read.format("avro").load("/Users/maxim/tmp/before_1582/2_4_5_ts_avro").show(false)
+--------------------------+
|ts |
+--------------------------+
|1001-01-07 01:09:05.123456|
+--------------------------+
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]