This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push:
new b6e8f64 [SPARK-31284][SQL][TESTS] Check rebasing of timestamps in ORC
datasource
b6e8f64 is described below
commit b6e8f64d49caf1f0a1f1b910d603e8e000270d01
Author: Maxim Gekk <[email protected]>
AuthorDate: Fri Mar 27 09:06:59 2020 -0700
[SPARK-31284][SQL][TESTS] Check rebasing of timestamps in ORC datasource
### What changes were proposed in this pull request?
In the PR, I propose 2 tests to check that rebasing of timestamps from/to
the hybrid calendar (Julian + Gregorian) to/from Proleptic Gregorian calendar
works correctly.
1. The test `compatibility with Spark 2.4 in reading timestamps` load ORC
file saved by Spark 2.4.5 via:
```shell
$ export TZ="America/Los_Angeles"
```
```scala
scala> spark.conf.set("spark.sql.session.timeZone", "America/Los_Angeles")
scala> val df = Seq("1001-01-01
01:02:03.123456").toDF("tsS").select($"tsS".cast("timestamp").as("ts"))
df: org.apache.spark.sql.DataFrame = [ts: timestamp]
scala> df.write.orc("/Users/maxim/tmp/before_1582/2_4_5_ts_orc")
scala>
spark.read.orc("/Users/maxim/tmp/before_1582/2_4_5_ts_orc").show(false)
+--------------------------+
|ts |
+--------------------------+
|1001-01-01 01:02:03.123456|
+--------------------------+
```
2. The test `rebasing timestamps in write` is round trip test. Since the
previous test confirms correct rebasing of timestamps in read. This test should
pass only if rebasing works correctly in write.
### Why are the changes needed?
To guarantee that rebasing works correctly for timestamps in ORC datasource.
### Does this PR introduce any user-facing change?
No
### How was this patch tested?
By running `OrcSourceSuite` for Hive 1.2 and 2.3 via the commands:
```
$ build/sbt -Phive-2.3 "test:testOnly *OrcSourceSuite"
```
and
```
$ build/sbt -Phive-1.2 "test:testOnly *OrcSourceSuite"
```
Closes #28047 from MaxGekk/rebase-ts-orc-test.
Authored-by: Maxim Gekk <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit fc2a974e030c82bf500a81c3908f853c3eeb761d)
Signed-off-by: Dongjoon Hyun <[email protected]>
---
.../test-data/before_1582_ts_v2_4.snappy.orc | Bin 0 -> 251 bytes
.../execution/datasources/orc/OrcSourceSuite.scala | 28 +++++++++++++++++++++
2 files changed, 28 insertions(+)
diff --git
a/sql/core/src/test/resources/test-data/before_1582_ts_v2_4.snappy.orc
b/sql/core/src/test/resources/test-data/before_1582_ts_v2_4.snappy.orc
new file mode 100644
index 0000000..af9ef04
Binary files /dev/null and
b/sql/core/src/test/resources/test-data/before_1582_ts_v2_4.snappy.orc differ
diff --git
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala
index b5e002f..0b7500c 100644
---
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala
+++
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala
@@ -508,6 +508,34 @@ abstract class OrcSuite extends OrcTest with
BeforeAndAfterAll {
}
}
}
+
+ test("SPARK-31284: compatibility with Spark 2.4 in reading timestamps") {
+ Seq(false, true).foreach { vectorized =>
+ withSQLConf(SQLConf.ORC_VECTORIZED_READER_ENABLED.key ->
vectorized.toString) {
+ checkAnswer(
+ readResourceOrcFile("test-data/before_1582_ts_v2_4.snappy.orc"),
+ Row(java.sql.Timestamp.valueOf("1001-01-01 01:02:03.123456")))
+ }
+ }
+ }
+
+ test("SPARK-31284: rebasing timestamps in write") {
+ withTempPath { dir =>
+ val path = dir.getAbsolutePath
+ Seq("1001-01-01 01:02:03.123456").toDF("tsS")
+ .select($"tsS".cast("timestamp").as("ts"))
+ .write
+ .orc(path)
+
+ Seq(false, true).foreach { vectorized =>
+ withSQLConf(SQLConf.ORC_VECTORIZED_READER_ENABLED.key ->
vectorized.toString) {
+ checkAnswer(
+ spark.read.orc(path),
+ Row(java.sql.Timestamp.valueOf("1001-01-01 01:02:03.123456")))
+ }
+ }
+ }
+ }
}
class OrcSourceSuite extends OrcSuite with SharedSparkSession {
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]