Github user wangyum commented on a diff in the pull request:
https://github.com/apache/spark/pull/21741#discussion_r201889164
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala
---
@@ -387,6 +389,82 @@ class ParquetFilterSuite extends QueryTest with
ParquetTest with SharedSQLContex
}
}
+ test("filter pushdown - timestamp(TIMESTAMP_MILLIS)") {
+ val ts1 = Timestamp.valueOf("2018-06-14 08:28:53.123")
+ val ts2 = Timestamp.valueOf("2018-06-15 08:28:53.123")
+ val ts3 = Timestamp.valueOf("2018-06-16 08:28:53.123")
+ val ts4 = Timestamp.valueOf("2018-06-17 08:28:53.123")
+
+ val data = Seq(ts1, ts2, ts3, ts4)
+
+ withSQLConf(SQLConf.PARQUET_OUTPUT_TIMESTAMP_TYPE.key ->
--- End diff --
I changed to:
```scala
// spark.sql.parquet.outputTimestampType = TIMESTAMP_MILLIS
val millisData = Seq(Timestamp.valueOf("2018-06-14 08:28:53.123"),
Timestamp.valueOf("2018-06-15 08:28:53.123"),
Timestamp.valueOf("2018-06-16 08:28:53.123"),
Timestamp.valueOf("2018-06-17 08:28:53.123"))
withSQLConf(SQLConf.PARQUET_OUTPUT_TIMESTAMP_TYPE.key ->
ParquetOutputTimestampType.TIMESTAMP_MILLIS.toString) {
testTimestampPushdown(millisData)
}
// spark.sql.parquet.outputTimestampType = TIMESTAMP_MICROS
val microsData = Seq(Timestamp.valueOf("2018-06-14 08:28:53.123456"),
Timestamp.valueOf("2018-06-15 08:28:53.123456"),
Timestamp.valueOf("2018-06-16 08:28:53.123456"),
Timestamp.valueOf("2018-06-17 08:28:53.123456"))
withSQLConf(SQLConf.PARQUET_OUTPUT_TIMESTAMP_TYPE.key ->
ParquetOutputTimestampType.TIMESTAMP_MICROS.toString) {
testTimestampPushdown(microsData)
}
```
We shouldn't use same data to test `TIMESTAMP_MILLIS` type and
`TIMESTAMP_MICROS` type:
1. `TIMESTAMP_MILLIS` type will truncate `456` if use `microsData` to
test.
2. It can't test
`DateTimeUtils.fromJavaTimestamp(t.asInstanceOf[Timestamp]` if use `millisData`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]