Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/19702#discussion_r150378883
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -285,8 +285,24 @@ object SQLConf {
.booleanConf
.createWithDefault(true)
+ object ParquetOutputTimestampType extends Enumeration {
+ val INT96, TIMESTAMP_MICROS, TIMESTAMP_MILLIS = Value
+ }
+
+ val PARQUET_OUTPUT_TIMESTAMP_TYPE =
buildConf("spark.sql.parquet.outputTimestampType")
+ .doc("Sets which Parquet timestamp type to use when Spark writes data
to Parquet files. " +
+ "INT96 is a non-standard but commonly used timestamp type in
Parquet. TIMESTAMP_MICROS " +
+ "is a standard timestamp type in Parquet, which stores number of
microseconds from the " +
+ "Unix epoch. TIMESTAMP_MILLIS is also standard, but with millisecond
precision, which " +
+ "means Spark has to truncate the microsecond portion of its
timestamp value.")
+ .stringConf
+ .transform(_.toUpperCase(Locale.ROOT))
+ .checkValues(ParquetOutputTimestampType.values.map(_.toString))
+ .createWithDefault(ParquetOutputTimestampType.INT96.toString)
--- End diff --
To ensure no bug or test case issues, run all the test cases after changing
the default?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]