Ganesha Shreedhara created SPARK-51359:
------------------------------------------
Summary: Set INT64 as the default timestamp type for Parquet files
Key: SPARK-51359
URL: https://issues.apache.org/jira/browse/SPARK-51359
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 3.5.5
Reporter: Ganesha Shreedhara
The INT96 timestamp type has been deprecated as part of PARQUET-323. However,
Apache Spark still uses INT96 as the default outputTimestampType for Parquet
files ([code
link|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L1157]).
This could create incompatibilities when Parquet data written by Spark is read
by readers that do not support the INT96 type. We should consider changing the
default outputTimestampType to INT64 unless there is a compelling reason to
maintain INT96 as the default option.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]