Hello Sarah,

> Does the community have any plans to lift the isAdjustedToUTC=false
restriction in the future?

So far there are no such plans but we could introduce a SQL config which
switches the parquet writer to a backward compatible mode for the TIME data
type, and store it as isAdjustedToUTC=true. I do believe it shouldn't be by
default because it is incorrect semantically. Could you open a sub-task of
SPARK-51342 for future discussions, please.

Yours faithfully,
Max Gekk


On Tue, Aug 19, 2025 at 10:51 PM Sarah Gilmore
<sgilm...@mathworks.com.invalid> wrote:

> Hi all,
>
> My name is Sarah Gilmore, and I am a software developer at MathWorks[1] as
> well as a committer for the apache/arrow project.
>
> I noticed that the Spark ecosystem is introducing a new data type called
> TimeType[2] to represent time of day values in the upcoming 4.1.0 release,
> and I'm very excited to see this work come to fruition!
>
> However, I also noticed that the accompanying enhancement to Spark's
> Parquet reader only adds the ability to read Parquet TIME data if
> isAdjustedToUTC=false[3].
>
> Does the community have any plans to lift the isAdjustedToUTC=false
> restriction in the future?
>
> My question stems from the fact that some Parquet writers generate TIME
> data with isAdjustedToUTC=true to adhere to the Parquet's compatibility
> guidelines[4] with respect to the deprecation of the ConvertedType
> TIME_MICROS. For example, Arrow's Parquet writer sets
> isAdjustedToUTC=true[5] even though Arrow's time types themselves are
> timezone-agnostic. Consequently, Spark's Parquet reader will still be
> unable to import Parquet files that contain TIME data that were generated
> by Parquet writers that follow the Parquet compatibility guidelines - such
> as the Arrow Parquet writer - even after the release of the TimeType Spark
> datatype.
>
> For context, the MATLAB parquetwrite function leverages Arrow's Parquet
> writer[6], and many MATLAB users want to read MATLAB-generated Parquet
> files that contain TIME data in Spark.
>
> I appreciate the community's time and consideration on this topic.
>
> Thanks!
>
> Best,
>
> Sarah Gilmore
>
> [1] https://www.mathworks.com/
> [2] https://issues.apache.org/jira/browse/SPARK-51342
> [3]
> https://github.com/apache/spark/blob/77413d443f23dd7a14194e516a12d2c959a357be/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala#L309
> [4]
> https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#deprecated-time-convertedtype
> [5]
> https://github.com/apache/arrow/blob/066b2162206825f2d628f97f4113b0403da1f4ec/cpp/src/parquet/arrow/schema.cc#L434
> [6]
> https://www.mathworks.com/help/matlab/import_export/datatype-mappings-matlab-parquet.html
>

Reply via email to