alamb opened a new issue, #15721: URL: https://github.com/apache/datafusion/issues/15721
### Describe the bug `datafusion.execution.parquet.coerce_int96` is supposed to > If true, parquet reader will read columns of physical type int96 as originating from a different resolution than nanosecond. This is useful for reading data from systems like Spark which stores microsecond resolution timestamps in an int96 allowing it to write values with a larger date range than 64-bit timestamps with nanosecond resolution. However, when I set this to `ms` the type is still reported to be `Timestamp(Nanoseconds)` ### To Reproduce ```sql -- Enable coercion of int96 to microseconds set datafusion.execution.parquet.coerce_int96 = ms; -- Create external table CREATE EXTERNAL TABLE int96_from_spark STORED AS PARQUET LOCATION 'parquet-testing/data/int96_from_spark.parquet'; -- Print schema describe int96_from_spark; ``` Results in ```sql +-------------+-----------------------------+-------------+ | column_name | data_type | is_nullable | +-------------+-----------------------------+-------------+ | a | Timestamp(Nanosecond, None) | YES | +-------------+-----------------------------+-------------+ 1 row(s) fetched. Elapsed 0.001 seconds. ``` ### Expected behavior I expect the output type to be `Timestamp(Microsecond, None)` ### Additional context - The new feature was added in https://github.com/apache/datafusion/pull/15537 - Possibly related to https://github.com/apache/arrow-rs/issues/7287 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org