leaves12138 opened a new pull request, #7845: URL: https://github.com/apache/paimon/pull/7845
### Purpose Native Parquet writers such as Arrow can encode `TIMESTAMP(9)` as `INT64` with the `TIMESTAMP(NANOS)` logical annotation. This is valid Parquet, but Paimon's vectorized Parquet reader did not use the logical timestamp unit when decoding `INT64` timestamp columns, so nanosecond timestamps could not be read as Paimon `timestamp(9)` correctly. ### Changes - Decode `INT64` Parquet timestamps according to their logical time unit (`MILLIS`, `MICROS`, or `NANOS`). - Convert `TIMESTAMP(NANOS)` schema annotations to Paimon timestamp precision 9. - Add regression coverage for top-level `timestamp(9)` and `array<timestamp(9)>` written as Parquet `INT64 TIMESTAMP(NANOS)`. ### Tests - `mvn -pl paimon-format -DskipTests compile` - `mvn -pl paimon-format -Pfast-build -Dtest=ParquetReadWriteTest,ParquetSchemaConverterTest test` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
