leaves12138 opened a new pull request, #7845:
URL: https://github.com/apache/paimon/pull/7845

   ### Purpose
   
   Native Parquet writers such as Arrow can encode `TIMESTAMP(9)` as `INT64` 
with the `TIMESTAMP(NANOS)` logical annotation. This is valid Parquet, but 
Paimon's vectorized Parquet reader did not use the logical timestamp unit when 
decoding `INT64` timestamp columns, so nanosecond timestamps could not be read 
as Paimon `timestamp(9)` correctly.
   
   ### Changes
   
   - Decode `INT64` Parquet timestamps according to their logical time unit 
(`MILLIS`, `MICROS`, or `NANOS`).
   - Convert `TIMESTAMP(NANOS)` schema annotations to Paimon timestamp 
precision 9.
   - Add regression coverage for top-level `timestamp(9)` and 
`array<timestamp(9)>` written as Parquet `INT64 TIMESTAMP(NANOS)`.
   
   ### Tests
   
   - `mvn -pl paimon-format -DskipTests compile`
   - `mvn -pl paimon-format -Pfast-build 
-Dtest=ParquetReadWriteTest,ParquetSchemaConverterTest test`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to