aykut-bozkurt commented on PR #6299:
URL: https://github.com/apache/arrow-rs/pull/6299#issuecomment-2308771820

   > Right but this is not compatible with the parquet logical type definition, 
so is simply incorrect...
   > 
   > > There is currently no way to write the interval with millis to parquet 
via arrow
   > 
   > The broader issue here is that parquet doesn't support nanosecond 
precision intervals, and we're constrained by what the format itself supports - 
[apache/parquet-format#313](https://github.com/apache/parquet-format/issues/313)
   
   Yes, I totally understand the point. But do you think below approach is 
broken or fragile in the context of arrow to parquet reader/writer?
   
   On `writing` to parquet:
   - when user passes a metadata key `adjusted_as_millisec` =>
     - arrow to parquet writer can safely truncate 8 bytes nanoseconds part to 
parquet's 4 bytes milliseconds part (user is fully aware of the conversion. The 
user tells the value is already in milliseconds),
   - when metadata key is not passed =>
     - arrow to parquet writer fails as now.
   
   On `reading` back from parquet:
   - arrow parquet reader reads milliseconds part, then it safely converts it 
to nanoseconds by multiplying it by 1_000_000 to get a valid `IntervalNano` 
array. (reading is not an issue)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to