aykut-bozkurt commented on PR #6299: URL: https://github.com/apache/arrow-rs/pull/6299#issuecomment-2308771820
> Right but this is not compatible with the parquet logical type definition, so is simply incorrect... > > > There is currently no way to write the interval with millis to parquet via arrow > > The broader issue here is that parquet doesn't support nanosecond precision intervals, and we're constrained by what the format itself supports - [apache/parquet-format#313](https://github.com/apache/parquet-format/issues/313) Yes, I totally understand the point. But do you think below approach is broken or fragile in the context of arrow to parquet reader/writer? On `writing` to parquet: - when user passes a metadata key `adjusted_as_millisec` => - arrow to parquet writer can safely truncate 8 bytes nanoseconds part to parquet's 4 bytes milliseconds part (user is fully aware of the conversion. The user tells the value is already in milliseconds), - when metadata key is not passed => - arrow to parquet writer fails as now. On `reading` back from parquet: - arrow parquet reader reads milliseconds part, then it safely converts it to nanoseconds by multiplying it by 1_000_000 to get a valid `IntervalNano` array. (reading is not an issue) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
