joellubi opened a new issue, #39456: URL: https://github.com/apache/arrow/issues/39456
### Describe the bug, including details regarding any error messages, version, and platform. The Parquet [DATE](https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#date) logical type must annotate an `int32` representing days since the UNIX epoc per the spec. The Arrow DATE64 (ms since UNIX epoch) type does not have a direct analog in Parquet, so it must be coerced into a compatible representation when writing Arrow data to Parquet. The prevailing convention is to coerce DATE64 to `int32` seconds since the UNIX epoch (Parquet DATE logical type) [e.g. [C++](https://github.com/apache/arrow/blob/ccc674c56f3473c9556a5af96dff9d156f559663/cpp/src/parquet/arrow/schema.cc#L372-L375), [Rust](https://github.com/apache/arrow-rs/blob/2f383e764aa2b79e52d562e24eb0d1dce41f5ce7/parquet/src/arrow/schema/mod.rs#L425-L429)]. The behavior for handling an `int64` value not on a date boundary (i.e. not divisible by 86400000) is not defined. Some implementations [validate](https://github.com/apache/arrow/blob/bda727f9fe56e0abd4fa2770d7175c9074306573/cpp/src/arrow/array/validate.cc#L172-L190) this condition while others truncate to the date the physical value falls within. The current Go implementation diverges from the approach followed by these languages, coercing instead to a [UTC-normalized TIMESTAMP[ms]](https://github.com/apache/arrow/blob/ccc674c56f3473c9556a5af96dff9d156f559663/go/parquet/pqarrow/schema.go#L328-L330). This may lead to surprising behavior in cross-language use-cases and alters the original semantics of the type (at least for non-arrow consumers that don't handle [store_schema](https://pkg.go.dev/github.com/apache/arrow/go/[email protected]/parquet/pqarrow#WithStoreSchema)). It seems that it would increase overall compatibility in the ecosystem to align Go to the convention currently followed in the other implementations. See also: [https://lists.apache.org/thread/q036r1q3cw5ysn3zkpvljx3s9ho18419](https://lists.apache.org/thread/q036r1q3cw5ysn3zkpvljx3s9ho18419) ### Component(s) Go, Parquet -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
