Jorge Leitão created ARROW-14891:
------------------------------------
Summary: [parquet] 9999-12-31 date is wrapped to 1816
Key: ARROW-14891
URL: https://issues.apache.org/jira/browse/ARROW-14891
Project: Apache Arrow
Issue Type: Bug
Reporter: Jorge Leitão
Given a parquet file with a int96 date 9999-12-31 on it (which does not fit in
an i64 ns) is read as a "wrapped", resulting in the date 1816-03-29
05:56:08.066277376.
Spark seems to discard the nanoseconds and only read int96 to micros, which
gives them a 1000x of dates (which happens to cover the 9999, but not others).
There is a long discussion over this issue here:
https://github.com/apache/arrow-rs/issues/982 including a MWE for pyarrow.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)