Adrien Pacifico created ARROW-17192:
---------------------------------------
Summary: .to_pandas can't read_feather if a date column contains
dates before 1677 and after 2262
Key: ARROW-17192
URL: https://issues.apache.org/jira/browse/ARROW-17192
Project: Apache Arrow
Issue Type: Bug
Components: Python
Environment: Any environment
Reporter: Adrien Pacifico
A feather file with a column containing dates lower than 1677 or greater than
2262 cannot be read with pandas, du to `.to_pandas` method.
To reproduce the issue:
# create feather file
df = pd.DataFrame(\{"date": [
datetime.fromisoformat("1654-01-01"),
datetime.fromisoformat("1920-01-01"),
],})
df.to_feather("to_trash.feather")
### read feather file
from pyarrow.feather import read_feather
read_feather("to_trash.feather")
I think that the expected behavior would be to have an object column contining
datetime objects.
I think that the problem comes from _array_like_to_pandas method :
[https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L1584]
or from `_to_pandas()`
https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L2742
or from `to_pandas`:
https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L673
--
This message was sent by Atlassian Jira
(v8.20.10#820010)