Adrien Pacifico created ARROW-17192:
---------------------------------------

             Summary: .to_pandas  can't read_feather if a date column contains 
dates before 1677 and after 2262
                 Key: ARROW-17192
                 URL: https://issues.apache.org/jira/browse/ARROW-17192
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
         Environment: Any environment
            Reporter: Adrien Pacifico


A feather file with a column containing dates lower than 1677 or greater than 
2262 cannot be read with pandas, du to  `.to_pandas` method.

To reproduce the issue:
# create feather file
df = pd.DataFrame(\{"date": [
            datetime.fromisoformat("1654-01-01"),
            datetime.fromisoformat("1920-01-01"),
        ],})
df.to_feather("to_trash.feather")

### read feather file 

from pyarrow.feather import read_feather

read_feather("to_trash.feather")



I think that the expected behavior would be to have an object column contining 
datetime objects.

I think that the problem comes from _array_like_to_pandas method : 
[https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L1584]

or  from `_to_pandas()`
https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L2742

or from `to_pandas`:
https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L673



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to