[ 
https://issues.apache.org/jira/browse/ARROW-17192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Pacifico updated ARROW-17192:
------------------------------------
    Description: 
A feather file with a column containing dates lower than 1677 or greater than 
2262 cannot be read with pandas, du to  `.to_pandas` method.

To reproduce the issue:
 #  

{code:java}
### create feather file
import pandas as pd
df = pd.DataFrame({"date": [
datetime.fromisoformat("1654-01-01"),
datetime.fromisoformat("1920-01-01"),
],})
df.to_feather("to_trash.feather")

### read feather file      
from pyarrow.feather import read_feather
read_feather("to_trash.feather")
{code}
 

I think that the expected behavior would be to have an object column contining 
datetime objects.

I think that the problem comes from _array_like_to_pandas method : 
[https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L1584]

or  from `_to_pandas()`
[https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L2742]

or from `to_pandas`:
[https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L673]

  was:
A feather file with a column containing dates lower than 1677 or greater than 
2262 cannot be read with pandas, du to  `.to_pandas` method.

To reproduce the issue:
 #  

{code:java}
### create feather file
df = pd.DataFrame({"date": [
datetime.fromisoformat("1654-01-01"),
datetime.fromisoformat("1920-01-01"),
],})
df.to_feather("to_trash.feather")

### read feather file      
from pyarrow.feather import read_feather
read_feather("to_trash.feather")
{code}
 

I think that the expected behavior would be to have an object column contining 
datetime objects.

I think that the problem comes from _array_like_to_pandas method : 
[https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L1584]

or  from `_to_pandas()`
[https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L2742]

or from `to_pandas`:
[https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L673]


> .to_pandas  can't read_feather if a date column contains dates before 1677 
> and after 2262
> -----------------------------------------------------------------------------------------
>
>                 Key: ARROW-17192
>                 URL: https://issues.apache.org/jira/browse/ARROW-17192
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>         Environment: Any environment
>            Reporter: Adrien Pacifico
>            Priority: Major
>
> A feather file with a column containing dates lower than 1677 or greater than 
> 2262 cannot be read with pandas, du to  `.to_pandas` method.
> To reproduce the issue:
>  #  
> {code:java}
> ### create feather file
> import pandas as pd
> df = pd.DataFrame({"date": [
> datetime.fromisoformat("1654-01-01"),
> datetime.fromisoformat("1920-01-01"),
> ],})
> df.to_feather("to_trash.feather")
> ### read feather file      
> from pyarrow.feather import read_feather
> read_feather("to_trash.feather")
> {code}
>  
> I think that the expected behavior would be to have an object column 
> contining datetime objects.
> I think that the problem comes from _array_like_to_pandas method : 
> [https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L1584]
> or  from `_to_pandas()`
> [https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L2742]
> or from `to_pandas`:
> [https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L673]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to