[ 
https://issues.apache.org/jira/browse/ARROW-5125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micah Kornfield updated ARROW-5125:
-----------------------------------
    Labels: pull-request-available windows  (was: parquet 
pull-request-available windows)

> [Python] Cannot roundtrip extreme dates through pyarrow
> -------------------------------------------------------
>
>                 Key: ARROW-5125
>                 URL: https://issues.apache.org/jira/browse/ARROW-5125
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.13.0
>         Environment: Windows 10, Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 
> 2019, 22:22:05)
>            Reporter: Max Bolingbroke
>            Assignee: Micah Kornfield
>            Priority: Major
>              Labels: pull-request-available, windows
>             Fix For: 0.15.0
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> You can roundtrip many dates through a pyarrow array:
>  
> {noformat}
> >>> pa.array([datetime.date(1980, 1, 1)], type=pa.date32())[0]
> datetime.date(1980, 1, 1){noformat}
>  
> But (on Windows at least), not extreme ones:
>  
> {noformat}
> >>> pa.array([datetime.date(1960, 1, 1)], type=pa.date32())[0]
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
>  File "pyarrow\scalar.pxi", line 74, in pyarrow.lib.ArrayValue.__repr__
>  File "pyarrow\scalar.pxi", line 226, in pyarrow.lib.Date32Value.as_py
> OSError: [Errno 22] Invalid argument
> >>> pa.array([datetime.date(3200, 1, 1)], type=pa.date32())[0]
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
>  File "pyarrow\scalar.pxi", line 74, in pyarrow.lib.ArrayValue.__repr__
>  File "pyarrow\scalar.pxi", line 226, in pyarrow.lib.Date32Value.as_py
> {noformat}
> This is because datetime.utcfromtimestamp and datetime.timestamp fail on 
> these dates, but it seems we should be able to totally avoid invoking this 
> function when deserializing dates. Ideally we would be able to roundtrip 
> these as datetimes too, of course, but it's less clear that this will be 
> easy. For some context on this see [https://bugs.python.org/issue29097].
> This may be related to ARROW-3176 and ARROW-4746



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to