spenczar opened a new issue, #38050:
URL: https://github.com/apache/arrow/issues/38050
### Describe the bug, including details regarding any error messages,
version, and platform.
In PyArrow, Date64Array values do not maintain precision when being loaded
from pandas by `pa.array`.
For example, let's make a date64 array value, and convert it to a pandas
Series, taking care to avoid using datetime objects:
```py
import pyarrow as pa
import pandas as pd
date64_array = pa.array([1, 2, 3], pa.date64())
date64_pd = date64_array.to_pandas(date_as_object=False)
# Now load it back in:
date64_roundtripped = pa.array(pandas, pa.date64())
# It ought to be unchanged - but its not, this assertion fails:
assert date64_roundtripped == date64_array
```
If one prints `pc.subtract(date64_roundtripped, date64_array)`, you can see
that they are different:
```
<pyarrow.lib.DurationArray object at 0x10537f160>
[
-1,
-2,
-3
]
```
Note that this does not occur for date32:
```py
import pyarrow as pa
import pandas as pd
date32_array = pa.array([1, 2, 3], pa.date32())
date32_pd = date32_array.to_pandas(date_as_object=False)
date32_roundtripped = pa.array(pandas, pa.date32())
# just fine:
assert date32_roundtripped == date32_array
```
It appears to me that `date64_pd` is just fine. It prints as this:
```
0 1970-01-01 00:00:00.001
1 1970-01-01 00:00:00.002
2 1970-01-01 00:00:00.003
dtype: datetime64[ns]
```
One hint at whats going on is to use `pa.Array.from_pandas`. That actually
returns a `TimestampArray:
```
In [31]: pa.Array.from_pandas(date64_array)
Out[31]:
<pyarrow.lib.TimestampArray object at 0x12d434d60>
[
1970-01-01 00:00:00.001000000,
1970-01-01 00:00:00.002000000,
1970-01-01 00:00:00.003000000
]
```
The issue _might_ be that conversion from TimestampArray to Date64 array
drops precision, maybe.
### Component(s)
Python
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]