seanslma commented on issue #38171:
URL: https://github.com/apache/arrow/issues/38171#issuecomment-1755379996
Thanks. Apologies if I did not explain the issue clearly.
I used "pandas_version": "2.1.0" - this can be found from the parquet bytes
string.
This is the code used to create the df and parquet bytes
```py
t1 = '2023-09-01'
ds = pd.date_range(t1, t1, freq='30T')
df = pd.DataFrame({
'ds': ds,
})
df_parquet_bytes_v12_ns = df.astype({'ds': 'datetime64[ns]'}).to_parquet()
#using pyarrow 12.0.0
df_parquet_bytes_v12_us = df.astype({'ds': 'datetime64[us]'}).to_parquet()
#using pyarrow 12.0.0
df_parquet_bytes_v13_ns = df.astype({'ds': 'datetime64[ns]'}).to_parquet()
#using pyarrow 13.0.0
df_parquet_bytes_v13_us = df.astype({'ds': 'datetime64[us]'}).to_parquet()
#using pyarrow 13.0.0
```
I created the parquet bytes in pyarrow 12.0.0 because our api uses pyarrow
12.0.0. At the client side we use pyarrow 13.0.0 to convert the parquet bytes
back to pandas df.
For this one (**BUG**)
```
input output_v12 output_v13 comment
df_parquet_bytes_v12_ns: datetime64[ns] datetime64[us] v13 ns -> us,
lost resolution
```
The input parquet bytes data is created using pyarrow 12.0.0 with
`datetime64[ns]` in the input df.
When convert bytes data back to pandas df the unit is still `datetime64[ns]`
using pyarrow 12.0.0.
But the unit becomes `datetime64[us]` when using pyarrow 13.0.0.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]