seanslma opened a new issue, #38171:
URL: https://github.com/apache/arrow/issues/38171

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   I converted a pandas df (two versions: one with datetime64[ns] and another 
with datetime64[us]) to parquet bytes using both pyarrow 12.0.0 and 13.0.0.
   
   I then converted back the parquet bytes to pandas df, the datetime unit of 
the original df has been changed. Here is the summary of the results
   
   ```
                     input       output_v12      output_v13   comment
   df_parquet_bytes_v12_ns:  datetime64[ns]  datetime64[us]   v13 ns -> us, 
lost resolution
   df_parquet_bytes_v12_us:  datetime64[ns]  datetime64[us]   v12 us -> ns, 
acceptable
   df_parquet_bytes_v13_ns:  datetime64[ns]  datetime64[ns]   all match, no 
issues
   df_parquet_bytes_v13_us:  datetime64[ns]  datetime64[us]   v12 us -> ns, 
acceptable
   ```
   
   The change in pyarrow 13.0.0 leads to unacceptable result in case 1 (first 
line).
   
   Here is the code to reproduce the issue
   ```py
   df_parquet_bytes_v12_us = 
b'PAR1\x15\x04\x15\x10\x15\x14L\x15\x02\x15\x00\x12\x00\x00\x08\x1c\x00`}\xd7@\x04\x06\x00\x15\x00\x15\x12\x15\x16,\x15\x02\x15\x10\x15\x06\x15\x06\x1c\x18\x08\x00`}\xd7@\x04\x06\x00\x18\x08\x00`}\xd7@\x04\x06\x00\x16\x00(\x08\x00`}\xd7@\x04\x06\x00\x18\x08\x00`}\xd7@\x04\x06\x00\x00\x00\x00\t
 
\x02\x00\x00\x00\x02\x01\x01\x02\x00&\xc8\x01\x1c\x15\x04\x195\x10\x00\x06\x19\x18\x02ds\x15\x02\x16\x02\x16\xb8\x01\x16\xc0\x01&8&\x08\x1c\x18\x08\x00`}\xd7@\x04\x06\x00\x18\x08\x00`}\xd7@\x04\x06\x00\x16\x00(\x08\x00`}\xd7@\x04\x06\x00\x18\x08\x00`}\xd7@\x04\x06\x00\x00\x19,\x15\x04\x15\x00\x15\x02\x00\x15\x00\x15\x10\x15\x02\x00\x00\x00\x15\x04\x19,5\x00\x18\x06schema\x15\x02\x00\x15\x04%\x02\x18\x02ds%\x14L\x8c\x12\x1c,\x00\x00\x00\x00\x00\x16\x02\x19\x1c\x19\x1c&\xc8\x01\x1c\x15\x04\x195\x10\x00\x06\x19\x18\x02ds\x15\x02\x16\x02\x16\xb8\x01\x16\xc0\x01&8&\x08\x1c\x18\x08\x00`}\xd7@\x04\x06\x00\x18\x08\x00`}\xd7@\x04\x06\x00\x16\x00(\x08\x00`}\xd7@\x04\x06\x00\x
 
18\x08\x00`}\xd7@\x04\x06\x00\x00\x19,\x15\x04\x15\x00\x15\x02\x00\x15\x00\x15\x10\x15\x02\x00\x00\x00\x16\xb8\x01\x16\x02&\x08\x16\xc0\x01\x14\x00\x00\x19,\x18\x06pandas\x18\xb4\x03{"index_columns":
 [{"kind": "range", "name": null, "start": 0, "stop": 1, "step": 1}], 
"column_indexes": [{"name": null, "field_name": null, "pandas_type": "unicode", 
"numpy_type": "object", "metadata": {"encoding": "UTF-8"}}], "columns": 
[{"name": "ds", "field_name": "ds", "pandas_type": "datetime", "numpy_type": 
"datetime64[us]", "metadata": null}], "creator": {"library": "pyarrow", 
"version": "12.0.0"}, "pandas_version": 
"2.1.0"}\x00\x18\x0cARROW:schema\x18\xb8\x06/////2ACAAAQAAAAAAAKAA4ABgAFAAgACgAAAAABBAAQAAAAAAAKAAwAAAAEAAgACgAAAOwBAAAEAAAAAQAAAAwAAAAIAAwABAAIAAgAAAAIAAAAEAAAAAYAAABwYW5kYXMAALQBAAB7ImluZGV4X2NvbHVtbnMiOiBbeyJraW5kIjogInJhbmdlIiwgIm5hbWUiOiBudWxsLCAic3RhcnQiOiAwLCAic3RvcCI6IDEsICJzdGVwIjogMX1dLCAiY29sdW1uX2luZGV4ZXMiOiBbeyJuYW1lIjogbnVsbCwgImZpZWxkX25hbWUiOiBudWxsLCAicGFuZGFzX3R5cGU
 
iOiAidW5pY29kZSIsICJudW1weV90eXBlIjogIm9iamVjdCIsICJtZXRhZGF0YSI6IHsiZW5jb2RpbmciOiAiVVRGLTgifX1dLCAiY29sdW1ucyI6IFt7Im5hbWUiOiAiZHMiLCAiZmllbGRfbmFtZSI6ICJkcyIsICJwYW5kYXNfdHlwZSI6ICJkYXRldGltZSIsICJudW1weV90eXBlIjogImRhdGV0aW1lNjRbdXNdIiwgIm1ldGFkYXRhIjogbnVsbH1dLCAiY3JlYXRvciI6IHsibGlicmFyeSI6ICJweWFycm93IiwgInZlcnNpb24iOiAiMTIuMC4wIn0sICJwYW5kYXNfdmVyc2lvbiI6ICIyLjEuMCJ9AAAAAAEAAAAUAAAAEAAUAAgABgAHAAwAAAAQABAAAAAAAAEKEAAAABwAAAAEAAAAAAAAAAIAAABkcwAAAAAGAAgABgAGAAAAAAACAA==\x00\x18
 parquet-cpp-arrow version 12.0.0\x19\x1c\x1c\x00\x00\x00\xc8\x05\x00\x00PAR1'
   
   df_parquet_bytes_v12_ns = 
b'PAR1\x15\x04\x15\x10\x15\x14L\x15\x02\x15\x00\x12\x00\x00\x08\x1c\x00`}\xd7@\x04\x06\x00\x15\x00\x15\x12\x15\x16,\x15\x02\x15\x10\x15\x06\x15\x06\x1c\x18\x08\x00`}\xd7@\x04\x06\x00\x18\x08\x00`}\xd7@\x04\x06\x00\x16\x00(\x08\x00`}\xd7@\x04\x06\x00\x18\x08\x00`}\xd7@\x04\x06\x00\x00\x00\x00\t
 
\x02\x00\x00\x00\x02\x01\x01\x02\x00&\xc8\x01\x1c\x15\x04\x195\x10\x00\x06\x19\x18\x02ds\x15\x02\x16\x02\x16\xb8\x01\x16\xc0\x01&8&\x08\x1c\x18\x08\x00`}\xd7@\x04\x06\x00\x18\x08\x00`}\xd7@\x04\x06\x00\x16\x00(\x08\x00`}\xd7@\x04\x06\x00\x18\x08\x00`}\xd7@\x04\x06\x00\x00\x19,\x15\x04\x15\x00\x15\x02\x00\x15\x00\x15\x10\x15\x02\x00\x00\x00\x15\x04\x19,5\x00\x18\x06schema\x15\x02\x00\x15\x04%\x02\x18\x02ds%\x14L\x8c\x12\x1c,\x00\x00\x00\x00\x00\x16\x02\x19\x1c\x19\x1c&\xc8\x01\x1c\x15\x04\x195\x10\x00\x06\x19\x18\x02ds\x15\x02\x16\x02\x16\xb8\x01\x16\xc0\x01&8&\x08\x1c\x18\x08\x00`}\xd7@\x04\x06\x00\x18\x08\x00`}\xd7@\x04\x06\x00\x16\x00(\x08\x00`}\xd7@\x04\x06\x00\x
 
18\x08\x00`}\xd7@\x04\x06\x00\x00\x19,\x15\x04\x15\x00\x15\x02\x00\x15\x00\x15\x10\x15\x02\x00\x00\x00\x16\xb8\x01\x16\x02&\x08\x16\xc0\x01\x14\x00\x00\x19,\x18\x06pandas\x18\xb4\x03{"index_columns":
 [{"kind": "range", "name": null, "start": 0, "stop": 1, "step": 1}], 
"column_indexes": [{"name": null, "field_name": null, "pandas_type": "unicode", 
"numpy_type": "object", "metadata": {"encoding": "UTF-8"}}], "columns": 
[{"name": "ds", "field_name": "ds", "pandas_type": "datetime", "numpy_type": 
"datetime64[ns]", "metadata": null}], "creator": {"library": "pyarrow", 
"version": "12.0.0"}, "pandas_version": 
"2.1.0"}\x00\x18\x0cARROW:schema\x18\xb8\x06/////2ACAAAQAAAAAAAKAA4ABgAFAAgACgAAAAABBAAQAAAAAAAKAAwAAAAEAAgACgAAAOwBAAAEAAAAAQAAAAwAAAAIAAwABAAIAAgAAAAIAAAAEAAAAAYAAABwYW5kYXMAALQBAAB7ImluZGV4X2NvbHVtbnMiOiBbeyJraW5kIjogInJhbmdlIiwgIm5hbWUiOiBudWxsLCAic3RhcnQiOiAwLCAic3RvcCI6IDEsICJzdGVwIjogMX1dLCAiY29sdW1uX2luZGV4ZXMiOiBbeyJuYW1lIjogbnVsbCwgImZpZWxkX25hbWUiOiBudWxsLCAicGFuZGFzX3R5cGU
 
iOiAidW5pY29kZSIsICJudW1weV90eXBlIjogIm9iamVjdCIsICJtZXRhZGF0YSI6IHsiZW5jb2RpbmciOiAiVVRGLTgifX1dLCAiY29sdW1ucyI6IFt7Im5hbWUiOiAiZHMiLCAiZmllbGRfbmFtZSI6ICJkcyIsICJwYW5kYXNfdHlwZSI6ICJkYXRldGltZSIsICJudW1weV90eXBlIjogImRhdGV0aW1lNjRbbnNdIiwgIm1ldGFkYXRhIjogbnVsbH1dLCAiY3JlYXRvciI6IHsibGlicmFyeSI6ICJweWFycm93IiwgInZlcnNpb24iOiAiMTIuMC4wIn0sICJwYW5kYXNfdmVyc2lvbiI6ICIyLjEuMCJ9AAAAAAEAAAAUAAAAEAAUAAgABgAHAAwAAAAQABAAAAAAAAEKEAAAABwAAAAEAAAAAAAAAAIAAABkcwAAAAAGAAgABgAGAAAAAAADAA==\x00\x18
 parquet-cpp-arrow version 12.0.0\x19\x1c\x1c\x00\x00\x00\xc8\x05\x00\x00PAR1'
   
   df_parquet_bytes_v13_us = 
b'PAR1\x15\x04\x15\x10\x15\x14L\x15\x02\x15\x00\x12\x00\x00\x08\x1c\x00`}\xd7@\x04\x06\x00\x15\x00\x15\x12\x15\x16,\x15\x02\x15\x10\x15\x06\x15\x06\x1c\x18\x08\x00`}\xd7@\x04\x06\x00\x18\x08\x00`}\xd7@\x04\x06\x00\x16\x00(\x08\x00`}\xd7@\x04\x06\x00\x18\x08\x00`}\xd7@\x04\x06\x00\x00\x00\x00\t
 
\x02\x00\x00\x00\x02\x01\x01\x02\x00&\xc8\x01\x1c\x15\x04\x195\x00\x06\x10\x19\x18\x02ds\x15\x02\x16\x02\x16\xb8\x01\x16\xc0\x01&8&\x08\x1c\x18\x08\x00`}\xd7@\x04\x06\x00\x18\x08\x00`}\xd7@\x04\x06\x00\x16\x00(\x08\x00`}\xd7@\x04\x06\x00\x18\x08\x00`}\xd7@\x04\x06\x00\x00\x19,\x15\x04\x15\x00\x15\x02\x00\x15\x00\x15\x10\x15\x02\x00\x00\x00\x15\x04\x19,5\x00\x18\x06schema\x15\x02\x00\x15\x04%\x02\x18\x02ds%\x14L\x8c\x12\x1c,\x00\x00\x00\x00\x00\x16\x02\x19\x1c\x19\x1c&\xc8\x01\x1c\x15\x04\x195\x00\x06\x10\x19\x18\x02ds\x15\x02\x16\x02\x16\xb8\x01\x16\xc0\x01&8&\x08\x1c\x18\x08\x00`}\xd7@\x04\x06\x00\x18\x08\x00`}\xd7@\x04\x06\x00\x16\x00(\x08\x00`}\xd7@\x04\x06\x00\x
 
18\x08\x00`}\xd7@\x04\x06\x00\x00\x19,\x15\x04\x15\x00\x15\x02\x00\x15\x00\x15\x10\x15\x02\x00\x00\x00\x16\xb8\x01\x16\x02&\x08\x16\xc0\x01\x14\x00\x00\x19,\x18\x06pandas\x18\xb4\x03{"index_columns":
 [{"kind": "range", "name": null, "start": 0, "stop": 1, "step": 1}], 
"column_indexes": [{"name": null, "field_name": null, "pandas_type": "unicode", 
"numpy_type": "object", "metadata": {"encoding": "UTF-8"}}], "columns": 
[{"name": "ds", "field_name": "ds", "pandas_type": "datetime", "numpy_type": 
"datetime64[us]", "metadata": null}], "creator": {"library": "pyarrow", 
"version": "13.0.0"}, "pandas_version": 
"2.1.0"}\x00\x18\x0cARROW:schema\x18\xb8\x06/////2ACAAAQAAAAAAAKAA4ABgAFAAgACgAAAAABBAAQAAAAAAAKAAwAAAAEAAgACgAAAOwBAAAEAAAAAQAAAAwAAAAIAAwABAAIAAgAAAAIAAAAEAAAAAYAAABwYW5kYXMAALQBAAB7ImluZGV4X2NvbHVtbnMiOiBbeyJraW5kIjogInJhbmdlIiwgIm5hbWUiOiBudWxsLCAic3RhcnQiOiAwLCAic3RvcCI6IDEsICJzdGVwIjogMX1dLCAiY29sdW1uX2luZGV4ZXMiOiBbeyJuYW1lIjogbnVsbCwgImZpZWxkX25hbWUiOiBudWxsLCAicGFuZGFzX3R5cGU
 
iOiAidW5pY29kZSIsICJudW1weV90eXBlIjogIm9iamVjdCIsICJtZXRhZGF0YSI6IHsiZW5jb2RpbmciOiAiVVRGLTgifX1dLCAiY29sdW1ucyI6IFt7Im5hbWUiOiAiZHMiLCAiZmllbGRfbmFtZSI6ICJkcyIsICJwYW5kYXNfdHlwZSI6ICJkYXRldGltZSIsICJudW1weV90eXBlIjogImRhdGV0aW1lNjRbdXNdIiwgIm1ldGFkYXRhIjogbnVsbH1dLCAiY3JlYXRvciI6IHsibGlicmFyeSI6ICJweWFycm93IiwgInZlcnNpb24iOiAiMTMuMC4wIn0sICJwYW5kYXNfdmVyc2lvbiI6ICIyLjEuMCJ9AAAAAAEAAAAUAAAAEAAUAAgABgAHAAwAAAAQABAAAAAAAAEKEAAAABwAAAAEAAAAAAAAAAIAAABkcwAAAAAGAAgABgAGAAAAAAACAA==\x00\x18
 parquet-cpp-arrow version 13.0.0\x19\x1c\x1c\x00\x00\x00\xc8\x05\x00\x00PAR1'
   
   df_parquet_bytes_v13_ns = 
b'PAR1\x15\x04\x15\x10\x15\x14L\x15\x02\x15\x00\x12\x00\x00\x08\x1c\x00\x00\xbf\xc1I\x9d\x80\x17\x15\x00\x15\x12\x15\x16,\x15\x02\x15\x10\x15\x06\x15\x06\x1c\x18\x08\x00\x00\xbf\xc1I\x9d\x80\x17\x18\x08\x00\x00\xbf\xc1I\x9d\x80\x17\x16\x00(\x08\x00\x00\xbf\xc1I\x9d\x80\x17\x18\x08\x00\x00\xbf\xc1I\x9d\x80\x17\x00\x00\x00\t
 
\x02\x00\x00\x00\x02\x01\x01\x02\x00&\xc8\x01\x1c\x15\x04\x195\x00\x06\x10\x19\x18\x02ds\x15\x02\x16\x02\x16\xb8\x01\x16\xc0\x01&8&\x08\x1c\x18\x08\x00\x00\xbf\xc1I\x9d\x80\x17\x18\x08\x00\x00\xbf\xc1I\x9d\x80\x17\x16\x00(\x08\x00\x00\xbf\xc1I\x9d\x80\x17\x18\x08\x00\x00\xbf\xc1I\x9d\x80\x17\x00\x19,\x15\x04\x15\x00\x15\x02\x00\x15\x00\x15\x10\x15\x02\x00\x00\x00\x15\x04\x19,5\x00\x18\x06schema\x15\x02\x00\x15\x04%\x02\x18\x02dsl\x8c\x12\x1c<\x00\x00\x00\x00\x00\x16\x02\x19\x1c\x19\x1c&\xc8\x01\x1c\x15\x04\x195\x00\x06\x10\x19\x18\x02ds\x15\x02\x16\x02\x16\xb8\x01\x16\xc0\x01&8&\x08\x1c\x18\x08\x00\x00\xbf\xc1I\x9d\x80\x17\x18\x08\x00\x
 
00\xbf\xc1I\x9d\x80\x17\x16\x00(\x08\x00\x00\xbf\xc1I\x9d\x80\x17\x18\x08\x00\x00\xbf\xc1I\x9d\x80\x17\x00\x19,\x15\x04\x15\x00\x15\x02\x00\x15\x00\x15\x10\x15\x02\x00\x00\x00\x16\xb8\x01\x16\x02&\x08\x16\xc0\x01\x14\x00\x00\x19,\x18\x06pandas\x18\xb4\x03{"index_columns":
 [{"kind": "range", "name": null, "start": 0, "stop": 1, "step": 1}], 
"column_indexes": [{"name": null, "field_name": null, "pandas_type": "unicode", 
"numpy_type": "object", "metadata": {"encoding": "UTF-8"}}], "columns": 
[{"name": "ds", "field_name": "ds", "pandas_type": "datetime", "numpy_type": 
"datetime64[ns]", "metadata": null}], "creator": {"library": "pyarrow", 
"version": "13.0.0"}, "pandas_version": 
"2.1.0"}\x00\x18\x0cARROW:schema\x18\xb8\x06/////2ACAAAQAAAAAAAKAA4ABgAFAAgACgAAAAABBAAQAAAAAAAKAAwAAAAEAAgACgAAAOwBAAAEAAAAAQAAAAwAAAAIAAwABAAIAAgAAAAIAAAAEAAAAAYAAABwYW5kYXMAALQBAAB7ImluZGV4X2NvbHVtbnMiOiBbeyJraW5kIjogInJhbmdlIiwgIm5hbWUiOiBudWxsLCAic3RhcnQiOiAwLCAic3RvcCI6IDEsICJzdGVwIjogMX1dLCAiY29sdW1uX2luZG
 
V4ZXMiOiBbeyJuYW1lIjogbnVsbCwgImZpZWxkX25hbWUiOiBudWxsLCAicGFuZGFzX3R5cGUiOiAidW5pY29kZSIsICJudW1weV90eXBlIjogIm9iamVjdCIsICJtZXRhZGF0YSI6IHsiZW5jb2RpbmciOiAiVVRGLTgifX1dLCAiY29sdW1ucyI6IFt7Im5hbWUiOiAiZHMiLCAiZmllbGRfbmFtZSI6ICJkcyIsICJwYW5kYXNfdHlwZSI6ICJkYXRldGltZSIsICJudW1weV90eXBlIjogImRhdGV0aW1lNjRbbnNdIiwgIm1ldGFkYXRhIjogbnVsbH1dLCAiY3JlYXRvciI6IHsibGlicmFyeSI6ICJweWFycm93IiwgInZlcnNpb24iOiAiMTMuMC4wIn0sICJwYW5kYXNfdmVyc2lvbiI6ICIyLjEuMCJ9AAAAAAEAAAAUAAAAEAAUAAgABgAHAAwAAAAQABAAAAAAAAEKEAAAABwAAAAEAAAAAAAAAAIAAABkcwAAAAAGAAgABgAGAAAAAAADAA==\x00\x18
 parquet-cpp-arrow version 13.0.0\x19\x1c\x1c\x00\x00\x00\xc6\x05\x00\x00PAR1'
   
   for v in [12,13]:
       for s in ['ns', 'us']:
           print(f'df_parquet_bytes_v{v}_{s}: ', 
pd.read_parquet(io.BytesIO(globals()[f'df_parquet_bytes_v{v}_{s}'])).dtypes.iloc[0])
   ```
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to