avm19 commented on PR #44538: URL: https://github.com/apache/arrow/pull/44538#issuecomment-3237432695
Please include a clear changelog entry. For me as a user, managing nullable vs non-nullable ints in Pandas has been a huge headache. Some user code may break or become redundant simply because you fixed this bug. To maintainers: it would be great if Pandas' `arrow_table_to_pandas()` and Arrow's `Table.to_pandas()` were always consistent. I wonder if it is possible to steer this way in the long term? This is desirable because Arrow's `pyarrow.parquet.read_pandas(...).to_pandas()` and Pandas' `pd.read_parquet(..., dtype_backend="pyarrow")` do essentially the same, and from a naive user's perspective, they must be identical. Until this PR is merged, Pandas treats converts to nullable integers correctly, but Pyarrow returns floats as described in the issue. See [`pandas.io._util.arrow_table_to_pandas()`](https://github.com/pandas-dev/pandas/blob/3e1d6d5bc853bd8bc983291b12aec2dbf477dde6/pandas/io/_util.py#L71) for how they do it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org