avm19 commented on PR #44538:
URL: https://github.com/apache/arrow/pull/44538#issuecomment-3237432695

   Please include a clear changelog entry. For me as a user, managing nullable 
vs non-nullable ints in Pandas has been a huge headache. Some user code may 
break or become redundant simply because you fixed this bug.
   
   To maintainers: it would be great if Pandas' `arrow_table_to_pandas()` and 
Arrow's `Table.to_pandas()` were always consistent. I wonder if it is possible 
to steer this way in the long term? This is desirable because Arrow's 
`pyarrow.parquet.read_pandas(...).to_pandas()` and Pandas' 
`pd.read_parquet(..., dtype_backend="pyarrow")` do essentially the same, and 
from a naive user's perspective, they must be identical. Until this PR is 
merged, Pandas treats converts to nullable integers correctly, but Pyarrow 
returns floats as described in the issue. See 
[`pandas.io._util.arrow_table_to_pandas()`](https://github.com/pandas-dev/pandas/blob/3e1d6d5bc853bd8bc983291b12aec2dbf477dde6/pandas/io/_util.py#L71)
 for how they do it.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to