Re: [PR] GH-43112: [Python] Set nullable `Int64` `dtype` for integer columns with `None` values when converting to pandas [arrow]

via GitHub Fri, 29 Aug 2025 08:36:43 -0700


avm19 commented on PR #44538:
URL: https://github.com/apache/arrow/pull/44538#issuecomment-3237432695


   Please include a clear changelog entry. For me as a user, managing nullable 
vs non-nullable ints in Pandas has been a huge headache. Some user code may 
break or become redundant simply because you fixed this bug.
   
   To maintainers: it would be great if Pandas' `arrow_table_to_pandas()` and 
Arrow's `Table.to_pandas()` were always consistent. I wonder if it is possible 
to steer this way in the long term? This is desirable because Arrow's 
`pyarrow.parquet.read_pandas(...).to_pandas()` and Pandas' 
`pd.read_parquet(..., dtype_backend="pyarrow")` do essentially the same, and 
from a naive user's perspective, they must be identical. Until this PR is 
merged, Pandas treats converts to nullable integers correctly, but Pyarrow 
returns floats as described in the issue. See 
[`pandas.io._util.arrow_table_to_pandas()`](https://github.com/pandas-dev/pandas/blob/3e1d6d5bc853bd8bc983291b12aec2dbf477dde6/pandas/io/_util.py#L71)
 for how they do it.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] GH-43112: [Python] Set nullable `Int64` `dtype` for integer columns with `None` values when converting to pandas [arrow]

Reply via email to