wjones127 edited a comment on issue #12416:
URL: https://github.com/apache/arrow/issues/12416#issuecomment-1039292700


   Hi, this problem here likely isn't the partitioning read, but the conversion 
to pandas. From [the 
docs](https://arrow.apache.org/docs/python/pandas.html#nullable-types):
   
   > In Arrow all data types are nullable, meaning they support storing missing 
values. In pandas, however, not all data types have support for missing data. 
Most notably, the default integer data types do not, and will get casted to 
float when missing values are introduced. Therefore, when an Arrow array or 
table gets converted to pandas, integer columns will become float when missing 
values are present:
   
   There is a workaround using `type_mapper` in [that section of the 
docs](https://arrow.apache.org/docs/python/pandas.html#nullable-types) for 
Int64 specifically, so probably worth reading.
   
   If you do find there is an issue with the inferred partitioning schema, you 
can manually pass the partitioning schema. This should be available in version 
4.0.1: 
https://arrow.apache.org/docs/4.0/python/dataset.html#different-partitioning-schemes
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to