wjones127 edited a comment on issue #12416: URL: https://github.com/apache/arrow/issues/12416#issuecomment-1039292700
Hi, this problem here likely isn't the partitioning read, but the conversion to pandas. From [the docs](https://arrow.apache.org/docs/python/pandas.html#nullable-types): > In Arrow all data types are nullable, meaning they support storing missing values. In pandas, however, not all data types have support for missing data. Most notably, the default integer data types do not, and will get casted to float when missing values are introduced. Therefore, when an Arrow array or table gets converted to pandas, integer columns will become float when missing values are present: There is a workaround using `type_mapper` in [that section of the docs](https://arrow.apache.org/docs/python/pandas.html#nullable-types) for Int64 specifically, so probably worth reading. If you do find there is an issue with the inferred partitioning schema, you can manually pass the partitioning schema. This should be available in version 4.0.1: https://arrow.apache.org/docs/4.0/python/dataset.html#different-partitioning-schemes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org