stinodego commented on issue #40128: URL: https://github.com/apache/arrow/issues/40128#issuecomment-1953888244
> Why do we need to do this? Could you explain your use case? For Polars, we use `pyarrow` for the converting our data to `pandas`. Our `Enum` data type is similar to a pyarrow Dictionary type with a set number of categories. This converts into a pandas categorical type which also has a set number of categories. In this conversion, we have to cast the index type from `UInt32` (our default) to `Int64` since pandas does not support unsigned indices. If the data is empty, this cast now loses the category information, and data type after the conversion is wrong. See the original issue opened in our repo: https://github.com/pola-rs/polars/issues/14582 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
