Damian Barabonkov created ARROW-18099: -----------------------------------------
Summary: Cannot create pandas categorical from table only with nulls Key: ARROW-18099 URL: https://issues.apache.org/jira/browse/ARROW-18099 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 9.0.0 Environment: OSX 12.6 M1 silicon Reporter: Damian Barabonkov A pyarrow Table with only null values cannot be instantiated as a Pandas DataFrame with said column as a category. However, pandas does support "empty" categoricals. Therefore, a simple patch would be to load the pa.Table as an object first and convert, once in pandas, to a categorical which will be empty. However, that does not solve the pyarrow bug at its root. Sample reproducible example ```python import pyarrow as pa pylist = [\{'x': None, '__index_level_0__': 2}, \{'x': None, '__index_level_0__': 3}] tbl = pa.Table.from_pylist(pylist) # Errors df_broken = tbl.to_pandas(categories=["x"]) # Works df_works = tbl.to_pandas() df_works = df_works.astype(\{"x": "category"}) ``` -- This message was sent by Atlassian Jira (v8.20.10#820010)