Tomas Remes created ARROW-9369: ---------------------------------- Summary: Can't convert dictionary type using table.from_pandas Key: ARROW-9369 URL: https://issues.apache.org/jira/browse/ARROW-9369 Project: Apache Arrow Issue Type: Bug Affects Versions: 0.17.1 Reporter: Tomas Remes
Hello, I am trying to do the following (please correct me if I am doing some non-sense): {code:python} import pandas as pd import pyarrow as pa import pyarrow.parquet as pq fields = [pa.field("object", pa.dictionary(pa.int64(), pa.string()))] data = {"object": { "a": "a", "b": "b", "c": "c", "s": "d" }} df = pd.DataFrame(data) table = pa.Table.from_pandas(df, pa.schema(fields)) pq.write_table(table, "test.parquet") {code} and I am getting: {noformat} Traceback (most recent call last): File "pa_test.py", line 17, in <module> table = pa.Table.from_pandas(df, pa.schema(fields)) File "pyarrow/table.pxi", line 1451, in pyarrow.lib.Table.from_pandas File "/home/tremes/GITHUB/data-pipeline/venv/lib64/python3.7/site-packages/pyarrow/pandas_compat.py", line 575, in dataframe_to_arrays for c, f in zip(columns_to_convert, convert_fields)] File "/home/tremes/GITHUB/data-pipeline/venv/lib64/python3.7/site-packages/pyarrow/pandas_compat.py", line 575, in <listcomp> for c, f in zip(columns_to_convert, convert_fields)] File "/home/tremes/GITHUB/data-pipeline/venv/lib64/python3.7/site-packages/pyarrow/pandas_compat.py", line 566, in convert_column raise e File "/home/tremes/GITHUB/data-pipeline/venv/lib64/python3.7/site-packages/pyarrow/pandas_compat.py", line 560, in convert_column result = pa.array(col, type=type_, from_pandas=True, safe=safe) File "pyarrow/array.pxi", line 265, in pyarrow.lib.array File "pyarrow/array.pxi", line 80, in pyarrow.lib._ndarray_to_array File "pyarrow/error.pxi", line 106, in pyarrow.lib.check_status pyarrow.lib.ArrowNotImplementedError: ('Sequence converter for type dictionary<values=string, indices=int64, ordered=0> not implemented', 'Conversion failed for column object with type object') {noformat} Workaround is to use {{df.to_parquet("test.parquet")}} -- This message was sent by Atlassian Jira (v8.3.4#803005)