[
https://issues.apache.org/jira/browse/ARROW-16898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Li resolved ARROW-16898.
------------------------------
Fix Version/s: 9.0.0
Resolution: Fixed
Issue resolved by pull request 13402
[https://github.com/apache/arrow/pull/13402]
> [Python] TypeError from `Table.from_pandas(df)` when df using non-str index
> name
> --------------------------------------------------------------------------------
>
> Key: ARROW-16898
> URL: https://issues.apache.org/jira/browse/ARROW-16898
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Reporter: Martin Liu
> Assignee: Martin Liu
> Priority: Minor
> Labels: pull-request-available
> Fix For: 9.0.0
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> When do {{{}Table.from_pandas(df){}}}, current code didn't convert {{index}}
> name to str (it did [convert {{column}} name to
> str|https://github.com/apache/arrow/blob/apache-arrow-8.0.0/python/pyarrow/pandas_compat.py#L356]),
> so that it will fail if *non-str index name* in df.
> Code to reproduce:
> {code:java}
> import pandas as pd
> import pyarrow as pa
> df = pd.DataFrame({0: [1, 2, 3], 1: [4, 5, 6]})
> df = df.set_index(0)
> pa.Table.from_pandas(df) {code}
> Error:
> {code:java}
> ---------------------------------------------------------------------------
> TypeError Traceback (most recent call last)
> Input In [3], in <module>
> 4 df = pd.DataFrame({0: [1, 2, 3], 1: [4, 5, 6]})
> 5 df = df.set_index(0)
> ----> 6 pa.Table.from_pandas(df)
> File
> ~/src/mlpsandboxrt/venv/lib/python3.8/site-packages/pyarrow/table.pxi:1394,
> in pyarrow.lib.Table.from_pandas()
> File
> ~/src/mlpsandboxrt/venv/lib/python3.8/site-packages/pyarrow/pandas_compat.py:610,
> in dataframe_to_arrays(df, schema, preserve_index, nthreads, columns, safe)
> 608 for name, type_ in zip(all_names, types):
> 609 name = name if name is not None else 'None'
> --> 610 fields.append(pa.field(name, type_))
> 611 schema = pa.schema(fields)
> 613 pandas_metadata = construct_metadata(df, column_names, index_columns,
> 614 index_descriptors,
> preserve_index,
> 615 types)
> File
> ~/src/mlpsandboxrt/venv/lib/python3.8/site-packages/pyarrow/types.pxi:1698,
> in pyarrow.lib.field()
> File stringsource:15, in
> string.from_py.__pyx_convert_string_from_py_std__in_string()
> TypeError: expected bytes, int found{code}
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)