atherkevin commented on issue #15133:
URL: https://github.com/apache/arrow/issues/15133#issuecomment-1372364527
Reduced complexity to show the failure
`df = pd.DataFrame(({
'a': ['a', 'a', 'a', 'b', 'b', 'b', 'c'],
'b': [1, 2, None, 4, 5, 1, 0],
'c': [1.5, 2.0, 3.5, 5.0, 8.0, 10.0, 'a string'],
}))
fields = [pa.field('a', pa.string()),
pa.field('b', pa.int64()),
pa.field('c', pa.string())
]
schema = pa.schema(fields)
table = pa.Table.from_pandas(df, schema)`
Stack trace:
Error
Traceback (most recent call last):
File
"/Users/kevin/PycharmProjects/analyzethatv1/tests/test_pyarrow_file_failure.py",
line 20, in test_error_conversion
table = pa.Table.from_pandas(df, schema)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pyarrow/table.pxi", line 3475, in pyarrow.lib.Table.from_pandas
File
"/Users/kevin/PycharmProjects/analyzethatv1/venv/lib/python3.11/site-packages/pyarrow/pandas_compat.py",
line 611, in dataframe_to_arrays
arrays = [convert_column(c, f)
^^^^^^^^^^^^^^^^^^^^^
File
"/Users/kevin/PycharmProjects/analyzethatv1/venv/lib/python3.11/site-packages/pyarrow/pandas_compat.py",
line 611, in <listcomp>
arrays = [convert_column(c, f)
^^^^^^^^^^^^^^^^^^^^
File
"/Users/kevin/PycharmProjects/analyzethatv1/venv/lib/python3.11/site-packages/pyarrow/pandas_compat.py",
line 598, in convert_column
raise e
File
"/Users/kevin/PycharmProjects/analyzethatv1/venv/lib/python3.11/site-packages/pyarrow/pandas_compat.py",
line 592, in convert_column
result = pa.array(col, type=type_, from_pandas=True, safe=safe)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pyarrow/array.pxi", line 316, in pyarrow.lib.array
File "pyarrow/array.pxi", line 83, in pyarrow.lib._ndarray_to_array
File "pyarrow/error.pxi", line 123, in pyarrow.lib.check_status
pyarrow.lib.ArrowTypeError: ("Expected bytes, got a 'float' object",
'Conversion failed for column c with type object')
So it seems it is reading the schema, and attempting to convert it to
string, but then throws an error when it hits a float? I've also tried
variations of casting this in pandas before passing it into pyarrow to no avail.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]