atherkevin commented on issue #15133:
URL: https://github.com/apache/arrow/issues/15133#issuecomment-1372364527

   Reduced complexity to show the failure
   
   `df = pd.DataFrame(({
               'a': ['a', 'a', 'a', 'b', 'b', 'b', 'c'],
               'b': [1, 2, None, 4, 5, 1, 0],
               'c': [1.5, 2.0, 3.5, 5.0, 8.0, 10.0, 'a string'],
           }))
   
           fields = [pa.field('a', pa.string()),
                     pa.field('b', pa.int64()),
                     pa.field('c', pa.string())
                     ]
           schema = pa.schema(fields)
           table = pa.Table.from_pandas(df, schema)`
   
   Stack trace:
   
   Error
   Traceback (most recent call last):
     File 
"/Users/kevin/PycharmProjects/analyzethatv1/tests/test_pyarrow_file_failure.py",
 line 20, in test_error_conversion
       table = pa.Table.from_pandas(df, schema)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "pyarrow/table.pxi", line 3475, in pyarrow.lib.Table.from_pandas
     File 
"/Users/kevin/PycharmProjects/analyzethatv1/venv/lib/python3.11/site-packages/pyarrow/pandas_compat.py",
 line 611, in dataframe_to_arrays
       arrays = [convert_column(c, f)
                ^^^^^^^^^^^^^^^^^^^^^
     File 
"/Users/kevin/PycharmProjects/analyzethatv1/venv/lib/python3.11/site-packages/pyarrow/pandas_compat.py",
 line 611, in <listcomp>
       arrays = [convert_column(c, f)
                 ^^^^^^^^^^^^^^^^^^^^
     File 
"/Users/kevin/PycharmProjects/analyzethatv1/venv/lib/python3.11/site-packages/pyarrow/pandas_compat.py",
 line 598, in convert_column
       raise e
     File 
"/Users/kevin/PycharmProjects/analyzethatv1/venv/lib/python3.11/site-packages/pyarrow/pandas_compat.py",
 line 592, in convert_column
       result = pa.array(col, type=type_, from_pandas=True, safe=safe)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "pyarrow/array.pxi", line 316, in pyarrow.lib.array
     File "pyarrow/array.pxi", line 83, in pyarrow.lib._ndarray_to_array
     File "pyarrow/error.pxi", line 123, in pyarrow.lib.check_status
   pyarrow.lib.ArrowTypeError: ("Expected bytes, got a 'float' object", 
'Conversion failed for column c with type object')
   
   So it seems it is reading the schema, and attempting to convert it to 
string, but then throws an error when it hits a float? I've also tried 
variations of casting this in pandas before passing it into pyarrow to no avail.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to