There are a lot of unchecked Statuses in your code. I would suggest
checking them all and additionally adding a (checked!) call to
Validate() or ValidateFull() to make sure that everything is well
formed (it seems like it should be, but this is a pre-requisite before
debugging further)

On Sun, Aug 23, 2020 at 1:27 AM Yue Ni <[email protected]> wrote:
>
> Hi there,
>
> I tried to create a Python binding our Apache Arrow C++ based program, and 
> used pybind11 and pyarrow wrapping code to do it. For some reason, the code 
> works on macOS however it causes segfault under Linux.
>
> I created a minimum test case to reproduce this behavior, is there anyone who 
> can help to take a look at what may go wrong here?
>
> Here is the C++ code for creating the binding (it simply generates a fixed 
> size array and puts it into record batch and then creates a table)
> ```
> pybind11::object generate(const int32_t count) {
>   shared_ptr<arrow::Array> array;
>   arrow::Int64Builder builder;
>   for (auto i = 0; i < count; i++) {
>     auto _ = builder.Append(i);
>   }
>   auto _ = builder.Finish(&array);
>   auto record_batch = RecordBatch::Make(
>       arrow::schema(vector{arrow::field("int_value", arrow::int64())}), 
> count, vector{array});
>   auto table = 
> arrow::Table::FromRecordBatches(vector{record_batch}).ValueOrDie();
>   auto result = arrow::py::import_pyarrow();
>   auto wrapped_table = pybind11::reinterpret_borrow<pybind11::object>(
>       pybind11::handle(arrow::py::wrap_table(table)));
>   return wrapped_table;
> }
> ```
>
> Here is the python code that uses the binding (it calls the binding to 
> generate a 100-length single column table, and print the number of rows and 
> table schema).
> ```
> table = binding.generate(100)
> >>> print(table.num_rows) # this works correctly
> 100
> >>> print(table.shape) # this works correctly
> (100, 1)
> >>> print(table.num_columns) # this works correctly
> 1
> >>> print(table.column_names) # this prints an empty list, which is 
> >>> incorrect, but the program still runs
> ['']
> >>> print(table.columns) # this causes the segfault
> Segmentation fault (core dumped)
> ```
>
> The same code works completely fine and correct on macOS (Apple clang 11, 
> Python 3.7.5, arrow 1.0.0 C++ lib, pyarrow 1.0.0), but it doesn't work on 
> Debian bullseye (gcc 10.2.0, Python 3.8.5, arrow 1.0.1 C++ lib, pyarrow 
> 1.0.1). I tried switching to some combinations of Python 3.7.5 and 
> arrow/pyarrow 1.0.0 as well, but none of them works for me.
>
> I got the core dump and use gdb for some simple debugging, and it seems the 
> segfault happened when pyarrow tried to call `pyarrow_wrap_data_type` when 
> doing `Field.init`.
>
> Here is the core dump:
> ```
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> Core was generated by `python3'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x00007f604ea484cc in __pyx_f_7pyarrow_3lib_pyarrow_wrap_data_type () 
> from 
> /usr/local/lib/python3.8/dist-packages/pyarrow/lib.cpython-38-x86_64-linux-gnu.so
> [Current thread is 1 (Thread 0x7f60550bc740 (LWP 2205))]
> (gdb) where
> #0  0x00007f604ea484cc in __pyx_f_7pyarrow_3lib_pyarrow_wrap_data_type () 
> from 
> /usr/local/lib/python3.8/dist-packages/pyarrow/lib.cpython-38-x86_64-linux-gnu.so
> #1  0x00007f604eb05df0 in 
> __pyx_f_7pyarrow_3lib_5Field_init(__pyx_obj_7pyarrow_3lib_Field*, 
> std::shared_ptr<arrow::Field> const&) () from 
> /usr/local/lib/python3.8/dist-packages/pyarrow/lib.cpython-38-x86_64-linux-gnu.so
> #2  0x00007f604ea35d80 in __pyx_f_7pyarrow_3lib_pyarrow_wrap_field () from 
> /usr/local/lib/python3.8/dist-packages/pyarrow/lib.cpython-38-x86_64-linux-gnu.so
> #3  0x00007f604ea68c8f in __pyx_pw_7pyarrow_3lib_6Schema_28_field(_object*, 
> _object*) () from 
> /usr/local/lib/python3.8/dist-packages/pyarrow/lib.cpython-38-x86_64-linux-gnu.so
> #4  0x00007f604ea69595 in __Pyx_PyObject_CallOneArg(_object*, _object*) () 
> from 
> /usr/local/lib/python3.8/dist-packages/pyarrow/lib.cpython-38-x86_64-linux-gnu.so
> #5  0x00007f604ea755de in 
> __pyx_pw_7pyarrow_3lib_6Schema_7__getitem__(_object*, _object*) () from 
> /usr/local/lib/python3.8/dist-packages/pyarrow/lib.cpython-38-x86_64-linux-gnu.so
> #6  0x00007f604ea476f0 in __pyx_sq_item_7pyarrow_3lib_Schema(_object*, long) 
> () from 
> /usr/local/lib/python3.8/dist-packages/pyarrow/lib.cpython-38-x86_64-linux-gnu.so
> #7  0x00007f604ead554e in __pyx_pw_7pyarrow_3lib_5Table_55_column(_object*, 
> _object*) () from 
> /usr/local/lib/python3.8/dist-packages/pyarrow/lib.cpython-38-x86_64-linux-gnu.so
> #8  0x00007f604ea69595 in __Pyx_PyObject_CallOneArg(_object*, _object*) () 
> from 
> /usr/local/lib/python3.8/dist-packages/pyarrow/lib.cpython-38-x86_64-linux-gnu.so
> #9  0x00007f604ea8c8df in 
> __pyx_getprop_7pyarrow_3lib_5Table_columns(_object*, void*) () from 
> /usr/local/lib/python3.8/dist-packages/pyarrow/lib.cpython-38-x86_64-linux-gnu.so
> #10 0x000000000051bafa in ?? ()
> #11 0x0000000000518b3f in _PyObject_GenericGetAttrWithDict ()
> #12 0x0000000000505509 in _PyEval_EvalFrameDefault ()
> #13 0x0000000000503b25 in _PyEval_EvalCodeWithName ()
> #14 0x00000000005ce503 in PyEval_EvalCode ()
> #15 0x00000000005ec461 in ?? ()
> #16 0x00000000005e7a5f in ?? ()
> #17 0x000000000045b2dc in ?? ()
> #18 0x000000000045aee5 in PyRun_InteractiveLoopFlags ()
> #19 0x00000000005ef745 in PyRun_AnyFileExFlags ()
> #20 0x000000000044ddde in ?? ()
> #21 0x00000000005c3899 in Py_BytesMain ()
> #22 0x00007f60550e5cca in __libc_start_main (main=0x5c3860 <main>, argc=1, 
> argv=0x7fffc5992db8, init=<optimized out>, fini=<optimized out>, 
> rtld_fini=<optimized out>, stack_end=0x7fffc5992da8) at 
> ../csu/libc-start.c:308
> #23 0x00000000005c379a in _start ()
> ```
>
> Due to the complexity of the C++/Python conversion, I've no idea if this is 
> an issue of pyarrow or Cython or pybind11 in this case. Is there anyone who 
> can shed some light on it or how I can troubleshoot such an issue? Thanks.
>
> Regards,
> Yue
>

Reply via email to