[ 
https://issues.apache.org/jira/browse/ARROW-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16438389#comment-16438389
 ] 

EmericP commented on ARROW-2459:
--------------------------------

I can reproduce the issue easily on both Linux and MacOS. The segfault happens 
in libarrow:
{noformat}
==20185== Process terminating with default action of signal 11 (SIGSEGV)
==20185==  Bad permissions for mapped region at address 0x536E696
==20185==    at 0xB7B36A6: 
arrow::ipc::Message::ReadFrom(std::shared_ptr<arrow::Buffer> const&, 
arrow::io::InputStream*, std::unique_ptr<arrow::ipc::Message, 
std::default_delete<arrow::ipc::Message> >*) (in 
/usr/lib/python3.5/site-packages/pyarrow/libarrow.so.0)
==20185==    by 0xB7B4490: arrow::ipc::ReadMessage(arrow::io::InputStream*, 
std::unique_ptr<arrow::ipc::Message, std::default_delete<arrow::ipc::Message> 
>*) (in /usr/lib/python3.5/site-packages/pyarrow/libarrow.so.0)
==20185==    by 0xB7B5A0C: 
arrow::ipc::InputStreamMessageReader::ReadNextMessage(std::unique_ptr<arrow::ipc::Message,
 std::default_delete<arrow::ipc::Message> >*) (in 
/usr/lib/python3.5/site-packages/pyarrow/libarrow.so.0)
==20185==    by 0xB7BDF41: 
arrow::ipc::ReadMessageAndValidate(arrow::ipc::MessageReader*, 
arrow::ipc::Message::Type, bool, std::unique_ptr<arrow::ipc::Message, 
std::default_delete<arrow::ipc::Message> >*) [clone .constprop.261] (in 
/usr/lib/python3.5/site-packages/pyarrow/libarrow.so.0)
==20185==    by 0xB7C69E0: 
arrow::ipc::RecordBatchStreamReader::RecordBatchStreamReaderImpl::ReadSchema() 
(in /usr/lib/python3.5/site-packages/pyarrow/libarrow.so.0)
==20185==    by 0xB7C0EB5: 
arrow::ipc::RecordBatchStreamReader::Open(std::unique_ptr<arrow::ipc::MessageReader,
 std::default_delete<arrow::ipc::MessageReader> >, 
std::shared_ptr<arrow::RecordBatchReader>*) (in 
/usr/lib/python3.5/site-packages/pyarrow/libarrow.so.0)
==20185==    by 0xB7C0FB3: 
arrow::ipc::RecordBatchStreamReader::Open(arrow::io::InputStream*, 
std::shared_ptr<arrow::RecordBatchReader>*) (in 
/usr/lib/python3.5/site-packages/pyarrow/libarrow.so.0)
==20185==    by 0xB3770C7: 
__pyx_pw_7pyarrow_3lib_18_RecordBatchReader_3_open(_object*, _object*) (in 
/usr/lib/python3.5/site-packages/pyarrow/lib.cpython-35m-x86_64-linux-gnu.so)
==20185==    by 0x288CAB: PyEval_EvalFrameEx (in /usr/bin/python3)
==20185==    by 0x28E0DE: PyEval_EvalCodeEx (in /usr/bin/python3)
==20185==    by 0x2CA5D2: ??? (in /usr/bin/python3)
==20185==    by 0x311646: PyObject_Call (in /usr/bin/python3){noformat}

> pyarrow: Segfault with pyarrow.deserialize_pandas
> -------------------------------------------------
>
>                 Key: ARROW-2459
>                 URL: https://issues.apache.org/jira/browse/ARROW-2459
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>         Environment: OS X, Linux
>            Reporter: Travis Brady
>            Priority: Major
>
> Following up from [https://github.com/apache/arrow/issues/1884] wherein I 
> found that calling deserialize_pandas in the linked app.py script in the repo 
> linked below causes the app.py process to segfault.
> I initially observed this on OS X, but have since confirmed that the behavior 
> exists on Linux as well.
> Repo containing example: [https://github.com/travisbrady/sanic-arrow] 
> And more generally: what is the right way to get a Java-based HTTP 
> microservice to talk to a Python-based HTTP microservice using Arrow as the 
> serialization format? I'm exchanging DataFrame type objects (they are 
> pandas.DataFrame's on the Python side) between the two services for real-time 
> scoring in a few xgboost models implemented in Python.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to