[
https://issues.apache.org/jira/browse/ARROW-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16438389#comment-16438389
]
Advertising
EmericP commented on ARROW-2459:
--------------------------------
I can reproduce the issue easily on both Linux and MacOS. The segfault happens
in libarrow:
{noformat}
==20185== Process terminating with default action of signal 11 (SIGSEGV)
==20185== Bad permissions for mapped region at address 0x536E696
==20185== at 0xB7B36A6:
arrow::ipc::Message::ReadFrom(std::shared_ptr<arrow::Buffer> const&,
arrow::io::InputStream*, std::unique_ptr<arrow::ipc::Message,
std::default_delete<arrow::ipc::Message> >*) (in
/usr/lib/python3.5/site-packages/pyarrow/libarrow.so.0)
==20185== by 0xB7B4490: arrow::ipc::ReadMessage(arrow::io::InputStream*,
std::unique_ptr<arrow::ipc::Message, std::default_delete<arrow::ipc::Message>
>*) (in /usr/lib/python3.5/site-packages/pyarrow/libarrow.so.0)
==20185== by 0xB7B5A0C:
arrow::ipc::InputStreamMessageReader::ReadNextMessage(std::unique_ptr<arrow::ipc::Message,
std::default_delete<arrow::ipc::Message> >*) (in
/usr/lib/python3.5/site-packages/pyarrow/libarrow.so.0)
==20185== by 0xB7BDF41:
arrow::ipc::ReadMessageAndValidate(arrow::ipc::MessageReader*,
arrow::ipc::Message::Type, bool, std::unique_ptr<arrow::ipc::Message,
std::default_delete<arrow::ipc::Message> >*) [clone .constprop.261] (in
/usr/lib/python3.5/site-packages/pyarrow/libarrow.so.0)
==20185== by 0xB7C69E0:
arrow::ipc::RecordBatchStreamReader::RecordBatchStreamReaderImpl::ReadSchema()
(in /usr/lib/python3.5/site-packages/pyarrow/libarrow.so.0)
==20185== by 0xB7C0EB5:
arrow::ipc::RecordBatchStreamReader::Open(std::unique_ptr<arrow::ipc::MessageReader,
std::default_delete<arrow::ipc::MessageReader> >,
std::shared_ptr<arrow::RecordBatchReader>*) (in
/usr/lib/python3.5/site-packages/pyarrow/libarrow.so.0)
==20185== by 0xB7C0FB3:
arrow::ipc::RecordBatchStreamReader::Open(arrow::io::InputStream*,
std::shared_ptr<arrow::RecordBatchReader>*) (in
/usr/lib/python3.5/site-packages/pyarrow/libarrow.so.0)
==20185== by 0xB3770C7:
__pyx_pw_7pyarrow_3lib_18_RecordBatchReader_3_open(_object*, _object*) (in
/usr/lib/python3.5/site-packages/pyarrow/lib.cpython-35m-x86_64-linux-gnu.so)
==20185== by 0x288CAB: PyEval_EvalFrameEx (in /usr/bin/python3)
==20185== by 0x28E0DE: PyEval_EvalCodeEx (in /usr/bin/python3)
==20185== by 0x2CA5D2: ??? (in /usr/bin/python3)
==20185== by 0x311646: PyObject_Call (in /usr/bin/python3){noformat}
> pyarrow: Segfault with pyarrow.deserialize_pandas
> -------------------------------------------------
>
> Key: ARROW-2459
> URL: https://issues.apache.org/jira/browse/ARROW-2459
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Environment: OS X, Linux
> Reporter: Travis Brady
> Priority: Major
>
> Following up from [https://github.com/apache/arrow/issues/1884] wherein I
> found that calling deserialize_pandas in the linked app.py script in the repo
> linked below causes the app.py process to segfault.
> I initially observed this on OS X, but have since confirmed that the behavior
> exists on Linux as well.
> Repo containing example: [https://github.com/travisbrady/sanic-arrow]
> And more generally: what is the right way to get a Java-based HTTP
> microservice to talk to a Python-based HTTP microservice using Arrow as the
> serialization format? I'm exchanging DataFrame type objects (they are
> pandas.DataFrame's on the Python side) between the two services for real-time
> scoring in a few xgboost models implemented in Python.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)