[ 
https://issues.apache.org/jira/browse/ARROW-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315456#comment-16315456
 ] 

ASF GitHub Bot commented on ARROW-1972:
---------------------------------------

robertnishihara commented on a change in pull request #1463: ARROW-1972: 
[Python] Import pyarrow in DeserializeObject.
URL: https://github.com/apache/arrow/pull/1463#discussion_r160059003
 
 

 ##########
 File path: python/pyarrow/tests/deserialize_buffer.py
 ##########
 @@ -0,0 +1,24 @@
+# Licensed to the Apache Software Foundation (ASF) under one
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Deserialization of buffer objects (and pandas dataframes) segfaults on 
> different processes.
> -------------------------------------------------------------------------------------------
>
>                 Key: ARROW-1972
>                 URL: https://issues.apache.org/jira/browse/ARROW-1972
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>            Reporter: Robert Nishihara
>              Labels: pull-request-available
>
> To see the issue, first serialize a pyarrow buffer.
> {code}
> import pyarrow as pa
> serialized = pa.serialize(pa.frombuffer(b'hello')).to_buffer().to_pybytes()
> print(serialized)  # b'\x00\x00\x00\x00\x01...'
> {code}
> Deserializing it within the same process succeeds, however deserializing it 
> in a **separate process** causes a segfault. E.g.,
> {code}
> import pyarrow as pa
> pa.deserialize(b'\x00\x00\x00\x00\x01...')  # This segfaults
> {code}
> The backtrace is
> {code}
> (lldb) bt
> * thread #1, queue = ‘com.apple.main-thread’, stop reason = EXC_BAD_ACCESS 
> (code=1, address=0x0)
>   * frame #0: 0x0000000000000000
>     frame #1: 0x0000000105605534 
> libarrow_python.0.dylib`arrow::py::wrap_buffer(buffer=std::__1::shared_ptr<arrow::Buffer>::element_type
>  @ 0x000000010060c348 strong=1 weak=1) at pyarrow.cc:48
>     frame #2: 0x000000010554fdee 
> libarrow_python.0.dylib`arrow::py::GetValue(context=0x0000000108f17818, 
> parent=0x0000000100645438, arr=0x0000000100622938, index=0, type=0, 
> base=0x0000000108f0e528, blobs=0x0000000108f09588, result=0x00007fff5fbfd218) 
> at arrow_to_python.cc:173
>     frame #3: 0x000000010554d93a 
> libarrow_python.0.dylib`arrow::py::DeserializeList(context=0x0000000108f17818,
>  array=0x0000000100645438, start_idx=0, stop_idx=2, base=0x0000000108f0e528, 
> blobs=0x0000000108f09588, out=0x00007fff5fbfd470) at arrow_to_python.cc:208
>     frame #4: 0x000000010554d302 
> libarrow_python.0.dylib`arrow::py::DeserializeDict(context=0x0000000108f17818,
>  array=0x0000000100645338, start_idx=0, stop_idx=2, base=0x0000000108f0e528, 
> blobs=0x0000000108f09588, out=0x00007fff5fbfddd8) at arrow_to_python.cc:74
>     frame #5: 0x000000010554f249 
> libarrow_python.0.dylib`arrow::py::GetValue(context=0x0000000108f17818, 
> parent=0x00000001006377a8, arr=0x0000000100645298, index=0, type=0, 
> base=0x0000000108f0e528, blobs=0x0000000108f09588, result=0x00007fff5fbfddd8) 
> at arrow_to_python.cc:158
>     frame #6: 0x000000010554d93a 
> libarrow_python.0.dylib`arrow::py::DeserializeList(context=0x0000000108f17818,
>  array=0x00000001006377a8, start_idx=0, stop_idx=1, base=0x0000000108f0e528, 
> blobs=0x0000000108f09588, out=0x00007fff5fbfdfe8) at arrow_to_python.cc:208
>     frame #7: 0x0000000105551fbf 
> libarrow_python.0.dylib`arrow::py::DeserializeObject(context=0x0000000108f17818,
>  obj=0x0000000108f09588, base=0x0000000108f0e528, out=0x00007fff5fbfdfe8) at 
> arrow_to_python.cc:287
>     frame #8: 0x0000000104abecae 
> lib.cpython-36m-darwin.so`__pyx_pf_7pyarrow_3lib_18SerializedPyObject_2deserialize(__pyx_v_self=0x0000000108f09570,
>  __pyx_v_context=0x0000000108f17818) at lib.cxx:88592
>     frame #9: 0x0000000104abdec4 
> lib.cpython-36m-darwin.so`__pyx_pw_7pyarrow_3lib_18SerializedPyObject_3deserialize(__pyx_v_self=0x0000000108f09570,
>  __pyx_args=0x000000010231f358, __pyx_kwds=0x0000000000000000) at 
> lib.cxx:88514
>     frame #10: 0x000000010008b5f1 python`PyCFunction_Call + 145
>     frame #11: 0x0000000104941208 
> lib.cpython-36m-darwin.so`__Pyx_PyObject_Call(func=0x0000000108f302d0, 
> arg=0x000000010231f358, kw=0x0000000000000000) at lib.cxx:116108
>     frame #12: 0x0000000104b0e3fa 
> lib.cpython-36m-darwin.so`__Pyx__PyObject_CallOneArg(func=0x0000000108f302d0, 
> arg=0x0000000108f17818) at lib.cxx:116147
>     frame #13: 0x0000000104944bc6 
> lib.cpython-36m-darwin.so`__Pyx_PyObject_CallOneArg(func=0x0000000108f302d0, 
> arg=0x0000000108f17818) at lib.cxx:116166
>     frame #14: 0x0000000104b09873 
> lib.cpython-36m-darwin.so`__pyx_pf_7pyarrow_3lib_124deserialize_from(__pyx_self=0x0000000000000000,
>  __pyx_v_source=0x0000000108ddeee8, __pyx_v_base=0x0000000108f0e528, 
> __pyx_v_context=0x0000000108f17818) at lib.cxx:90327
>     frame #15: 0x0000000104b09310 
> lib.cpython-36m-darwin.so`__pyx_pw_7pyarrow_3lib_125deserialize_from(__pyx_self=0x0000000000000000,
>  __pyx_args=0x0000000108f10d38, __pyx_kwds=0x0000000000000000) at 
> lib.cxx:90260
>     frame #16: 0x000000010008b5f1 python`PyCFunction_Call + 145
>     frame #17: 0x0000000104941208 
> lib.cpython-36m-darwin.so`__Pyx_PyObject_Call(func=0x0000000108baf1b0, 
> arg=0x0000000108f10d38, kw=0x0000000000000000) at lib.cxx:116108
>     frame #18: 0x0000000104b0bf9d 
> lib.cpython-36m-darwin.so`__pyx_pf_7pyarrow_3lib_128deserialize(__pyx_self=0x0000000000000000,
>  __pyx_v_obj=0x0000000108f0e528, __pyx_v_context=0x0000000108f17818) at 
> lib.cxx:90770
>     frame #19: 0x0000000104b0b7ec 
> lib.cpython-36m-darwin.so`__pyx_pw_7pyarrow_3lib_129deserialize(__pyx_self=0x0000000000000000,
>  __pyx_args=0x0000000108def1c8, __pyx_kwds=0x0000000000000000) at 
> lib.cxx:90680
>     frame #20: 0x000000010008b5f1 python`PyCFunction_Call + 145
>     frame #21: 0x0000000108d5c468 
> plasma.cpython-36m-darwin.so`__Pyx_PyObject_Call(func=0x0000000108baf240, 
> arg=0x0000000108def1c8, kw=0x0000000000000000) at plasma.cxx:11200
>     frame #22: 0x0000000108d744a7 
> plasma.cpython-36m-darwin.so`__pyx_pf_7pyarrow_6plasma_12PlasmaClient_10get(__pyx_v_self=0x0000000108f0e210,
>  __pyx_v_object_ids=0x0000000108deb248, __pyx_v_timeout_ms=0, 
> __pyx_v_serialization_context=0x0000000108f17818) at plasma.cxx:6480
>     frame #23: 0x0000000108d6c250 
> plasma.cpython-36m-darwin.so`__pyx_pw_7pyarrow_6plasma_12PlasmaClient_11get(__pyx_v_self=0x0000000108f0e210,
>  __pyx_args=0x0000000102363630, __pyx_kwds=0x0000000000000000) at 
> plasma.cxx:6274
>     frame #24: 0x000000010008bc5b python`_PyCFunction_FastCallDict + 363
>     frame #25: 0x00000001001637f2 python`call_function + 146
>     frame #26: 0x00000001001614d5 python`_PyEval_EvalFrameDefault + 47093
>     frame #27: 0x0000000100154aab python`_PyEval_EvalCodeWithName + 427
>     frame #28: 0x0000000100163c4c python`fast_function + 348
>     frame #29: 0x000000010016383e python`call_function + 222
>     frame #30: 0x00000001001614d5 python`_PyEval_EvalFrameDefault + 47093
>     frame #31: 0x0000000100154aab python`_PyEval_EvalCodeWithName + 427
>     frame #32: 0x0000000100163c4c python`fast_function + 348
>     frame #33: 0x000000010016383e python`call_function + 222
>     frame #34: 0x00000001001614d5 python`_PyEval_EvalFrameDefault + 47093
>     frame #35: 0x0000000100154aab python`_PyEval_EvalCodeWithName + 427
>     frame #36: 0x0000000100163c4c python`fast_function + 348
>     frame #37: 0x000000010016383e python`call_function + 222
>     frame #38: 0x00000001001614d5 python`_PyEval_EvalFrameDefault + 47093
>     frame #39: 0x0000000100154aab python`_PyEval_EvalCodeWithName + 427
>     frame #40: 0x00000001001b01dc python`PyRun_InteractiveOneObject + 1132
>     frame #41: 0x00000001001ad15e python`PyRun_InteractiveLoopFlags + 334
>     frame #42: 0x00000001001acfeb python`PyRun_AnyFileExFlags + 139
>     frame #43: 0x00000001001d3378 python`Py_Main + 4632
>     frame #44: 0x00000001000016bd python`main + 509
>     frame #45: 0x00007fffb6073235 libdyld.dylib`start + 1
> {code}
> Note however that if we first serialize something, then it works. E.g., the 
> following succeeds.
> {code}
> import pyarrow as pa
> pa.serialize(1)
> pa.deserialize(b'\x00\x00\x00\x00\x01...')  # This succeeds!
> {code}
> I have a potential fix/workaround, which I will post momentarily.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to