That works. I've tried a bunch of debugging and work arounds- as far as I
can tell this is just a problem with deserializr from components and
multiprocess.

On Fri., 6 Jul. 2018, 5:12 pm Robert Nishihara, <robertnishih...@gmail.com>
wrote:

> Can you reproduce it without all of the multiprocessing code? E.g., just
> call *pyarrow.serialize* in one interpreter. Then copy and paste the bytes
> into another interpreter and call *pyarrow.deserialize *or
> *pyarrow.deserialize_components*?
> On Thu, Jul 5, 2018 at 9:48 PM Josh Quigley <
> josh.quig...@lifetrading.com.au>
> wrote:
>
> > Attachment inline:
> >
> > import pyarrow as pa
> > import multiprocessing as mp
> > import numpy as np
> >
> > def make_payload():
> >     """Common function - make data to send"""
> >     return ['message', 123, np.random.uniform(-100, 100, (4, 4))]
> >
> > def send_payload(payload, connection):
> >     """Common function - serialize & send data through a socket"""
> >     s = pa.serialize(payload)
> >     c = s.to_components()
> >
> >     # Send
> >     data = c.pop('data')
> >     connection.send(c)
> >     for d in data:
> >         connection.send_bytes(d)
> >     connection.send_bytes(b'')
> >
> >
> > def recv_payload(connection):
> >     """Common function - recv data through a socket & deserialize"""
> >     c = connection.recv()
> >     c['data'] = []
> >     while True:
> >         r = connection.recv_bytes()
> >         if len(r) == 0:
> >             break
> >         c['data'].append(pa.py_buffer(r))
> >
> >     print('...deserialize')
> >     return pa.deserialize_components(c)
> >
> >
> > def run_same_process():
> >     """Same process: Send data down a socket, then read data from the
> > matching socket"""
> >     print('run_same_process')
> >     recv_conn,send_conn = mp.Pipe(duplex=False)
> >     payload = make_payload()
> >     print(payload)
> >     send_payload(payload, send_conn)
> >     payload2 = recv_payload(recv_conn)
> >     print(payload2)
> >
> >
> > def receiver(recv_conn):
> >     """Separate process: runs in a different process, recv data &
> > deserialize"""
> >     print('Receiver started')
> >     payload = recv_payload(recv_conn)
> >     print(payload)
> >
> >
> > def run_separate_process():
> >     """Separate process: launch the child process, then send data"""
> >
> >
> >     print('run_separate_process')
> >     recv_conn,send_conn = mp.Pipe(duplex=False)
> >     process = mp.Process(target=receiver, args=(recv_conn,))
> >     process.start()
> >
> >     payload = make_payload()
> >     print(payload)
> >     send_payload(payload, send_conn)
> >
> >     process.join()
> >
> > if __name__ == '__main__':
> >     run_same_process()
> >     run_separate_process()
> >
> >
> > On Fri, Jul 6, 2018 at 2:42 PM Josh Quigley <
> > josh.quig...@lifetrading.com.au>
> > wrote:
> >
> > > A reproducible program attached - it first runs serialize/deserialize
> > from
> > > the same process, then it does the same work using a separate process
> for
> > > the deserialize.
> > >
> > > The behaviour see is (after the same process code executes happily) is
> > > hanging / child-process crashing during the call to deserialize.
> > >
> > > Is this expected, and if not, is there a known workaround?
> > >
> > > Running Windows 10, conda distribution,  with package versions listed
> > > below. I'll also see what happens if I run on *nix.
> > >
> > >   - arrow-cpp=0.9.0=py36_vc14_7
> > >   - boost-cpp=1.66.0=vc14_1
> > >   - bzip2=1.0.6=vc14_1
> > >   - hdf5=1.10.2=vc14_0
> > >   - lzo=2.10=vc14_0
> > >   - parquet-cpp=1.4.0=vc14_0
> > >   - snappy=1.1.7=vc14_1
> > >   - zlib=1.2.11=vc14_0
> > >   - blas=1.0=mkl
> > >   - blosc=1.14.3=he51fdeb_0
> > >   - cython=0.28.3=py36hfa6e2cd_0
> > >   - icc_rt=2017.0.4=h97af966_0
> > >   - intel-openmp=2018.0.3=0
> > >   - numexpr=2.6.5=py36hcd2f87e_0
> > >   - numpy=1.14.5=py36h9fa60d3_2
> > >   - numpy-base=1.14.5=py36h5c71026_2
> > >   - pandas=0.23.1=py36h830ac7b_0
> > >   - pyarrow=0.9.0=py36hfe5e424_2
> > >   - pytables=3.4.4=py36he6f6034_0
> > >   - python=3.6.6=hea74fb7_0
> > >   - vc=14=h0510ff6_3
> > >   - vs2015_runtime=14.0.25123=3
> > >
> > >
> >
>

Reply via email to