Same question but more simple to understand.

Using pyarrow and working with pieces of data by process (multi-process as workaround GIL limitation). What is the correct way to handle this task ?

1. each // process have to create create a list of records store them into a record batch and return this batch

2. each // process have to create an output and writer buffer , create a list of records store them into a record batch and write this record batch into the stream writer. The process return the corresponding buffer ?

with the answer (1) I see how to merge all of those batch but with solution (2) how to merge all buffer to one once each process has returned their buffer ?



Thanks


--
Jonathan MERCIER

Researcher computational biology

PhD, Jonathan MERCIER

Centre National de Recherche en Génomique Humaine (CNRGH)

Bioinformatics (LBI)

2, rue Gaston Crémieux

91057 Evry Cedex

Tel :(33) 1 60 87 34 88

Email :[email protected] <mailto:[email protected]>

Reply via email to