Same question but more simple to understand.
Using pyarrow and working with pieces of data by process (multi-process
as workaround GIL limitation). What is the correct way to handle this task ?
1. each // process have to create create a list of records store them
into a record batch and return this batch
2. each // process have to create an output and writer buffer , create a
list of records store them into a record batch and write this record
batch into the stream writer. The process return the corresponding buffer ?
with the answer (1) I see how to merge all of those batch but with
solution (2) how to merge all buffer to one once each process has
returned their buffer ?
Thanks
--
Jonathan MERCIER
Researcher computational biology
PhD, Jonathan MERCIER
Centre National de Recherche en Génomique Humaine (CNRGH)
Bioinformatics (LBI)
2, rue Gaston Crémieux
91057 Evry Cedex
Tel :(33) 1 60 87 34 88
Email :[email protected] <mailto:[email protected]>