Hi All,

I'm currently using pyarrow.csv.read_csv to parse a CSV stream that
originates from a ZIP of multiple CSV files. For now, I'm using a separate
implementation to do the streaming ZIP decompression, then
using pyarrow.csv.read_csv at each CSV file boundary.

I would love if there were a way to leverage pyarrow to handle the
decompression. From what I've seen in examples, a ZIP file containing a
single CSV is supported -- that is, it's possible to operate on a
compressed CSV stream -- but I wonder if it's possible to handle a
compressed stream that contains multiple files?

Thank you in advance!

Reply via email to