First you will need a filesystem that can read & write to GCS.  There
is no native GCS filesystem (yet, see [1]) at the moment so you will
need to use fsspec to wrap an fsspec compatible GCS filesystem.  There
is an example of how to do this at [2].

To open a CSV read stream you can either create a dataset with the CSV
file format (see [3] to learn about datasets) or you can create an
incremental CSV reader using open_csv[4] and an incremental CSV writer
using CSVWriter[5].  More general CSV reading/writing information can
be found at [6].

[1] https://issues.apache.org/jira/browse/ARROW-1231
[2] 
https://arrow.apache.org/docs/python/filesystems.html#using-fsspec-compatible-filesystems
[3] https://arrow.apache.org/docs/python/dataset.html#tabular-datasets
[4] 
https://arrow.apache.org/docs/python/generated/pyarrow.csv.open_csv.html#pyarrow.csv.open_csv
[5] 
https://arrow.apache.org/docs/python/generated/pyarrow.csv.CSVWriter.html#pyarrow.csv.CSVWriter
[6] 
https://arrow.apache.org/docs/python/generated/pyarrow.csv.CSVWriter.html#pyarrow.csv.CSVWriter

On Wed, Aug 25, 2021 at 4:59 PM gates ma <[email protected]> wrote:
>
> hi folks,
>
> Looking to use the csv read stream to write to GCS. Is there an ability to 
> use pyarrow cvs stream to write to a GCS bucket ?
>
> Thanks,
> MG.

Reply via email to