hi Anton,

Does pa.parquet.write_metadata not do what you want?

https://github.com/apache/arrow/blob/master/python/pyarrow/parquet.py#L1205

See also https://issues.apache.org/jira/browse/ARROW-1983

- Wes
On Fri, Aug 31, 2018 at 5:38 PM Anton Goloborodko
<[email protected]> wrote:
>
> Dear Arrow developers,
>
> Our lab is planning to use pyarrow to store some biological information in
> Parquet files. We also have to store some metadata alongside, e.g. which
> sample the data comes from, how it was obtained and processed, etc.
>
> Parquet seems to support file-wide metadata, but I cannot find how the
> write it via pyarrow. The closest thing I could find is how to write
> row-group metadata (https://github.com/pandas-dev/pandas/pull/20534), but
> this seems like an overkill, since our metadata is the same for all row
> groups in the file.
>
> Is there any way to write file-wide Parquet metadata with pyarrow?
>
> Thank you!
> Anton.

Reply via email to