[
https://issues.apache.org/jira/browse/ARROW-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rok Mihevc updated ARROW-1858:
------------------------------
External issue URL: https://github.com/apache/arrow/issues/17851
> [Python] Add documentation about parquet.write_to_dataset and related methods
> -----------------------------------------------------------------------------
>
> Key: ARROW-1858
> URL: https://issues.apache.org/jira/browse/ARROW-1858
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Wes McKinney
> Assignee: Donal Simmie
> Priority: Major
> Labels: beginner, pull-request-available
> Fix For: 0.10.0
>
>
> {{pyarrow}} does not only allow one to write to a single Parquet file but you
> can also write only the schema metadata for a full multi-file dataset. This
> dataset can also be automatically partitioned by one or more columns. At the
> moment, this functionality is not really visible in the documentation. You
> mainly find the API documentation for it but we should have a small
> tutorial-like section that explains the differences and use cases for each of
> these functions.
> See also
> https://stackoverflow.com/questions/47482434/can-pyarrow-write-multiple-parquet-files-to-a-folder-like-fastparquets-file-sch
--
This message was sent by Atlassian Jira
(v8.20.10#820010)