[jira] [Updated] (ARROW-1858) [Python] Add documentation about parquet.write_to_dataset and related methods

Uwe L. Korn (JIRA) Sat, 21 Apr 2018 00:05:50 -0700

     [ 
https://issues.apache.org/jira/browse/ARROW-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Uwe L. Korn updated ARROW-1858:
-------------------------------
    Description: 
{{pyarrow}} does not only allow one to write to a single Parquet file but you 
can also write only the schema metadata for a full multi-file dataset. This 
dataset can also be automatically partitioned by one or more columns. At the 
moment, this functionality is not really visible in the documentation. You 
mainly find the API documentation for it but we should have a small 
tutorial-like section that explains the differences and use cases for each of 
these functions.

See also 
https://stackoverflow.com/questions/47482434/can-pyarrow-write-multiple-parquet-files-to-a-folder-like-fastparquets-file-sch

  was:See 
https://stackoverflow.com/questions/47482434/can-pyarrow-write-multiple-parquet-files-to-a-folder-like-fastparquets-file-sch


> [Python] Add documentation about parquet.write_to_dataset and related methods
> -----------------------------------------------------------------------------
>
>                 Key: ARROW-1858
>                 URL: https://issues.apache.org/jira/browse/ARROW-1858
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Wes McKinney
>            Priority: Major
>              Labels: beginner
>             Fix For: 0.10.0
>
>
> {{pyarrow}} does not only allow one to write to a single Parquet file but you 
> can also write only the schema metadata for a full multi-file dataset. This 
> dataset can also be automatically partitioned by one or more columns. At the 
> moment, this functionality is not really visible in the documentation. You 
> mainly find the API documentation for it but we should have a small 
> tutorial-like section that explains the differences and use cases for each of 
> these functions.
> See also 
> https://stackoverflow.com/questions/47482434/can-pyarrow-write-multiple-parquet-files-to-a-folder-like-fastparquets-file-sch



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-1858) [Python] Add documentation about parquet.write_to_dataset and related methods

Reply via email to