[jira] [Commented] (ARROW-13224) [Python][Doc] Documentation missing for pyarrow.dataset.write_dataset

Joris Van den Bossche (Jira) Thu, 01 Jul 2021 06:30:11 -0700


    [ 
https://issues.apache.org/jira/browse/ARROW-13224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17372791#comment-17372791
 ]


Joris Van den Bossche commented on ARROW-13224:
-----------------------------------------------

Indeed, we should add some documentation for writing datasets 
(python/dataset.rst only handles reading right now)

> [Python][Doc] Documentation missing for pyarrow.dataset.write_dataset
> ---------------------------------------------------------------------
>
>                 Key: ARROW-13224
>                 URL: https://issues.apache.org/jira/browse/ARROW-13224
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Documentation, Python
>            Reporter: Weston Pace
>            Assignee: Weston Pace
>            Priority: Major
>
> I don't believe this is meant to be internal.  
> pyarrow.parquet.write_to_dataset uses this (if use_legacy_dataset=False) but 
> the parquet API doesn't expose the same features.  A new example should also 
> probably be added to the Tabular Datasets section of the docs explaining why 
> write_dataset can take in a scanner (e.g. memory preserving, ability to write 
> a dataset from flight or any record batch source, etc.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARROW-13224) [Python][Doc] Documentation missing for pyarrow.dataset.write_dataset

Reply via email to