[
https://issues.apache.org/jira/browse/ARROW-13224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17372791#comment-17372791
]
Joris Van den Bossche commented on ARROW-13224:
-----------------------------------------------
Indeed, we should add some documentation for writing datasets
(python/dataset.rst only handles reading right now)
> [Python][Doc] Documentation missing for pyarrow.dataset.write_dataset
> ---------------------------------------------------------------------
>
> Key: ARROW-13224
> URL: https://issues.apache.org/jira/browse/ARROW-13224
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Documentation, Python
> Reporter: Weston Pace
> Assignee: Weston Pace
> Priority: Major
>
> I don't believe this is meant to be internal.
> pyarrow.parquet.write_to_dataset uses this (if use_legacy_dataset=False) but
> the parquet API doesn't expose the same features. A new example should also
> probably be added to the Tabular Datasets section of the docs explaining why
> write_dataset can take in a scanner (e.g. memory preserving, ability to write
> a dataset from flight or any record batch source, etc.)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)