[
https://issues.apache.org/jira/browse/ARROW-8039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056487#comment-17056487
]
Neal Richardson commented on ARROW-8039:
----------------------------------------
Ah, good call. That sounds reasonable to me (as someone who is not a user). And
it looks like it is trivial enough to promote only read_table and not mention
ParquetDataset in
https://arrow.apache.org/docs/python/parquet.html#partitioned-datasets-multiple-files.
So the idea would be that read_table would be the function that gets the new
Dataset option, and ParquetDataset would be unchanged (just no longer
encouraged for use).
[~wesm] thoughts?
> [C++][Python][Dataset] Assemble a minimal ParquetDataset shim
> -------------------------------------------------------------
>
> Key: ARROW-8039
> URL: https://issues.apache.org/jira/browse/ARROW-8039
> Project: Apache Arrow
> Issue Type: Sub-task
> Components: C++ - Dataset, Python
> Affects Versions: 0.16.0
> Reporter: Ben Kietzman
> Assignee: Ben Kietzman
> Priority: Major
> Fix For: 0.17.0
>
>
> Assemble a minimal ParquetDataset shim backed by {{pyarrow.dataset.*}}.
> Replace the existing ParquetDataset with the shim by default, allow opt-out
> for users who need the current ParquetDataset
> This is mostly exploratory to see which of the python tests fail
--
This message was sent by Atlassian Jira
(v8.3.4#803005)