[
https://issues.apache.org/jira/browse/ARROW-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe L. Korn resolved ARROW-5436.
--------------------------------
Resolution: Fixed
Issue resolved by pull request 4409
[https://github.com/apache/arrow/pull/4409]
> [Python] expose filters argument in parquet.read_table
> ------------------------------------------------------
>
> Key: ARROW-5436
> URL: https://issues.apache.org/jira/browse/ARROW-5436
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Joris Van den Bossche
> Priority: Major
> Labels: parquet, pull-request-available
> Fix For: 0.14.0
>
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> Currently, the {{parquet.read_table}} function can be used both for reading a
> single file (interface to ParquetFile) as a directory (interface to
> ParquetDataset).
> ParquetDataset has some extra keywords such as {{filters}} that would be nice
> to expose through {{read_table}} as well.
> Of course one can always use {{ParquetDataset}} if you need its power, but
> for pandas wrapping pyarrow it is easier to be able to pass through keywords
> just to {{parquet.read_table}} instead of calling either {{read_table}} or
> {{ParquetDataset}}. Context: https://github.com/pandas-dev/pandas/issues/26551
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)