[ https://issues.apache.org/jira/browse/ARROW-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neal Richardson updated ARROW-2801: ----------------------------------- Fix Version/s: (was: 4.0.0) 5.0.0 > [Python][C++][Dataset] Implement split_row_groups for ParquetDataset > -------------------------------------------------------------------- > > Key: ARROW-2801 > URL: https://issues.apache.org/jira/browse/ARROW-2801 > Project: Apache Arrow > Issue Type: New Feature > Components: Python > Reporter: Robbie Gruener > Assignee: Joris Van den Bossche > Priority: Minor > Labels: dataset, dataset-parquet-read, parquet, > pull-request-available > Fix For: 5.0.0 > > Time Spent: 3h 10m > Remaining Estimate: 0h > > Currently the split_row_groups argument in ParquetDataset yields a not > implemented error. An easy and efficient way to implement this is by using > the summary metadata file instead of opening every footer file -- This message was sent by Atlassian Jira (v8.3.4#803005)