[jira] [Comment Edited] (ARROW-3822) [C++] parquet::arrow::FileReader::GetRecordBatchReader may not iterate through chunked columns completely

Ben Kietzman (Jira) Wed, 04 Aug 2021 07:08:05 -0700


    [ 
https://issues.apache.org/jira/browse/ARROW-3822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17393190#comment-17393190
 ]


Ben Kietzman edited comment on ARROW-3822 at 8/4/21, 2:07 PM:
--------------------------------------------------------------

(It's been a while since I looked at this, but IIRC: ) the row group size 
heuristic was ignored and I wasn't able to generate a parquet file smaller than 
2GB which reproduced this issue.


was (Author: bkietz):
(It's been a while since I looked at this, but IIRC:) the row group size 
heuristic was ignored and I wasn't able to generate a parquet file smaller than 
2GB which reproduced this issue.

> [C++] parquet::arrow::FileReader::GetRecordBatchReader may not iterate 
> through chunked columns completely
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-3822
>                 URL: https://issues.apache.org/jira/browse/ARROW-3822
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: Wes McKinney
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> EDIT: https://github.com/apache/arrow/pull/3911#issuecomment-473679153
> We don't currently test that all data is iterated through when reading from a 
> Parquet file where the result is chunked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (ARROW-3822) [C++] parquet::arrow::FileReader::GetRecordBatchReader may not iterate through chunked columns completely

Reply via email to