[jira] [Commented] (ARROW-14025) [R][C++] PreBuffer is not enabled when scanning parquet via exec nodes

Neal Richardson (Jira) Mon, 04 Oct 2021 12:40:07 -0700


    [ 
https://issues.apache.org/jira/browse/ARROW-14025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17424132#comment-17424132
 ]


Neal Richardson commented on ARROW-14025:
-----------------------------------------

So the Dataset gets created with ParquetFragmentScanOptions (or whatever 
format), shouldn't we grab that and start with it? Less important for Parquet 
perhaps but for CSV don't we need that in order to get parsing options?

> [R][C++] PreBuffer is not enabled when scanning parquet via exec nodes
> ----------------------------------------------------------------------
>
>                 Key: ARROW-14025
>                 URL: https://issues.apache.org/jira/browse/ARROW-14025
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, R
>            Reporter: Weston Pace
>            Assignee: Neal Richardson
>            Priority: Major
>             Fix For: 6.0.0
>
>
> In ExecNode_Scan a ScanOptions object is built up.  If we are reading parquet 
> we should enable pre-buffering.  This is done by creating a 
> ParquetFragmentScanOptions object and enabling pre-buffering.
> Alternatively, we could just default pre-buffering to true for asynchronous 
> scans of parquet data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARROW-14025) [R][C++] PreBuffer is not enabled when scanning parquet via exec nodes

Reply via email to