[
https://issues.apache.org/jira/browse/ARROW-18113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Li resolved ARROW-18113.
------------------------------
Fix Version/s: 11.0.0
Resolution: Fixed
Issue resolved by pull request 14723
[https://github.com/apache/arrow/pull/14723]
> [C++] Implement a read range process without caching
> ----------------------------------------------------
>
> Key: ARROW-18113
> URL: https://issues.apache.org/jira/browse/ARROW-18113
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Percy Camilo Triveño Aucahuasi
> Assignee: Percy Camilo Triveño Aucahuasi
> Priority: Major
> Labels: pull-request-available
> Fix For: 11.0.0
>
> Time Spent: 1h 40m
> Remaining Estimate: 0h
>
> The current
> [ReadRangeCache|https://github.com/apache/arrow/blob/e06e98db356e602212019cfbae83fd3d5347292d/cpp/src/arrow/io/caching.h#L100]
> is mixing caching with coalescing and making difficult to implement readers
> capable to really perform concurrent reads on coalesced data (see this
> [github
> comment|https://github.com/apache/arrow/pull/14226#discussion_r999334979] for
> additional context); for instance, right now the prebuffering feature of
> those readers cannot handle concurrent invocations.
> The goal for this ticket is to implement a similar component to
> ReadRangeCache for performing non-cache reads (doing only the coalescing part
> instead). So, once we have that new capability, we can port the parquet and
> IPC readers to this new component and keep improving the reading process
> (that would be part of other set of follow-up tickets). Similar ideas were
> mentioned here https://issues.apache.org/jira/browse/ARROW-17599
> Maybe a good place to implement this new capability is inside the file system
> abstraction (as part of a dedicated method to read coalesced data) and where
> the abstract file system can provide a default implementation.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)