Pavel Solodovnikov created ARROW-17318:
------------------------------------------
Summary: [C++][Dataset] Support async streaming interface for
getting fragments in Dataset
Key: ARROW-17318
URL: https://issues.apache.org/jira/browse/ARROW-17318
Project: Apache Arrow
Issue Type: Sub-task
Reporter: Pavel Solodovnikov
Assignee: Pavel Solodovnikov
Add `GetFragmentsAsync()` and `GetFragmentsAsyncImpl()` functions to the
generic `Dataset` interface, which allows to produce fragments in a streamed
fashion.
This is one of the prerequisites for making `FileSystemDataset` to support lazy
fragment processing, which, in turn, can be used to start scan operations
without waiting for the entire dataset to be discovered.
To aid the transition process of moving to async implementation in
`Dataset`/`AsyncScanner` code, a default implementation for
`GetFragmentsAsyncImpl()` should be provided (yielding a VectorGenerator over
the fragments vector, which is stored by every implementation of Dataset
interface at the moment).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)