Pavel Solodovnikov created ARROW-17318:
------------------------------------------

             Summary: [C++][Dataset] Support async streaming interface for 
getting fragments in Dataset
                 Key: ARROW-17318
                 URL: https://issues.apache.org/jira/browse/ARROW-17318
             Project: Apache Arrow
          Issue Type: Sub-task
            Reporter: Pavel Solodovnikov
            Assignee: Pavel Solodovnikov


Add `GetFragmentsAsync()` and `GetFragmentsAsyncImpl()` functions to the 
generic `Dataset` interface, which allows to produce fragments in a streamed 
fashion.

This is one of the prerequisites for making `FileSystemDataset` to support lazy 
fragment processing, which, in turn, can be used to start scan operations 
without waiting for the entire dataset to be discovered.

To aid the transition process of moving to async implementation in 
`Dataset`/`AsyncScanner` code, a default implementation for 
`GetFragmentsAsyncImpl()` should be provided (yielding a VectorGenerator over 
the fragments vector, which is stored by every implementation of Dataset 
interface at the moment).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to