ManManson opened a new pull request, #13804:
URL: https://github.com/apache/arrow/pull/13804

   Add `GetFragmentsAsync()` and `GetFragmentsAsyncImpl()`
   functions to the generic `Dataset` interface, which
   allows to produce fragments in a streamed fashion.
   
   This is one of the prerequisites for making
   `FileSystemDataset` to support lazy fragment
   processing, which, in turn, can be used to start
   scan operations without waiting for the entire
   dataset to be discovered.
   
   To aid the transition process of moving to async
   implementation in `Dataset`/`AsyncScanner` code,
   a default implementation for `GetFragmentsAsyncImpl()`
   is provided (yielding a VectorGenerator over
   the fragments vector, which is stored by every
   implementation of Dataset interface at the moment).
   
   Tests: unit(release)
   
   Signed-off-by: Pavel Solodovnikov <[email protected]>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to