[
https://issues.apache.org/jira/browse/ARROW-11924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17299081#comment-17299081
]
Antoine Pitrou commented on ARROW-11924:
----------------------------------------
Note that the implementation could very well be able to produce results in
parallel. However, it's not known up front how many results will be produced.
I'm not sure if that's a problem.
> [C++] Provide streaming output from GetFileInfo
> -----------------------------------------------
>
> Key: ARROW-11924
> URL: https://issues.apache.org/jira/browse/ARROW-11924
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Affects Versions: 3.0.0
> Reporter: Ben Kietzman
> Assignee: Antoine Pitrou
> Priority: Major
>
> For situations where a monolithic call to GetFileInfo will be slow, it would
> be useful to immediately receive any results which *are* ready through an
> {{AsyncGenerator<std::vector<FileInfo>>}} or so. This is probably a
> prerequisite of ARROW-8163, where the goal is to begin scanning known
> fragments while other fragments are still being discovered.
> IIUC, one concrete example would be paging through a long output from S3's
> ListObjectsV2.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)