Will Jones created ARROW-16703:
----------------------------------
Summary: [R] Refactor map_batches() so it can stream results
Key: ARROW-16703
URL: https://issues.apache.org/jira/browse/ARROW-16703
Project: Apache Arrow
Issue Type: Improvement
Components: R
Affects Versions: 8.0.0
Reporter: Will Jones
Fix For: 9.0.0
As part of ARROW-15271, {{map_batches()}} was modified to return a
{{RecordBatchReader}}, but the implementation collects all results as a list of
record batches and then converts that to a reader. In theory, if we push the
implementation down to C++, we should be able to make a proper streaming RBR.
We won't know the schema ahead of time. We could optionally accept it, which
would allow the function to be lazy. Or we could eagerly evaluate just the
first batch to determine the schema.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)