nealrichardson opened a new pull request, #13210: URL: https://github.com/apache/arrow/pull/13210
* Pushes KVM handling into ExecPlan so that Run() preserves the R metadata we want. * Also pushes special handling for a kind of collapsed query from collect() into Build(). * Better encapsulate KVM for the the $metadata and $r_metadata so that as a user/developer, you never have to touch the serialize/deserialize functions, you just have a list to work with. This is a slight API change, most noticeable if you were to `print(tab$metadata)`; better is to `print(str(tab$metdata))`. * Factor out a common utility in r/src for taking cpp11::strings (named character vector) and producing arrow::KeyValueMetadata The upshot of all of this is that we can push the ExecPlan evaluation into `as_record_batch_reader()`, and all that `collect()` does on top is read the RBR into a Table/data.frame. This means that we can plug dplyr queries into anything else that expects a RecordBatchReader, and it will be (to the maximum extent possible, given the limitations of ExecPlan) streaming, not requiring you to `compute()` and materialize things first. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
