Neal Richardson created ARROW-17038:
---------------------------------------
Summary: [R] to_arrow() on db connection should hold reference to
con
Key: ARROW-17038
URL: https://issues.apache.org/jira/browse/ARROW-17038
Project: Apache Arrow
Issue Type: Improvement
Components: R
Reporter: Neal Richardson
Currently to_arrow() on a duckdb connection returns a RecordBatchReader. This
works fine until you want to query again because RecordBatchReader is one-shot:
once you've consumed it, you can't do it again. Among the places where this
gets in the way is with the dplyr::glimpse() method (ARROW-16776), which shows
a preview of the data. But you can't preview a RBR's data without consuming
part of it.
Going the other direction, duckdb solves this by holding a reference to the
Dataset/query object, and on demand it does Scanner$create() on it, which it
can do multiple times.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)