Neal Richardson created ARROW-17038:
---------------------------------------

             Summary: [R] to_arrow() on db connection should hold reference to 
con
                 Key: ARROW-17038
                 URL: https://issues.apache.org/jira/browse/ARROW-17038
             Project: Apache Arrow
          Issue Type: Improvement
          Components: R
            Reporter: Neal Richardson


Currently to_arrow() on a duckdb connection returns a RecordBatchReader. This 
works fine until you want to query again because RecordBatchReader is one-shot: 
once you've consumed it, you can't do it again. Among the places where this 
gets in the way is with the dplyr::glimpse() method (ARROW-16776), which shows 
a preview of the data. But you can't preview a RBR's data without consuming 
part of it. 

Going the other direction, duckdb solves this by holding a reference to the 
Dataset/query object, and on demand it does Scanner$create() on it, which it 
can do multiple times. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to