Remi Dettai created ARROW-10368:
-----------------------------------

             Summary: [Rust][Datafusion] Make InMemoryScan work on iterators of 
RecordBatch
                 Key: ARROW-10368
                 URL: https://issues.apache.org/jira/browse/ARROW-10368
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Rust, Rust - DataFusion
            Reporter: Remi Dettai


Currently, InMemoryScan takes a Vec<Vec<RecordBatch>> as data.
- the outer Vec separates the partitions
- the inner Vec contains all the RecordBatch for one partition
The inner Vec is then converted into an iterator when the LogicalPlan is turned 
into a PhysicalPlan.

I suggest that InMemoryScan should take Vec<Iter<RecordBatch>>.  This would 
make it possible to plug custom Scan implementations into datafusion without 
the need to read them entirely into memory. It would still work pretty 
seamlessly with Vec<Vec<RecordBatch>> that would just need a to be converted 
with data.map(|x| x.iter()) first.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to