alamb opened a new issue, #9157:
URL: https://github.com/apache/arrow-datafusion/issues/9157

   ### Is your feature request related to a problem or challenge?
   
   Many APIs in DataFusion produce `Vec<RecordBatch>`  (e.g. 
[`DataFrame::collect`](https://docs.rs/datafusion/latest/datafusion/dataframe/struct.DataFrame.html#method.collect))
   
   However, there is no corresponding API to create a `DataFrame` from a 
`Vec<RecordBatch>` which is confusing to a first time user who just wants to do 
something like "sort my batches"
   
   It is straightforward to scan `Vec<RecordBatch>` by create one with a 
`MemTable`, as is done here: 
https://docs.rs/datafusion/latest/src/datafusion/execution/context/mod.rs.html#939-950
 but having to find that incantation puts a barrier to sue
   
   There is a similar API for one batch 
[`SessionContext::read_batch`](https://docs.rs/datafusion/latest/datafusion/execution/context/struct.SessionContext.html#method.read_batch)
 but not one for the `Vec<RecordBatch>`
   
   ### Describe the solution you'd like
   
   I would like a `SessionContext::read_batches` API that takes an iterator of 
RecordBatches (like a Vec)
   
   ```rust
   impl SessionContext
     /// Creates a [`DataFrame`] for reading a [`RecordBatch`]
       pub fn read_batch(&self, batches: impl IntoIter<Item = RecordBatch>) -> 
Result<DataFrame> {
         ...
       }
   }
   ```
   
   Along with:
   1. A doc example showing how to read a `Vec<RecordBatch>` (which would also 
serve as a test)
   
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   I was speaking with @carols10cents  today, and she noted that m


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to