alamb commented on PR #2177:
URL: 
https://github.com/apache/arrow-datafusion/pull/2177#issuecomment-1113608594

   @gandronchik  thank you for the explanation in this PR's description. It 
helps though I will admit I still don't fully understand what is going o. 
   
   I agree with @doki23  -- I expect a table function to logically return a 
table (that something with both rows and columns)
   
   > Regarding signature, I decided to use a single vector and vector with 
sizes of sections instead of vec of vecs to have better performance. If we use 
Vec, this will require a lot of memory in case of a request for millions of 
rows.
   
   The way the rest of DataFusion avoids buffering all the intermediate results 
at once int memory is with `Stream`s but then that requires interacting with 
rust's `async` ecosystem which is non trivial
   
   If you wanted a streaming solution,  that would mean the signature might 
look something like the following (maybe)
   
   
   ```rust
   Arc<dyn Fn(Box<dyn SendableRecordBatchStream>) -> Result<Box<dyn 
SendableRecordBatchStream>> + Send + Sync>;
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to