[GitHub] [arrow-datafusion] Jimexist commented on issue #1248: Optimized `RecordBatch` for constant columns

GitBox Sun, 07 Nov 2021 19:01:26 -0800


Jimexist commented on issue #1248:
URL: 
https://github.com/apache/arrow-datafusion/issues/1248#issuecomment-962768112



   I had a further thought on this and believe having `DFRecordBatch` is 
approachable, with some inter steps:
   
   - [ ] create a trait `DFRecordBatch`, with basic functions defined such as 
`fn is_emtpy(&self) -> bool`, `fn columns(&self) -> Vec<ArrayRef>`, and `fn 
column(&self, i: usize) -> ArrayRef` defined, and also `fn to_arrow(&self) -> 
RecordBatch`
   - [ ] have a default `DFRecordBatchImpl` struct that wraps around arrow's 
`RecordBatch` and `impl` `DFRecordBatch`, 
   - [ ] have `fn to_df(&self) -> DFRecordBatch` implemented for arrow's 
`RecordBatch`
   - [ ] change `DataFrame` to return `DFRecordBatch` instead and let this 
breaking change released to downstream
   - [ ] add other types of impls for `DFRecordBatch` or enhance 
`DFRecordBatchImpl` to have virtual columns


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] Jimexist commented on issue #1248: Optimized `RecordBatch` for constant columns

Reply via email to