Lordworms commented on issue #7053:
URL: https://github.com/apache/datafusion/issues/7053#issuecomment-2586009190

   Current design is 
   1. substitute `SendableRecordBatchStream` between `SortPreservingMergeExec` 
and `SortExec` to `RowOrColumnStream` 
   ```Rust
   pub enum RowOrColumn {
       Row(Rows),
       Column(RecordBatch),
   }
   
   /// Contains a Rows or a Recordbatch
   pub type RowOrColumnStream = Pin<Box<dyn Stream<Item = Result<RowOrColumn>> 
+ Send>>;
   
   ```
   
   2. In the begining of SortExec, we build the RowConverter, (if it is a 
single column sort, we don't build this and send a Recordbatch)
   3.  for every recordbatch `SortExec` recieved, we convert it into Rows and 
do spill logic using Rows format(I implemented a rudimentary reader and writer 
for Rows)
   4. in `SortPreservingMergeExec` we convert the rows to [[ArrayRef]] (We have 
to do this since I didn't find any arrow methods to directly build Recordbatch 
from Rows) and didn't break any loser_tree logics.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to