alamb commented on issue #16206:
URL: https://github.com/apache/datafusion/issues/16206#issuecomment-3006040173

   > Extend the take (or interleave) kernels to eagerly allocate new data 
buffers for string views (i.e. add a compute::TakeOption - unfortunately also a 
breaking change).
   
   BTW this eagerly allocating buffers / copying string views when needed is 
what the (in progress) `coalesce` kernel does:
   - https://github.com/apache/arrow-rs/blob/main/arrow-select/src/coalesce.rs
   
   Here is the logic that does it for StringView 
   - 
https://github.com/apache/arrow-rs/blob/main/arrow-select/src/coalesce/byte_view.rs
   
   I hope to migrate DataFusion to use the BatchCoalescer soon. 
   
   There is a `BatchCoalescer::push_batch_with_filter`  (that does the 
equivalent of `filter` + `concat`) that I am actively working to improve
   
   I imagine implementing something like 
`BatchCoalescer::push_batch_with_indices` that does the equivalent of `take` + 
`concat`
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to