alamb commented on issue #16206: URL: https://github.com/apache/datafusion/issues/16206#issuecomment-3006040173
> Extend the take (or interleave) kernels to eagerly allocate new data buffers for string views (i.e. add a compute::TakeOption - unfortunately also a breaking change). BTW this eagerly allocating buffers / copying string views when needed is what the (in progress) `coalesce` kernel does: - https://github.com/apache/arrow-rs/blob/main/arrow-select/src/coalesce.rs Here is the logic that does it for StringView - https://github.com/apache/arrow-rs/blob/main/arrow-select/src/coalesce/byte_view.rs I hope to migrate DataFusion to use the BatchCoalescer soon. There is a `BatchCoalescer::push_batch_with_filter` (that does the equivalent of `filter` + `concat`) that I am actively working to improve I imagine implementing something like `BatchCoalescer::push_batch_with_indices` that does the equivalent of `take` + `concat` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org