zhuqi-lucas commented on PR #15380: URL: https://github.com/apache/datafusion/pull/15380#issuecomment-2808114536
> > I think the most efficient way would be to sort the indices to the arrays in one step followed by interleave, without either concat or sort followed by merge which would benefit the most from the built in sort algorithm and avoids copying the data. > > I wonder if we can skip interleave / copying entirely? > > Specifically, what if we sorted to indices, as you suggested, but then instead of calling `interleave` (which will copy the data) before sending it to merge_streams) maybe we could have some way to have the merge cursors also take the indicies -- so we could only copy data once 🤔 Thanks @alamb , it looks promising. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org