devanbenz commented on issue #13188: URL: https://github.com/apache/datafusion/issues/13188#issuecomment-2457485995
> I will take a look on this issue too, how is your progress on this @devanbenz ? Looking at the following code path: https://github.com/apache/datafusion/blob/9005585fa6f4eb6a4d0cc515b6ad76794c33c626/datafusion/physical-plan/src/coalesce/mod.rs#L246-L263 I noticed when writing up a small example using this code path locally <img width="1023" alt="Screenshot 2024-11-03 at 12 30 19 PM" src="https://github.com/user-attachments/assets/7695c70c-9ced-4530-88ce-59ec60422997"> I've identified where the performance fault is--but I'm unsure what next steps to take to try and alleviate it. I was going to maybe modify the coalescer to try and "change the RecordBatch to stringview" as the input occurs around here (if that makes sense): https://github.com/apache/datafusion/blob/9005585fa6f4eb6a4d0cc515b6ad76794c33c626/datafusion/physical-plan/src/coalesce_batches.rs#L297-L307 I don't have a meaningful plan yet just have been doing some exploratory work as of right now while benchmarking locally. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org