devanbenz commented on issue #13188:
URL: https://github.com/apache/datafusion/issues/13188#issuecomment-2457485995

   > I will take a look on this issue too, how is your progress on this 
@devanbenz ?
   
   Looking at the following code path: 
   
   
https://github.com/apache/datafusion/blob/9005585fa6f4eb6a4d0cc515b6ad76794c33c626/datafusion/physical-plan/src/coalesce/mod.rs#L246-L263
   
   I noticed when writing up a small example using this code path locally
   
   <img width="1023" alt="Screenshot 2024-11-03 at 12 30 19 PM" 
src="https://github.com/user-attachments/assets/7695c70c-9ced-4530-88ce-59ec60422997";>
   
   I've identified where the performance fault is--but I'm unsure what next 
steps to take to try and alleviate it. I was going to maybe modify the 
coalescer to try and "change the RecordBatch to stringview" as the input occurs 
around here (if that makes sense):
   
   
https://github.com/apache/datafusion/blob/9005585fa6f4eb6a4d0cc515b6ad76794c33c626/datafusion/physical-plan/src/coalesce_batches.rs#L297-L307
   
   I don't have a meaningful plan yet just have been doing some exploratory 
work as of right now while benchmarking locally. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to