alamb commented on issue #5504:
URL: https://github.com/apache/datafusion/issues/5504#issuecomment-2719177223

   @Shreyaskr1409  thank you
   
   > This would gracefully handle different versions where different actions 
are required for different types of events (like commit-main, PR, scheduled 
event due to a cron-job). I hope this diagram explains what I am thinking.
   
   In general I think trying to use github workers is likely to be challenging 
as they are shared and so the performance can vary from run without any changes 
in the software . I expect that almost any benchmarking exercise will us 
dedicated hardware somewhere. 
   
   As I mentioned above I think the core thing to do is get scripts that can 
generate the data. 
   
   @logan-keede  has created some here
   - https://github.com/apache/datafusion/pull/15144
   
   I need to review that shortly
   
   > also, do we have implementations for per-operator benchmarks for 
datafusion somewhere? I would like to look into how those can be implemented. 
Maybe I can create a separate issue for this since it would be a really nice 
addition to the repository and shift later discussions in that specific issue, 
making it easy to study.
   
   New tickets sound great.  Here is an example of a benchmark for 
`SortPreservingMerge`: 
https://github.com/apache/datafusion/blob/main/datafusion/core/benches/spm.rs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to