alamb commented on issue #5504: URL: https://github.com/apache/datafusion/issues/5504#issuecomment-2719177223
@Shreyaskr1409 thank you > This would gracefully handle different versions where different actions are required for different types of events (like commit-main, PR, scheduled event due to a cron-job). I hope this diagram explains what I am thinking. In general I think trying to use github workers is likely to be challenging as they are shared and so the performance can vary from run without any changes in the software . I expect that almost any benchmarking exercise will us dedicated hardware somewhere. As I mentioned above I think the core thing to do is get scripts that can generate the data. @logan-keede has created some here - https://github.com/apache/datafusion/pull/15144 I need to review that shortly > also, do we have implementations for per-operator benchmarks for datafusion somewhere? I would like to look into how those can be implemented. Maybe I can create a separate issue for this since it would be a really nice addition to the repository and shift later discussions in that specific issue, making it easy to study. New tickets sound great. Here is an example of a benchmark for `SortPreservingMerge`: https://github.com/apache/datafusion/blob/main/datafusion/core/benches/spm.rs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
