Hello folks During the community sync, there as an item for benchmarks next step but we could not get to it. I don't think we need to wait for the next community sync to start the discussion, so here we go.
I have a couple of tasks on my todo list for Polaris benchmarks. And I would like to share those ideas and gather new ones, in case there is appetite for more benchmarks. Here is a short description for the tasks that I am working on. 1 - Remove sequential benchmarks and renew credentials (#6 <https://github.com/apache/polaris-tools/pull/6>) Sequential benchmarks (i.e. with only one request at a time) were initially created because the current Eclipselink runs into issues under concurrent load. But now that the benchmarks throughput and concurrency can be configured, those are not necessary anymore. Additionally, this PR contains an improvement for authentication to support long benchmarks (>1h, the auth token validity). 2 - Remove the bound on the maximum number of updates The current update-related benchmarks require the user to specify the maximum number of update operations that should be generated. The code will change to generate an infinite stream of update operations. Coupled with the ability to control the throughput and the duration of the simulation, this will simplify the user experience. 3 - Add a simulation that continuously creates table and view commits This benchmark will continuously send table properties updates. It will be a way to quickly create lots of snapshots, which can then be used for capacity planning, stressing the events subsystem or some metadata management facility. 4 - ... ? What else are you thinking should be added to Polaris benchmarks? If you have ideas of scenarios that could benefit the project, please let me know. Cheers -- Pierre