Hi Jan, > Can you explain what "restart performance" means, and what the y axis number is? Yes, that refers to the time it took to restart 7 nodes, 2 at a time, while waiting for all replicas on those nodes to be active before proceeding to the next batch (of 2). https://github.com/fullstorydev/solr-bench/blob/ishan/repeatable-jenkins/suites/cluster-test.json#L33-L47
The number on the y-axis is the total time it took for the entire operation (of restarting those 7 nodes). https://github.com/fullstorydev/solr-bench/blob/ishan/repeatable-jenkins/createGraph.py#L25-L31 Here's a sample results file from a run: {task1=[{start-time=0.175, total-time=238334, end-time=238.509}], task2=[{start-time=238.523, total-time=109.801, end-time=348.324}, {start-time=238.523, total-time=113.048, end-time=351.571}, {start-time=348.324, total-time=122.317, end-time=470.641}, {start-time=351.572, total-time=130.111, end-time=481.683}, {start-time=470.642, total-time=106.068, end-time=576.71}, {start-time=481.683, total-time=107.827, end-time=589.51}, {start-time=576.711, total-time=89.179, end-time=665.89}]} Thanks, Ishan On Wed, Nov 9, 2022 at 6:27 PM Jan Høydahl <[email protected]> wrote: > Thanks for putting this together Ishan, > > Can you explain what "restart performance" means, and what the y axis > number is? > > Jan > > > 9. nov. 2022 kl. 07:39 skrev Ishan Chattopadhyaya < > [email protected]>: > > > > I'm working on automating performance testing, details in > > https://issues.apache.org/jira/browse/SOLR-16525. > > > > Even before I could complete the automation, I observed massive slowdown > in > > restart performance, now attributable to > > https://issues.apache.org/jira/browse/SOLR-16414. This affected 9.1 > release > > candidate RC1, but is now fixed in 9.1 and 9x branches. > > > > However, while performance was back to original levels on 9.1 branch, > there > > was a 80-100% slowdown on the 9x branch even after this fix. > > Please see: http://mostly.cool/cluster-test.json.html > > The test is here: > > > https://github.com/fullstorydev/solr-bench/blob/ishan/repeatable-jenkins/suites/cluster-test.json > > > > In order to investigate the slowdown, I retroactively applied the patch > > that fixed the performance problem in SOLR-16414 (removing use of > > parallelStream) to the intermediate commits and plotted the graph: > > http://mostly.cool/cluster-test-with-patch.html > > > > And now, two more commits with potential slowdowns are observed. Here are > > the JIRA issues I've opened for both: > > https://issues.apache.org/jira/browse/SOLR-16530 > > https://issues.apache.org/jira/browse/SOLR-16531 > > > > In a week of working on this automation, I was able to catch 3 slowdowns > on > > the first thing I automated. It might be good to keep this running and > test > > other aspects. Going forward, I'll be automating more performance suites > > and open blocker JIRA issues on significant performance degradation, > > whenever observed. I'll make it easy for all of us to add suites to the > > framework and have their personal branches/PRs tested through this. > > > > Please let me know about any thoughts / concerns / suggestions. > > > > Thanks, > > Ishan > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
