stillalex commented on PR #1155: URL: https://github.com/apache/solr/pull/1155#issuecomment-1334309402
I have attempted to write a benchmark for this change to verify some of the perf gains. based on a limited understanding of the code, the PR looks reasonable to me, but the results I had are a bit different than expected. I am assuming the benchmark itself does not correctly mirror the setup used to perform this analysis. I am maintaining 2 branches, [one with only the benchmark](https://github.com/apache/solr/compare/main...stillalex:SOLR-16497-locks-bench?expand=1) and [one with this PR + same benchmark](https://github.com/apache/solr/compare/main...stillalex:SOLR-16497-locks-bench-and-patch?expand=1). * Results on main branch with no changes ``` Iteration 1: 9129555.788 ops/s Iteration 2: 9203237.026 ops/s Iteration 3: 9232071.659 ops/s Iteration 4: 9187238.556 ops/s Result "org.apache.solr.core.SolrCoresBenchTest.getCoreFromAnyList": 9188025.757 ±(99.9%) 278959.948 ops/s [Average] (min, avg, max) = (9129555.788, 9188025.757, 9232071.659), stdev = 43169.361 CI (99.9%): [8909065.809, 9466985.706] (assumes normal distribution) Benchmark Mode Cnt Score Error Units SolrCoresBenchTest.getCoreFromAnyList thrpt 4 9188025.757 ± 278959.948 ops/s ``` * Results on patch ``` Iteration 1: 1488901.065 ops/s Iteration 2: 1503247.470 ops/s Iteration 3: 1531084.510 ops/s Iteration 4: 1464588.927 ops/s Result "org.apache.solr.core.SolrCoresBenchTest.getCoreFromAnyList": 1496955.493 ±(99.9%) 179578.485 ops/s [Average] (min, avg, max) = (1464588.927, 1496955.493, 1531084.510), stdev = 27789.969 CI (99.9%): [1317377.008, 1676533.978] (assumes normal distribution) Benchmark Mode Cnt Score Error Units SolrCoresBenchTest.getCoreFromAnyList thrpt 4 1496955.493 ± 179578.485 ops/s ``` We effectively drop from 9M ops/s to 1.5M ops/s. The main observation here is that the benchmark does not mirror the PR setup. this is obvious from the fact that there are no transient cores (as far as I understand loading and unloading cores will increase the locking time under heavy load). I would like to understand how common is the setup that is using transient cores, the Solr Guide mentions that they are not recommended under SolrCloud: [Core Discovery](https://solr.apache.org/guide/solr/latest/configuration-guide/core-discovery.html) page under transient property. Also if there is interest I can spend more time tweaking this benchmark, I think having some way to verify/reproduce any perf change is very beneficial. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
