FWIW I asked Claude to ponder this composability migration between the two benchmark systems : https://docs.google.com/document/d/1iNdtTZ90Q9cLzLrYIspLqTIV0ItfdY5EJ0_dNMYSn-k/edit?usp=sharing I'm super impressed!
As a next step, even though non-critical, I'd like to see solr/benchmark decoupled from the source tree (move to the Solr sandbox) and most especially from assuming/limiting itself to embedded Solr -- although that certainly needs to remain an option. The Gradle "Composite Builds <https://docs.gradle.org/current/userguide/composite_builds.html>" (aka includeBuild) feature can make it easy to continue to use the benchmark module against a local source tree for testing WIP (a current advantage of the status quo). I use includeBuild at work and love this fantastic Gradle feature. A search-game repo that uses solr/benchmark would take responsibility for starting/stopping Solr, probably via Docker. And would probably eventually have a way of retaining a common search index so that identical data/segments can be used across Solr versions being compared (rather critial for doing performance comparisons). Although it'd mean we wouldn't see new improvements in the latest index -- I think this is the right trade-off. Hmmm... come to think of it, Solr's new index upgrader could be used to incrementally upgrade a reference index to the latest while retaining the same "index geometry". I'll think on that later; it's a nice-to-have. The useful real-world data / queries currently existing in benchmark-game can be ported to solr/benchmark to form a new set of benchmarks. On Fri, Feb 20, 2026 at 10:23 AM David Smiley <[email protected]> wrote: > > > On Fri, Feb 20, 2026 at 8:23 AM David Eric Pugh via dev < > [email protected]> wrote: > >> I'll be honest, the JMH stuff, I think I need to learn it for when I try >> to do actual writing of code and want to understand performance, but I >> don't think right now it's a generalizable perf tool? Can I use it to say >> "Solr 10.1 has the same performance characteristics as 9.8.2"? Which is >> the question that I'm trying to answer. >> > > Nor do I think the code/technology in solr/benchmark should answer that > question by itself. I think it's a well scoped project that shouldn't try > and address every use-case in the field of benchmarking. > > My point is, we should seek complementary / composable things rather than > non-interoperable things that overlap significantly in scope, and thus > unfortunately compete with each other. That spreads > our resources/investments thin and causes someone to put a benchmark in one > place versus another when, ideally, there would be one natural place for > Solr's benchmarks. > > I'm willing to put some time into this. > > >> I think there are a lot of great ideas out there.. Our challenge as a >> community has been "can we actually move forward with any of them" and "how >> do we support them". I'm totally up for any tool, and I think we need to >> make sure perfection doesn't stop progress. >> The gatling based stuff in https://github.com/apache/solr-sandbox just >> seemed too cumbersome for me. Being able to compare across revisions >> means storing data, and keeping the perf test environment the same, which I >> think is pretty hard to do. >> > > Frustratingly, these are all from-scratch, non-composable efforts. > > >> I like the fact that the setup per version of Solr is stored in >> https://github.com/epugh/search-benchmark-game/tree/master/engines and i >> can run them on my laptop, or fire up a DigitalOcean droplet with lots of >> cpu's and ram and run it there... And the comparison between the versions >> remains valid. It also just felt pretty "easy" to get started. >> I am excited about being able to run some perf tests against single node >> user-managed (standalone) mode and single node embedded ZK Solr cloud mode >> and get a sense of performance impacts. >> I *do* hope to not become a performance benchmarks guy ;-). >> > > I do think "search-benchmark-game" is a promising contender to be a > *layer* of an entire benchmark solution. The fact that there are multiple > engines supported implies decoupling that's necessary for it to be a > layer, versus something all-encompassing. As a layer, it should not be > supplying data & queries; let the underlying low level benchmark do that. > > Note that /solr/benchmark .jmh.sh can emit its results in JSON, which is > key for consumability for a higher layer. (Gatling doesn't support that, > if I recall) > > An MVP could just work the solr/benchmark benchmarks as they are but I > could see utility in decoupling solr/benchmark from MiniSolrCloudCluster > (embedding Solr -> talking to Solr)., especially to re-use an index over > multiple Solr versions. > > ~ David >
