The extent to which I'm in the mood to use my limited uncompensated time here is ensuring that solr/benchmark is a viable base layer for a higher layer. I have commenced on https://issues.apache.org/jira/browse/SOLR-18126
It's my hope that you will modify your fork of search-benchmark-game as I described to leverage our existing framework. If I can be helpful in improving the base layer, let me know. I could set Claude loose on this but I'd only want to start the effort, resulting in a PR that's hopefully useful, without necessarily seeing it through. I'd like to propose that search-benchmark-game (or do we now call it solr-benchmark-game?) work with Solr as a single Solr Docker container per version/env being compared. This allows multiple shards, and I *think* multiple replicas, even though it's architecturally pointless. I'm suggesting this to help simplify/constrain, at least initially. On Sun, Feb 22, 2026 at 9:25 AM David Eric Pugh via dev <[email protected]> wrote: > I'm all for it.... The plan looks extensive enough, you might want to > write it up as a SIP so there is something that we can all look at, and > refresh our minds as time passes. > I'm going to make the comment that makes me sound like a nay sayer, our > biggest challenge in this space is not ideas on HOW to do load testing and > perf testing, but someone willing to DO the work. We have a number of > internal proprietary solutions, but nothing that has quite "jelled" yet as > a community solution. Which means that the act of perf testing is not > part of our muscle memory as a community. We don't do perf comparisons as > part of release for example. > I think what I can offer up, as it's in my immediate paid work, is to get > the Search Benchmark Repo to covering Solr 9 and 10 and 11 setups with a > two replica no shard setup. I can commit to running the comparisons as we > cut release candidates. I also plan on doing a version that compares > user-managed 1 node no sharding to solrcloud 1 node no sharding embedded ZK. > I suspect that if you put together a robust SIP, with well defined tasks, > then as folks say "How can I help?" then we can point them at the SIP and > the associated JIRAs to work on items. There is clearly interest, but we > all seem to build different solutions. Maybe with a good SIP we can all > build different parts of ONE solution. > On Saturday, February 21, 2026 at 08:32:31 PM EST, David Smiley < > [email protected]> wrote: > > FWIW I asked Claude to ponder this composability migration between the two > benchmark systems : > > https://docs.google.com/document/d/1iNdtTZ90Q9cLzLrYIspLqTIV0ItfdY5EJ0_dNMYSn-k/edit?usp=sharing > I'm super impressed! > > As a next step, even though non-critical, I'd like to see solr/benchmark > decoupled from the source tree (move to the Solr sandbox) and most > especially from assuming/limiting itself to embedded Solr -- although that > certainly needs to remain an option. The Gradle "Composite Builds > <https://docs.gradle.org/current/userguide/composite_builds.html>" (aka > includeBuild) feature can make it easy to continue to use the benchmark > module against a local source tree for testing WIP (a current advantage of > the status quo). I use includeBuild at work and love this fantastic Gradle > feature. > > A search-game repo that uses solr/benchmark would take responsibility for > starting/stopping Solr, probably via Docker. And would probably eventually > have a way of retaining a common search index so that identical > data/segments can be used across Solr versions being compared (rather > critial for doing performance comparisons). Although it'd mean we wouldn't > see new improvements in the latest index -- I think this is the right > trade-off. Hmmm... come to think of it, Solr's new index upgrader could be > used to incrementally upgrade a reference index to the latest while > retaining the same "index geometry". I'll think on that later; it's a > nice-to-have. > > The useful real-world data / queries currently existing in benchmark-game > can be ported to solr/benchmark to form a new set of benchmarks. > > On Fri, Feb 20, 2026 at 10:23 AM David Smiley <[email protected]> wrote: > > > > > > > On Fri, Feb 20, 2026 at 8:23 AM David Eric Pugh via dev < > > [email protected]> wrote: > > > >> I'll be honest, the JMH stuff, I think I need to learn it for when I > try > >> to do actual writing of code and want to understand performance, but I > >> don't think right now it's a generalizable perf tool? Can I use it to > say > >> "Solr 10.1 has the same performance characteristics as 9.8.2"? Which is > >> the question that I'm trying to answer. > >> > > > > Nor do I think the code/technology in solr/benchmark should answer that > > question by itself. I think it's a well scoped project that shouldn't > try > > and address every use-case in the field of benchmarking. > > > > My point is, we should seek complementary / composable things rather than > > non-interoperable things that overlap significantly in scope, and thus > > unfortunately compete with each other. That spreads > > our resources/investments thin and causes someone to put a benchmark in > one > > place versus another when, ideally, there would be one natural place for > > Solr's benchmarks. > > > > I'm willing to put some time into this. > > > > > >> I think there are a lot of great ideas out there.. Our challenge as a > >> community has been "can we actually move forward with any of them" and > "how > >> do we support them". I'm totally up for any tool, and I think we need > to > >> make sure perfection doesn't stop progress. > >> The gatling based stuff in https://github.com/apache/solr-sandbox just > >> seemed too cumbersome for me. Being able to compare across revisions > >> means storing data, and keeping the perf test environment the same, > which I > >> think is pretty hard to do. > >> > > > > Frustratingly, these are all from-scratch, non-composable efforts. > > > > > >> I like the fact that the setup per version of Solr is stored in > >> https://github.com/epugh/search-benchmark-game/tree/master/engines and > i > >> can run them on my laptop, or fire up a DigitalOcean droplet with lots > of > >> cpu's and ram and run it there... And the comparison between the > versions > >> remains valid. It also just felt pretty "easy" to get started. > >> I am excited about being able to run some perf tests against single node > >> user-managed (standalone) mode and single node embedded ZK Solr cloud > mode > >> and get a sense of performance impacts. > >> I *do* hope to not become a performance benchmarks guy ;-). > >> > > > > I do think "search-benchmark-game" is a promising contender to be a > > *layer* of an entire benchmark solution. The fact that there are > multiple > > engines supported implies decoupling that's necessary for it to be a > > layer, versus something all-encompassing. As a layer, it should not be > > supplying data & queries; let the underlying low level benchmark do that. > > > > Note that /solr/benchmark .jmh.sh can emit its results in JSON, which is > > key for consumability for a higher layer. (Gatling doesn't support that, > > if I recall) > > > > An MVP could just work the solr/benchmark benchmarks as they are but I > > could see utility in decoupling solr/benchmark from MiniSolrCloudCluster > > (embedding Solr -> talking to Solr)., especially to re-use an index over > > multiple Solr versions. > > > > ~ David > > >
