I'm all for it.... The plan looks extensive enough, you might want to write
it up as a SIP so there is something that we can all look at, and refresh our
minds as time passes.
I'm going to make the comment that makes me sound like a nay sayer, our biggest
challenge in this space is not ideas on HOW to do load testing and perf
testing, but someone willing to DO the work. We have a number of internal
proprietary solutions, but nothing that has quite "jelled" yet as a community
solution. Which means that the act of perf testing is not part of our muscle
memory as a community. We don't do perf comparisons as part of release for
example.
I think what I can offer up, as it's in my immediate paid work, is to get the
Search Benchmark Repo to covering Solr 9 and 10 and 11 setups with a two
replica no shard setup. I can commit to running the comparisons as we cut
release candidates. I also plan on doing a version that compares user-managed
1 node no sharding to solrcloud 1 node no sharding embedded ZK.
I suspect that if you put together a robust SIP, with well defined tasks, then
as folks say "How can I help?" then we can point them at the SIP and the
associated JIRAs to work on items. There is clearly interest, but we all seem
to build different solutions. Maybe with a good SIP we can all build different
parts of ONE solution.
On Saturday, February 21, 2026 at 08:32:31 PM EST, David Smiley
<[email protected]> wrote:
FWIW I asked Claude to ponder this composability migration between the two
benchmark systems :
https://docs.google.com/document/d/1iNdtTZ90Q9cLzLrYIspLqTIV0ItfdY5EJ0_dNMYSn-k/edit?usp=sharing
I'm super impressed!
As a next step, even though non-critical, I'd like to see solr/benchmark
decoupled from the source tree (move to the Solr sandbox) and most
especially from assuming/limiting itself to embedded Solr -- although that
certainly needs to remain an option. The Gradle "Composite Builds
<https://docs.gradle.org/current/userguide/composite_builds.html>" (aka
includeBuild) feature can make it easy to continue to use the benchmark
module against a local source tree for testing WIP (a current advantage of
the status quo). I use includeBuild at work and love this fantastic Gradle
feature.
A search-game repo that uses solr/benchmark would take responsibility for
starting/stopping Solr, probably via Docker. And would probably eventually
have a way of retaining a common search index so that identical
data/segments can be used across Solr versions being compared (rather
critial for doing performance comparisons). Although it'd mean we wouldn't
see new improvements in the latest index -- I think this is the right
trade-off. Hmmm... come to think of it, Solr's new index upgrader could be
used to incrementally upgrade a reference index to the latest while
retaining the same "index geometry". I'll think on that later; it's a
nice-to-have.
The useful real-world data / queries currently existing in benchmark-game
can be ported to solr/benchmark to form a new set of benchmarks.
On Fri, Feb 20, 2026 at 10:23 AM David Smiley <[email protected]> wrote:
>
>
> On Fri, Feb 20, 2026 at 8:23 AM David Eric Pugh via dev <
> [email protected]> wrote:
>
>> I'll be honest, the JMH stuff, I think I need to learn it for when I try
>> to do actual writing of code and want to understand performance, but I
>> don't think right now it's a generalizable perf tool? Can I use it to say
>> "Solr 10.1 has the same performance characteristics as 9.8.2"? Which is
>> the question that I'm trying to answer.
>>
>
> Nor do I think the code/technology in solr/benchmark should answer that
> question by itself. I think it's a well scoped project that shouldn't try
> and address every use-case in the field of benchmarking.
>
> My point is, we should seek complementary / composable things rather than
> non-interoperable things that overlap significantly in scope, and thus
> unfortunately compete with each other. That spreads
> our resources/investments thin and causes someone to put a benchmark in one
> place versus another when, ideally, there would be one natural place for
> Solr's benchmarks.
>
> I'm willing to put some time into this.
>
>
>> I think there are a lot of great ideas out there.. Our challenge as a
>> community has been "can we actually move forward with any of them" and "how
>> do we support them". I'm totally up for any tool, and I think we need to
>> make sure perfection doesn't stop progress.
>> The gatling based stuff in https://github.com/apache/solr-sandbox just
>> seemed too cumbersome for me. Being able to compare across revisions
>> means storing data, and keeping the perf test environment the same, which I
>> think is pretty hard to do.
>>
>
> Frustratingly, these are all from-scratch, non-composable efforts.
>
>
>> I like the fact that the setup per version of Solr is stored in
>> https://github.com/epugh/search-benchmark-game/tree/master/engines and i
>> can run them on my laptop, or fire up a DigitalOcean droplet with lots of
>> cpu's and ram and run it there... And the comparison between the versions
>> remains valid. It also just felt pretty "easy" to get started.
>> I am excited about being able to run some perf tests against single node
>> user-managed (standalone) mode and single node embedded ZK Solr cloud mode
>> and get a sense of performance impacts.
>> I *do* hope to not become a performance benchmarks guy ;-).
>>
>
> I do think "search-benchmark-game" is a promising contender to be a
> *layer* of an entire benchmark solution. The fact that there are multiple
> engines supported implies decoupling that's necessary for it to be a
> layer, versus something all-encompassing. As a layer, it should not be
> supplying data & queries; let the underlying low level benchmark do that.
>
> Note that /solr/benchmark .jmh.sh can emit its results in JSON, which is
> key for consumability for a higher layer. (Gatling doesn't support that,
> if I recall)
>
> An MVP could just work the solr/benchmark benchmarks as they are but I
> could see utility in decoupling solr/benchmark from MiniSolrCloudCluster
> (embedding Solr -> talking to Solr)., especially to re-use an index over
> multiple Solr versions.
>
> ~ David
>