The extent to which I'm in the mood to use my limited uncompensated time
here is ensuring that solr/benchmark is a viable base layer for a higher
layer.  I have commenced on https://issues.apache.org/jira/browse/SOLR-18126

It's my hope that you will modify your fork of search-benchmark-game as I
described to leverage our existing framework.  If I can be helpful in
improving the base layer, let me know.  I could set Claude loose on this
but I'd only want to start the effort, resulting in a PR that's hopefully
useful, without necessarily seeing it through.

I'd like to propose that search-benchmark-game (or do we now call it
solr-benchmark-game?) work with Solr as a single Solr Docker container per
version/env being compared.  This allows multiple shards, and I *think*
multiple replicas, even though it's architecturally pointless.  I'm
suggesting this to help simplify/constrain, at least initially.

On Sun, Feb 22, 2026 at 9:25 AM David Eric Pugh via dev <[email protected]>
wrote:

>  I'm all for it....    The plan looks extensive enough, you might want to
> write it up as a SIP so there is something that we can all look at, and
> refresh our minds as time passes.
> I'm going to make the comment that makes me sound like a nay sayer, our
> biggest challenge in this space is not ideas on HOW to do load testing and
> perf testing, but someone willing to DO the work.  We have a number of
> internal proprietary solutions, but nothing that has quite "jelled" yet as
> a community solution.   Which means that the act of perf testing is not
> part of our muscle memory as a community.  We don't do perf comparisons as
> part of release for example.
> I think what I can offer up, as it's in my immediate paid work, is to get
> the Search Benchmark Repo to covering Solr 9 and 10 and 11 setups with a
> two replica no shard setup.  I can commit to running the comparisons as we
> cut release candidates.  I also plan on doing a version that compares
> user-managed 1 node no sharding to solrcloud 1 node no sharding embedded ZK.
> I suspect that if you put together a robust SIP, with well defined tasks,
> then as folks say "How can I help?" then we can point them at the SIP and
> the associated JIRAs to work on items.   There is clearly interest, but we
> all seem to build different solutions.  Maybe with a good SIP we can all
> build different parts of ONE solution.
>     On Saturday, February 21, 2026 at 08:32:31 PM EST, David Smiley <
> [email protected]> wrote:
>
>  FWIW I asked Claude to ponder this composability migration between the two
> benchmark systems :
>
> https://docs.google.com/document/d/1iNdtTZ90Q9cLzLrYIspLqTIV0ItfdY5EJ0_dNMYSn-k/edit?usp=sharing
> I'm super impressed!
>
> As a next step, even though non-critical, I'd like to see solr/benchmark
> decoupled from the source tree (move to the Solr sandbox) and most
> especially from assuming/limiting itself to embedded Solr -- although that
> certainly needs to remain an option.  The Gradle "Composite Builds
> <https://docs.gradle.org/current/userguide/composite_builds.html>" (aka
> includeBuild) feature can make it easy to continue to use the benchmark
> module against a local source tree for testing WIP (a current advantage of
> the status quo).  I use includeBuild at work and love this fantastic Gradle
> feature.
>
> A search-game repo that uses solr/benchmark would take responsibility for
> starting/stopping Solr, probably via Docker.  And would probably eventually
> have a way of retaining a common search index so that identical
> data/segments can be used across Solr versions being compared (rather
> critial for doing performance comparisons).  Although it'd mean we wouldn't
> see new improvements in the latest index -- I think this is the right
> trade-off.  Hmmm... come to think of it, Solr's new index upgrader could be
> used to incrementally upgrade a reference index to the latest while
> retaining the same "index geometry".  I'll think on that later; it's a
> nice-to-have.
>
> The useful real-world data / queries currently existing in benchmark-game
> can be ported to solr/benchmark to form a new set of benchmarks.
>
> On Fri, Feb 20, 2026 at 10:23 AM David Smiley <[email protected]> wrote:
>
> >
> >
> > On Fri, Feb 20, 2026 at 8:23 AM David Eric Pugh via dev <
> > [email protected]> wrote:
> >
> >>  I'll be honest, the JMH stuff, I think I need to learn it for when I
> try
> >> to do actual writing of code and want to understand performance, but I
> >> don't think right now it's a generalizable perf tool?  Can I use it to
> say
> >> "Solr 10.1 has the same performance characteristics as 9.8.2"?  Which is
> >> the question that I'm trying to answer.
> >>
> >
> > Nor do I think the code/technology in solr/benchmark should answer that
> > question by itself.  I think it's a well scoped project that shouldn't
> try
> > and address every use-case in the field of benchmarking.
> >
> > My point is, we should seek complementary / composable things rather than
> > non-interoperable things that overlap significantly in scope, and thus
> > unfortunately compete with each other.  That spreads
> > our resources/investments thin and causes someone to put a benchmark in
> one
> > place versus another when, ideally, there would be one natural place for
> > Solr's benchmarks.
> >
> > I'm willing to put some time into this.
> >
> >
> >>  I think there are a lot of great ideas out there..  Our challenge as a
> >> community has been "can we actually move forward with any of them" and
> "how
> >> do we support them".  I'm totally up for any tool, and I think we need
> to
> >> make sure perfection doesn't stop progress.
> >> The gatling based stuff in https://github.com/apache/solr-sandbox just
> >> seemed too cumbersome for me.  Being able to compare across revisions
> >> means storing data, and keeping the perf test environment the same,
> which I
> >> think is pretty hard to do.
> >>
> >
> > Frustratingly, these are all from-scratch, non-composable efforts.
> >
> >
> >> I like the fact that the setup per version of Solr is stored in
> >> https://github.com/epugh/search-benchmark-game/tree/master/engines and
> i
> >> can run them on my laptop, or fire up a DigitalOcean droplet with lots
> of
> >> cpu's and ram and run it there...  And the comparison between the
> versions
> >> remains valid.  It also just felt pretty "easy" to get started.
> >> I am excited about being able to run some perf tests against single node
> >> user-managed (standalone) mode and single node embedded ZK Solr cloud
> mode
> >> and get a sense of performance impacts.
> >> I *do* hope to not become a performance benchmarks guy ;-).
> >>
> >
> > I do think "search-benchmark-game" is a promising contender to be a
> > *layer* of an entire benchmark solution.  The fact that there are
> multiple
> > engines supported implies decoupling that's necessary for it to be a
> > layer, versus something all-encompassing.  As a layer, it should not be
> > supplying data & queries; let the underlying low level benchmark do that.
> >
> > Note that /solr/benchmark .jmh.sh can emit its results in JSON, which is
> > key for consumability for a higher layer.  (Gatling doesn't support that,
> > if I recall)
> >
> > An MVP could just work the solr/benchmark benchmarks as they are but I
> > could see utility in decoupling solr/benchmark from MiniSolrCloudCluster
> > (embedding Solr -> talking to Solr)., especially to re-use an index over
> > multiple Solr versions.
> >
> > ~ David
> >
>

Reply via email to