Re: Search Benchmark The Game repo

Jason Gerlowski Tue, 24 Feb 2026 06:38:19 -0800

> As a next step [...] I'd like to see solr/benchmark decoupled from [...]
> assuming/limiting itself to embedded Solr


JMH is great in its niche, but it's much more of a specialty tool than
a swiss-army knife that can do a bunch of different things.  JMH is
for microbenchmarking.  For profiling individual method calls or
codepaths within the hosting JVM.  And embedded Solr is how we do
that.  If the "Java code being profiled" becomes a network call out to
some other Solr running elsewhere.....the whole feature set and value
of JMH kindof disappears.

Just a word of caution to be wary of "when you have a hammer..."
thinking here.  It'd be great if every performance test could live in
one place and use the same framework we use in 'solr/benchmark'....but
that's just not the reality.

On Sun, Feb 22, 2026 at 2:04 PM David Smiley <[email protected]> wrote:
>
> The extent to which I'm in the mood to use my limited uncompensated time
> here is ensuring that solr/benchmark is a viable base layer for a higher
> layer.  I have commenced on https://issues.apache.org/jira/browse/SOLR-18126
>
> It's my hope that you will modify your fork of search-benchmark-game as I
> described to leverage our existing framework.  If I can be helpful in
> improving the base layer, let me know.  I could set Claude loose on this
> but I'd only want to start the effort, resulting in a PR that's hopefully
> useful, without necessarily seeing it through.
>
> I'd like to propose that search-benchmark-game (or do we now call it
> solr-benchmark-game?) work with Solr as a single Solr Docker container per
> version/env being compared.  This allows multiple shards, and I *think*
> multiple replicas, even though it's architecturally pointless.  I'm
> suggesting this to help simplify/constrain, at least initially.
>
> On Sun, Feb 22, 2026 at 9:25 AM David Eric Pugh via dev <[email protected]>
> wrote:
>
> >  I'm all for it....    The plan looks extensive enough, you might want to
> > write it up as a SIP so there is something that we can all look at, and
> > refresh our minds as time passes.
> > I'm going to make the comment that makes me sound like a nay sayer, our
> > biggest challenge in this space is not ideas on HOW to do load testing and
> > perf testing, but someone willing to DO the work.  We have a number of
> > internal proprietary solutions, but nothing that has quite "jelled" yet as
> > a community solution.   Which means that the act of perf testing is not
> > part of our muscle memory as a community.  We don't do perf comparisons as
> > part of release for example.
> > I think what I can offer up, as it's in my immediate paid work, is to get
> > the Search Benchmark Repo to covering Solr 9 and 10 and 11 setups with a
> > two replica no shard setup.  I can commit to running the comparisons as we
> > cut release candidates.  I also plan on doing a version that compares
> > user-managed 1 node no sharding to solrcloud 1 node no sharding embedded ZK.
> > I suspect that if you put together a robust SIP, with well defined tasks,
> > then as folks say "How can I help?" then we can point them at the SIP and
> > the associated JIRAs to work on items.   There is clearly interest, but we
> > all seem to build different solutions.  Maybe with a good SIP we can all
> > build different parts of ONE solution.
> >     On Saturday, February 21, 2026 at 08:32:31 PM EST, David Smiley <
> > [email protected]> wrote:
> >
> >  FWIW I asked Claude to ponder this composability migration between the two
> > benchmark systems :
> >
> > https://docs.google.com/document/d/1iNdtTZ90Q9cLzLrYIspLqTIV0ItfdY5EJ0_dNMYSn-k/edit?usp=sharing
> > I'm super impressed!
> >
> > As a next step, even though non-critical, I'd like to see solr/benchmark
> > decoupled from the source tree (move to the Solr sandbox) and most
> > especially from assuming/limiting itself to embedded Solr -- although that
> > certainly needs to remain an option.  The Gradle "Composite Builds
> > <https://docs.gradle.org/current/userguide/composite_builds.html>" (aka
> > includeBuild) feature can make it easy to continue to use the benchmark
> > module against a local source tree for testing WIP (a current advantage of
> > the status quo).  I use includeBuild at work and love this fantastic Gradle
> > feature.
> >
> > A search-game repo that uses solr/benchmark would take responsibility for
> > starting/stopping Solr, probably via Docker.  And would probably eventually
> > have a way of retaining a common search index so that identical
> > data/segments can be used across Solr versions being compared (rather
> > critial for doing performance comparisons).  Although it'd mean we wouldn't
> > see new improvements in the latest index -- I think this is the right
> > trade-off.  Hmmm... come to think of it, Solr's new index upgrader could be
> > used to incrementally upgrade a reference index to the latest while
> > retaining the same "index geometry".  I'll think on that later; it's a
> > nice-to-have.
> >
> > The useful real-world data / queries currently existing in benchmark-game
> > can be ported to solr/benchmark to form a new set of benchmarks.
> >
> > On Fri, Feb 20, 2026 at 10:23 AM David Smiley <[email protected]> wrote:
> >
> > >
> > >
> > > On Fri, Feb 20, 2026 at 8:23 AM David Eric Pugh via dev <
> > > [email protected]> wrote:
> > >
> > >>  I'll be honest, the JMH stuff, I think I need to learn it for when I
> > try
> > >> to do actual writing of code and want to understand performance, but I
> > >> don't think right now it's a generalizable perf tool?  Can I use it to
> > say
> > >> "Solr 10.1 has the same performance characteristics as 9.8.2"?  Which is
> > >> the question that I'm trying to answer.
> > >>
> > >
> > > Nor do I think the code/technology in solr/benchmark should answer that
> > > question by itself.  I think it's a well scoped project that shouldn't
> > try
> > > and address every use-case in the field of benchmarking.
> > >
> > > My point is, we should seek complementary / composable things rather than
> > > non-interoperable things that overlap significantly in scope, and thus
> > > unfortunately compete with each other.  That spreads
> > > our resources/investments thin and causes someone to put a benchmark in
> > one
> > > place versus another when, ideally, there would be one natural place for
> > > Solr's benchmarks.
> > >
> > > I'm willing to put some time into this.
> > >
> > >
> > >>  I think there are a lot of great ideas out there..  Our challenge as a
> > >> community has been "can we actually move forward with any of them" and
> > "how
> > >> do we support them".  I'm totally up for any tool, and I think we need
> > to
> > >> make sure perfection doesn't stop progress.
> > >> The gatling based stuff in https://github.com/apache/solr-sandbox just
> > >> seemed too cumbersome for me.  Being able to compare across revisions
> > >> means storing data, and keeping the perf test environment the same,
> > which I
> > >> think is pretty hard to do.
> > >>
> > >
> > > Frustratingly, these are all from-scratch, non-composable efforts.
> > >
> > >
> > >> I like the fact that the setup per version of Solr is stored in
> > >> https://github.com/epugh/search-benchmark-game/tree/master/engines and
> > i
> > >> can run them on my laptop, or fire up a DigitalOcean droplet with lots
> > of
> > >> cpu's and ram and run it there...  And the comparison between the
> > versions
> > >> remains valid.  It also just felt pretty "easy" to get started.
> > >> I am excited about being able to run some perf tests against single node
> > >> user-managed (standalone) mode and single node embedded ZK Solr cloud
> > mode
> > >> and get a sense of performance impacts.
> > >> I *do* hope to not become a performance benchmarks guy ;-).
> > >>
> > >
> > > I do think "search-benchmark-game" is a promising contender to be a
> > > *layer* of an entire benchmark solution.  The fact that there are
> > multiple
> > > engines supported implies decoupling that's necessary for it to be a
> > > layer, versus something all-encompassing.  As a layer, it should not be
> > > supplying data & queries; let the underlying low level benchmark do that.
> > >
> > > Note that /solr/benchmark .jmh.sh can emit its results in JSON, which is
> > > key for consumability for a higher layer.  (Gatling doesn't support that,
> > > if I recall)
> > >
> > > An MVP could just work the solr/benchmark benchmarks as they are but I
> > > could see utility in decoupling solr/benchmark from MiniSolrCloudCluster
> > > (embedding Solr -> talking to Solr)., especially to re-use an index over
> > > multiple Solr versions.
> > >
> > > ~ David
> > >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Search Benchmark The Game repo

Reply via email to