Benchmarking Lucene

Sanjoy Das Mon, 23 Nov 2015 11:43:37 -0800

Hi all,

I work for a JVM vendor, and we're interested in obtaining / creating
a set of Lucene benchmarks for internal use.  We plan to use these for
performance regression testing and general performance analysis
(i.e. to make sure Lucene performs well on our JVM).  I'm especially
interested in benchmarks that demonstrate opportunities for
improvements in our JIT compiler.


While I imagine that the lucene/benchmark/ directory is probably the
right place to start, I have a few high-level questions that are best
answered by people on this mailing list:

- Are there realistic Lucene workloads that are bottle-necked on the
  JVM's performance (JIT, GC etc.) and *not* e.g. disk / network IO?
  If so, what are some examples?

- How relevant are the Dacapo "luindex" and "lusearch" benchmarks
  today?  Will porting them to the latest version of Lucene give me a
  benchmark representative of modern Lucene usage, or has Lucene's
  performance characteristics evolved in fundamental ways since Dacapo
  was published?

- What is the distribution of Lucene versions in production
  deployments?  Do users tend to aggressively upgrade to the "latest
  and greatest" Lucene version, or is there usually a non-trivial lag?

Any other information that you think is useful or relevant is
welcome.

Thanks!
-- Sanjoy

Benchmarking Lucene

Reply via email to