On Mon, Nov 23, 2015 at 2:42 PM, Sanjoy Das <[email protected]> wrote: > Hi all, > > I work for a JVM vendor, and we're interested in obtaining / creating > a set of Lucene benchmarks for internal use. We plan to use these for > performance regression testing and general performance analysis > (i.e. to make sure Lucene performs well on our JVM). I'm especially > interested in benchmarks that demonstrate opportunities for > improvements in our JIT compiler. > > While I imagine that the lucene/benchmark/ directory is probably the > right place to start, I have a few high-level questions that are best > answered by people on this mailing list:
Actually I think http://people.apache.org/~mikemccand/lucenebench/ might be better for your purposes. Code is currently located here: https://github.com/mikemccand/luceneutil > > - Are there realistic Lucene workloads that are bottle-necked on the > JVM's performance (JIT, GC etc.) and *not* e.g. disk / network IO? > If so, what are some examples? You can see some changes in query graphs when the JVM was upgraded at the above link. In some cases they are not positive. For example, why did indexing throughput drop significantly when upgrading from 1.8.0_25 to 1.8.0_40? (annotation BD in http://people.apache.org/~mikemccand/lucenebench/indexing.html) > > - How relevant are the Dacapo "luindex" and "lusearch" benchmarks > today? Will porting them to the latest version of Lucene give me a > benchmark representative of modern Lucene usage, or has Lucene's > performance characteristics evolved in fundamental ways since Dacapo > was published? Some things have changed since lucene 2.4 such as much better concurrency when indexing with multiple threads, the use of bulk integer decompression methods vs vByte compression, and so on. Also support for new data structures like column-stride fields were added, and the use cases around those (e.g. faceted search) are probably not represented. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
