Hi there,

After recent discussions on the speed of indexing/searching using different parameters it became even clearer that we need a comprehensive and repeatable benchmark.

I created a class which represents my first hack at benchmarking various aspects of Lucene, using a range of different parameters. Since it uses a standard, well-defined document collection, I hope that its results should be more or less meaningful across different OS/hardware combinations.

I had a look at JUnitPerf, but found the API to be too limited for collecting complex time-series data, so I basically rolled my own benchmarking framework... If you know a better way to do it, I'm all ears.

I'm going to package it into a self-running application (WebStart?), but for now you can try to compile and run it yourself. You can get it here:

        http://www.getopt.org/lb/LuceneBenchmark.java

It depends on the commons-compress.jar, specifically on the Tar functionality. This JAR is in commons-sandbox, so it may not be readily available - in that case you can get it here:

        http://www.getopt.org/lb/commons-compress.jar

(I will put an index page there, but for now use these direct links).

CAVEAT: please NOTE WELL that this benchmark runs at 100% CPU and 100% disk I/O for SEVERAL HOURS even on a modern equipment (partial results are printed on System.out from time to time). You have been warned - so don't send me any fried mobo's or melted drives for repairs, ok?

You can cut down the number of input parameters to reduce the overall time, or use the mini* document collection (but this reduces the number of documents in index). See the comments in source.

Comments and patches are welcome!

--
Best regards,
Andrzej Bialecki

-------------------------------------------------
Software Architect, System Integration Specialist
CEN/ISSS EC Workshop, ECIMF project chair
EU FP6 E-Commerce Expert/Evaluator
-------------------------------------------------
FreeBSD developer (http://www.freebsd.org)


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to