Re: Benchmarking results

2006-04-09 Thread Marvin Humphrey
On Apr 7, 2006, at 10:05 AM, Doug Cutting wrote: > Another axis that I don't think you're yet measuring is how things change as > the index grows. There are lots of axes that I haven't measured yet, but I do have to move on to other things sooner or later. :) Running decent scientific

Re: Benchmarking results

2006-04-09 Thread Marvin Humphrey
Hello, I have discovered a serious bug in the LuceneIndexer benchmarking app. All tests have been rerun, and the new numbers reflect a 13-15% improvement for Lucene. I apologize for having reported bad data. Here are some of the new results, both with and without the bug so that you can

Re: Benchmarking results

2006-04-07 Thread Doug Cutting
Marvin Humphrey wrote: However, having established that KinoSearch is in Lucene's league with regards to indexing speed, I'm not worried about absolute numbers, and the new benchmarker interface is slightly more stable, allowing more accurate comparative analysis of algorithmic efficiency.

Re: Benchmarking results

2006-04-07 Thread Grant Ingersoll
Marvin Humphrey wrote: On Apr 4, 2006, at 10:23 AM, Tatu Saloranta wrote: So in this case, what would give more comparable results (assuming you are interested in measuring likely server-side usage scenario, which is usually what Lucene is used for) Actually, I think the benchmark results i

Re: Benchmarking results

2006-04-06 Thread Marvin Humphrey
On Apr 4, 2006, at 10:23 AM, Tatu Saloranta wrote: So in this case, what would give more comparable results (assuming you are interested in measuring likely server-side usage scenario, which is usually what Lucene is used for) My main interest with these tests is algorithmic performance. How

Re: Benchmarking results

2006-04-04 Thread Igor Bolotin
For faster Hotspot warm-up you can use Hotspot VM option: -XX:CompileThreshold=NN This option controls number of method invocations/branches before (re-)compiling. Defaults are: 10,000 -server, 1,500 -client. See documentation here: http://java.sun.com/docs/hotspot/VMOptions.html In one of my pre

Re: Benchmarking results

2006-04-04 Thread Tatu Saloranta
> The times for KinoSearch and Lucene are 5-run ... > is due to cache reassignment.) Therefore, the same > command was > issued on the command line 6 times, separated by > semicolons. The > first iter was discarded, and the rest were > averaged. ... > The maximum memory consumption was meas

RE: Benchmarking results

2006-04-04 Thread Pasha Bizhan
Hi, > From: Marvin Humphrey [mailto:[EMAIL PROTECTED] > The test corpus was Reuters-21578, Distribution 1.0. > Reuters-21578 is available from David D. Lewis' professional > home page, currently: > > http://www.research.att.com/~lewis The correct link is http://www.daviddlewis.com/re