For faster Hotspot warm-up you can use Hotspot VM option: -XX:CompileThreshold=NN
This option controls number of method invocations/branches before (re-)compiling. Defaults are: 10,000 -server, 1,500 -client. See documentation here: http://java.sun.com/docs/hotspot/VMOptions.html In one of my previous benchmarking projects we used -XX:CompileThreshold=100 to force compilation to happen as soon as possible. However - you still have to warm up JVM before measuring performance. What we did - we indexed relatively small corpus in different directory without taking measurments before running actual benchmark in the same JVM session. Igor On 4/4/06, Tatu Saloranta <[EMAIL PROTECTED]> wrote: > > > > The times for KinoSearch and Lucene are 5-run > ... > > is due to cache reassignment.) Therefore, the same > > command was > > issued on the command line 6 times, separated by > > semicolons. The > > first iter was discarded, and the rest were > > averaged. > ... > > The maximum memory consumption was measured during > > auxiliary passes > > (i.e. not averaged in), using the crude method of > > eyeballing RPRVT in > > the output of top. > > Marvin, I think it is great that different > implementations are > compared, and your results are interesting. However, I > think that > above methodology does not work well with Java (it may > work > better for/with Perl, but might have problems there as > well). > In this case it is maybe not quite as big a difference > as for > some other tests (since test runs were almost minute > long), ie. > no order of magnitude difference, but it will be > noticeable. > > The reason is that it is crucial NOT to run > consequtive tests > by restarting JVM, unless you really want to measure > one-shot > single-run command line total times. The reason is > that the > startup overhead and warmup of HotSpot essentially > mean > that if you did run second indexing right after first > one, > it would be significantly faster, and not just due to > caching > effects. And consequtive runs would have run times > that converge > towards sustainable long-term performance -- in this > case the second > run may already be as fast as it'll get, since it's > running for > significant amount of time (I have noticed 30 or even > 10 second > warm up time is often sufficient). > HotSpot only compiles Java bytecode when it determines > a need, and > figuring that out will take a while. > > So in this case, what would give more comparable > results (assuming > you are interested in measuring likely server-side > usage scenario, > which is usually what Lucene is used for) would be to > run all > runs within same JVM / execution (for Perl), and > either take > the fastest runs, or discard the first one and take > median or > average. > > Would this be possible? I am not really concerned > about "whose > language is faster" here, but about relevancy of the > results, using > methodology that gives realistic numbers for the usual > use case. > Chances are, Perl-based version would also perform > better (depending > on how Perl runtime optimizes things) if tests were > run under > a single process. > > Anyway, above is intended as constructive criticism, > so once again > thank you for doing these tests! > > -+ Tatu +- > > ps. Regarding memory usage: it is also quite tricky to > measure > reliably, since Garbage Collection only kicks in when > it has to... > so Java uses as much memory as it can (without > expanding heap)... > plus, JVMs do not necessarily (or even usually) > return unused > chunks later on. > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >