RE: Lucene Speed under diff JVMs
One more bit of info that I should have included: The randomly generated documents consisted of 2 fields, one Text with 3 words, and one UnStored with 500 words. Average word length was 7 characters. If Otis (he wrote it, I just made a tweak or two) doesn't mind, I'll post the source code. Dan -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Lucene Speed under diff JVMs
Otis doesn't mind. --- Armbrust, Daniel C. [EMAIL PROTECTED] wrote: One more bit of info that I should have included: The randomly generated documents consisted of 2 fields, one Text with 3 words, and one UnStored with 500 words. Average word length was 7 characters. If Otis (he wrote it, I just made a tweak or two) doesn't mind, I'll post the source code. Dan -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] __ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Lucene Speed under diff JVMs
It doesn't surprise me that the IBM JDK is faster indexing. This JVM is better optimized in this case from my experience. I did some serious load testing with various JVM implementation from Sun and IBM and found that the opposite when it came to searching. I.e. Lucene searches were fastest under Sun 1.4.1. This JVM was consequently able to handle a higher load (faster response increases queries/second). IBM was drastically slower at handling queries. I've never tried Jrocket since I don't like the cost. The index for my tests had 7million records and 6 major fields. Queries were randomly chosen from a list of 2 million real user queries. The query load was meant to simulate real loads from a production site. This was all accomplished on a single 1U, Redhat Linux 7.2, 2-processor box with 1 GB of RAM. Query times were very good compared to previous indexing methods. Jonathan -Original Message- From: Armbrust, Daniel C. [mailto:[EMAIL PROTECTED]] Sent: Thursday, December 05, 2002 2:47 PM To: 'Lucene Users List' Subject: Lucene Speed under diff JVMs This may be of use to people who want to make lucene index faster. Also, I'm curious as to what JVM most people run Lucene under, and if anyone else has seen results like this: I'm using the class that Otis wrote (see message from about 3 weeks ago) for testing the scalability of lucene (more results on that later) and I first tried running it under different versions of Java, to see where it runs the fastest. The class simply creates an index out of randomly generated documents. All of the following were running on a dual CPU 1 GHz PIII Windows 2000 machine that wasn't doing much else during the benchmark. The indexing program was single threaded, so it only used one of the processors of the machine. java version 1.3.1_04 Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_04-b02) Java HotSpot(TM) Client VM (build 1.3.1_04-b02, mixed mode) 42 seconds/1000 documents java version 1.4.1 Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1-b21) Java HotSpot(TM) Client VM (build 1.4.1-b21, mixed mode) 42 seconds/1000 documents Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_01) BEA WebLogic JRockit(R) Virtual Machine (build 8.0_Beta-1.4.1_01-win32-CROSIS-20021105-1617, Native Threads, Generational Concurrent Garbage Collector) 35 seconds/1000 documents java version 1.3.1 Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1) Classic VM (build 1.3.1, J2RE 1.3.1 IBM Windows 32 build cn131-20020403 (JIT enabled: jitc)) 27 seconds/1000 documents As you can see, the IBM jvm pretty much smoked Suns. And beat out JRockit as well. Just a hunch, but it wouldn't surprise me if search times were also faster under the IBM jdk. Has anyone else come to this conclusion? Dan -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Lucene Speed under diff JVMs
This may be of use to people who want to make lucene index faster. Also, I'm curious as to what JVM most people run Lucene under, and if anyone else has seen results like this: I'm using the class that Otis wrote (see message from about 3 weeks ago) for testing the scalability of lucene (more results on that later) and I first tried running it under different versions of Java, to see where it runs the fastest. The class simply creates an index out of randomly generated documents. All of the following were running on a dual CPU 1 GHz PIII Windows 2000 machine that wasn't doing much else during the benchmark. The indexing program was single threaded, so it only used one of the processors of the machine. java version 1.3.1_04 Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_04-b02) Java HotSpot(TM) Client VM (build 1.3.1_04-b02, mixed mode) 42 seconds/1000 documents java version 1.4.1 Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1-b21) Java HotSpot(TM) Client VM (build 1.4.1-b21, mixed mode) 42 seconds/1000 documents Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_01) BEA WebLogic JRockit(R) Virtual Machine (build 8.0_Beta-1.4.1_01-win32-CROSIS-20021105-1617, Native Threads, Generational Concurrent Garbage Collector) 35 seconds/1000 documents java version 1.3.1 Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1) Classic VM (build 1.3.1, J2RE 1.3.1 IBM Windows 32 build cn131-20020403 (JIT enabled: jitc)) 27 seconds/1000 documents As you can see, the IBM jvm pretty much smoked Suns. And beat out JRockit as well. Just a hunch, but it wouldn't surprise me if search times were also faster under the IBM jdk. Has anyone else come to this conclusion? Dan -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Lucene Speed under diff JVMs
On Thu, 5 Dec 2002, Armbrust, Daniel C. wrote: I'm using the class that Otis wrote (see message from about 3 weeks ago) for testing the scalability of lucene (more results on that later) and I May I ask you where one can get the source code? I cannot find it in archive. Thank you -g- -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Lucene Speed under diff JVMs
On Thu, 5 Dec 2002, Armbrust, Daniel C. wrote: I'm using the class that Otis wrote (see message from about 3 weeks ago) for testing the scalability of lucene (more results on that later) and I first tried running it under different versions of Java, to see where it runs the fastest. The class simply creates an index out of randomly generated documents. All of the following were running on a dual CPU 1 GHz PIII Windows 2000 machine that wasn't doing much else during the benchmark. The indexing program was single threaded, so it only used one of the processors of the machine. [snip specific measurements] As you can see, the IBM jvm pretty much smoked Suns. And beat out JRockit as well. Just a hunch, but it wouldn't surprise me if search times were also faster under the IBM jdk. Has anyone else come to this conclusion? Just a brief note on performance measurements and statistical sampling: no offense, but if these are measurements of a single trial of 1000 documents for each JVM, they're not so different that I'd be willing to conclude that one JVM is notably faster for this task than another. The problem is compounded by the fact that it can be hard to tell just how much CPU is being taken up by OS tasks (and this can fluctuate quite a lot). If you really want to quote statistics like this, using 5 or 10 trials would give a more accurate notion of the real performance differences (if any). Casuistically :), Joshua O'Madadhain [EMAIL PROTECTED] Per Obscuriuswww.ics.uci.edu/~jmadden Joshua O'Madadhain: Information Scientist, Musician, Philosopher-At-Tall It's that moment of dawning comprehension that I live for. -- Bill Watterson My opinions are too rational and insightful to be those of any organization. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]