This is great and valuable information, Toke(n)! Just the other day we recommended this multi-IndexSearcher to somebody concerned with low QPS rates their benchmarks revealed. They were hitting their index with a good number of threads and hitting synchronized blocks in Lucene. Multiple searchers is one way around that. Also, your sweet spot of 3 makes sense - keeps all of your cores fully busy.
You are our main SSD info supplier -- keep it coming! :) And let us know what numbers you get for 2.2 and 2.3, please. Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- From: Toke Eskildsen <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Thursday, January 17, 2008 5:31:56 AM Subject: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?) On Fri, 2008-01-11 at 11:34 +0100, Toke Eskildsen wrote: > As for shared searcher vs. individual searchers, there was just a > slight penalty for using individual searchers. Whoops! Seems like I need better QA for my test-code. I didn't use individual searchers for each thread when I thought I was. The slight penalty wrongly observed must have been due to measurement variations. With the corrected test, some interesting observations about our index can be made, which will definitely affect our configuration. In the following, the queries/second is an average over 350.000 queries. For each query, a search is performed and the content of a specific field is extracted for the first 20 hits. == System-summary == Dual-core Intel Xeon 5148 2.3 GHz, 8 GB RAM, Linux, Lucene 2.1, 37 GB/10 million documents index, queries taken from production system logs. == Conventional harddisks (2 * 15000 RPM in software RAID 1) == 1 thread, 1 searcher: 109 queries/sec 2 threads, 1 searcher: 118 queries/sec 2 threads, 2 searchers: 157 queries/sec 3 threads, 1 searcher: 111 queries/sec 3 threads, 3 searchers: 177 queries/sec 4 threads, 1 searcher: 108 queries/sec 4 threads, 4 searchers: 169 queries/sec == Solid State Drives (2 * 32 GB Samsung in software RAID 0) == 1 thread, 1 searcher: 193 queries/sec 2 threads, 1 searcher: 295 queries/sec 2 threads, 2 searchers: 357 queries/sec 3 threads, 1 searcher: 197 queries/sec 3 threads, 3 searchers: 369 queries/sec 4 threads, 1 searcher: 192 queries/sec 4 threads, 4 searchers: 302 queries/sec Graphs can be viewed at http://wiki.statsbiblioteket.dk/summa/Hardware For our setup it seems that the usual avoid-multiple-searchers advice is not valid, neither for conventional harddisks, nor Solid State Drives. The optimal configuration for our dual-core test machine is three threads with individual searchers. The obvious question is whether this can be extended to other cases. > As for threading, I noticed something strange: On the dual-core > machine, two threads gave better performance than one, while 4 threads > gave the same performance as one. As can be seen above, this strange picture is consistent. 1, 3 and 4 threads with shared searcher performs the same, independent of which storage the machine uses, while 2 threads performs markedly better. I've started the same test-suite for Lucene 2.2 and 2.3RC2. It should be finished in a day or two. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]