Not me, but it looks useful and something I could actually use (exactly to look at synchronization bottlenecks in situations where many threads are sharing a single IndexSearcher). Unfortunately, it looks like it works only with IBM's JVM: "Any platform running an IBM®-supplied Java™ SDK or JRE, Version 5.0 or above." :(
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- From: Mark Miller <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Sunday, January 20, 2008 9:46:25 AM Subject: Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?) Anyone tried using this on Lucene yet? http://www.alphaworks.ibm.com/tech/jla Michael McCandless wrote: > > These results are very interesting. With 3 threads on SSD your > searches run 87% faster if you use 3 IndexSearchers instead of sharing > a single one. > > This means, for your test, there are some crazy synchronization > bottlenecks when searching, which I think we should ferret out and fix. > > Have you done any profiling to understand where the threads are > waiting when you share one IndexSearcher? EG YourKit can tell you > where the threads are waiting... > > I know there is synchronization used when reading bytes from the > underlying file descriptor. We've investigated options to remove that > (https://issues.apache.org/jira/browse/LUCENE-753) but those options > seemed to hurt single threaded performance. I wonder if the patch on > that issue closes some of this 87% performance loss? > > Does anyone know of other synchronization bottlenecks in searching? > > Mike > > Otis Gospodnetic wrote: > >> This is great and valuable information, Toke(n)! >> Just the other day we recommended this multi-IndexSearcher to >> somebody concerned with low QPS rates their benchmarks revealed. >> They were hitting their index with a good number of threads and >> hitting synchronized blocks in Lucene. Multiple searchers is one way >> around that. Also, your sweet spot of 3 makes sense - keeps all of >> your cores fully busy. >> >> You are our main SSD info supplier -- keep it coming! :) And let us >> know what numbers you get for 2.2 and 2.3, please. >> >> Thanks, >> Otis >> >> -- >> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch >> >> ----- Original Message ---- >> From: Toke Eskildsen <[EMAIL PROTECTED]> >> To: java-user@lucene.apache.org >> Sent: Thursday, January 17, 2008 5:31:56 AM >> Subject: Multiple searchers (Was: CachingWrapperFilter: why cache per >> IndexReader?) >> >> On Fri, 2008-01-11 at 11:34 +0100, Toke Eskildsen wrote: >>> As for shared searcher vs. individual searchers, there was just a >>> slight penalty for using individual searchers. >> >> Whoops! Seems like I need better QA for my test-code. I didn't use >> individual searchers for each thread when I thought I was. The slight >> penalty wrongly observed must have been due to measurement variations. >> >> With the corrected test, some interesting observations about our index >> can be made, which will definitely affect our configuration. In the >> following, the queries/second is an average over 350.000 queries. >> For each query, a search is performed and the content of a specific >> field is extracted for the first 20 hits. >> >> == System-summary == >> Dual-core Intel Xeon 5148 2.3 GHz, 8 GB RAM, Linux, Lucene 2.1, 37 >> GB/10 >> million documents index, queries taken from production system logs. >> >> == Conventional harddisks (2 * 15000 RPM in software RAID 1) == >> 1 thread, 1 searcher: 109 queries/sec >> 2 threads, 1 searcher: 118 queries/sec >> 2 threads, 2 searchers: 157 queries/sec >> 3 threads, 1 searcher: 111 queries/sec >> 3 threads, 3 searchers: 177 queries/sec >> 4 threads, 1 searcher: 108 queries/sec >> 4 threads, 4 searchers: 169 queries/sec >> >> == Solid State Drives (2 * 32 GB Samsung in software RAID 0) == >> 1 thread, 1 searcher: 193 queries/sec >> 2 threads, 1 searcher: 295 queries/sec >> 2 threads, 2 searchers: 357 queries/sec >> 3 threads, 1 searcher: 197 queries/sec >> 3 threads, 3 searchers: 369 queries/sec >> 4 threads, 1 searcher: 192 queries/sec >> 4 threads, 4 searchers: 302 queries/sec >> >> Graphs can be viewed at http://wiki.statsbiblioteket.dk/summa/Hardware >> >> For our setup it seems that the usual avoid-multiple-searchers advice >> is >> not valid, neither for conventional harddisks, nor Solid State Drives. >> The optimal configuration for our dual-core test machine is three >> threads with individual searchers. The obvious question is whether this >> can be extended to other cases. >> >>> As for threading, I noticed something strange: On the dual-core >>> machine, two threads gave better performance than one, while 4 >> threads >>> gave the same performance as one. >> >> As can be seen above, this strange picture is consistent. 1, 3 and 4 >> threads with shared searcher performs the same, independent of which >> storage the machine uses, while 2 threads performs markedly better. >> >> I've started the same test-suite for Lucene 2.2 and 2.3RC2. It should >> be finished in a day or two. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]