I've been testing 2.9 RC2 lately and comparing query performance to 2.3.2. I'm seeing a huge increase in throughput (2x-10x) on an index that was built with 2.3.2. The queries have a lot of BoostingTermQuerys and boolean clauses containing a custom scorer. Using JProfiler, I observe that the improvement is due to a huge reduction in the number of calls to TermDocs.next and TermDocs.skipTo (about 65% fewer calls). Since the custom scorer is somewhat expensive and frequent in these queries, the improvement is quite noticable.
Have there been a lot of optimizations in the IndexSearcher/skipto/next logic since 2.3.2? Should I expect to see even more query improvement by building the index with 2.9? (pinch me!) Good stuff! Thanks, Peter On Fri, Aug 28, 2009 at 3:02 PM, Mark Miller <[email protected]> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hello Lucene users, > > On behalf of the Lucene dev community (a growing community far larger > than just the committers) I would like to announce the second release > candidate for Lucene 2.9. > > Please download and check it out – take it for a spin and kick the > tires. If all goes well, we hope to release the final version of > Lucene 2.9 in a little over a week. > > The following changes have been applied since the first release candidate: > LUCENE-1867: replace collation/lib/icu4j.jar with a smaller icu jar > LUCENE-1868: add Arabic stemmer to notice.txt > LUCENE-1870: fix contrib dist - missing analyzers/db binaries - extra > miscellaneous folder with misc readme in it > LUCENE-1869: include 'file exists?' when we throw RuntimeException > from fdx or tvx size mismatches during flush or merge > LUCENE-1871: Add option to avoid needlessly wrapping TokenStream with > CachingTokenFilter in Highlighter > > While we generally try and maintain full backwards compatibility > between major versions, Lucene 2.9 has a variety of breaks that are > spelled out in the 'Changes in backwards compatibility policy' section > of CHANGES.txt. > > We recommend that you recompile your application with Lucene 2.9 > rather than attempting to “drop” it in. This will alert you to any > issues you may have to fix if you are affected by one of the backward > compatibility breaks. As always, its a really good idea to thoroughly > read CHANGES.txt before upgrading. Also, remember that this is a > release candidate, and not the final Lucene 2.9 release. > > Lucene 2.9 comes with a bevy of new features, including: > > Per segment searching and caching (can lead to much faster reopen > among other things) > > Near real-time search capabilities added to IndexWriter > > New queries, including NumericRangeQuery and NumericRangeFilter – > fast, highly scalable alternatives to RangeQuery/RangeFilter for > numeric searches. > > Smarter, more scalable multi-term queries (wildcard, range, etc) > > A freshly optimized Collector/Scorer API > > Improved Unicode support > > A new Attribute based TokenStream API > > A new QueryParser framework in contrib with a core QueryParser > replacement impl included. > > …. > > And many, many more features, bug fixes, optimizations, and various > improvements. You can find the full list of changes here: > > HTML version: > > http://people.apache.org/~markrmiller/staging-area/lucene2.9changes/Changes.html<http://people.apache.org/%7Emarkrmiller/staging-area/lucene2.9changes/Changes.html> > > Text version: > > http://people.apache.org/~markrmiller/staging-area/lucene2.9changes/CHANGES.txt<http://people.apache.org/%7Emarkrmiller/staging-area/lucene2.9changes/CHANGES.txt> > > Many changes have also occurred in Lucene's contrib area: > > http://people.apache.org/~markrmiller/staging-area/lucene2.9changes/contrib/CHANGES.txt<http://people.apache.org/%7Emarkrmiller/staging-area/lucene2.9changes/contrib/CHANGES.txt> > > Download release candidate 1 here: > http://people.apache.org/~markrmiller/staging-area/lucene2.9rc2/<http://people.apache.org/%7Emarkrmiller/staging-area/lucene2.9rc2/> > > Be sure to report back with any issues you find! > > Thanks, > > Mark Miller > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.9 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iEYEARECAAYFAkqYKcIACgkQ0DU3IV7ywDkZHACfbzHKac0sVjNkdSi+79WTYWme > JR4An18v1SJ6HN8mkCYHF0ybqUSypOG/ > =qS9t > -----END PGP SIGNATURE----- > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
