I wanted to add a note as to the motivation for these changes. Essentially we should be able to scale-up better with Solr/Lucene and not just scale-out by sharding. With sharding one enters distributed systems territory with all the pitfalls and failure conditions, which is not ideal especially if your index is not large enough to warrant it. The next frontier seems to be to utilize multiple CPU cores for performing a search request's legwork, and the results are very promising - 2-3x speedups in many cases!
Would be great to get a sense of committer interest :) Let me know if I should just open a JIRA with patches. On Sat, Oct 19, 2013 at 6:18 PM, Shikhar Bhushan <sbhus...@etsy.com> wrote: > I got inspired by Uwe Schindler's talk "Is your index reader really > atomic or maybe > slow?"<http://www.lucenerevolution.org/sites/default/files/Schindler%20-%20IsYourIndexReaderReallyAtomicOrMaybeSlow_0.pdf>to > look into the state of parallelism when searching across segments. It > seems there has been some effort to do this but we never went all the way. > > IndexSearcher has an optional constructor arg for an ExecutorService, > which gets used for searching in parallel for call paths where one of the > TopDocCollector's is created internally. The per-segment search happens in > parallel and then the TopDocs/TopFieldDocs results are merged with locking > around the merge bit. > > What's the upside when performing parallel search? I benchmarked using > luceneutil on WIKI_MEDIUM_10M with trunk vs trunk with IndexSearcher > constructor hacked to initialize a > ForkJoinPool<https://gist.github.com/anonymous/7048089> > . > > Report after iter 19: > TaskQPS baseline StdDev QPS patched StdDev > Pct diff > Respell 63.05 (3.0%) 46.37 (2.9%) > -26.5% ( -31% - -21%) > PKLookup 255.80 (2.6%) 194.11 (7.4%) > -24.1% ( -33% - -14%) > Fuzzy2 54.66 (3.8%) 42.28 (3.5%) > -22.7% ( -28% - -16%) > Fuzzy1 75.35 (2.9%) 59.23 (4.2%) > -21.4% ( -27% - -14%) > MedSloppyPhrase 141.74 (3.6%) 124.56 (9.7%) > -12.1% ( -24% - 1%) > AndHighLow 922.70 (3.1%) 908.18 (10.0%) > -1.6% ( -14% - 11%) > LowSloppyPhrase 88.50 (4.2%) 106.64 (13.0%) > 20.5% ( 3% - 39%) > LowTerm 694.97 (2.7%) 949.30 (21.6%) > 36.6% ( 11% - 62%) > MedSpanNear 102.47 (3.0%) 160.94 (12.7%) > 57.1% ( 40% - 75%) > OrNotHighLow 98.58 (4.9%) 155.22 (18.1%) > 57.5% ( 32% - 84%) > AndHighMed 260.31 (1.7%) 448.19 (23.5%) > 72.2% ( 46% - 99%) > OrHighLow 101.68 (6.7%) 179.91 (27.8%) > 76.9% ( 39% - 119%) > MedTerm 200.49 (3.1%) 365.70 (32.8%) > 82.4% ( 45% - 122%) > HighTerm 145.36 (3.3%) 268.51 (27.7%) > 84.7% ( 51% - 119%) > Prefix3 81.59 (2.8%) 153.52 (24.8%) > 88.2% ( 58% - 119%) > OrHighNotLow 50.67 (6.9%) 95.91 (22.5%) > 89.3% ( 56% - 127%) > OrHighMed 93.37 (6.3%) 182.37 (24.3%) > 95.3% ( 60% - 134%) > OrHighNotMed 78.19 (6.7%) 153.30 (30.7%) > 96.1% ( 54% - 143%) > OrHighHigh 46.59 (7.0%) 92.33 (31.7%) > 98.2% ( 55% - 147%) > OrHighNotHigh 51.01 (5.7%) 105.32 (25.1%) > 106.5% ( 71% - 145%) > Wildcard 78.53 (3.7%) 168.42 (40.8%) > 114.5% ( 67% - 164%) > OrNotHighHigh 40.42 (5.5%) 93.45 (29.5%) > 131.2% ( 91% - 175%) > OrNotHighMed 24.40 (4.8%) 57.00 (22.0%) > 133.6% ( 102% - 168%) > HighSpanNear 4.13 (3.6%) 10.09 (10.5%) > 144.3% ( 125% - 164%) > LowSpanNear 14.27 (2.3%) 35.24 (17.0%) > 147.0% ( 124% - 170%) > IntNRQ 8.73 (4.8%) 21.69 (23.6%) > 148.6% ( 114% - 185%) > LowPhrase 29.11 (2.9%) 72.87 (22.4%) > 150.3% ( 121% - 180%) > MedPhrase 33.19 (3.4%) 83.38 (40.4%) > 151.3% ( 103% - 201%) > HighSloppyPhrase 6.46 (5.8%) 16.37 (21.9%) > 153.5% ( 119% - 192%) > HighPhrase 10.00 (6.8%) 25.66 (14.4%) > 156.7% ( 126% - 190%) > AndHighHigh 44.82 (1.1%) 115.77 (39.6%) > 158.3% ( 116% - 201%) > > That looks pretty awesome! Any feedback from existing users? > * > * > *Problems* > > - (relatively minor) Solr can't take advantage of this yet. > SolrIndexSearcher's disregard the possibility of propagating an > ExecutorService up to IndexSearcher constructor. > - Even if it did, as far as I can tell Solr does not make use of > IndexSearcher's search() variants that create TopDocCollector's internally > and parallelize in the presence of the executor. > - If arbitary Collector args come into play, we don't/can't > parallelize. Note that even if ultimately results are going to a > TopDocCollector it may be wrapped inside e.g. a EarlyTerminatingCollector > or TimeLimitingCollector or both. > - The special-casing with parallelism layered on top as IndexSearcher > does with TopDocCollector's does not scale, there are lots* *of > Collector's that could be potentially lend themselves to parallelism. > > *Idea* > * > * > > > I have been exploring a refactoring of Collector's that allows for > parallelization at the level of the collection protocol. Some of the design > decisions: > > > - easy migration path for collectors that want to remain serial > > - the parallelization should be composable (when Collector's wrap other > Collector's) > - allow collector's to pick the optimal solution (e.g. there might be > memory tradeoffs to be made) by advising the collector about whether a search > will be parallelized > > - encourage use of lock-free parallelism by providing for a segment-level > done() method that is guaranteed to execute in a single-threaded manner (so > if you were to accumulate some state at the segment-level, it can be safely > merged with the composite state without use of locking) > > The code is currently in a fork on github > <https://github.com/shikhar/lucene-solr/compare/apache:trunk...trunk>. > > > - Collector goes from being an abstract class to this interface > <https://github.com/shikhar/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/Collector.java>. > A new interface SubCollector > <https://github.com/shikhar/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/SubCollector.java> > is introduced. > - The existing subclasses of Collector can choose to either implement the > new interface, or subclass a SerialCollector > <https://github.com/shikhar/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/SerialCollector.java> > abstract class which maintains essentially the same API as before. > - There is javadoc on the new interfaces, but the new collection protocol > is probably clearer in connection with the usage from IndexSearcher > <https://github.com/shikhar/lucene-solr/blob/b3098fc/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L571-L636>. > - I tried to convert at least all of the collectors that wrap other > collectors over to the new interface, rather than simply extending > SerialCollector (also introduced a new WrappingCollector > <https://github.com/shikhar/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/WrappingCollector.java> > base class), so that the wrappers themselves aren't blockers from a search > being parallelizable. There are probably a few cases I have missed. > - An example of a collector that is very easily parallelizable in the > divide-and-conquer style: TotalHitCountCollector > <https://github.com/shikhar/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/TotalHitCountCollector.java> > > - An update of TopScoreDocCollector to being a parallelizable Collector > <https://github.com/shikhar/lucene-solr/commit/b3098fcfdeb302481e93aab93eb77d0cabc72f9b?w=1>, > and corresponding removal of the special-cased parallelism inside > IndexSearcher. > > Benchmark results <https://gist.github.com/shikhar/7062026> > > *Next steps* > * > * > I would love to get your feedback on the idea and implementation, and how > this could be incorporated into trunk. Code review would be much > appreciated. > > Some follow-up work that needs to be done (help needed!): > > - Making TopFieldCollector a parallelizable Collector, like was done > for TopScoreDocCollector, and removal of the special-casing inside > IndexSearcher. > - Solr support -- perhaps making the Executor to be used by > SolrIndexSearcher configurable from solrconfig.xml. > - Making more collector's parallelizable - lots of opportunities here. > I guess this will be an ongoing process. > > Looking forward to your thoughts! > > Thanks, > > Shikhar Bhushan >