[jira] [Comment Edited] (LUCENE-5299) Refactor Collector API for parallelism

Shikhar Bhushan (JIRA) Mon, 27 Oct 2014 14:02:01 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14185751#comment-14185751
 ]


Shikhar Bhushan edited comment on LUCENE-5299 at 10/27/14 9:01 PM:
-------------------------------------------------------------------

Just an update that the code rebased against recent trunk lives at 
https://github.com/shikhar/lucene-solr/tree/LUCENE-5299. I've made various 
tweaks, like being able to throttle per-request parallelism in 
{{ParallelSearchStrategy}}.

luceneutil bench numbers when running with ^ & hacked IndexSearcher constructor 
that uses {{ParallelSearchStrategy(new ForkJoinPool(128), 8)}}, against trunk, 
on a 32 core (with HT) Sandy Bridge server, with source {{wikimedium500k}}

SEARCH_NUM_THREADS = 16
{noformat}
Report after iter 19:
                    TaskQPS baseline      StdDev  QPS parcol      StdDev        
        Pct diff
                  Fuzzy1       81.91     (43.2%)       52.96     (39.7%)  
-35.3% ( -82% -   83%)
                 LowTerm     2550.11     (11.9%)     1927.28      (5.6%)  
-24.4% ( -37% -   -7%)
                 Respell       43.02     (39.4%)       35.23     (31.5%)  
-18.1% ( -63% -   87%)
                  Fuzzy2       19.32     (25.1%)       16.40     (34.8%)  
-15.1% ( -59% -   59%)
                 MedTerm     1679.37     (12.2%)     1743.27      (8.6%)    
3.8% ( -15% -   28%)
                PKLookup      221.58      (8.3%)      257.36     (13.2%)   
16.1% (  -4% -   41%)
              AndHighLow     1027.99     (11.6%)     1278.39     (15.9%)   
24.4% (  -2% -   58%)
              AndHighMed      741.50     (10.0%)     1198.04     (27.5%)   
61.6% (  21% -  110%)
               MedPhrase      709.04     (11.6%)     1203.02     (24.3%)   
69.7% (  30% -  119%)
             LowSpanNear      601.13     (16.9%)     1127.30     (16.7%)   
87.5% (  46% -  145%)
         LowSloppyPhrase      554.87     (10.8%)     1130.25     (30.5%)  
103.7% (  56% -  162%)
               OrHighMed      408.55     (10.4%)      977.56     (20.1%)  
139.3% (  98% -  189%)
               LowPhrase      364.36     (10.8%)      893.27     (41.0%)  
145.2% (  84% -  220%)
               OrHighLow      355.78     (12.7%)      893.63     (19.6%)  
151.2% ( 105% -  210%)
             AndHighHigh      390.73     (10.3%)     1004.70     (24.3%)  
157.1% ( 111% -  213%)
                HighTerm      399.01     (11.8%)     1067.67     (12.1%)  
167.6% ( 128% -  217%)
                Wildcard      754.76     (11.6%)     2067.96     (28.0%)  
174.0% ( 120% -  241%)
            HighSpanNear      153.57     (14.8%)      463.54     (24.3%)  
201.8% ( 141% -  282%)
              OrHighHigh      212.16     (12.4%)      665.56     (28.2%)  
213.7% ( 154% -  290%)
              HighPhrase      170.49     (13.1%)      547.72     (17.3%)  
221.3% ( 168% -  289%)
        HighSloppyPhrase       66.91     (10.1%)      219.59     (12.0%)  
228.2% ( 187% -  278%)
         MedSloppyPhrase      128.73     (12.5%)      425.67     (20.3%)  
230.7% ( 175% -  300%)
             MedSpanNear      130.31     (10.7%)      436.12     (18.2%)  
234.7% ( 185% -  295%)
                 Prefix3      166.91     (14.9%)      652.64     (26.7%)  
291.0% ( 217% -  390%)
                  IntNRQ      110.73     (15.0%)      467.72     (33.6%)  
322.4% ( 238% -  436%)
{noformat}

SEARCH_NUM_THREADS=32
{noformat}
                    TaskQPS baseline      StdDev  QPS parcol      StdDev        
        Pct diff
                 LowTerm     2401.88     (12.7%)     1799.27      (6.3%)  
-25.1% ( -39% -   -6%)
                  Fuzzy2        6.52     (14.4%)        5.74     (24.0%)  
-11.9% ( -43% -   30%)
                 Respell       45.13     (90.2%)       40.94     (83.5%)   
-9.3% ( -96% - 1679%)
                PKLookup      232.02     (12.9%)      228.35     (12.4%)   
-1.6% ( -23% -   27%)
                 MedTerm     1612.01     (14.0%)     1601.71     (10.9%)   
-0.6% ( -22% -   28%)
                  Fuzzy1       14.19     (79.3%)       14.71    (177.6%)    
3.7% (-141% - 1258%)
              AndHighLow     1205.65     (17.5%)     1254.76     (15.9%)    
4.1% ( -24% -   45%)
             MedSpanNear      478.11     (25.4%)      946.72     (34.5%)   
98.0% (  30% -  211%)
               OrHighLow      424.71     (14.5%)      941.39     (31.4%)  
121.7% (  66% -  195%)
             AndHighHigh      377.82     (13.3%)      910.77     (32.2%)  
141.1% (  84% -  215%)
                HighTerm      325.35     (11.3%)      855.63      (8.9%)  
163.0% ( 128% -  206%)
              AndHighMed      346.57     (11.7%)      914.59     (26.4%)  
163.9% ( 112% -  228%)
               MedPhrase      227.47     (13.1%)      621.50     (22.9%)  
173.2% ( 121% -  240%)
         LowSloppyPhrase      265.21     (10.4%)      748.30     (49.2%)  
182.2% ( 110% -  269%)
               OrHighMed      221.49     (12.2%)      632.55     (23.9%)  
185.6% ( 133% -  252%)
               LowPhrase      190.34     (14.9%)      586.71     (22.6%)  
208.2% ( 148% -  288%)
                 Prefix3      305.01     (15.9%)      948.63     (17.0%)  
211.0% ( 153% -  289%)
         MedSloppyPhrase      229.15     (15.0%)      718.29     (41.4%)  
213.5% ( 136% -  317%)
             LowSpanNear      102.98     (14.0%)      323.91     (37.1%)  
214.5% ( 143% -  309%)
                Wildcard      249.66     (13.3%)      787.42     (17.0%)  
215.4% ( 163% -  283%)
              OrHighHigh      124.76     (10.5%)      394.72     (35.0%)  
216.4% ( 154% -  292%)
            HighSpanNear      119.23     (15.5%)      386.33     (57.5%)  
224.0% ( 130% -  351%)
              HighPhrase       86.95     (14.4%)      293.00     (15.5%)  
237.0% ( 180% -  311%)
        HighSloppyPhrase      136.37     (12.9%)      462.38     (21.7%)  
239.1% ( 181% -  314%)
                  IntNRQ      100.48     (14.1%)      391.02     (14.2%)  
289.1% ( 228% -  369%)
{noformat}

SEARCH_NUM_THREADS=64
{noformat}
Report after iter 19:
                    TaskQPS baseline      StdDev  QPS parcol      StdDev        
        Pct diff
                PKLookup      213.67     (23.0%)       11.53      (6.5%)  
-94.6% (-100% -  -84%)
                  Fuzzy1       48.00     (85.5%)       26.33     (74.5%)  
-45.2% (-110% -  789%)
                  Fuzzy2        4.10     (16.8%)        2.92      (9.2%)  
-28.8% ( -46% -   -3%)
                 Respell       15.21    (159.4%)       12.86    (118.6%)  
-15.5% (-113% - -441%)
                 LowTerm     1247.47     (16.6%)     1187.85     (14.8%)   
-4.8% ( -31% -   32%)
                 MedTerm      875.84     (11.7%)     1093.66     (19.5%)   
24.9% (  -5% -   63%)
         LowSloppyPhrase      445.65     (12.3%)      668.59     (58.6%)   
50.0% ( -18% -  137%)
              AndHighLow      429.62     (20.8%)      672.25     (51.2%)   
56.5% ( -12% -  162%)
              AndHighMed      365.37     (18.8%)      609.35     (51.1%)   
66.8% (  -2% -  168%)
               OrHighMed      253.66     (14.4%)      467.54     (68.7%)   
84.3% (   1% -  195%)
               MedPhrase      351.70     (14.3%)      653.30     (31.6%)   
85.8% (  34% -  153%)
               OrHighLow      288.46     (18.2%)      563.37     (36.4%)   
95.3% (  34% -  183%)
               LowPhrase      288.58     (10.5%)      567.36     (35.3%)   
96.6% (  45% -  159%)
             AndHighHigh      245.55     (14.5%)      528.54     (73.8%)  
115.2% (  23% -  238%)
             LowSpanNear      192.64      (7.5%)      440.77     (71.6%)  
128.8% (  46% -  224%)
             MedSpanNear      201.70     (14.4%)      487.17     (62.4%)  
141.5% (  56% -  254%)
                HighTerm      285.68     (10.1%)      716.36     (30.6%)  
150.8% ( 100% -  212%)
              HighPhrase       81.87     (17.0%)      215.48    (136.8%)  
163.2% (   7% -  382%)
        HighSloppyPhrase      111.43     (11.9%)      306.32    (114.9%)  
174.9% (  43% -  342%)
         MedSloppyPhrase       91.36     (15.9%)      257.01    (134.5%)  
181.3% (  26% -  394%)
              OrHighHigh      126.72     (17.9%)      362.42     (40.4%)  
186.0% ( 108% -  297%)
                Wildcard      401.61      (5.8%)     1170.84     (25.2%)  
191.5% ( 151% -  236%)
            HighSpanNear       96.98     (26.3%)      302.07     (77.1%)  
211.5% (  85% -  427%)
                 Prefix3      287.06     (13.3%)      990.66     (43.0%)  
245.1% ( 166% -  347%)
                  IntNRQ      109.19     (13.8%)      429.31     (48.4%)  
293.2% ( 203% -  412%)
{noformat}


was (Author: shikhar):
Just an update that the code rebased against recent trunk lives at 
https://github.com/shikhar/lucene-solr/tree/LUCENE-5299. I've made various 
tweaks, like being able to throttle per-request parallelism in 
{{ParallelSearchStrategy}}.

luceneutil bench numbers when running with ^
  + hacked IndexSearcher constructor that uses {{ParallelSearchStrategy(new 
ForkJoinPool(128), 8)}}
  + luceneutil constants.py SEARCH_NUM_THREADS = 16

Against trunk, on a 32 core (with HT) Sandy Bridge server, with source 
{{wikimedium500k}}

{noformat}
Report after iter 19:
                    TaskQPS baseline      StdDev  QPS parcol      StdDev        
        Pct diff
                  Fuzzy1       81.91     (43.2%)       52.96     (39.7%)  
-35.3% ( -82% -   83%)
                 LowTerm     2550.11     (11.9%)     1927.28      (5.6%)  
-24.4% ( -37% -   -7%)
                 Respell       43.02     (39.4%)       35.23     (31.5%)  
-18.1% ( -63% -   87%)
                  Fuzzy2       19.32     (25.1%)       16.40     (34.8%)  
-15.1% ( -59% -   59%)
                 MedTerm     1679.37     (12.2%)     1743.27      (8.6%)    
3.8% ( -15% -   28%)
                PKLookup      221.58      (8.3%)      257.36     (13.2%)   
16.1% (  -4% -   41%)
              AndHighLow     1027.99     (11.6%)     1278.39     (15.9%)   
24.4% (  -2% -   58%)
              AndHighMed      741.50     (10.0%)     1198.04     (27.5%)   
61.6% (  21% -  110%)
               MedPhrase      709.04     (11.6%)     1203.02     (24.3%)   
69.7% (  30% -  119%)
             LowSpanNear      601.13     (16.9%)     1127.30     (16.7%)   
87.5% (  46% -  145%)
         LowSloppyPhrase      554.87     (10.8%)     1130.25     (30.5%)  
103.7% (  56% -  162%)
               OrHighMed      408.55     (10.4%)      977.56     (20.1%)  
139.3% (  98% -  189%)
               LowPhrase      364.36     (10.8%)      893.27     (41.0%)  
145.2% (  84% -  220%)
               OrHighLow      355.78     (12.7%)      893.63     (19.6%)  
151.2% ( 105% -  210%)
             AndHighHigh      390.73     (10.3%)     1004.70     (24.3%)  
157.1% ( 111% -  213%)
                HighTerm      399.01     (11.8%)     1067.67     (12.1%)  
167.6% ( 128% -  217%)
                Wildcard      754.76     (11.6%)     2067.96     (28.0%)  
174.0% ( 120% -  241%)
            HighSpanNear      153.57     (14.8%)      463.54     (24.3%)  
201.8% ( 141% -  282%)
              OrHighHigh      212.16     (12.4%)      665.56     (28.2%)  
213.7% ( 154% -  290%)
              HighPhrase      170.49     (13.1%)      547.72     (17.3%)  
221.3% ( 168% -  289%)
        HighSloppyPhrase       66.91     (10.1%)      219.59     (12.0%)  
228.2% ( 187% -  278%)
         MedSloppyPhrase      128.73     (12.5%)      425.67     (20.3%)  
230.7% ( 175% -  300%)
             MedSpanNear      130.31     (10.7%)      436.12     (18.2%)  
234.7% ( 185% -  295%)
                 Prefix3      166.91     (14.9%)      652.64     (26.7%)  
291.0% ( 217% -  390%)
                  IntNRQ      110.73     (15.0%)      467.72     (33.6%)  
322.4% ( 238% -  436%)
{noformat}


> Refactor Collector API for parallelism
> --------------------------------------
>
>                 Key: LUCENE-5299
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5299
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Shikhar Bhushan
>         Attachments: LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch, 
> LUCENE-5299.patch, LUCENE-5299.patch, benchmarks.txt
>
>
> h2. Motivation
> We should be able to scale-up better with Solr/Lucene by utilizing multiple 
> CPU cores, and not have to resort to scaling-out by sharding (with all the 
> associated distributed system pitfalls) when the index size does not warrant 
> it.
> Presently, IndexSearcher has an optional constructor arg for an 
> ExecutorService, which gets used for searching in parallel for call paths 
> where one of the TopDocCollector's is created internally. The 
> per-atomic-reader search happens in parallel and then the 
> TopDocs/TopFieldDocs results are merged with locking around the merge bit.
> However there are some problems with this approach:
> * If arbitary Collector args come into play, we can't parallelize. Note that 
> even if ultimately results are going to a TopDocCollector it may be wrapped 
> inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
> * The special-casing with parallelism baked on top does not scale, there are 
> many Collector's that could potentially lend themselves to parallelism, and 
> special-casing means the parallelization has to be re-implemented if a 
> different permutation of collectors is to be used.
> h2. Proposal
> A refactoring of collectors that allows for parallelization at the level of 
> the collection protocol. 
> Some requirements that should guide the implementation:
> * easy migration path for collectors that need to remain serial
> * the parallelization should be composable (when collectors wrap other 
> collectors)
> * allow collectors to pick the optimal solution (e.g. there might be memory 
> tradeoffs to be made) by advising the collector about whether a search will 
> be parallelized, so that the serial use-case is not penalized.
> * encourage use of non-blocking constructs and lock-free parallelism, 
> blocking is not advisable for the hot-spot of a search, besides wasting 
> pooled threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (LUCENE-5299) Refactor Collector API for parallelism

Reply via email to