[ 
https://issues.apache.org/jira/browse/SOLR-6554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14229939#comment-14229939
 ] 

Shalin Shekhar Mangar commented on SOLR-6554:
---------------------------------------------

Actually, the improvements in Overseer for stateFormat=1 (the default case) is 
much better than I expected. After the refactorings, the amILeader calls are 
very infrequent and the speed up is about 40%:

{code}
Overseer queue size: 20000 state requests

stateFormat = 1, With refactoring (trunk)
=========================================

216071 T12 oasc.OverseerTest.testPerformance Overseer loop finished processing: 
216072 T12 oasc.OverseerTest.printTimingStats    totalTime: 201411.465265
216072 T12 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
0.004964922311489345
216073 T12 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 0.0
216073 T12 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 0.0
216073 T12 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
201411.465265
216073 T12 oasc.OverseerTest.printTimingStats    medianRequestTime: 
201411.465265
216073 T12 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 
201411.465265
216074 T12 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
201411.465265
216074 T12 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
201411.465265
216074 T12 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
201411.465265
216075 T12 oasc.OverseerTest.testPerformance op: am_i_leader, success: 2, 
failure: 0
216075 T12 oasc.OverseerTest.printTimingStats    totalTime: 9.377281
216075 T12 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
0.5969575423185497
216075 T12 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
12.529098642264385
216075 T12 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
19.324759776433687
216075 T12 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 4.6886405
216076 T12 oasc.OverseerTest.printTimingStats    medianRequestTime: 4.6886405
216076 T12 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 9.022041
216076 T12 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 9.022041
216076 T12 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 9.022041
216076 T12 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 9.022041
216077 T12 oasc.OverseerTest.testPerformance op: update_state, success: 135, 
failure: 0
216077 T12 oasc.OverseerTest.printTimingStats    totalTime: 61.333751
216077 T12 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
40.31065112174398
216077 T12 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 48.0
216078 T12 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
48.0
216078 T12 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.4543240814814815
216078 T12 oasc.OverseerTest.printTimingStats    medianRequestTime: 0.364217
216078 T12 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 0.409896
216078 T12 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
0.9332719999999994
216079 T12 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
3.576287319999995
216079 T12 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 3.700744
216079 T12 oasc.OverseerTest.testPerformance op: state, success: 20001, 
failure: 0
216081 T12 oasc.OverseerTest.printTimingStats    totalTime: 13344.072646
216081 T12 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
5973.226142698651
216081 T12 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
4437.949777291698
216082 T12 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
3247.958438006491
216082 T12 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.6671702737863107
216083 T12 oasc.OverseerTest.printTimingStats    medianRequestTime: 
0.6112960000000001
216083 T12 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 0.65861125
216083 T12 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 0.9373918
216083 T12 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
1.179823900000002
216083 T12 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
6.713780613000015


stateFormat = 1, Without refactoring (branch_5x):
============================================================================================

354435 T11 oasc.OverseerTest.testPerformance Overseer loop finished processing: 
354437 T11 oasc.OverseerTest.printTimingStats    totalTime: 336777.887
354438 T11 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
0.0029692955509913457
354438 T11 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 0.0
354438 T11 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 0.0
354439 T11 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 336777.887
354439 T11 oasc.OverseerTest.printTimingStats    medianRequestTime: 336777.887
354439 T11 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 336777.887
354440 T11 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 336777.887
354440 T11 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 336777.887
354440 T11 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
336777.887
354441 T11 oasc.OverseerTest.testPerformance op: state, success: 20001, 
failure: 0
354444 T11 oasc.OverseerTest.printTimingStats    totalTime: 13029.408
354444 T11 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
3570.0750281584515
354444 T11 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
3169.209724490217
354445 T11 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
2124.6849108211077
354445 T11 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.6514378281085945
354445 T11 oasc.OverseerTest.printTimingStats    medianRequestTime: 0.59
354446 T11 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 0.633
354446 T11 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
0.8480999999999999
354446 T11 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
0.9995200000000004
354447 T11 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
1.736079000000002
354447 T11 oasc.OverseerTest.testPerformance op: update_state, success: 222, 
failure: 0
354448 T11 oasc.OverseerTest.printTimingStats    totalTime: 98.244
354448 T11 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
39.622607985461286
354448 T11 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 48.0
354448 T11 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
48.0
354449 T11 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.44254054054054054
354449 T11 oasc.OverseerTest.printTimingStats    medianRequestTime: 0.3835
354450 T11 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 0.463
354450 T11 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
0.7994499999999999
354450 T11 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
1.2152900000000026
354451 T11 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 2.452
354451 T11 oasc.OverseerTest.testPerformance op: am_i_leader, success: 223, 
failure: 0
354452 T11 oasc.OverseerTest.printTimingStats    totalTime: 43.33
354453 T11 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
39.777330428482294
354453 T11 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
57.7576718337744
354453 T11 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
65.77963729636123
354453 T11 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.194304932735426
354454 T11 oasc.OverseerTest.printTimingStats    medianRequestTime: 0.149
354454 T11 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 0.188
354454 T11 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
0.25839999999999996
354454 T11 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
0.47591999999999895
354455 T11 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 5.712
{code}

Do not compare these numbers with the last ones because this test was run on a 
different box. Also trunk used jdk1.8.0_25 and branch_5x was run on 
jdk1.7.0_25. I'm running the other tests and I will report back shortly.

> Speed up overseer operations for collections with stateFormat > 1
> -----------------------------------------------------------------
>
>                 Key: SOLR-6554
>                 URL: https://issues.apache.org/jira/browse/SOLR-6554
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>    Affects Versions: 5.0, Trunk
>            Reporter: Shalin Shekhar Mangar
>         Attachments: SOLR-6554-batching-refactor.patch, 
> SOLR-6554-batching-refactor.patch, SOLR-6554-batching-refactor.patch, 
> SOLR-6554-batching-refactor.patch, SOLR-6554.patch, SOLR-6554.patch, 
> SOLR-6554.patch, SOLR-6554.patch, SOLR-6554.patch, SOLR-6554.patch, 
> SOLR-6554.patch, SOLR-6554.patch
>
>
> Right now (after SOLR-5473 was committed), a node watches a collection only 
> if stateFormat=1 or if that node hosts at least one core belonging to that 
> collection.
> This means that a node which is the overseer operates on all collections but 
> watches only a few. So any read goes directly to zookeeper which slows down 
> overseer operations.
> Let's have the overseer node watch all collections always and never remove 
> those watches (except when the collection itself is deleted).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to