[ 
https://issues.apache.org/jira/browse/SOLR-6554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-6554:
----------------------------------------
    Attachment: SOLR-6554.patch

I bit the bullet and refactored the overseer to be less of a mess than it is 
now.
# I grouped the cluster operations into cluster, collection, slice and replica 
and moved them to their own classes. 
# A new class ZkStateWriter is introduced which uses ZkWriteCommand to update 
the clusterstate.
# Each overseer operation returns a ZkWriteCommand
# The force update of cluster state is no longer required inside the main 
overseer loop. It is read once at the start of the loop and then we use ZK 
compare-and-set to update the cluster states using the versions already read.
# The above also means that there is no need to watch every collection with 
stateFormat > 1 on the overseer node anymore.

Todo
# There are some nocommits that need to be taken care of.
# Implement batching for collections with stateFormat > 1
# There are a few newer operations such as balanceSliceUnique which aren't 
implemented yet so some tests fail.
# More cleanup

Here are some performance numbers using the new code vs branch_5x:
{code}
Overseer queue size: 20000 state requests

stateFormat = 1, With refactoring (trunk)
=========================================
250962 T13 oasc.OverseerTest.testPerformance Overseer loop finished processing: 
250964 T13 oasc.OverseerTest.printTimingStats    totalTime: 241639.501565
250964 T13 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
0.0041383582263057345
250965 T13 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 0.0
250965 T13 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 0.0
250965 T13 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
241639.501565
250966 T13 oasc.OverseerTest.printTimingStats    medianRequestTime: 
241639.501565
250966 T13 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 
241639.501565
250966 T13 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
241639.501565
250966 T13 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
241639.501565
250967 T13 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
241639.501565
250967 T13 oasc.OverseerTest.testPerformance op: am_i_leader, success: 163, 
failure: 0
250967 T13 oasc.OverseerTest.printTimingStats    totalTime: 27.778517
250967 T13 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
40.51109030620299
250967 T13 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
60.52909864226439
250968 T13 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
67.32475977643367
250968 T13 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.17042034969325154
250968 T13 oasc.OverseerTest.printTimingStats    medianRequestTime: 0.127852
250968 T13 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 0.159049
250968 T13 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
0.20707859999999995
250968 T13 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
2.9894586799999168
250968 T13 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 6.591979
250968 T13 oasc.OverseerTest.testPerformance op: update_state, success: 161, 
failure: 0
250969 T13 oasc.OverseerTest.printTimingStats    totalTime: 105.56181
250969 T13 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
40.02790612569949
250969 T13 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 48.0
250969 T13 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
48.0
250970 T13 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.6556634161490683
250970 T13 oasc.OverseerTest.printTimingStats    medianRequestTime: 0.57833
250970 T13 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 0.7091475
250970 T13 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
0.9294200000000001
250970 T13 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
3.997095759999999
250970 T13 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 4.075775
250971 T13 oasc.OverseerTest.testPerformance op: state, success: 20001, 
failure: 0
250972 T13 oasc.OverseerTest.printTimingStats    totalTime: 24677.266392
250972 T13 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
4971.795405166019
250972 T13 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
4190.29864341858
250972 T13 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
3199.6744276725544
250972 T13 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
1.2338016295185241
250973 T13 oasc.OverseerTest.printTimingStats    medianRequestTime: 1.1432815
250973 T13 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 1.57099125
250973 T13 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 1.8449494
250973 T13 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
2.155266570000001
250973 T13 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
4.228459292000002


stateFormat = 1, Without refactoring (branch_5x):
============================================================================================

281984 T11 oasc.OverseerTest.testPerformance Overseer loop finished processing: 
281985 T11 oasc.OverseerTest.printTimingStats    totalTime: 256532.804054
281986 T11 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
0.0038981033718983866
281986 T11 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 0.0
281986 T11 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 0.0
281987 T11 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
256532.804054
281987 T11 oasc.OverseerTest.printTimingStats    medianRequestTime: 
256532.804054
281987 T11 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 
256532.804054
281988 T11 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
256532.804054
281988 T11 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
256532.804054
281988 T11 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
256532.804054
281989 T11 oasc.OverseerTest.testPerformance op: state, success: 20001, 
failure: 0
281990 T11 oasc.OverseerTest.printTimingStats    totalTime: 14883.542675
281990 T11 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
4679.183238623399
281990 T11 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
4042.064261836551
281990 T11 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
3041.931602459868
281990 T11 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.7441399267536623
281991 T11 oasc.OverseerTest.printTimingStats    medianRequestTime: 
0.6902204999999999
281991 T11 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 
0.9253057499999999
281991 T11 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
1.0441081499999998
281991 T11 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
1.2398173200000007
281991 T11 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
1.4717788410000001
281991 T11 oasc.OverseerTest.testPerformance op: update_state, success: 171, 
failure: 0
281992 T11 oasc.OverseerTest.printTimingStats    totalTime: 108.786403
281992 T11 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
40.00714188468187
281992 T11 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 48.0
281992 T11 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
48.0
281992 T11 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.6361777953216374
281992 T11 oasc.OverseerTest.printTimingStats    medianRequestTime: 0.615385
281992 T11 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 0.731113
281992 T11 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
0.8640954000000003
281993 T11 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
2.1159375600000034
281993 T11 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 4.361583
281993 T11 oasc.OverseerTest.testPerformance op: am_i_leader, success: 172, 
failure: 0
281993 T11 oasc.OverseerTest.printTimingStats    totalTime: 23.880482
281993 T11 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
40.23061331599097
281993 T11 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
60.11834593132536
281994 T11 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
67.11122870391561
281994 T11 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.13884001162790696
281994 T11 oasc.OverseerTest.printTimingStats    medianRequestTime: 
0.14114549999999998
281994 T11 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 0.1714525
281994 T11 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
0.24251024999999998
281994 T11 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
0.2920699700000001
281994 T11 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 0.303612


stateFormat = 2, 10 collections, With refactoring (trunk):
===========================================================
359321 T13 oasc.OverseerTest.testPerformance Overseer loop finished processing: 
359322 T13 oasc.OverseerTest.printTimingStats    totalTime: 336222.016107
359323 T13 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
0.0029742060387324683
359323 T13 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 0.0
359323 T13 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 0.0
359324 T13 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
336222.016107
359324 T13 oasc.OverseerTest.printTimingStats    medianRequestTime: 
336222.016107
359324 T13 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 
336222.016107
359325 T13 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
336222.016107
359325 T13 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
336222.016107
359325 T13 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
336222.016107
359325 T13 oasc.OverseerTest.testPerformance op: am_i_leader, success: 19898, 
failure: 0
359326 T13 oasc.OverseerTest.printTimingStats    totalTime: 2910.821076
359326 T13 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
3551.0282475508393
359326 T13 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
3007.215001818593
359327 T13 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
1628.5124596704984
359327 T13 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.14628711810232184
359327 T13 oasc.OverseerTest.printTimingStats    medianRequestTime: 0.127051
359327 T13 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 
0.16182850000000001
359327 T13 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
0.22852089999999997
359327 T13 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
0.30072901000000013
359327 T13 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
1.135954577000003
359327 T13 oasc.OverseerTest.testPerformance op: update_state, success: 19896, 
failure: 0
359328 T13 oasc.OverseerTest.printTimingStats    totalTime: 14968.90839
359328 T13 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
3551.180599528965
359329 T13 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
3006.8437316469813
359329 T13 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
1628.3818327713007
359329 T13 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.7523576794330519
359329 T13 oasc.OverseerTest.printTimingStats    medianRequestTime: 0.7057765
359329 T13 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 0.8617165
359329 T13 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 1.03365505
359329 T13 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
1.1740241200000008
359329 T13 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
4.014327310000006
359330 T13 oasc.OverseerTest.testPerformance op: state, success: 20001, 
failure: 0
359330 T13 oasc.OverseerTest.printTimingStats    totalTime: 25670.317603
359330 T13 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
3572.411123496283
359330 T13 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
3451.4741046935114
359331 T13 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
2559.18706643615
359331 T13 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
1.2834517075646217
359331 T13 oasc.OverseerTest.printTimingStats    medianRequestTime: 1.2092995
359331 T13 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 1.59053175
359331 T13 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
1.8313089999999999
359331 T13 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
2.0310246800000007
359331 T13 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
2.4514601340000004


stateFormat = 2, 10 collections, Without refactoring (branch_5x):
=================================================================

408300 T11 oasc.OverseerTest.testPerformance Overseer loop finished processing: 
408302 T11 oasc.OverseerTest.printTimingStats    totalTime: 384185.906373
408302 T11 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
0.0026028952612532213
408302 T11 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 0.0
408302 T11 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 0.0
408303 T11 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
384185.906373
408303 T11 oasc.OverseerTest.printTimingStats    medianRequestTime: 
384185.906373
408303 T11 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 
384185.906373
408303 T11 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
384185.906373
408303 T11 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
384185.906373
408304 T11 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
384185.906373
408304 T11 oasc.OverseerTest.testPerformance op: state, success: 20001, 
failure: 0
408306 T11 oasc.OverseerTest.printTimingStats    totalTime: 37886.743042
408306 T11 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
3127.565044455818
408306 T11 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
3017.721755102994
408307 T11 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
2191.9117215274555
408307 T11 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
1.8942424399780011
408307 T11 oasc.OverseerTest.printTimingStats    medianRequestTime: 1.805034
408307 T11 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 2.33467975
408307 T11 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 2.72064335
408308 T11 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
3.380797200000005
408308 T11 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
6.764998426000004
408308 T11 oasc.OverseerTest.testPerformance op: update_state, success: 20011, 
failure: 0
408310 T11 oasc.OverseerTest.printTimingStats    totalTime: 15664.348411
408310 T11 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
3125.8333680772357
408310 T11 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
3008.770952686416
408311 T11 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
2188.5152728989856
408311 T11 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.782786887761731
408311 T11 oasc.OverseerTest.printTimingStats    medianRequestTime: 0.7298905
408311 T11 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 0.8860805
408311 T11 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
1.0464787999999998
408312 T11 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
1.2055892300000004
408312 T11 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
4.504650318000003
408312 T11 oasc.OverseerTest.testPerformance op: am_i_leader, success: 20013, 
failure: 0
408313 T11 oasc.OverseerTest.printTimingStats    totalTime: 3008.698855
408313 T11 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
3125.585466114684
408313 T11 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
3006.128012707961
408313 T11 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
2187.5621767051953
408314 T11 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.15033722355468945
408314 T11 oasc.OverseerTest.printTimingStats    medianRequestTime: 0.1373615
408314 T11 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 0.17162775
408314 T11 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
0.23810549999999997
408314 T11 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
0.28888714000000004
408314 T11 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
0.36018958200000006


stateFormat = 2, 100 collections, With refactoring (trunk):
===========================================================
353683 T13 oasc.OverseerTest.testPerformance Overseer loop finished processing: 
353685 T13 oasc.OverseerTest.printTimingStats    totalTime: 344294.509037
353686 T13 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
0.0029044719408401407
353686 T13 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 0.0
353686 T13 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 0.0
353687 T13 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
344294.509037
353687 T13 oasc.OverseerTest.printTimingStats    medianRequestTime: 
344294.509037
353687 T13 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 
344294.509037
353687 T13 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
344294.509037
353688 T13 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
344294.509037
353688 T13 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
344294.509037
353688 T13 oasc.OverseerTest.testPerformance op: am_i_leader, success: 19908, 
failure: 0
353690 T13 oasc.OverseerTest.printTimingStats    totalTime: 2899.55199
353690 T13 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
3471.041899147638
353690 T13 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
2696.4544094117955
353690 T13 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
1206.322204620017
353690 T13 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.1456475783604581
353691 T13 oasc.OverseerTest.printTimingStats    medianRequestTime: 0.1307565
353691 T13 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 0.16647275
353691 T13 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
0.24111169999999996
353691 T13 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
0.29323824000000015
353691 T13 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
2.1718927580000065
353691 T13 oasc.OverseerTest.testPerformance op: update_state, success: 19906, 
failure: 0
353693 T13 oasc.OverseerTest.printTimingStats    totalTime: 16055.477415
353693 T13 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
3471.500723620727
353693 T13 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
2706.077181788675
353693 T13 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
1216.7129526167616
353693 T13 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.8065647249572993
353694 T13 oasc.OverseerTest.printTimingStats    medianRequestTime: 0.7490195
353694 T13 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 0.93685775
353694 T13 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 1.1227754
353694 T13 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
1.2775636000000001
353694 T13 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
5.825008328000003
353694 T13 oasc.OverseerTest.testPerformance op: state, success: 20001, 
failure: 0
353695 T13 oasc.OverseerTest.printTimingStats    totalTime: 28055.800755
353695 T13 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
3509.7301625656414
353696 T13 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
3387.1127858679465
353696 T13 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
2517.7854888232214
353696 T13 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
1.4027199017549121
353696 T13 oasc.OverseerTest.printTimingStats    medianRequestTime: 1.283997
353696 T13 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 1.751609
353696 T13 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 1.98747225
353697 T13 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
2.2288728700000013
353697 T13 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
2.677476075


stateFormat = 2, 100 collections, Without refactoring (branch_5x):
==================================================================
642467 T11 oasc.OverseerTest.testPerformance Overseer loop finished processing: 
642468 T11 oasc.OverseerTest.printTimingStats    totalTime: 592456.582081
642471 T11 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
0.0016878842331991438
642471 T11 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 0.0
642471 T11 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 0.0
642471 T11 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
592456.582081
642471 T11 oasc.OverseerTest.printTimingStats    medianRequestTime: 
592456.582081
642471 T11 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 
592456.582081
642471 T11 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
592456.582081
642472 T11 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
592456.582081
642472 T11 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
592456.582081
642472 T11 oasc.OverseerTest.testPerformance op: state, success: 20001, 
failure: 0
642473 T11 oasc.OverseerTest.printTimingStats    totalTime: 38316.042986
642473 T11 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
2039.7101580674453
642473 T11 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
2225.3413952254627
642473 T11 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
1666.2656764044962
642474 T11 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
1.9157063639818008
642474 T11 oasc.OverseerTest.printTimingStats    medianRequestTime: 1.9428355
642474 T11 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 2.363384
642474 T11 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 2.6738654
642474 T11 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
2.9951339600000004
642474 T11 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
4.165273211000003
642474 T11 oasc.OverseerTest.testPerformance op: update_state, success: 20101, 
failure: 0
642475 T11 oasc.OverseerTest.printTimingStats    totalTime: 16713.560223
642476 T11 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
2035.9152283337185
642476 T11 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
2256.420345106015
642476 T11 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
1779.6120090712736
642476 T11 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.8314790419879609
642476 T11 oasc.OverseerTest.printTimingStats    medianRequestTime: 0.800087
642476 T11 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 0.92331125
642476 T11 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 1.08387195
642476 T11 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 
1.1850831800000001
642476 T11 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
1.7994323630000015
642477 T11 oasc.OverseerTest.testPerformance op: am_i_leader, success: 20103, 
failure: 0
642478 T11 oasc.OverseerTest.printTimingStats    totalTime: 3164.627444
642478 T11 oasc.OverseerTest.printTimingStats    avgRequestsPerMinute: 
2035.9212322908609
642478 T11 oasc.OverseerTest.printTimingStats    5minRateRequestsPerMinute: 
2258.1746414754753
642478 T11 oasc.OverseerTest.printTimingStats    15minRateRequestsPerMinute: 
1785.9179009788268
642478 T11 oasc.OverseerTest.printTimingStats    avgTimePerRequest: 
0.15742065582251405
642478 T11 oasc.OverseerTest.printTimingStats    medianRequestTime: 0.141227
642478 T11 oasc.OverseerTest.printTimingStats    75thPctlRequestTime: 0.17663375
642479 T11 oasc.OverseerTest.printTimingStats    95thPctlRequestTime: 
0.23308624999999997
642479 T11 oasc.OverseerTest.printTimingStats    99thPctlRequestTime: 0.31163878
642479 T11 oasc.OverseerTest.printTimingStats    999thPctlRequestTime: 
4.382885935000004
{code}

For the worst case (100 collections with stateFormat=2), processing time went 
down from 592456ms to 344294ms.

The state processing time has gone up in my patch which I'll fix.

> Speed up overseer operations for collections with stateFormat > 1
> -----------------------------------------------------------------
>
>                 Key: SOLR-6554
>                 URL: https://issues.apache.org/jira/browse/SOLR-6554
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>    Affects Versions: 5.0, Trunk
>            Reporter: Shalin Shekhar Mangar
>         Attachments: SOLR-6554.patch
>
>
> Right now (after SOLR-5473 was committed), a node watches a collection only 
> if stateFormat=1 or if that node hosts at least one core belonging to that 
> collection.
> This means that a node which is the overseer operates on all collections but 
> watches only a few. So any read goes directly to zookeeper which slows down 
> overseer operations.
> Let's have the overseer node watch all collections always and never remove 
> those watches (except when the collection itself is deleted).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to