[ https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14271701#comment-14271701 ]
Ariel Weisberg edited comment on CASSANDRA-8457 at 1/9/15 7:08 PM: ------------------------------------------------------------------- Took a stab at writing an adaptive approach to coalescing based on a moving average. Numbers look good for the workloads tested. Code https://github.com/aweisberg/cassandra/compare/6be33289f34782e12229a7621022bb5ce66b2f1b...e48133c4d5acbaa6563ea48a0ca118c278b2f6f7 Testing in AWS, 14 servers 6 clients. Using a fixed coalescing window at low concurrency there is a drop of performance from 6746 to 3929. With adaptive coalescing I got 6758. At medium concurrency (5 threads per client, 6 clients) I got 31097 with coalescing disable and 31120 with coalescing. At high concurrency (500 threads per client, 6 clients) I got 479532 with coalescing and 166010 without. This is with a maximum coalescing window of 200 milliseconds. I added debug output to log when coalescing starts and stops and it's interesting. At the beginning of the benchmark things flap, but they don't flap madly. After a few minutes it settles. I also notice a strange thing where CPU utilization at the start of a benchmark is 500% or so and then after a while it climbs. Like something somewhere is warming up or balancing. I recall seeing this in GCE as well. I had one of the OutboundTcpConnections (first to get the permit) log a trace of all outgoing message times. I threw that into a histogram for informational purposes. 50% of messages are sent within 100 microseconds of each other and 92% are sent within one millisecond. This is without any coalescing. {noformat} Value Percentile TotalCount 1/(1-Percentile) 0.000 0.000000000000 5554 1.00 5.703 0.100000000000 124565 1.11 13.263 0.200000000000 249128 1.25 24.143 0.300000000000 373630 1.43 40.607 0.400000000000 498108 1.67 94.015 0.500000000000 622664 2.00 158.463 0.550000000000 684867 2.22 244.351 0.600000000000 747137 2.50 305.407 0.650000000000 809631 2.86 362.239 0.700000000000 871641 3.33 428.031 0.750000000000 933978 4.00 467.711 0.775000000000 965085 4.44 520.703 0.800000000000 996254 5.00 595.967 0.825000000000 1027359 5.71 672.767 0.850000000000 1058457 6.67 743.935 0.875000000000 1089573 8.00 780.799 0.887500000000 1105290 8.89 821.247 0.900000000000 1120774 10.00 868.351 0.912500000000 1136261 11.43 928.767 0.925000000000 1151889 13.33 1006.079 0.937500000000 1167421 16.00 1049.599 0.943750000000 1175260 17.78 1095.679 0.950000000000 1183041 20.00 1143.807 0.956250000000 1190779 22.86 1198.079 0.962500000000 1198542 26.67 1264.639 0.968750000000 1206301 32.00 1305.599 0.971875000000 1210228 35.56 1354.751 0.975000000000 1214090 40.00 1407.999 0.978125000000 1217975 45.71 1470.463 0.981250000000 1221854 53.33 1542.143 0.984375000000 1225759 64.00 1586.175 0.985937500000 1227720 71.11 1634.303 0.987500000000 1229643 80.00 1688.575 0.989062500000 1231596 91.43 1756.159 0.990625000000 1233523 106.67 1839.103 0.992187500000 1235464 128.00 1887.231 0.992968750000 1236430 142.22 1944.575 0.993750000000 1237409 160.00 2007.039 0.994531250000 1238384 182.86 2084.863 0.995312500000 1239358 213.33 2174.975 0.996093750000 1240326 256.00 2230.271 0.996484375000 1240818 284.44 2293.759 0.996875000000 1241292 320.00 2369.535 0.997265625000 1241785 365.71 2455.551 0.997656250000 1242271 426.67 2578.431 0.998046875000 1242752 512.00 2656.255 0.998242187500 1242999 568.89 2740.223 0.998437500000 1243244 640.00 2834.431 0.998632812500 1243482 731.43 2957.311 0.998828125000 1243725 853.33 3131.391 0.999023437500 1243969 1024.00 3235.839 0.999121093750 1244091 1137.78 3336.191 0.999218750000 1244212 1280.00 3471.359 0.999316406250 1244332 1462.86 3641.343 0.999414062500 1244455 1706.67 3837.951 0.999511718750 1244576 2048.00 4001.791 0.999560546875 1244636 2275.56 4136.959 0.999609375000 1244697 2560.00 4399.103 0.999658203125 1244758 2925.71 4628.479 0.999707031250 1244819 3413.33 5119.999 0.999755859375 1244880 4096.00 5439.487 0.999780273438 1244910 4551.11 5791.743 0.999804687500 1244940 5120.00 6582.271 0.999829101563 1244971 5851.43 7917.567 0.999853515625 1245001 6826.67 10027.007 0.999877929688 1245032 8192.00 11321.343 0.999890136719 1245047 9102.22 12607.487 0.999902343750 1245063 10240.00 14524.415 0.999914550781 1245077 11702.86 15785.983 0.999926757813 1245092 13653.33 16416.767 0.999938964844 1245108 16384.00 16793.599 0.999945068359 1245116 18204.44 17072.127 0.999951171875 1245123 20480.00 17465.343 0.999957275391 1245130 23405.71 18563.071 0.999963378906 1245138 27306.67 30883.839 0.999969482422 1245146 32768.00 33030.143 0.999972534180 1245149 36408.89 33587.199 0.999975585938 1245153 40960.00 35061.759 0.999978637695 1245157 46811.43 36241.407 0.999981689453 1245161 54613.33 37257.215 0.999984741211 1245165 65536.00 37322.751 0.999986267090 1245166 72817.78 37978.111 0.999987792969 1245168 81920.00 40534.015 0.999989318848 1245170 93622.86 47382.527 0.999990844727 1245172 109226.67 53510.143 0.999992370605 1245174 131072.00 54558.719 0.999993133545 1245175 145635.56 62586.879 0.999993896484 1245176 163840.00 63700.991 0.999994659424 1245177 187245.71 70320.127 0.999995422363 1245178 218453.33 107806.719 0.999996185303 1245179 262144.00 107806.719 0.999996566772 1245179 291271.11 1882193.919 0.999996948242 1245180 327680.00 1882193.919 0.999997329712 1245180 374491.43 2202009.599 0.999997711182 1245181 436906.67 2202009.599 0.999998092651 1245181 524288.00 2202009.599 0.999998283386 1245181 582542.22 2875195.391 0.999998474121 1245182 655360.00 2875195.391 0.999998664856 1245182 748982.86 2875195.391 0.999998855591 1245182 873813.33 2875195.391 0.999999046326 1245182 1048576.00 2875195.391 0.999999141693 1245182 1165084.44 148176371.711 0.999999237061 1245183 1310720.00 148176371.711 1.000000000000 1245183 #[Mean = 418.657, StdDeviation = 132779.859] #[Max = 148176371.711, Total count = 1245183] #[Buckets = 53, SubBuckets = 2048] {noformat} The histogram with coalescing enabled is different/better with 75% of messages within 100 microseconds and 99% within 1 millisecond. This is with a maximum coalescing window of 200 microseconds. The actual amount of delay may be greater. That's something I should measure. {noformat} Value Percentile TotalCount 1/(1-Percentile) 0.000 0.000000000000 21883 1.00 3.133 0.100000000000 354051 1.11 7.087 0.200000000000 707886 1.25 12.335 0.300000000000 1061740 1.43 20.015 0.400000000000 1415773 1.67 31.039 0.500000000000 1769759 2.00 38.111 0.550000000000 1946469 2.22 47.519 0.600000000000 2123791 2.50 61.119 0.650000000000 2300427 2.86 80.063 0.700000000000 2477510 3.33 104.703 0.750000000000 2654387 4.00 120.127 0.775000000000 2742690 4.44 138.495 0.800000000000 2831292 5.00 160.767 0.825000000000 2919792 5.71 187.775 0.850000000000 3008289 6.67 221.439 0.875000000000 3096699 8.00 241.535 0.887500000000 3141005 8.89 264.703 0.900000000000 3185423 10.00 291.583 0.912500000000 3229485 11.43 323.839 0.925000000000 3273829 13.33 363.519 0.937500000000 3317946 16.00 386.815 0.943750000000 3340119 17.78 413.183 0.950000000000 3362025 20.00 444.415 0.956250000000 3384278 22.86 481.791 0.962500000000 3406268 26.67 532.479 0.968750000000 3428529 32.00 564.735 0.971875000000 3439421 35.56 603.647 0.975000000000 3450589 40.00 649.215 0.978125000000 3461530 45.71 706.047 0.981250000000 3472600 53.33 781.823 0.984375000000 3483678 64.00 826.367 0.985937500000 3489196 71.11 875.519 0.987500000000 3494765 80.00 935.423 0.989062500000 3500258 91.43 1010.175 0.990625000000 3505780 106.67 1099.775 0.992187500000 3511323 128.00 1153.023 0.992968750000 3514076 142.22 1213.439 0.993750000000 3516840 160.00 1284.095 0.994531250000 3519596 182.86 1369.087 0.995312500000 3522358 213.33 1482.751 0.996093750000 3525127 256.00 1555.455 0.996484375000 3526510 284.44 1637.375 0.996875000000 3527884 320.00 1736.703 0.997265625000 3529268 365.71 1861.631 0.997656250000 3530653 426.67 2024.447 0.998046875000 3532036 512.00 2119.679 0.998242187500 3532728 568.89 2230.271 0.998437500000 3533420 640.00 2373.631 0.998632812500 3534110 731.43 2549.759 0.998828125000 3534799 853.33 2777.087 0.999023437500 3535489 1024.00 2918.399 0.999121093750 3535833 1137.78 3112.959 0.999218750000 3536186 1280.00 3334.143 0.999316406250 3536528 1462.86 3602.431 0.999414062500 3536873 1706.67 3940.351 0.999511718750 3537216 2048.00 4147.199 0.999560546875 3537388 2275.56 4378.623 0.999609375000 3537565 2560.00 4628.479 0.999658203125 3537734 2925.71 4952.063 0.999707031250 3537907 3413.33 5304.319 0.999755859375 3538080 4096.00 5505.023 0.999780273438 3538166 4551.11 5779.455 0.999804687500 3538253 5120.00 6070.271 0.999829101563 3538341 5851.43 6557.695 0.999853515625 3538427 6826.67 7262.207 0.999877929688 3538512 8192.00 7704.575 0.999890136719 3538556 9102.22 8343.551 0.999902343750 3538598 10240.00 9453.567 0.999914550781 3538641 11702.86 11845.631 0.999926757813 3538684 13653.33 15335.423 0.999938964844 3538728 16384.00 16941.055 0.999945068359 3538750 18204.44 17498.111 0.999951171875 3538771 20480.00 17956.863 0.999957275391 3538792 23405.71 18776.063 0.999963378906 3538814 27306.67 19726.335 0.999969482422 3538837 32768.00 20004.863 0.999972534180 3538846 36408.89 20889.599 0.999975585938 3538857 40960.00 21544.959 0.999978637695 3538868 46811.43 22593.535 0.999981689453 3538879 54613.33 23969.791 0.999984741211 3538890 65536.00 25804.799 0.999986267090 3538895 72817.78 28983.295 0.999987792969 3538900 81920.00 32522.239 0.999989318848 3538906 93622.86 34242.559 0.999990844727 3538911 109226.67 35651.583 0.999992370605 3538917 131072.00 36601.855 0.999993133545 3538919 145635.56 36962.303 0.999993896484 3538922 163840.00 37453.823 0.999994659424 3538925 187245.71 38240.255 0.999995422363 3538927 218453.33 39288.831 0.999996185303 3538930 262144.00 42303.487 0.999996566772 3538931 291271.11 46891.007 0.999996948242 3538933 327680.00 47546.367 0.999997329712 3538934 374491.43 49414.143 0.999997711182 3538935 436906.67 60096.511 0.999998092651 3538937 524288.00 60096.511 0.999998283386 3538937 582542.22 65044.479 0.999998474121 3538938 655360.00 65110.015 0.999998664856 3538939 748982.86 65110.015 0.999998855591 3538939 873813.33 1919942.655 0.999999046326 3538940 1048576.00 1919942.655 0.999999141693 3538940 1165084.44 2202009.599 0.999999237061 3538941 1310720.00 2202009.599 0.999999332428 3538941 1497965.71 2202009.599 0.999999427795 3538941 1747626.67 3147825.151 0.999999523163 3538942 2097152.00 3147825.151 0.999999570847 3538942 2330168.89 3147825.151 0.999999618530 3538942 2621440.00 3147825.151 0.999999666214 3538942 2995931.43 3147825.151 0.999999713898 3538942 3495253.33 186562641.919 0.999999761581 3538943 4194304.00 186562641.919 1.000000000000 3538943 #[Mean = 158.540, StdDeviation = 99162.712] #[Max = 186562641.919, Total count = 3538943] #[Buckets = 53, SubBuckets = 2048] {noformat} was (Author: aweisberg): Took a stab at writing an adaptive approach to coalescing based on a moving average. Numbers look good for the workloads tested. Code https://github.com/aweisberg/cassandra/compare/6be33289f34782e12229a7621022bb5ce66b2f1b...e48133c4d5acbaa6563ea48a0ca118c278b2f6f7 Testing in AWS, 14 servers 6 clients. Using a fixed coalescing window at low concurrency there is a drop of performance from 6746 to 3929. With adaptive coalescing I got 6758. At medium concurrency (5 threads per client, 6 clients) I got 31097 with coalescing disable and 31120 with coalescing. At high concurrency (500 threads per client, 6 clients) I got 479532 with coalescing and 166010 without. This is with a maximum coalescing window of 200 milliseconds. I added debug output to log when coalescing starts and stops and it's interesting. At the beginning of the benchmark things flap, but they don't flap madly. After a few minutes it settles. I also notice a strange thing where CPU utilization at the start of a benchmark is 500% or so and then after a while it climbs. Like something somewhere is warming up or balancing. I recall seeing this in GCE as well. I had one of the OutboundTcpConnections (first to get the permit) log a trace of all outgoing message times. I threw that into a histogram for informational purposes. 50% of messages are sent within 100 microseconds of each other and 92% are sent within one millisecond. This is without any coalescing. {noformat} Value Percentile TotalCount 1/(1-Percentile) 0.000 0.000000000000 5554 1.00 5.703 0.100000000000 124565 1.11 13.263 0.200000000000 249128 1.25 24.143 0.300000000000 373630 1.43 40.607 0.400000000000 498108 1.67 94.015 0.500000000000 622664 2.00 158.463 0.550000000000 684867 2.22 244.351 0.600000000000 747137 2.50 305.407 0.650000000000 809631 2.86 362.239 0.700000000000 871641 3.33 428.031 0.750000000000 933978 4.00 467.711 0.775000000000 965085 4.44 520.703 0.800000000000 996254 5.00 595.967 0.825000000000 1027359 5.71 672.767 0.850000000000 1058457 6.67 743.935 0.875000000000 1089573 8.00 780.799 0.887500000000 1105290 8.89 821.247 0.900000000000 1120774 10.00 868.351 0.912500000000 1136261 11.43 928.767 0.925000000000 1151889 13.33 1006.079 0.937500000000 1167421 16.00 1049.599 0.943750000000 1175260 17.78 1095.679 0.950000000000 1183041 20.00 1143.807 0.956250000000 1190779 22.86 1198.079 0.962500000000 1198542 26.67 1264.639 0.968750000000 1206301 32.00 1305.599 0.971875000000 1210228 35.56 1354.751 0.975000000000 1214090 40.00 1407.999 0.978125000000 1217975 45.71 1470.463 0.981250000000 1221854 53.33 1542.143 0.984375000000 1225759 64.00 1586.175 0.985937500000 1227720 71.11 1634.303 0.987500000000 1229643 80.00 1688.575 0.989062500000 1231596 91.43 1756.159 0.990625000000 1233523 106.67 1839.103 0.992187500000 1235464 128.00 1887.231 0.992968750000 1236430 142.22 1944.575 0.993750000000 1237409 160.00 2007.039 0.994531250000 1238384 182.86 2084.863 0.995312500000 1239358 213.33 2174.975 0.996093750000 1240326 256.00 2230.271 0.996484375000 1240818 284.44 2293.759 0.996875000000 1241292 320.00 2369.535 0.997265625000 1241785 365.71 2455.551 0.997656250000 1242271 426.67 2578.431 0.998046875000 1242752 512.00 2656.255 0.998242187500 1242999 568.89 2740.223 0.998437500000 1243244 640.00 2834.431 0.998632812500 1243482 731.43 2957.311 0.998828125000 1243725 853.33 3131.391 0.999023437500 1243969 1024.00 3235.839 0.999121093750 1244091 1137.78 3336.191 0.999218750000 1244212 1280.00 3471.359 0.999316406250 1244332 1462.86 3641.343 0.999414062500 1244455 1706.67 3837.951 0.999511718750 1244576 2048.00 4001.791 0.999560546875 1244636 2275.56 4136.959 0.999609375000 1244697 2560.00 4399.103 0.999658203125 1244758 2925.71 4628.479 0.999707031250 1244819 3413.33 5119.999 0.999755859375 1244880 4096.00 5439.487 0.999780273438 1244910 4551.11 5791.743 0.999804687500 1244940 5120.00 6582.271 0.999829101563 1244971 5851.43 7917.567 0.999853515625 1245001 6826.67 10027.007 0.999877929688 1245032 8192.00 11321.343 0.999890136719 1245047 9102.22 12607.487 0.999902343750 1245063 10240.00 14524.415 0.999914550781 1245077 11702.86 15785.983 0.999926757813 1245092 13653.33 16416.767 0.999938964844 1245108 16384.00 16793.599 0.999945068359 1245116 18204.44 17072.127 0.999951171875 1245123 20480.00 17465.343 0.999957275391 1245130 23405.71 18563.071 0.999963378906 1245138 27306.67 30883.839 0.999969482422 1245146 32768.00 33030.143 0.999972534180 1245149 36408.89 33587.199 0.999975585938 1245153 40960.00 35061.759 0.999978637695 1245157 46811.43 36241.407 0.999981689453 1245161 54613.33 37257.215 0.999984741211 1245165 65536.00 37322.751 0.999986267090 1245166 72817.78 37978.111 0.999987792969 1245168 81920.00 40534.015 0.999989318848 1245170 93622.86 47382.527 0.999990844727 1245172 109226.67 53510.143 0.999992370605 1245174 131072.00 54558.719 0.999993133545 1245175 145635.56 62586.879 0.999993896484 1245176 163840.00 63700.991 0.999994659424 1245177 187245.71 70320.127 0.999995422363 1245178 218453.33 107806.719 0.999996185303 1245179 262144.00 107806.719 0.999996566772 1245179 291271.11 1882193.919 0.999996948242 1245180 327680.00 1882193.919 0.999997329712 1245180 374491.43 2202009.599 0.999997711182 1245181 436906.67 2202009.599 0.999998092651 1245181 524288.00 2202009.599 0.999998283386 1245181 582542.22 2875195.391 0.999998474121 1245182 655360.00 2875195.391 0.999998664856 1245182 748982.86 2875195.391 0.999998855591 1245182 873813.33 2875195.391 0.999999046326 1245182 1048576.00 2875195.391 0.999999141693 1245182 1165084.44 148176371.711 0.999999237061 1245183 1310720.00 148176371.711 1.000000000000 1245183 #[Mean = 418.657, StdDeviation = 132779.859] #[Max = 148176371.711, Total count = 1245183] #[Buckets = 53, SubBuckets = 2048] {noformat} > nio MessagingService > -------------------- > > Key: CASSANDRA-8457 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8457 > Project: Cassandra > Issue Type: New Feature > Components: Core > Reporter: Jonathan Ellis > Assignee: Ariel Weisberg > Labels: performance > Fix For: 3.0 > > > Thread-per-peer (actually two each incoming and outbound) is a big > contributor to context switching, especially for larger clusters. Let's look > at switching to nio, possibly via Netty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)