[ 
https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061499#comment-14061499
 ] 

graham sanderson commented on CASSANDRA-7546:
---------------------------------------------

I have attached some code and a pseudo test which compares new and old behavior 
with a simulation of writing hints to a single partition

Here is the output on a fast 16 core box

{code}
    [junit] --------------------------------------------------
    [junit] 1 THREAD; ELEMENT SIZE 64
    [junit] 
    [junit] Threads = 1 elements = 100000 (of size 64) partitions = 1
    [junit]  original code:
    [junit]   Duration = 1015ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 35 ms for 3 collections
    [junit]   Approx allocation = 564MB vs 8MB; ratio to raw data size = 
70.47914095238096
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 849ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 32 ms for 3 collections
    [junit]   Approx allocation = 587MB vs 8MB; ratio to raw data size = 
73.31190857142857
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit] 
    [junit] Threads = 1 elements = 100000 (of size 64) partitions = 16
    [junit]  original code:
    [junit]   Duration = 623ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 22 ms for 2 collections
    [junit]   Approx allocation = 446MB vs 8MB; ratio to raw data size = 
55.77215714285714
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 564ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 22 ms for 2 collections
    [junit]   Approx allocation = 481MB vs 8MB; ratio to raw data size = 
60.09202095238095
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit] 
    [junit] Threads = 1 elements = 100000 (of size 64) partitions = 256
    [junit]  original code:
    [junit]   Duration = 436ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 9 ms for 1 collections
    [junit]   Approx allocation = 331MB vs 8MB; ratio to raw data size = 
41.34096380952381
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 403ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 10 ms for 1 collections
    [junit]   Approx allocation = 348MB vs 8MB; ratio to raw data size = 
43.445909523809526
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit] 
    [junit] Threads = 1 elements = 100000 (of size 64) partitions = 1024
    [junit]  original code:
    [junit]   Duration = 333ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 11 ms for 1 collections
    [junit]   Approx allocation = 274MB vs 8MB; ratio to raw data size = 
34.251781904761906
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 333ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 11 ms for 1 collections
    [junit]   Approx allocation = 285MB vs 8MB; ratio to raw data size = 
35.67829714285714
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit] 
    [junit] --------------------------------------------------
    [junit] 100 THREADS; ELEMENT SIZE 64
    [junit] 
    [junit] Threads = 100 elements = 100000 (of size 64) partitions = 1
    [junit]  original code:
    [junit]   Duration = 1730ms maxConcurrency = 100
    [junit]   GC for PS Scavenge: 99 ms for 30 collections
    [junit]   Approx allocation = 9842MB vs 8MB; ratio to raw data size = 
1228.6645866666668
    [junit]   loopRatio (closest to 1 best) 17.41481 raw 100000/1741481 counted 
0/0 sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 1300ms maxConcurrency = 100
    [junit]   GC for PS Scavenge: 16 ms for 1 collections
    [junit]   Approx allocation = 561MB vs 8MB; ratio to raw data size = 
70.0673819047619
    [junit]   loopRatio (closest to 1 best) 1.00004 raw 258/260 counted 2/2 
sync 99741/99742 up 1 down 1
    [junit] 
    [junit] 
    [junit] Threads = 100 elements = 100000 (of size 64) partitions = 16
    [junit]  original code:
    [junit]   Duration = 215ms maxConcurrency = 100
    [junit]   GC for PS Scavenge: 24 ms for 2 collections
    [junit]   Approx allocation = 763MB vs 8MB; ratio to raw data size = 
95.24857523809524
    [junit]   loopRatio (closest to 1 best) 1.88702 raw 100000/188702 counted 
0/0 sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 208ms maxConcurrency = 100
    [junit]   GC for PS Scavenge: 9 ms for 1 collections
    [junit]   Approx allocation = 560MB vs 8MB; ratio to raw data size = 
69.98852571428571
    [junit]   loopRatio (closest to 1 best) 1.32446 raw 50845/67230 counted 
17730/18424 sync 40221/46792 up 10636 down 10329
    [junit] 
    [junit] 
    [junit] Threads = 100 elements = 100000 (of size 64) partitions = 256
    [junit]  original code:
    [junit]   Duration = 180ms maxConcurrency = 97
    [junit]   GC for PS Scavenge: 14 ms for 1 collections
    [junit]   Approx allocation = 328MB vs 8MB; ratio to raw data size = 
41.03978761904762
    [junit]   loopRatio (closest to 1 best) 1.01959 raw 100000/101959 counted 
0/0 sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 183ms maxConcurrency = 95
    [junit]   GC for PS Scavenge: 12 ms for 1 collections
    [junit]   Approx allocation = 338MB vs 8MB; ratio to raw data size = 
42.207682857142856
    [junit]   loopRatio (closest to 1 best) 1.01961 raw 98172/100033 counted 
1852/1855 sync 41/73 up 1825 down 1818
    [junit] 
    [junit] 
    [junit] Threads = 100 elements = 100000 (of size 64) partitions = 1024
    [junit]  original code:
    [junit]   Duration = 180ms maxConcurrency = 96
    [junit]   GC for PS Scavenge: 13 ms for 1 collections
    [junit]   Approx allocation = 274MB vs 8MB; ratio to raw data size = 
34.29566095238095
    [junit]   loopRatio (closest to 1 best) 1.00353 raw 100000/100353 counted 
0/0 sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 179ms maxConcurrency = 100
    [junit]   GC for PS Scavenge: 13 ms for 1 collections
    [junit]   Approx allocation = 285MB vs 8MB; ratio to raw data size = 
35.591366666666666
    [junit]   loopRatio (closest to 1 best) 1.00391 raw 99609/99998 counted 
389/389 sync 2/4 up 388 down 387
    [junit] 
    [junit] 
    [junit] --------------------------------------------------
    [junit] 1 THREAD; ELEMENT SIZE 256
    [junit] 
    [junit] Threads = 1 elements = 100000 (of size 256) partitions = 1
    [junit]  original code:
    [junit]   Duration = 960ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 29 ms for 2 collections
    [junit]   Approx allocation = 564MB vs 26MB; ratio to raw data size = 
21.439910434782607
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 976ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 27 ms for 2 collections
    [junit]   Approx allocation = 560MB vs 26MB; ratio to raw data size = 
21.31138724637681
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit] 
    [junit] Threads = 1 elements = 100000 (of size 256) partitions = 16
    [junit]  original code:
    [junit]   Duration = 673ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 9 ms for 1 collections
    [junit]   Approx allocation = 453MB vs 26MB; ratio to raw data size = 
17.215506086956523
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 589ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 10 ms for 1 collections
    [junit]   Approx allocation = 455MB vs 26MB; ratio to raw data size = 
17.307048695652174
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit] 
    [junit] Threads = 1 elements = 100000 (of size 256) partitions = 256
    [junit]  original code:
    [junit]   Duration = 238ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 10 ms for 1 collections
    [junit]   Approx allocation = 342MB vs 26MB; ratio to raw data size = 
12.99539536231884
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 271ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 10 ms for 1 collections
    [junit]   Approx allocation = 341MB vs 26MB; ratio to raw data size = 
12.96123536231884
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit] 
    [junit] Threads = 1 elements = 100000 (of size 256) partitions = 1024
    [junit]  original code:
    [junit]   Duration = 233ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 10 ms for 1 collections
    [junit]   Approx allocation = 284MB vs 26MB; ratio to raw data size = 
10.803777971014492
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 230ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 11 ms for 1 collections
    [junit]   Approx allocation = 284MB vs 26MB; ratio to raw data size = 
10.822605507246378
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit] 
    [junit] --------------------------------------------------
    [junit] 100 THREADS; ELEMENT SIZE 256
    [junit] 
    [junit] Threads = 100 elements = 100000 (of size 256) partitions = 1
    [junit]  original code:
    [junit]   Duration = 1996ms maxConcurrency = 100
    [junit]   GC for PS Scavenge: 120 ms for 32 collections
    [junit]   Approx allocation = 10206MB vs 26MB; ratio to raw data size = 
387.7607713043478
    [junit]   loopRatio (closest to 1 best) 17.4912 raw 100000/1749120 counted 
0/0 sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 1532ms maxConcurrency = 100
    [junit]   GC for PS Scavenge: 15 ms for 1 collections
    [junit]   Approx allocation = 581MB vs 26MB; ratio to raw data size = 
22.088286666666665
    [junit]   loopRatio (closest to 1 best) 1.00004 raw 341/343 counted 2/2 
sync 99658/99659 up 1 down 1
    [junit] 
    [junit] 
    [junit] Threads = 100 elements = 100000 (of size 256) partitions = 16
    [junit]  original code:
    [junit]   Duration = 220ms maxConcurrency = 100
    [junit]   GC for PS Scavenge: 24 ms for 2 collections
    [junit]   Approx allocation = 770MB vs 26MB; ratio to raw data size = 
29.258727826086957
    [junit]   loopRatio (closest to 1 best) 1.87623 raw 100000/187623 counted 
0/0 sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 216ms maxConcurrency = 98
    [junit]   GC for PS Scavenge: 28 ms for 2 collections
    [junit]   Approx allocation = 581MB vs 26MB; ratio to raw data size = 
22.077911884057972
    [junit]   loopRatio (closest to 1 best) 1.33551 raw 52282/69043 counted 
18308/19001 sync 38617/45507 up 10826 down 10513
    [junit] 
    [junit] 
    [junit] Threads = 100 elements = 100000 (of size 256) partitions = 256
    [junit]  original code:
    [junit]   Duration = 182ms maxConcurrency = 100
    [junit]   GC for PS Scavenge: 13 ms for 1 collections
    [junit]   Approx allocation = 361MB vs 26MB; ratio to raw data size = 
13.740559130434782
    [junit]   loopRatio (closest to 1 best) 1.01958 raw 100000/101958 counted 
0/0 sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 181ms maxConcurrency = 98
    [junit]   GC for PS Scavenge: 12 ms for 1 collections
    [junit]   Approx allocation = 361MB vs 26MB; ratio to raw data size = 
13.729368985507246
    [junit]   loopRatio (closest to 1 best) 1.01977 raw 98122/100015 counted 
1886/1891 sync 39/71 up 1857 down 1853
    [junit] 
    [junit] 
    [junit] Threads = 100 elements = 100000 (of size 256) partitions = 1024
    [junit]  original code:
    [junit]   Duration = 181ms maxConcurrency = 99
    [junit]   GC for PS Scavenge: 14 ms for 1 collections
    [junit]   Approx allocation = 303MB vs 26MB; ratio to raw data size = 
11.513563768115942
    [junit]   loopRatio (closest to 1 best) 1.00402 raw 100000/100402 counted 
0/0 sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 205ms maxConcurrency = 96
    [junit]   GC for PS Scavenge: 31 ms for 1 collections
    [junit]   Approx allocation = 302MB vs 26MB; ratio to raw data size = 
11.490827826086957
    [junit]   loopRatio (closest to 1 best) 1.00394 raw 99610/100003 counted 
389/389 sync 1/2 up 392 down 388
    [junit] 
    [junit] 
    [junit] --------------------------------------------------
    [junit] 1 THREAD; ELEMENT SIZE 1024
    [junit] 
    [junit] Threads = 1 elements = 100000 (of size 1024) partitions = 1
    [junit]  original code:
    [junit]   Duration = 832ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 22 ms for 2 collections
    [junit]   Approx allocation = 591MB vs 99MB; ratio to raw data size = 
5.942641762452107
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 391ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 30 ms for 3 collections
    [junit]   Approx allocation = 631MB vs 99MB; ratio to raw data size = 
6.343681455938698
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit] 
    [junit] Threads = 1 elements = 100000 (of size 1024) partitions = 16
    [junit]  original code:
    [junit]   Duration = 896ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 27 ms for 2 collections
    [junit]   Approx allocation = 519MB vs 99MB; ratio to raw data size = 
5.215628199233716
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 321ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 20 ms for 2 collections
    [junit]   Approx allocation = 552MB vs 99MB; ratio to raw data size = 
5.5507787739463605
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit] 
    [junit] Threads = 1 elements = 100000 (of size 1024) partitions = 256
    [junit]  original code:
    [junit]   Duration = 312ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 23 ms for 2 collections
    [junit]   Approx allocation = 415MB vs 99MB; ratio to raw data size = 
4.177359923371648
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 293ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 23 ms for 2 collections
    [junit]   Approx allocation = 399MB vs 99MB; ratio to raw data size = 
4.008342911877395
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit] 
    [junit] Threads = 1 elements = 100000 (of size 1024) partitions = 1024
    [junit]  original code:
    [junit]   Duration = 268ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 23 ms for 2 collections
    [junit]   Approx allocation = 354MB vs 99MB; ratio to raw data size = 
3.560803908045977
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 278ms maxConcurrency = 1
    [junit]   GC for PS Scavenge: 26 ms for 2 collections
    [junit]   Approx allocation = 351MB vs 99MB; ratio to raw data size = 
3.5285575478927202
    [junit]   loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0 
sync 0/0 up 0 down 0
    [junit] 
    [junit] 
    [junit] --------------------------------------------------
    [junit] 100 THREADS; ELEMENT SIZE 1024
    [junit] 
    [junit] Threads = 100 elements = 100000 (of size 1024) partitions = 1
    [junit]  original code:
    [junit]   Duration = 2377ms maxConcurrency = 100
    [junit]   GC for PS Scavenge: 143 ms for 38 collections
    [junit]   Approx allocation = 12034MB vs 99MB; ratio to raw data size = 
120.87394314176245
    [junit]   loopRatio (closest to 1 best) 18.17784 raw 100000/1817784 counted 
0/0 sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 1305ms maxConcurrency = 100
    [junit]   GC for PS Scavenge: 32 ms for 2 collections
    [junit]   Approx allocation = 486MB vs 99MB; ratio to raw data size = 
4.88428275862069
    [junit]   loopRatio (closest to 1 best) 1.00009 raw 173/180 counted 2/2 
sync 99826/99827 up 1 down 1
    [junit] 
    [junit] 
    [junit] Threads = 100 elements = 100000 (of size 1024) partitions = 16
    [junit]  original code:
    [junit]   Duration = 225ms maxConcurrency = 100
    [junit]   GC for PS Scavenge: 34 ms for 3 collections
    [junit]   Approx allocation = 853MB vs 99MB; ratio to raw data size = 
8.572829348659004
    [junit]   loopRatio (closest to 1 best) 1.89489 raw 100000/189489 counted 
0/0 sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 231ms maxConcurrency = 99
    [junit]   GC for PS Scavenge: 42 ms for 3 collections
    [junit]   Approx allocation = 631MB vs 99MB; ratio to raw data size = 
6.3455461302681995
    [junit]   loopRatio (closest to 1 best) 1.33684 raw 50954/67619 counted 
18453/19130 sync 39872/46935 up 10747 down 10462
    [junit] 
    [junit] 
    [junit] Threads = 100 elements = 100000 (of size 1024) partitions = 256
    [junit]  original code:
    [junit]   Duration = 202ms maxConcurrency = 95
    [junit]   GC for PS Scavenge: 32 ms for 2 collections
    [junit]   Approx allocation = 411MB vs 99MB; ratio to raw data size = 
4.134808582375479
    [junit]   loopRatio (closest to 1 best) 1.01988 raw 100000/101988 counted 
0/0 sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 219ms maxConcurrency = 100
    [junit]   GC for PS Scavenge: 35 ms for 2 collections
    [junit]   Approx allocation = 408MB vs 99MB; ratio to raw data size = 
4.102220459770115
    [junit]   loopRatio (closest to 1 best) 1.01909 raw 98202/100024 counted 
1815/1819 sync 38/66 up 1789 down 1785
    [junit] 
    [junit] 
    [junit] Threads = 100 elements = 100000 (of size 1024) partitions = 1024
    [junit]  original code:
    [junit]   Duration = 202ms maxConcurrency = 96
    [junit]   GC for PS Scavenge: 31 ms for 2 collections
    [junit]   Approx allocation = 368MB vs 99MB; ratio to raw data size = 
3.699853869731801
    [junit]   loopRatio (closest to 1 best) 1.0039 raw 100000/100390 counted 
0/0 sync 0/0 up 0 down 0
    [junit] 
    [junit]  modified code: 
    [junit]   Duration = 206ms maxConcurrency = 95
    [junit]   GC for PS Scavenge: 30 ms for 2 collections
    [junit]   Approx allocation = 360MB vs 99MB; ratio to raw data size = 
3.6165709578544063
    [junit]   loopRatio (closest to 1 best) 1.0037 raw 99638/100005 counted 
363/363 sync 1/2 up 367 down 362
    [junit] 
    [junit] 
    [junit] ==================================================
{code}

> AtomicSortedColumns.addAllWithSizeDelta has a spin lock that allocates memory
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7546
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7546
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: graham sanderson
>         Attachments: suggestion1.txt
>
>
> In order to preserve atomicity, this code attempts to read, clone/update, 
> then CAS the state of the partition.
> Under heavy contention for updating a single partition this can cause some 
> fairly staggering memory growth (the more cores on your machine the worst it 
> gets).
> Whilst many usage patterns don't do highly concurrent updates to the same 
> partition, hinting today, does, and in this case wild (order(s) of magnitude 
> more than expected) memory allocation rates can be seen (especially when the 
> updates being hinted are small updates to different partitions which can 
> happen very fast on their own) - see CASSANDRA-7545
> It would be best to eliminate/reduce/limit the spinning memory allocation 
> whilst not slowing down the very common un-contended case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to