[
https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061499#comment-14061499
]
graham sanderson commented on CASSANDRA-7546:
---------------------------------------------
I have attached some code and a pseudo test which compares new and old behavior
with a simulation of writing hints to a single partition
Here is the output on a fast 16 core box
{code}
[junit] --------------------------------------------------
[junit] 1 THREAD; ELEMENT SIZE 64
[junit]
[junit] Threads = 1 elements = 100000 (of size 64) partitions = 1
[junit] original code:
[junit] Duration = 1015ms maxConcurrency = 1
[junit] GC for PS Scavenge: 35 ms for 3 collections
[junit] Approx allocation = 564MB vs 8MB; ratio to raw data size =
70.47914095238096
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 849ms maxConcurrency = 1
[junit] GC for PS Scavenge: 32 ms for 3 collections
[junit] Approx allocation = 587MB vs 8MB; ratio to raw data size =
73.31190857142857
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit]
[junit] Threads = 1 elements = 100000 (of size 64) partitions = 16
[junit] original code:
[junit] Duration = 623ms maxConcurrency = 1
[junit] GC for PS Scavenge: 22 ms for 2 collections
[junit] Approx allocation = 446MB vs 8MB; ratio to raw data size =
55.77215714285714
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 564ms maxConcurrency = 1
[junit] GC for PS Scavenge: 22 ms for 2 collections
[junit] Approx allocation = 481MB vs 8MB; ratio to raw data size =
60.09202095238095
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit]
[junit] Threads = 1 elements = 100000 (of size 64) partitions = 256
[junit] original code:
[junit] Duration = 436ms maxConcurrency = 1
[junit] GC for PS Scavenge: 9 ms for 1 collections
[junit] Approx allocation = 331MB vs 8MB; ratio to raw data size =
41.34096380952381
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 403ms maxConcurrency = 1
[junit] GC for PS Scavenge: 10 ms for 1 collections
[junit] Approx allocation = 348MB vs 8MB; ratio to raw data size =
43.445909523809526
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit]
[junit] Threads = 1 elements = 100000 (of size 64) partitions = 1024
[junit] original code:
[junit] Duration = 333ms maxConcurrency = 1
[junit] GC for PS Scavenge: 11 ms for 1 collections
[junit] Approx allocation = 274MB vs 8MB; ratio to raw data size =
34.251781904761906
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 333ms maxConcurrency = 1
[junit] GC for PS Scavenge: 11 ms for 1 collections
[junit] Approx allocation = 285MB vs 8MB; ratio to raw data size =
35.67829714285714
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit]
[junit] --------------------------------------------------
[junit] 100 THREADS; ELEMENT SIZE 64
[junit]
[junit] Threads = 100 elements = 100000 (of size 64) partitions = 1
[junit] original code:
[junit] Duration = 1730ms maxConcurrency = 100
[junit] GC for PS Scavenge: 99 ms for 30 collections
[junit] Approx allocation = 9842MB vs 8MB; ratio to raw data size =
1228.6645866666668
[junit] loopRatio (closest to 1 best) 17.41481 raw 100000/1741481 counted
0/0 sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 1300ms maxConcurrency = 100
[junit] GC for PS Scavenge: 16 ms for 1 collections
[junit] Approx allocation = 561MB vs 8MB; ratio to raw data size =
70.0673819047619
[junit] loopRatio (closest to 1 best) 1.00004 raw 258/260 counted 2/2
sync 99741/99742 up 1 down 1
[junit]
[junit]
[junit] Threads = 100 elements = 100000 (of size 64) partitions = 16
[junit] original code:
[junit] Duration = 215ms maxConcurrency = 100
[junit] GC for PS Scavenge: 24 ms for 2 collections
[junit] Approx allocation = 763MB vs 8MB; ratio to raw data size =
95.24857523809524
[junit] loopRatio (closest to 1 best) 1.88702 raw 100000/188702 counted
0/0 sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 208ms maxConcurrency = 100
[junit] GC for PS Scavenge: 9 ms for 1 collections
[junit] Approx allocation = 560MB vs 8MB; ratio to raw data size =
69.98852571428571
[junit] loopRatio (closest to 1 best) 1.32446 raw 50845/67230 counted
17730/18424 sync 40221/46792 up 10636 down 10329
[junit]
[junit]
[junit] Threads = 100 elements = 100000 (of size 64) partitions = 256
[junit] original code:
[junit] Duration = 180ms maxConcurrency = 97
[junit] GC for PS Scavenge: 14 ms for 1 collections
[junit] Approx allocation = 328MB vs 8MB; ratio to raw data size =
41.03978761904762
[junit] loopRatio (closest to 1 best) 1.01959 raw 100000/101959 counted
0/0 sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 183ms maxConcurrency = 95
[junit] GC for PS Scavenge: 12 ms for 1 collections
[junit] Approx allocation = 338MB vs 8MB; ratio to raw data size =
42.207682857142856
[junit] loopRatio (closest to 1 best) 1.01961 raw 98172/100033 counted
1852/1855 sync 41/73 up 1825 down 1818
[junit]
[junit]
[junit] Threads = 100 elements = 100000 (of size 64) partitions = 1024
[junit] original code:
[junit] Duration = 180ms maxConcurrency = 96
[junit] GC for PS Scavenge: 13 ms for 1 collections
[junit] Approx allocation = 274MB vs 8MB; ratio to raw data size =
34.29566095238095
[junit] loopRatio (closest to 1 best) 1.00353 raw 100000/100353 counted
0/0 sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 179ms maxConcurrency = 100
[junit] GC for PS Scavenge: 13 ms for 1 collections
[junit] Approx allocation = 285MB vs 8MB; ratio to raw data size =
35.591366666666666
[junit] loopRatio (closest to 1 best) 1.00391 raw 99609/99998 counted
389/389 sync 2/4 up 388 down 387
[junit]
[junit]
[junit] --------------------------------------------------
[junit] 1 THREAD; ELEMENT SIZE 256
[junit]
[junit] Threads = 1 elements = 100000 (of size 256) partitions = 1
[junit] original code:
[junit] Duration = 960ms maxConcurrency = 1
[junit] GC for PS Scavenge: 29 ms for 2 collections
[junit] Approx allocation = 564MB vs 26MB; ratio to raw data size =
21.439910434782607
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 976ms maxConcurrency = 1
[junit] GC for PS Scavenge: 27 ms for 2 collections
[junit] Approx allocation = 560MB vs 26MB; ratio to raw data size =
21.31138724637681
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit]
[junit] Threads = 1 elements = 100000 (of size 256) partitions = 16
[junit] original code:
[junit] Duration = 673ms maxConcurrency = 1
[junit] GC for PS Scavenge: 9 ms for 1 collections
[junit] Approx allocation = 453MB vs 26MB; ratio to raw data size =
17.215506086956523
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 589ms maxConcurrency = 1
[junit] GC for PS Scavenge: 10 ms for 1 collections
[junit] Approx allocation = 455MB vs 26MB; ratio to raw data size =
17.307048695652174
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit]
[junit] Threads = 1 elements = 100000 (of size 256) partitions = 256
[junit] original code:
[junit] Duration = 238ms maxConcurrency = 1
[junit] GC for PS Scavenge: 10 ms for 1 collections
[junit] Approx allocation = 342MB vs 26MB; ratio to raw data size =
12.99539536231884
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 271ms maxConcurrency = 1
[junit] GC for PS Scavenge: 10 ms for 1 collections
[junit] Approx allocation = 341MB vs 26MB; ratio to raw data size =
12.96123536231884
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit]
[junit] Threads = 1 elements = 100000 (of size 256) partitions = 1024
[junit] original code:
[junit] Duration = 233ms maxConcurrency = 1
[junit] GC for PS Scavenge: 10 ms for 1 collections
[junit] Approx allocation = 284MB vs 26MB; ratio to raw data size =
10.803777971014492
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 230ms maxConcurrency = 1
[junit] GC for PS Scavenge: 11 ms for 1 collections
[junit] Approx allocation = 284MB vs 26MB; ratio to raw data size =
10.822605507246378
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit]
[junit] --------------------------------------------------
[junit] 100 THREADS; ELEMENT SIZE 256
[junit]
[junit] Threads = 100 elements = 100000 (of size 256) partitions = 1
[junit] original code:
[junit] Duration = 1996ms maxConcurrency = 100
[junit] GC for PS Scavenge: 120 ms for 32 collections
[junit] Approx allocation = 10206MB vs 26MB; ratio to raw data size =
387.7607713043478
[junit] loopRatio (closest to 1 best) 17.4912 raw 100000/1749120 counted
0/0 sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 1532ms maxConcurrency = 100
[junit] GC for PS Scavenge: 15 ms for 1 collections
[junit] Approx allocation = 581MB vs 26MB; ratio to raw data size =
22.088286666666665
[junit] loopRatio (closest to 1 best) 1.00004 raw 341/343 counted 2/2
sync 99658/99659 up 1 down 1
[junit]
[junit]
[junit] Threads = 100 elements = 100000 (of size 256) partitions = 16
[junit] original code:
[junit] Duration = 220ms maxConcurrency = 100
[junit] GC for PS Scavenge: 24 ms for 2 collections
[junit] Approx allocation = 770MB vs 26MB; ratio to raw data size =
29.258727826086957
[junit] loopRatio (closest to 1 best) 1.87623 raw 100000/187623 counted
0/0 sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 216ms maxConcurrency = 98
[junit] GC for PS Scavenge: 28 ms for 2 collections
[junit] Approx allocation = 581MB vs 26MB; ratio to raw data size =
22.077911884057972
[junit] loopRatio (closest to 1 best) 1.33551 raw 52282/69043 counted
18308/19001 sync 38617/45507 up 10826 down 10513
[junit]
[junit]
[junit] Threads = 100 elements = 100000 (of size 256) partitions = 256
[junit] original code:
[junit] Duration = 182ms maxConcurrency = 100
[junit] GC for PS Scavenge: 13 ms for 1 collections
[junit] Approx allocation = 361MB vs 26MB; ratio to raw data size =
13.740559130434782
[junit] loopRatio (closest to 1 best) 1.01958 raw 100000/101958 counted
0/0 sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 181ms maxConcurrency = 98
[junit] GC for PS Scavenge: 12 ms for 1 collections
[junit] Approx allocation = 361MB vs 26MB; ratio to raw data size =
13.729368985507246
[junit] loopRatio (closest to 1 best) 1.01977 raw 98122/100015 counted
1886/1891 sync 39/71 up 1857 down 1853
[junit]
[junit]
[junit] Threads = 100 elements = 100000 (of size 256) partitions = 1024
[junit] original code:
[junit] Duration = 181ms maxConcurrency = 99
[junit] GC for PS Scavenge: 14 ms for 1 collections
[junit] Approx allocation = 303MB vs 26MB; ratio to raw data size =
11.513563768115942
[junit] loopRatio (closest to 1 best) 1.00402 raw 100000/100402 counted
0/0 sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 205ms maxConcurrency = 96
[junit] GC for PS Scavenge: 31 ms for 1 collections
[junit] Approx allocation = 302MB vs 26MB; ratio to raw data size =
11.490827826086957
[junit] loopRatio (closest to 1 best) 1.00394 raw 99610/100003 counted
389/389 sync 1/2 up 392 down 388
[junit]
[junit]
[junit] --------------------------------------------------
[junit] 1 THREAD; ELEMENT SIZE 1024
[junit]
[junit] Threads = 1 elements = 100000 (of size 1024) partitions = 1
[junit] original code:
[junit] Duration = 832ms maxConcurrency = 1
[junit] GC for PS Scavenge: 22 ms for 2 collections
[junit] Approx allocation = 591MB vs 99MB; ratio to raw data size =
5.942641762452107
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 391ms maxConcurrency = 1
[junit] GC for PS Scavenge: 30 ms for 3 collections
[junit] Approx allocation = 631MB vs 99MB; ratio to raw data size =
6.343681455938698
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit]
[junit] Threads = 1 elements = 100000 (of size 1024) partitions = 16
[junit] original code:
[junit] Duration = 896ms maxConcurrency = 1
[junit] GC for PS Scavenge: 27 ms for 2 collections
[junit] Approx allocation = 519MB vs 99MB; ratio to raw data size =
5.215628199233716
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 321ms maxConcurrency = 1
[junit] GC for PS Scavenge: 20 ms for 2 collections
[junit] Approx allocation = 552MB vs 99MB; ratio to raw data size =
5.5507787739463605
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit]
[junit] Threads = 1 elements = 100000 (of size 1024) partitions = 256
[junit] original code:
[junit] Duration = 312ms maxConcurrency = 1
[junit] GC for PS Scavenge: 23 ms for 2 collections
[junit] Approx allocation = 415MB vs 99MB; ratio to raw data size =
4.177359923371648
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 293ms maxConcurrency = 1
[junit] GC for PS Scavenge: 23 ms for 2 collections
[junit] Approx allocation = 399MB vs 99MB; ratio to raw data size =
4.008342911877395
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit]
[junit] Threads = 1 elements = 100000 (of size 1024) partitions = 1024
[junit] original code:
[junit] Duration = 268ms maxConcurrency = 1
[junit] GC for PS Scavenge: 23 ms for 2 collections
[junit] Approx allocation = 354MB vs 99MB; ratio to raw data size =
3.560803908045977
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 278ms maxConcurrency = 1
[junit] GC for PS Scavenge: 26 ms for 2 collections
[junit] Approx allocation = 351MB vs 99MB; ratio to raw data size =
3.5285575478927202
[junit] loopRatio (closest to 1 best) 1.0 raw 100000/100000 counted 0/0
sync 0/0 up 0 down 0
[junit]
[junit]
[junit] --------------------------------------------------
[junit] 100 THREADS; ELEMENT SIZE 1024
[junit]
[junit] Threads = 100 elements = 100000 (of size 1024) partitions = 1
[junit] original code:
[junit] Duration = 2377ms maxConcurrency = 100
[junit] GC for PS Scavenge: 143 ms for 38 collections
[junit] Approx allocation = 12034MB vs 99MB; ratio to raw data size =
120.87394314176245
[junit] loopRatio (closest to 1 best) 18.17784 raw 100000/1817784 counted
0/0 sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 1305ms maxConcurrency = 100
[junit] GC for PS Scavenge: 32 ms for 2 collections
[junit] Approx allocation = 486MB vs 99MB; ratio to raw data size =
4.88428275862069
[junit] loopRatio (closest to 1 best) 1.00009 raw 173/180 counted 2/2
sync 99826/99827 up 1 down 1
[junit]
[junit]
[junit] Threads = 100 elements = 100000 (of size 1024) partitions = 16
[junit] original code:
[junit] Duration = 225ms maxConcurrency = 100
[junit] GC for PS Scavenge: 34 ms for 3 collections
[junit] Approx allocation = 853MB vs 99MB; ratio to raw data size =
8.572829348659004
[junit] loopRatio (closest to 1 best) 1.89489 raw 100000/189489 counted
0/0 sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 231ms maxConcurrency = 99
[junit] GC for PS Scavenge: 42 ms for 3 collections
[junit] Approx allocation = 631MB vs 99MB; ratio to raw data size =
6.3455461302681995
[junit] loopRatio (closest to 1 best) 1.33684 raw 50954/67619 counted
18453/19130 sync 39872/46935 up 10747 down 10462
[junit]
[junit]
[junit] Threads = 100 elements = 100000 (of size 1024) partitions = 256
[junit] original code:
[junit] Duration = 202ms maxConcurrency = 95
[junit] GC for PS Scavenge: 32 ms for 2 collections
[junit] Approx allocation = 411MB vs 99MB; ratio to raw data size =
4.134808582375479
[junit] loopRatio (closest to 1 best) 1.01988 raw 100000/101988 counted
0/0 sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 219ms maxConcurrency = 100
[junit] GC for PS Scavenge: 35 ms for 2 collections
[junit] Approx allocation = 408MB vs 99MB; ratio to raw data size =
4.102220459770115
[junit] loopRatio (closest to 1 best) 1.01909 raw 98202/100024 counted
1815/1819 sync 38/66 up 1789 down 1785
[junit]
[junit]
[junit] Threads = 100 elements = 100000 (of size 1024) partitions = 1024
[junit] original code:
[junit] Duration = 202ms maxConcurrency = 96
[junit] GC for PS Scavenge: 31 ms for 2 collections
[junit] Approx allocation = 368MB vs 99MB; ratio to raw data size =
3.699853869731801
[junit] loopRatio (closest to 1 best) 1.0039 raw 100000/100390 counted
0/0 sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 206ms maxConcurrency = 95
[junit] GC for PS Scavenge: 30 ms for 2 collections
[junit] Approx allocation = 360MB vs 99MB; ratio to raw data size =
3.6165709578544063
[junit] loopRatio (closest to 1 best) 1.0037 raw 99638/100005 counted
363/363 sync 1/2 up 367 down 362
[junit]
[junit]
[junit] ==================================================
{code}
> AtomicSortedColumns.addAllWithSizeDelta has a spin lock that allocates memory
> -----------------------------------------------------------------------------
>
> Key: CASSANDRA-7546
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7546
> Project: Cassandra
> Issue Type: Bug
> Reporter: graham sanderson
> Attachments: suggestion1.txt
>
>
> In order to preserve atomicity, this code attempts to read, clone/update,
> then CAS the state of the partition.
> Under heavy contention for updating a single partition this can cause some
> fairly staggering memory growth (the more cores on your machine the worst it
> gets).
> Whilst many usage patterns don't do highly concurrent updates to the same
> partition, hinting today, does, and in this case wild (order(s) of magnitude
> more than expected) memory allocation rates can be seen (especially when the
> updates being hinted are small updates to different partitions which can
> happen very fast on their own) - see CASSANDRA-7545
> It would be best to eliminate/reduce/limit the spinning memory allocation
> whilst not slowing down the very common un-contended case.
--
This message was sent by Atlassian JIRA
(v6.2#6252)