[
https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061514#comment-14061514
]
graham sanderson edited comment on CASSANDRA-7546 at 7/15/14 1:36 AM:
----------------------------------------------------------------------
The stateful learning behavior here seems like a good thing... attempting the
first loop only to fail, before doing any form of synchronization would mean a
ratio of at least 2 in the highly contended case.
>From looking at the numbers above, you'll see that this code:
- maintains a ratio of 1 as expected in the uncontended cases
- maintains a ratio of 1 in the highly contended cases, vs on this box as high
as 17 in the original code (which causes massive memory allocation). e.g.
{code}
[junit] Threads = 100 elements = 100000 (of size 64) partitions = 1
[junit] original code:
[junit] Duration = 1730ms maxConcurrency = 100
[junit] GC for PS Scavenge: 99 ms for 30 collections
[junit] Approx allocation = 9842MB vs 8MB; ratio to raw data size =
1228.6645866666668
[junit] loopRatio (closest to 1 best) 17.41481 raw 100000/1741481 counted
0/0 sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 1300ms maxConcurrency = 100
[junit] GC for PS Scavenge: 16 ms for 1 collections
[junit] Approx allocation = 561MB vs 8MB; ratio to raw data size =
70.0673819047619
[junit] loopRatio (closest to 1 best) 1.00004 raw 258/260 counted 2/2
sync 99741/99742 up 1 down 1
{code}
- seems to max out at about 1.3 for cases in between, with that generally lower
or very very close to the original code. e.g.
{code}
[junit] Threads = 100 elements = 100000 (of size 256) partitions = 16
[junit] original code:
[junit] Duration = 220ms maxConcurrency = 100
[junit] GC for PS Scavenge: 24 ms for 2 collections
[junit] Approx allocation = 770MB vs 26MB; ratio to raw data size =
29.258727826086957
[junit] loopRatio (closest to 1 best) 1.87623 raw 100000/187623 counted
0/0 sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 216ms maxConcurrency = 98
[junit] GC for PS Scavenge: 28 ms for 2 collections
[junit] Approx allocation = 581MB vs 26MB; ratio to raw data size =
22.077911884057972
[junit] loopRatio (closest to 1 best) 1.33551 raw 52282/69043 counted
18308/19001 sync 38617/45507 up 10826 down 10513
{code}
was (Author: graham sanderson):
The stateful learning behavior here seems like a good thing... attempting the
first loop only to fail, before doing any form of synchronization would mean a
ratio of at least 2 in the highly contended case.
>From looking at the numbers above, you'll see that this code:
- maintains a ratio of 1 as expected in the uncontended cases
- maintains a ratio of 1 in the highly contended cases, vs on this box as high
as 17 in the original code (which causes massive memory allocation)
{code}
[junit] Threads = 100 elements = 100000 (of size 64) partitions = 1
[junit] original code:
[junit] Duration = 1730ms maxConcurrency = 100
[junit] GC for PS Scavenge: 99 ms for 30 collections
[junit] Approx allocation = 9842MB vs 8MB; ratio to raw data size =
1228.6645866666668
[junit] loopRatio (closest to 1 best) 17.41481 raw 100000/1741481 counted
0/0 sync 0/0 up 0 down 0
[junit]
[junit] modified code:
[junit] Duration = 1300ms maxConcurrency = 100
[junit] GC for PS Scavenge: 16 ms for 1 collections
[junit] Approx allocation = 561MB vs 8MB; ratio to raw data size =
70.0673819047619
[junit] loopRatio (closest to 1 best) 1.00004 raw 258/260 counted 2/2
sync 99741/99742 up 1 down 1
{code}
- seems to max out at about 1.3 for cases in between, with that generally lower
or very very close to the original code.
> AtomicSortedColumns.addAllWithSizeDelta has a spin lock that allocates memory
> -----------------------------------------------------------------------------
>
> Key: CASSANDRA-7546
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7546
> Project: Cassandra
> Issue Type: Bug
> Reporter: graham sanderson
> Attachments: suggestion1.txt
>
>
> In order to preserve atomicity, this code attempts to read, clone/update,
> then CAS the state of the partition.
> Under heavy contention for updating a single partition this can cause some
> fairly staggering memory growth (the more cores on your machine the worst it
> gets).
> Whilst many usage patterns don't do highly concurrent updates to the same
> partition, hinting today, does, and in this case wild (order(s) of magnitude
> more than expected) memory allocation rates can be seen (especially when the
> updates being hinted are small updates to different partitions which can
> happen very fast on their own) - see CASSANDRA-7545
> It would be best to eliminate/reduce/limit the spinning memory allocation
> whilst not slowing down the very common un-contended case.
--
This message was sent by Atlassian JIRA
(v6.2#6252)