[ 
https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069981#comment-14069981
 ] 

graham sanderson commented on CASSANDRA-7546:
---------------------------------------------

My last piece of speculation... these single partition hint trees are probably 
getting thousands of nodes big, and we probably have hundreds of concurrent 
mutator threads for them. It may just be that we are hitting a "sweet" spot of 
allocation rate such that none of the on processor threads get to actually make 
sufficient progress to reach their cas before we end up needing to GC, at which 
point they must all safepoint after which I assume, they don't get any 
preferential dibs at running next, so we have a much higher ratio of wastage 
than even in my synthetic test where it was largely proportional to number of 
cores not number of threads. In this nasty case where we have enough cores to 
do lots of concurrent work, but enough work per core to cause enough allocation 
to cause GC before any of them finish the task at hand, you get the worst of 
both the locking and the spinning worlds.

Anyways, let me know if you want me to take another stab at the patch including 
doing the one time allocation outside the loop (or on first pass) - you are 
more familiar with the code, but it is always good to learn.

> AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7546
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7546
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: graham sanderson
>            Assignee: graham sanderson
>         Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_alt.txt, 
> suggestion1.txt, suggestion1_21.txt
>
>
> In order to preserve atomicity, this code attempts to read, clone/update, 
> then CAS the state of the partition.
> Under heavy contention for updating a single partition this can cause some 
> fairly staggering memory growth (the more cores on your machine the worst it 
> gets).
> Whilst many usage patterns don't do highly concurrent updates to the same 
> partition, hinting today, does, and in this case wild (order(s) of magnitude 
> more than expected) memory allocation rates can be seen (especially when the 
> updates being hinted are small updates to different partitions which can 
> happen very fast on their own) - see CASSANDRA-7545
> It would be best to eliminate/reduce/limit the spinning memory allocation 
> whilst not slowing down the very common un-contended case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to