[ 
https://issues.apache.org/jira/browse/CASSANDRA-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HUANG DUICAN updated CASSANDRA-14953:
-------------------------------------
     Attachment: 2.PNG
                 1.PNG
    Description: 
We found that Cassandra has a lot of write accumulation in the production 
environment, and our business has experienced a lot of write failures.
 Through the system.log, it was found that MemtableReclaimMemory was pending at 
the beginning, and then a large number of MutationStage stacks appeared at a 
certain moment.
 Finally, the heap memory is full, the GC time reaches tens of seconds, the 
node status is DN through nodetool, but the Cassandra process is still 
running.We killed the node and restarted the node, and the above situation 
disappeared.

 

Also the number of Active MemtableReclaimMemory threads seems to stay at 1.

(you can see the 1.PNG)

a large number of MutationStage stacks appeared at a certain moment.

(you can see the 2.PNG)

 

long GC time:

 - MemtableReclaimMemory 1 156 24565 0 0
 - G1 Old Generation GC in 87121ms. G1 Old Gen: 51175946656 -> 50082999760;
 - MutationStage 128 11931622 1983820772 0 0
 - CounterMutationStage 0 0 0 0 0
 - MemtableReclaimMemory 1 156 24565 0 0
 - G1 Young Generation GC in {color:#FF0000}969ms{color}. G1 Eden Space: 
1090519040 -> 0; G1 Old Gen: 50082999760 -> 51156741584;
 - MutationStage 128 11953653 1983820772 0 0
 - CounterMutationStage 0 0 0 0 0
 - MemtableReclaimMemory 1 156 24565 0 0
 - G1 Old Generation GC in {color:#FF0000}84785ms{color}. G1 Old Gen: 
51173518800 -> 50180911432;
 - MutationStage 128 11967484 1983820772 0 0
 - CounterMutationStage 0 0 0 0 0
 - MemtableReclaimMemory 1 156 24565 0 0
 - G1 Young Generation GC in 611ms. G1 Eden Space: 989855744 -> 0; G1 Old Gen: 
50180911432 -> 51153989960;
 - MutationStage 128 11975849 1983820772 0 0
 - CounterMutationStage 0 0 0 0 0
 - MemtableReclaimMemory 1 156 24565 0 0
 - G1 Old Generation GC in {color:#FF0000}85845ms{color}. G1 Old Gen: 
51170767176 -> 50238295416;
 - MutationStage 128 11978192 1983820772 0 0
 - CounterMutationStage 0 0 0 0 0
 - MemtableReclaimMemory 1 156 24565 0 0
 - G1 Young Generation GC in 602ms. G1 Eden Space: 939524096 -> 0; G1 Old Gen: 
50238295416 -> 51161042296;
 - MutationStage 128 11994295 1983820772 0 0
 - CounterMutationStage 0 0 0 0 0
 - MemtableReclaimMemory 1 156 24565 0 0
 - G1 Old Generation GC in {color:#FF0000}85307ms{color}. G1 Old Gen: 
51177819512 -> 50288829624; Metaspace: 36544536 -> 36525696
 - MutationStage 128 12001932 1983820772 0 0
 - CounterMutationStage 0 0 0 0 0
66 - MutationStage 128 12004395 1983820772 0 0
66 - CounterMutationStage 0 0 0 0 0
 - MemtableReclaimMemory 1 156 24565 0 0
66 - MemtableReclaimMemory 1 156 24565 0 0
 - G1 Young Generation GC in 610ms. G1 Eden Space: 889192448 -> 0; G1 Old Gen: 
50288829624 -> 51178022072;
 - MutationStage 128 12023677 1983820772 0 0

Why is this happening? 

  was:
We found that Cassandra has a lot of write accumulation in the production 
environment, and our business has experienced a lot of write failures.
Through the system.log, it was found that MemtableReclaimMemory was pending at 
the beginning, and then a large number of MutationStage stacks appeared at a 
certain moment.
Finally, the heap memory is full, the GC time reaches tens of seconds, the node 
status is DN through nodetool, but the Cassandra process is still running.We 
killed the node and restarted the node, and the above situation disappeared.

 

Also the number of Active MemtableReclaimMemory threads seems to stay at 1.


 

a large number of MutationStage stacks appeared at a certain moment.



long GC time:



 

Why is this happening? 


> Failed to reclaim the memory and too many MemtableReclaimMemory pending task
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-14953
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14953
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Memtable
>         Environment: version : cassandra 2.1.15
> jdk: 8
> os:suse
>            Reporter: HUANG DUICAN
>            Priority: Major
>         Attachments: 1.PNG, 2.PNG, cassandra_20190105.zip
>
>
> We found that Cassandra has a lot of write accumulation in the production 
> environment, and our business has experienced a lot of write failures.
>  Through the system.log, it was found that MemtableReclaimMemory was pending 
> at the beginning, and then a large number of MutationStage stacks appeared at 
> a certain moment.
>  Finally, the heap memory is full, the GC time reaches tens of seconds, the 
> node status is DN through nodetool, but the Cassandra process is still 
> running.We killed the node and restarted the node, and the above situation 
> disappeared.
>  
> Also the number of Active MemtableReclaimMemory threads seems to stay at 1.
> (you can see the 1.PNG)
> a large number of MutationStage stacks appeared at a certain moment.
> (you can see the 2.PNG)
>  
> long GC time:
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Old Generation GC in 87121ms. G1 Old Gen: 51175946656 -> 50082999760;
>  - MutationStage 128 11931622 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Young Generation GC in {color:#FF0000}969ms{color}. G1 Eden Space: 
> 1090519040 -> 0; G1 Old Gen: 50082999760 -> 51156741584;
>  - MutationStage 128 11953653 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Old Generation GC in {color:#FF0000}84785ms{color}. G1 Old Gen: 
> 51173518800 -> 50180911432;
>  - MutationStage 128 11967484 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Young Generation GC in 611ms. G1 Eden Space: 989855744 -> 0; G1 Old 
> Gen: 50180911432 -> 51153989960;
>  - MutationStage 128 11975849 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Old Generation GC in {color:#FF0000}85845ms{color}. G1 Old Gen: 
> 51170767176 -> 50238295416;
>  - MutationStage 128 11978192 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Young Generation GC in 602ms. G1 Eden Space: 939524096 -> 0; G1 Old 
> Gen: 50238295416 -> 51161042296;
>  - MutationStage 128 11994295 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Old Generation GC in {color:#FF0000}85307ms{color}. G1 Old Gen: 
> 51177819512 -> 50288829624; Metaspace: 36544536 -> 36525696
>  - MutationStage 128 12001932 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
> 66 - MutationStage 128 12004395 1983820772 0 0
> 66 - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
> 66 - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Young Generation GC in 610ms. G1 Eden Space: 889192448 -> 0; G1 Old 
> Gen: 50288829624 -> 51178022072;
>  - MutationStage 128 12023677 1983820772 0 0
> Why is this happening? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to