[ 
https://issues.apache.org/jira/browse/CASSANDRA-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901497#action_12901497
 ] 

Peter Schuller commented on CASSANDRA-1014:
-------------------------------------------

I have not read anything about this other than what is in this ticket, and the 
beginnings of this is old, so this may be moot, but a couple of things:

* The first graph attached (1014-2Gheap.png) looks to me like the JVM is only 
doing young generation collections and is simply not ever doing a concurrent 
mark/sweep phase. That would be a VM bug (or broken VM options).

* Is the 60 mb vs. 368 mb the difference between a CMS full collection and a 
stop-the-world full collection? I.e., it was 368 right after a full CMS sweep? 
It need not necessarily indicate a VM bug; consider that CMS's old gen is 
maintained in a non-compacting/copying fashion and that the CMS old gen is thus 
susceptible to fragmentation overhead. A full stop-the-world GC also applies, 
AFAIK, that it does a compacting GC. A factor of 6.1 seems like a lot though, 
but I don't know about how the CMS free space management works. If the 6.1 is 
explained by fragmentation, my initial guess would be that large allocations 
are the triggering factor.

> GC storming, possible memory leak
> ---------------------------------
>
>                 Key: CASSANDRA-1014
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1014
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>         Environment: debian lenny amd64 OpenJDK 64-Bit Server VM (build 
> 1.6.0_0-b11, mixed mode)
>            Reporter: Brandon Williams
>             Fix For: 0.7.0
>
>         Attachments: 1014-2Gheap.png, 1014-commitlog-v2.tar.gz, 
> 1014-table.diff, 724-0001.png, gc2.png
>
>
> There appears to be a GC issue due to memory pressure in the 0.6 branch.  You 
> can see this by starting the server and performing many inserts.  Quickly the 
> jvm will consume most of its heap, and pauses for stop-the-world GC will 
> begin.  With verbose GC turned on, this can be observed as follows:
> [GC [ParNew (promotion failed): 79703K->79703K(84544K), 0.0622980 
> secs][CMS[CMS-concurrent-mark: 3.678/5.031 secs] [Times: user=10.35 sys=4.22, 
> real=5.03 secs]
>  (concurrent mode failure): 944529K->492222K(963392K), 2.8264480 secs] 
> 990745K->492222K(1047936K), 2.8890500 secs] [Times: user=2.90 sys=0.04, 
> real=2.90 secs]
> After enough inserts (around 75-100 million) the server will GC storm and 
> then OOM.
> jbellis and I narrowed this down to patch 0001 in CASSANDRA-724.  Switching 
> LBQ with ABQ made no difference, however using batch mode instead of periodic 
> for the commitlog does prevent the issue from occurring.  The attached 
> screenshot shows the heap usage in jconsole first when the issue is 
> exhibiting, a restart, and then the same amount of inserts when it does not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to