[
https://issues.apache.org/jira/browse/CASSANDRA-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901497#action_12901497
]
Peter Schuller commented on CASSANDRA-1014:
-------------------------------------------
I have not read anything about this other than what is in this ticket, and the
beginnings of this is old, so this may be moot, but a couple of things:
* The first graph attached (1014-2Gheap.png) looks to me like the JVM is only
doing young generation collections and is simply not ever doing a concurrent
mark/sweep phase. That would be a VM bug (or broken VM options).
* Is the 60 mb vs. 368 mb the difference between a CMS full collection and a
stop-the-world full collection? I.e., it was 368 right after a full CMS sweep?
It need not necessarily indicate a VM bug; consider that CMS's old gen is
maintained in a non-compacting/copying fashion and that the CMS old gen is thus
susceptible to fragmentation overhead. A full stop-the-world GC also applies,
AFAIK, that it does a compacting GC. A factor of 6.1 seems like a lot though,
but I don't know about how the CMS free space management works. If the 6.1 is
explained by fragmentation, my initial guess would be that large allocations
are the triggering factor.
> GC storming, possible memory leak
> ---------------------------------
>
> Key: CASSANDRA-1014
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1014
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.6
> Environment: debian lenny amd64 OpenJDK 64-Bit Server VM (build
> 1.6.0_0-b11, mixed mode)
> Reporter: Brandon Williams
> Fix For: 0.7.0
>
> Attachments: 1014-2Gheap.png, 1014-commitlog-v2.tar.gz,
> 1014-table.diff, 724-0001.png, gc2.png
>
>
> There appears to be a GC issue due to memory pressure in the 0.6 branch. You
> can see this by starting the server and performing many inserts. Quickly the
> jvm will consume most of its heap, and pauses for stop-the-world GC will
> begin. With verbose GC turned on, this can be observed as follows:
> [GC [ParNew (promotion failed): 79703K->79703K(84544K), 0.0622980
> secs][CMS[CMS-concurrent-mark: 3.678/5.031 secs] [Times: user=10.35 sys=4.22,
> real=5.03 secs]
> (concurrent mode failure): 944529K->492222K(963392K), 2.8264480 secs]
> 990745K->492222K(1047936K), 2.8890500 secs] [Times: user=2.90 sys=0.04,
> real=2.90 secs]
> After enough inserts (around 75-100 million) the server will GC storm and
> then OOM.
> jbellis and I narrowed this down to patch 0001 in CASSANDRA-724. Switching
> LBQ with ABQ made no difference, however using batch mode instead of periodic
> for the commitlog does prevent the issue from occurring. The attached
> screenshot shows the heap usage in jconsole first when the issue is
> exhibiting, a restart, and then the same amount of inserts when it does not.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.