[ 
https://issues.apache.org/jira/browse/CASSANDRA-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890347#action_12890347
 ] 

Brandon Williams commented on CASSANDRA-1014:
---------------------------------------------

To summarize the current status since there's a lot of noise in this ticket:

With a 1GB heap and constant inserts, the server will begin to GC storm around 
the 100M row mark, and eventually OOM.  Increasing the heap size doesn't help, 
it just takes longer to reproduce.  The old gen continues to slowly grow until 
it's full and can't keep up.  If you stop the inserts and force a STW GC, 
memory usage returns to normal.  If you analyze a heap dump in MAT, it's not 
very helpful, most of the heap will be used by 'other' and tracing the GC roots 
of those objects is fruitless.  Using other collectors doesn't improve the 
situation, ParOld and G1 both produce the same behavior.  The GC options 
committed earlier in this ticket helped, but did not solve the situation.   
Using either batch or periodic mode doesn't matter, though batch takes longer 
to exhibit the issue.

> GC storming, possible memory leak
> ---------------------------------
>
>                 Key: CASSANDRA-1014
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1014
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>         Environment: debian lenny amd64 OpenJDK 64-Bit Server VM (build 
> 1.6.0_0-b11, mixed mode)
>            Reporter: Brandon Williams
>            Assignee: Jonathan Ellis
>             Fix For: 0.6.4
>
>         Attachments: 1014-2Gheap.png, 1014-commitlog-v2.tar.gz, 
> 1014-table.diff, 724-0001.png, gc2.png
>
>
> There appears to be a GC issue due to memory pressure in the 0.6 branch.  You 
> can see this by starting the server and performing many inserts.  Quickly the 
> jvm will consume most of its heap, and pauses for stop-the-world GC will 
> begin.  With verbose GC turned on, this can be observed as follows:
> [GC [ParNew (promotion failed): 79703K->79703K(84544K), 0.0622980 
> secs][CMS[CMS-concurrent-mark: 3.678/5.031 secs] [Times: user=10.35 sys=4.22, 
> real=5.03 secs]
>  (concurrent mode failure): 944529K->492222K(963392K), 2.8264480 secs] 
> 990745K->492222K(1047936K), 2.8890500 secs] [Times: user=2.90 sys=0.04, 
> real=2.90 secs]
> After enough inserts (around 75-100 million) the server will GC storm and 
> then OOM.
> jbellis and I narrowed this down to patch 0001 in CASSANDRA-724.  Switching 
> LBQ with ABQ made no difference, however using batch mode instead of periodic 
> for the commitlog does prevent the issue from occurring.  The attached 
> screenshot shows the heap usage in jconsole first when the issue is 
> exhibiting, a restart, and then the same amount of inserts when it does not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to