[ 
https://issues.apache.org/jira/browse/CASSANDRA-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964894#action_12964894
 ] 

Jonathan Ellis commented on CASSANDRA-1776:
-------------------------------------------

Get a stack trace when it's doing that (e.g. w/ jstack), then use the steps at 
http://publib.boulder.ibm.com/infocenter/javasdk/tools/index.jsp?topic=/com.ibm.java.doc.igaa/_1vg0001475cb4a-1190e2e0f74-8000_1007.html
 to find out which Cassandra/JVM threads are taking up all that CPU so we can 
figure out what is going on.

> Untrapped exceptions in ThreadPool have a variety of ill effects
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-1776
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1776
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6.5
>            Reporter: Edward Capriolo
>         Attachments: logs
>
>
> I have seen a variety of conditions that keep the Cassandra process running 
> even though it mostly failed. At times the node stays up sending gossip 
> messages so other nodes think the node is up. In the worst case condition a 
> node gets in a tight loop fully utilizing 16 cores of a system and sending 
> gossip messages that cause cascading issues across the cluster. 
> I have seen untrapped OOM errors.  The interesting part of the attached log 
> is that we are not using super columns. I also have machines that come up out 
> of a 40 second garbage collect, (I assume they gossip themselves as UP)  
> messages then go back into a garbage collect to repeat again.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to