[
https://issues.apache.org/jira/browse/CASSANDRA-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023232#comment-13023232
]
Thibaut commented on CASSANDRA-2543:
------------------------------------
Ok thanks.
I will try above to make sure it's the GC thread taking up 100% cpu (which I
suspect).
The GC indeed ran indeed very very often: (grepping the old log file)
INFO [ScheduledTasks:1] 2011-04-22 15:07:24,742 GCInspector.java (line 128) GC
for ConcurrentMarkSweep: 3133 ms, 1537221688 reclaimed leaving 3830570400 used;
max is 5605687296
INFO [ScheduledTasks:1] 2011-04-22 15:07:28,978 GCInspector.java (line 128) GC
for ConcurrentMarkSweep: 2901 ms, 2172093472 reclaimed leaving 3219903248 used;
max is 5605687296
INFO [ScheduledTasks:1] 2011-04-22 15:07:33,946 GCInspector.java (line 128) GC
for ConcurrentMarkSweep: 3138 ms, 1715631768 reclaimed leaving 3581355456 used;
max is 5605687296
INFO [ScheduledTasks:1] 2011-04-22 15:07:38,318 GCInspector.java (line 128) GC
for ConcurrentMarkSweep: 2860 ms, 2467501736 reclaimed leaving 2874209912 used;
max is 5605687296
INFO [ScheduledTasks:1] 2011-04-22 15:07:43,466 GCInspector.java (line 128) GC
for ConcurrentMarkSweep: 3059 ms, 1491300856 reclaimed leaving 3875493760 used;
max is 5605687296
INFO [ScheduledTasks:1] 2011-04-22 15:07:47,529 GCInspector.java (line 128) GC
for ConcurrentMarkSweep: 2780 ms, 2521501408 reclaimed leaving 2824239464 used;
max is 5605687296
INFO [ScheduledTasks:1] 2011-04-22 15:07:52,983 GCInspector.java (line 128) GC
for ConcurrentMarkSweep: 3330 ms, 1287324312 reclaimed leaving 3964465048 used;
max is 5605687296
INFO [ScheduledTasks:1] 2011-04-22 15:07:57,147 GCInspector.java (line 128) GC
for ConcurrentMarkSweep: 3071 ms, 1950115232 reclaimed leaving 3337420088 used;
max is 5605687296
INFO [ScheduledTasks:1] 2011-04-22 15:08:01,785 GCInspector.java (line 128) GC
for ConcurrentMarkSweep: 3069 ms, 1748108592 reclaimed leaving 3444685288 used;
max is 5605687296
INFO [ScheduledTasks:1] 2011-04-22 15:08:06,397 GCInspector.java (line 128) GC
for ConcurrentMarkSweep: 3011 ms, 1779092920 reclaimed leaving 3637964728 used;
max is 5605687296
INFO [ScheduledTasks:1] 2011-04-22 15:08:10,715 GCInspector.java (line 128) GC
for ConcurrentMarkSweep: 2950 ms, 2042754088 reclaimed leaving 3223819496 used;
max is 5605687296
INFO [ScheduledTasks:1] 2011-04-22 15:08:15,528 GCInspector.java (line 128) GC
for ConcurrentMarkSweep: 3217 ms, 956021816 reclaimed leaving 4371838768 used;
max is 5605687296
INFO [ScheduledTasks:1] 2011-04-22 15:08:19,380 GCInspector.java (line 128) GC
for ConcurrentMarkSweep: 3007 ms, 2216336016 reclaimed leaving 3213552904 used;
max is 5605687296
INFO [ScheduledTasks:1] 2011-04-22 15:08:24,154 GCInspector.java (line 128) GC
for ConcurrentMarkSweep: 2973 ms, 1964499424 reclaimed leaving 3287557528 used;
max is 5605687296
INFO [ScheduledTasks:1] 2011-04-22 15:08:29,088 GCInspector.java (line 128) GC
for ConcurrentMarkSweep: 2988 ms, 1964657952 reclaimed leaving 3418170312 used;
max is 5605687296
It still interest me why the node didn't recover even though there were nearly
no requests at all from our application.
> Node not responding, bringing down cluster, marked as up
> --------------------------------------------------------
>
> Key: CASSANDRA-2543
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2543
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.4
> Reporter: Thibaut
> Fix For: 0.7.6
>
> Attachments: jstack
>
>
> I have one node which constantly hangs and brings done the entire cluster
> (not giving any answers).
> If I restart the node, the node will hang after a certain number of time. I
> have no indication
> It's marked as up when executing the nodetool ring command.
> Executing the ring command on the node itself (without any traffic on the
> cluster) takes at least 2 minutes to execute. The node takes about 50%-100%
> of cpu over all cpus.
> Netstats doesn't list anything interesting:
> /software/cassandra/bin/nodetool -h localhost netstats
> Mode: Normal
> Not sending any streams.
> Not receiving any streams.
> Pool Name Active Pending Completed
> Commands n/a 0 51064
> Responses n/a 0 530479
> I attached the jstack of the node. There are no indications that the node has
> faulty hardware.
> /usr/bin/java -ea -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42
> -Xms5254M -Xmx5254M -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss128k
> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
> -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
> -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=8080
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dlog4j.configuration=log4j-server.properties
> -Dlog4j.defaultInitOverride=true -Dcassandra-foreground=yes -cp
> /software/cassandra/bin/../conf:/software/cassandra/bin/../build/classes:/software/cassandra/bin/../lib/antlr-3.1.3.jar:/software/cassandra/bin/../lib/apache-cassandra-0.7.4.jar:/software/cassandra/bin/../lib/avro-1.4.0-fixes.jar:/software/cassandra/bin/../lib/avro-1.4.0-sources-fixes.jar:/software/cassandra/bin/../lib/commons-cli-1.1.jar:/software/cassandra/bin/../lib/commons-codec-1.2.jar:/software/cassandra/bin/../lib/commons-collections-3.2.1.jar:/software/cassandra/bin/../lib/commons-lang-2.4.jar:/software/cassandra/bin/../lib/concurrentlinkedhashmap-lru-1.1.jar:/software/cassandra/bin/../lib/guava-r05.jar:/software/cassandra/bin/../lib/high-scale-lib.jar:/software/cassandra/bin/../lib/jackson-core-asl-1.4.0.jar:/software/cassandra/bin/../lib/jackson-mapper-asl-1.4.0.jar:/software/cassandra/bin/../lib/jetty-6.1.21.jar:/software/cassandra/bin/../lib/jetty-util-6.1.21.jar:/software/cassandra/bin/../lib/jline-0.9.94.jar:/software/cassandra/bin/../lib/json-simple-1.1.jar:/software/cassandra/bin/../lib/jug-2.0.0.jar:/software/cassandra/bin/../lib/libthrift-0.5.jar:/software/cassandra/bin/../lib/log4j-1.2.16.jar:/software/cassandra/bin/../lib/servlet-api-2.5-20081211.jar:/software/cassandra/bin/../lib/slf4j-api-1.6.1.jar:/software/cassandra/bin/../lib/slf4j-log4j12-1.6.1.jar:/software/cassandra/bin/../lib/snakeyaml-1.6.jar
> org.apache.cassandra.thrift.CassandraDaemon
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira