[jira] Commented: (CASSANDRA-2054) Cpu Spike to > 100%.

Thibaut (JIRA) Tue, 25 Jan 2011 14:04:09 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12986715#action_12986715
 ]


Thibaut commented on CASSANDRA-2054:
------------------------------------

java -version
java version "1.6.0_22"
Java(TM) SE Runtime Environment (build 1.6.0_22-b04)
Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03, mixed mode)


 /usr/bin/java -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms3022M 
-Xmx3022M -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedOops -Xss128k 
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled 
-XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 
-XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly 
-Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=8080 
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dlog4j.configuration=log4j-server.properties -cp 
/software/cassandra/bin/../conf:/software/cassandra/bin/../build/classes:/software/cassandra/bin/../lib/antlr-3.1.3.jar:/software/cassandra/bin/../lib/apache-cassandra-2011-01-24_06-01-26.jar:/software/cassandra/bin/../lib/avro-1.4.0-fixes.jar:/software/cassandra/bin/../lib/avro-1.4.0-sources-fixes.jar:/software/cassandra/bin/../lib/commons-cli-1.1.jar:/software/cassandra/bin/../lib/commons-codec-1.2.jar:/software/cassandra/bin/../lib/commons-collections-3.2.1.jar:/software/cassandra/bin/../lib/commons-lang-2.4.jar:/software/cassandra/bin/../lib/concurrentlinkedhashmap-lru-1.1.jar:/software/cassandra/bin/../lib/guava-r05.jar:/software/cassandra/bin/../lib/high-scale-lib.jar:/software/cassandra/bin/../lib/ivy-2.1.0.jar:/software/cassandra/bin/../lib/jackson-core-asl-1.4.0.jar:/software/cassandra/bin/../lib/jackson-mapper-asl-1.4.0.jar:/software/cassandra/bin/../lib/jetty-6.1.21.jar:/software/cassandra/bin/../lib/jetty-util-6.1.21.jar:/software/cassandra/bin/../lib/jline-0.9.94.jar:/software/cassandra/bin/../lib/json-simple-1.1.jar:/software/cassandra/bin/../lib/jug-2.0.0.jar:/software/cassandra/bin/../lib/libthrift-0.5.jar:/software/cassandra/bin/../lib/log4j-1.2.16.jar:/software/cassandra/bin/../lib/servlet-api-2.5-20081211.jar:/software/cassandra/bin/../lib/slf4j-api-1.6.1.jar:/software/cassandra/bin/../lib/slf4j-log4j12-1.6.1.jar:/software/cassandra/bin/../lib/snakeyaml-1.6.jar
 org.apache.cassandra.thrift.CassandraDaemon


I added " -XX:+UseCompressedOops". About 40-50 column families, througput_in_mb 
reduced from 64 to 16)


> Cpu Spike to > 100%. 
> ---------------------
>
>                 Key: CASSANDRA-2054
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2054
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Thibaut
>         Attachments: jstackerror.txt
>
>
> I see sudden spikes of cpu usage where cassandra will take up an enormous 
> amount of cpu (uptime load > 1000). 
> My application executes both reads and writes.
> I tested this with 
> https://hudson.apache.org/hudson/job/Cassandra-0.7/193/artifact/cassandra/build/apache-cassandra-2011-01-24_06-01-26-bin.tar.gz.
> I disabled JNA, but this didn't help.
> Jstack won't work anymore when this happens:
> -bash-4.1# jstack 27699 > /tmp/jstackerror
> 27699: Unable to open socket file: target process not responding or HotSpot 
> VM not loaded
> The -F option can be used when the target process is not responding
> Also, my entire application comes to a halt as long as the node is in this 
> state, as the node is still marked as up, but won't respond (cassandra is 
> taking up all the cpu on the first node) to any requests.
> /software/cassandra/bin/nodetool -h localhost ring
> Address Status State Load Owns Token
> ffffffffffffffff
> 192.168.0.1 Up Normal 3.48 GB 5.00% 0cc
> 192.168.0.2 Up Normal 3.48 GB 5.00% 199
> 192.168.0.3 Up Normal 3.67 GB 5.00% 266
> 192.168.0.4 Up Normal 2.55 GB 5.00% 333
> 192.168.0.5 Up Normal 2.58 GB 5.00% 400
> 192.168.0.6 Up Normal 2.54 GB 5.00% 4cc
> 192.168.0.7 Up Normal 2.59 GB 5.00% 599
> 192.168.0.8 Up Normal 2.58 GB 5.00% 666
> 192.168.0.9 Up Normal 2.33 GB 5.00% 733
> 192.168.0.10 Down Normal 2.39 GB 5.00% 7ff
> 192.168.0.11 Up Normal 2.4 GB 5.00% 8cc
> 192.168.0.12 Up Normal 2.74 GB 5.00% 999
> 192.168.0.13 Up Normal 3.17 GB 5.00% a66
> 192.168.0.14 Up Normal 3.25 GB 5.00% b33
> 192.168.0.15 Up Normal 3.01 GB 5.00% c00
> 192.168.0.16 Up Normal 2.48 GB 5.00% ccc
> 192.168.0.17 Up Normal 2.41 GB 5.00% d99
> 192.168.0.18 Up Normal 2.3 GB 5.00% e66
> 192.168.0.19 Up Normal 2.27 GB 5.00% f33
> 192.168.0.20 Up Normal 2.32 GB 5.00% ffffffffffffffff
> The interesting part is that after a while (seconds or minutes), I have seen 
> cassandra nodes return to a normal state again (without restart). I have also 
> never seen this happen at 2 nodes at the same time in the cluster (the node 
> where it happens differes, but there seems to be scheme for it to happen on 
> the first node most of the times).
> In the above case, I restarted node 192.168.0.10 and the first node returned 
> to normal state. (I don't know if there is a correlation)
> I attached the jstack of the node in trouble (as soon as I could access it 
> with jstack, but I suspect this is the jstack when the node was running 
> normal again).
> The heap usage is still moderate:
> /software/cassandra/bin/nodetool -h localhost info
> 0cc
> Gossip active    : true
> Load             : 3.49 GB
> Generation No    : 1295949691
> Uptime (seconds) : 42843
> Heap Memory (MB) : 1570.58 / 3005.38
> I will enable the GC logging tomorrow.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-2054) Cpu Spike to > 100%.

Reply via email to