[
https://issues.apache.org/jira/browse/CASSANDRA-10439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941545#comment-14941545
]
Govindaraj commented on CASSANDRA-10439:
----------------------------------------
[root@mesoscass-cmcb-01 gvenka008c]# vmstat
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
27 0 0 311864 275348 5494052 0 0 28 138 4 8 16 2 82 1 0
[root@mesoscass-cmcb-01 gvenka008c]# iostat
Linux 2.6.32-504.12.2.el6.x86_64 (mesoscass-cmcb-01.sys.comcast.net)
10/02/2015 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
13.54 2.02 1.88 0.95 0.12 81.50
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
vda 1.92 7.86 39.04 2135978 10606072
vdb 13.65 84.00 1818.10 22823858 493983616
vdc 2.64 353.92 332.02 96161082 90211760
vdd 0.00 0.00 0.00 236 0
> Cassandra Read Request Latency
> ------------------------------
>
> Key: CASSANDRA-10439
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10439
> Project: Cassandra
> Issue Type: Task
> Environment: PROD
> Reporter: Govindaraj
> Attachments: Screen Shot 2015-10-02 at 1.40.31 PM.png, Screen Shot
> 2015-10-02 at 12.08.06 PM.png, Screen Shot 2015-10-02 at 2.28.31 PM.png,
> Screen Shot 2015-10-02 at 2.28.39 PM.png, Screen Shot 2015-10-02 at 2.28.48
> PM.png, Screen Shot 2015-10-02 at 2.28.54 PM.png, Screen Shot 2015-10-02 at
> 2.39.57 PM.png
>
>
> Hi Team,
> Our PROD environment has two Data Centers configured as below.
> DC1 - Has 3 Cassandra nodes (dseĀ 4.7.0)
> DC2 - Has 3 Cassandra nodes (dseĀ 4.7.0)
> We are seeing the below issues repeatedly
> 1. Repeated alerts for Cassandra Read Request Latency. ReadStage has a lot of
> Pending queue.
> #nodetool tpstats
> Pool Name Active Pending Completed Blocked All
> time blocked
> MutationStage 0 0 8493347 0
> 0
> ReadStage 32 3699 5835 0
> 0
> Also during the same time we see a high CPU load on the cassandra nodes
> top
> top - 14:41:00 up 6 days, 21:58, 2 users, load average: 33.75, 27.80, 21.41
> Tasks: 226 total, 1 running, 225 sleeping, 0 stopped, 0 zombie
> Cpu(s): 96.8%us, 1.8%sy, 1.1%ni, 0.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Mem: 16331716k total, 16016864k used, 314852k free, 189380k buffers
> Swap: 0k total, 0k used, 0k free, 5786164k cached
> 2. We stopped the COMPACTION on the nodes were CPU load is high. Immediately
> the load came down and the read stage queue was cleared. All the pending
> tasks were processed in the Read Stage queue.
> Can you please suggest what might be causing this issue? Also how can we
> troubleshoot and fix it?
> Thanks,
> Venky
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)