[ 
https://issues.apache.org/jira/browse/CASSANDRA-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792890#action_12792890
 ] 

Ramzi Rabah commented on CASSANDRA-646:
---------------------------------------

If we want readLatency to mean only read from disk time, but not include 
network delay time can we at least log the exception on the server when a 
timedout exception happens (like we used to in version 0.4). The reason I 
mention this is that we had some major TimedOutExceptions being thrown, and the 
system was suffering badly, but the server logs and cfstats showed everything 
to be perfectly running fine. It's only when we dug into the client logs that 
we started noticing that. It makes monitoring the health of the system harder, 
when you have many connected clients to the cassandra servers, and you need to 
look at each of their logs separately. 

> Fix few minor problems in nodeprobe cfstats
> -------------------------------------------
>
>                 Key: CASSANDRA-646
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-646
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 0.5
>            Reporter: Ramzi Rabah
>            Priority: Minor
>
> nodeprobe cfstats reports that readlatency/writelatency is NaN on the 
> keyspace level although it obviously is not.
> For example:
> Keyspace: Keyspace1
>         Read Count: 392
>         Read Latency: NaN ms.
>         Write Count: 262
>         Write Latency: NaN ms.
>         Pending Tasks: 0
>                 Column Family: MyCF
>                 Memtable Columns Count: 143
>                 Memtable Data Size: 123433
>                 Memtable Switch Count: 2
>                 Read Count: 392
>                 Read Latency: 0.533 ms.
>                 Write Count: 262
>                 Write Latency: 0.000 ms.
>                 Pending Tasks: 0
>                 Column Family: Standard2
>                 Memtable Columns Count: 0
>                 Memtable Data Size: 0
>                 Memtable Switch Count: 0
>                 Read Count: 0
>                 Read Latency: NaN ms.
>                 Write Count: 0
>                 Write Latency: NaN ms.
>                 Pending Tasks: 0
> The problem here is that there is more than one cf, and one of them has read 
> latency/writelatency NaN. This causes the keyspace readlatency/writelatency 
> to be NaN instead of the average across all cfs. 
> Another problem with cfstats is that it does not account for the delays when 
> a read/write times out, so it does not accurately reflect the health of the 
> system under too much stress. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to