[ https://issues.apache.org/jira/browse/CASSANDRA-7731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14095171#comment-14095171 ]
Cyril Scetbon edited comment on CASSANDRA-7731 at 8/13/14 6:49 AM: ------------------------------------------------------------------- [~snazy] in the first part, you're totally right about the use of exponentially decaying reservoirs. I didn't see that at first. Cool. Yeah, you could use a better name for variables but as long as it does what it should that's fine for me :) The message is clear in CFSTAT about this. Renaming them to be clearer could help developers too. For the second part, that's a yes. I think we really need to know the last max for live and tombstone cells number of reads. We hit 2 development bugs related to this and monitoring that could really help ! So using 2 more histograms (with biased=true) for those max values should help and is a must have. Can you also confirm that calling CFSTAT does not reset internal counters at the end of the call ? I understand that for the histograms above it doesn't, but what about the others ? was (Author: cscetbon): [~snazy] in the first part, you're totally right about the use of exponentially decaying reservoirs. I didn't see that at first. Cool. Yeah, you could use a better name for variables but as long as it does what it should that's fine for me :) The message is clear in CFSTAT about this. Renaming them to be clearer could help developers too. For the second part, that's a yes. I think we really need to know the last max for live and tombstone cells number of reads. We hit 2 development bugs related to this and monitoring that could really help ! So using 2 more histograms (with biased=true) for those max values should help and is a must have. Can you just confirm that calling CFSTAT does not reset internal counters at the end of the call ? I understand that for the histograms above it doesn't, but what about the others ? > Average live/tombstone cells per slice > -------------------------------------- > > Key: CASSANDRA-7731 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7731 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Cyril Scetbon > Assignee: Robert Stupp > Priority: Minor > > I think you should not say that slice statistics are valid for the [last five > minutes > |https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/tools/NodeCmd.java#L955-L956] > in CFSTATS command of nodetool. I've read the documentation from yammer for > Histograms and there is no way to force values to expire after x minutes > except by > [clearing|http://grepcode.com/file/repo1.maven.org/maven2/com.yammer.metrics/metrics-core/2.1.2/com/yammer/metrics/core/Histogram.java#96] > it . The only thing I can see is that the last snapshot used to provide the > median (or whatever you'd used instead) value is based on 1028 values. > I think we should also be able to detect that some requests are accessing a > lot of live/tombstone cells per query and that's not possible for now without > activating DEBUG for SliceQueryFilter for example and by tweaking the > threshold. Currently as nodetool cfstats returns the median if a low part of > the queries are scanning a lot of live/tombstone cells we miss it ! -- This message was sent by Atlassian JIRA (v6.2#6252)