RE: Questions about high read latency and related metrics
Hello C* community, I have been experimenting a bit with my lab node. I am assuming the following as observing the progress of metrics over time: 1. EstimatedPartitionSizeHistogram metric derives from READ operations. Cassandra reports values to this metric as it serves Read queries. 2. PartitionSize derives from the Compaction activities. Cassandra reports values to this metric as it performs the compaction of sstables. I am not sure whether those assumptions are valid but at least provide a good explanation to the progress of the stats observed. Thanks a lot and CU on the next topic. BR MK From: Michail Kotsiouros via user Sent: Thursday, May 11, 2023 14:08 To: user@cassandra.apache.org Subject: RE: Questions about high read latency and related metrics Hello Erick, No Max/Min/Mean vs Histogram difference is clear. What confuses me is the description of those metrics: Size of the compacted partition (in bytes). Vs estimated partition size. I am after what is measured by each metric. To be more specific: What metric should be consider when we want to see the partition size over time? Does this “compacted partition” means that only the partitions which have undergone a compaction in the respective sstables are taken into account for PartitionSize metrics? What “estimated” means in the EstimatedPartitionSizeHistogram metric? Excuse me if those questions sound trivial. BR MK From: Erick Ramirez mailto:erickramire...@apache.org>> Sent: Thursday, May 11, 2023 13:16 To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>; Michail Kotsiouros mailto:michail.kotsiou...@ericsson.com>> Subject: Re: Questions about high read latency and related metrics Is it the concept of histograms that's not clear? Something else?
RE: Questions about high read latency and related metrics
Hello Erick, No Max/Min/Mean vs Histogram difference is clear. What confuses me is the description of those metrics: Size of the compacted partition (in bytes). Vs estimated partition size. I am after what is measured by each metric. To be more specific: What metric should be consider when we want to see the partition size over time? Does this “compacted partition” means that only the partitions which have undergone a compaction in the respective sstables are taken into account for PartitionSize metrics? What “estimated” means in the EstimatedPartitionSizeHistogram metric? Excuse me if those questions sound trivial. BR MK From: Erick Ramirez Sent: Thursday, May 11, 2023 13:16 To: user@cassandra.apache.org; Michail Kotsiouros Subject: Re: Questions about high read latency and related metrics Is it the concept of histograms that's not clear? Something else?
Re: Questions about high read latency and related metrics
Is it the concept of histograms that's not clear? Something else? >
RE: Questions about high read latency and related metrics
Hello Erick, Thanks a lot for the immediate reply but still the difference between those 2 metrics is not clear to me. BR MK From: Erick Ramirez Sent: Thursday, May 11, 2023 13:04 To: user@cassandra.apache.org Subject: Re: Questions about high read latency and related metrics The min/max/mean partition sizes are the sizes in bytes which are the same statistics reported by nodetool tablestats. EstimatedPartitionSizeHistogram is the distribution of partition sizes within specified ranges (percentiles) and is the same histogram reported by nodetool tablehistograms (in the Partition Size column). Cheers!
Re: Questions about high read latency and related metrics
The min/max/mean partition sizes are the sizes in bytes which are the same statistics reported by nodetool tablestats. EstimatedPartitionSizeHistogram is the distribution of partition sizes within specified ranges (percentiles) and is the same histogram reported by nodetool tablehistograms (in the Partition Size column). Cheers! >
Questions about high read latency and related metrics
Hello Cassandra community, I see the following metrics in JMX Metric Name org.apache.cassandra.metrics.Table... MinPartitionSize Gauge Size of the smallest compacted partition (in bytes). MaxPartitionSize Gauge Size of the largest compacted partition (in bytes). MeanPartitionSize Gauge Size of the average compacted partition (in bytes). And EstimatedPartitionSizeHistogram Gauge Histogram of estimated partition size (in bytes). Could you, please, help me clarify the difference of those 2 metrics. We suspect that the increasing partition size by the application data model has an impact on Read latency. What would be the appropriate metric to monitor from PartitionSize and EstimatedPartitionSizeHistogram. BR Michail Kotisouros