C. Scott Andreas updated CASSANDRA-6486:
    Component/s: Metrics

> Latency Measurement
> -------------------
>                 Key: CASSANDRA-6486
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6486
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Metrics
>            Reporter: Benedict
>            Priority: Minor
> Latency measurement in Cassandra is currently suboptimal. Exactly what the 
> latency measurements tell you isn't intuitively clear due to their 
> exponentially decaying, but amount to some view of the latency per 
> (unweighted) operation over the past, approximately, 10 minute period, with 
> greater weight given to more recent operations. This has some obvious flaws, 
> the most notable being that due to probabilistic sampling, large outlier 
> events (e.g. GC) can easily be lost over a multi-minute time horizon, and 
> even when caught are unlikely to appear even in the 99.9th percentile due to 
> accounting for a tiny fraction of events numerically.
> I'm generally thinking about how we might improve on this, and want to dump 
> my ideas here for discussion. I think the following things should be targeted:
> 1) Ability to see uniform latency measurements for different time horizons 
> stretching back from the present, e.g. last 1s, 1m, 1hr and 1day
> 2) Ability to bound the error margin of statistics for all of these intervals
> 3) Protect against losing outlier measurements
> 4) Possibly offer the ability to weight statistics, so that longer latencies 
> are not underplayed even if they are counted
> 5) Preferably non-blocking, memory efficient, and relatively garbage-free
> (3) and (4) are the trickiest, as a theoretically sound and general approach 
> isn't immediately obvious. There are a number of possibilities that spring to 
> mind:
> 1) ensure that we have enough sample points that we are probabilistically 
> guaranteed to not lose them, but over large time horizons this is problematic 
> due to memory constraints, and it doesn't address (4);
> 2) count large events multiple times (or sub-slices of the events), based on 
> e.g. average op-rate. I am not a fan of this idea because it makes possibly 
> bad assumptions about behaviour and doesn't seem very theoretically sound;
> 3) weight the probability of retaining an event by its length. the problem 
> with this approach is that it ties you into (4) without offering the current 
> view of statistics (i.e. unweighted operations), and it also doesn't lend 
> itself to efficient implementation
> I'm currently leaning towards a fourth approach, which attempts to hybridise 
> uniform sampling and histogram behaviour, by separating the sample space into 
> ranges, each some multiple of the last (say 2x the size). Each range has a 
> uniform sample of events that occured in that range, plus a count of total 
> events. Ideally the size of the sample will be variable based on the number 
> of events occurring in any range, but that there will be a lower-bound 
> calculated to ensure we do not lose events.
> This approach lends itself to all 5 goals above:
> 1) by maintaining the same structure for each time horizon, and uniformly 
> sampling from all of the directly lower order time horizons to maintain it;
> 2) by imposing minimum sample sizes for each range;
> 3) ditto (2);
> 4) by producing time/frequency-weighted statistics using the samples and 
> counts from each range;
> 5) with thread-local array-based timers that are synchronised with the global 
> timer once every minimum period, by the owning thread
> This also extends reasonably nicely the timers I have already written for 
> CASSANDRA-6199, so some of the work is already done.
> Thoughts / discussion would be welcome, especially if you think I've missed 
> another obvious approach.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to