[ 
https://issues.apache.org/jira/browse/CASSANDRA-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13848498#comment-13848498
 ] 

Benedict commented on CASSANDRA-6486:
-------------------------------------

So, after thinking about this a little more, I may be leaning towards a 
slightly modified approach, that avoids the per-thread allocation and dynamic 
resizing of ranges in favour of a single global reservoir that is updated 
directly by each thread. This has the disadvantage that the intervals you're 
timing are more difficult to define, but we really don't need that kind of 
paranoia with accuracy for measuring many-microsecond and above events. 

I'm currently thinking of using a rolling collection of sample-histograms (say 
10 per timer) to provide a rolling window on the desired measurement interval, 
and on retiring the oldest sample-histogram the result can be merged into the 
next tier of interval we're measuring.

Alternatively we could take the same approach but with just a regular 
histogram, but I currently prefer the sampled approach as, even with larger 
windows than a histogram, the distribution for any window is more likely to 
closely approximate a normal distribution and so should give a more accurate 
picture of latencies for the interval even with a very small sample size.



> Latency Measurement
> -------------------
>
>                 Key: CASSANDRA-6486
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6486
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Benedict
>
> Latency measurement in Cassandra is currently suboptimal. Exactly what the 
> latency measurements tell you isn't intuitively clear due to their 
> exponentially decaying, but amount to some view of the latency per 
> (unweighted) operation over the past, approximately, 10 minute period, with 
> greater weight given to more recent operations. This has some obvious flaws, 
> the most notable being that due to probabilistic sampling, large outlier 
> events (e.g. GC) can easily be lost over a multi-minute time horizon, and 
> even when caught are unlikely to appear even in the 99.9th percentile due to 
> accounting for a tiny fraction of events numerically.
> I'm generally thinking about how we might improve on this, and want to dump 
> my ideas here for discussion. I think the following things should be targeted:
> 1) Ability to see uniform latency measurements for different time horizons 
> stretching back from the present, e.g. last 1s, 1m, 1hr and 1day
> 2) Ability to bound the error margin of statistics for all of these intervals
> 3) Protect against losing outlier measurements
> 4) Possibly offer the ability to weight statistics, so that longer latencies 
> are not underplayed even if they are counted
> 5) Preferably non-blocking, memory efficient, and relatively garbage-free
> (3) and (4) are the trickiest, as a theoretically sound and general approach 
> isn't immediately obvious. There are a number of possibilities that spring to 
> mind:
> 1) ensure that we have enough sample points that we are probabilistically 
> guaranteed to not lose them, but over large time horizons this is problematic 
> due to memory constraints, and it doesn't address (4);
> 2) count large events multiple times (or sub-slices of the events), based on 
> e.g. average op-rate. I am not a fan of this idea because it makes possibly 
> bad assumptions about behaviour and doesn't seem very theoretically sound;
> 3) weight the probability of retaining an event by its length. the problem 
> with this approach is that it ties you into (4) without offering the current 
> view of statistics (i.e. unweighted operations), and it also doesn't lend 
> itself to efficient implementation
> I'm currently leaning towards a fourth approach, which attempts to hybridise 
> uniform sampling and histogram behaviour, by separating the sample space into 
> ranges, each some multiple of the last (say 2x the size). Each range has a 
> uniform sample of events that occured in that range, plus a count of total 
> events. Ideally the size of the sample will be variable based on the number 
> of events occurring in any range, but that there will be a lower-bound 
> calculated to ensure we do not lose events.
> This approach lends itself to all 5 goals above:
> 1) by maintaining the same structure for each time horizon, and uniformly 
> sampling from all of the directly lower order time horizons to maintain it;
> 2) by imposing minimum sample sizes for each range;
> 3) ditto (2);
> 4) by producing time/frequency-weighted statistics using the samples and 
> counts from each range;
> 5) with thread-local array-based timers that are synchronised with the global 
> timer once every minimum period, by the owning thread
> This also extends reasonably nicely the timers I have already written for 
> CASSANDRA-6199, so some of the work is already done.
> Thoughts / discussion would be welcome, especially if you think I've missed 
> another obvious approach.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to