Benoît Canet <benoit.ca...@nodalink.com> writes: > On Mon, Sep 08, 2014 at 05:09:38PM +0200, Paolo Bonzini wrote: >> Il 08/09/2014 16:49, Benoît Canet ha scritto: >> >> > - create two windows, with twice the suggested expiration period, and >> >> > return min/avg/max from the oldest window. Example >> >> > >> >> > t=0 |t=1 |t=2 |t=3 |t=4 >> >> > wnd0: [0,1) |wnd0: [1,3) | |wnd0: [3,5) | >> >> > wnd1: [0,2) | |wnd1: [2,4) | | >> >> > >> >> > Values are returned from: >> >> > >> >> > wnd0---------|wnd1---------|wnd0---------|wnd1---------| >> > >> > This is neat. >> >> Alternatively, you can make it probabilistically correct: >> >> t=0 |t=0.66 |t=1.33 |t=2 >> |t=2.66 >> |wnd0: [0.66,2) | |wnd0: [2,3.33) | >> wnd1: [0,0.66) | |wnd1: [1.33,2.66) | | >> >> Return from: >> >> >> wnd1-----------|wnd1-------------|wnd0---------------|wnd1-------------|wnd0 >> >> So you always have 2/3 seconds worth of data, and on average exactly 1 second >> worth of data. >> >> The problem is the delay in getting data, which can be big for the minute- >> and hour-based statistics. Suppose you have a spike that lasts 10 seconds, >> it might not show in the minute-based statistics for as much as 30 seconds >> after it ends (the window switches every 40 seconds). >> >> For min/max you could return min(min0, min1) and max(max0, max1). Only the >> average has this problem. >> >> Exponential smoothing doesn't have this problem. IIRC uptime uses that. > > I am writing this so cloud end users can programatically get informations > about > their vms disk statistics. > > Cloud end users are known to use their cloud API to script the > elasticity of their > architecture. > Some code will poll system statistics to decide if new instances must > be launched > or existing instances must be pruned. > This means introducing a delay in the accounting code would slow down their > decisions. > > min and max is also useful to know since it gives an idea of the deviation.
For what it's worth, the algorithm in the Dr. Dobb's Paolo referenced can compute a standard deviation. Can we figure out what users really want, standard deviation, min/max, or both? [...]