Re: [Qemu-devel] [PATCH v2 3/3] util: Add an utility infrastructure used to compute an average on a time slice

Markus Armbruster Mon, 15 Sep 2014 04:15:08 -0700

Benoît Canet <benoit.ca...@nodalink.com> writes:

> On Mon, Sep 08, 2014 at 05:09:38PM +0200, Paolo Bonzini wrote:
>> Il 08/09/2014 16:49, Benoît Canet ha scritto:
>> >> > - create two windows, with twice the suggested expiration period, and
>> >> > return min/avg/max from the oldest window.  Example
>> >> > 
>> >> >        t=0          |t=1          |t=2          |t=3          |t=4
>> >> >        wnd0: [0,1)  |wnd0: [1,3)  |             |wnd0: [3,5)  |
>> >> >        wnd1: [0,2)  |             |wnd1: [2,4)  |             |
>> >> > 
>> >> > Values are returned from:
>> >> > 
>> >> >        wnd0---------|wnd1---------|wnd0---------|wnd1---------|
>> > 
>> > This is neat.
>> 
>> Alternatively, you can make it probabilistically correct:
>> 
>>     t=0            |t=0.66           |t=1.33             |t=2              
>> |t=2.66
>>                    |wnd0: [0.66,2)   |                   |wnd0: [2,3.33)   |
>>     wnd1: [0,0.66) |                 |wnd1: [1.33,2.66)  |                 |
>> 
>> Return from:
>> 
>>     
>> wnd1-----------|wnd1-------------|wnd0---------------|wnd1-------------|wnd0
>> 
>> So you always have 2/3 seconds worth of data, and on average exactly 1 second
>> worth of data.
>> 
>> The problem is the delay in getting data, which can be big for the minute-
>> and hour-based statistics.  Suppose you have a spike that lasts 10 seconds,
>> it might not show in the minute-based statistics for as much as 30 seconds
>> after it ends (the window switches every 40 seconds).
>> 
>> For min/max you could return min(min0, min1) and max(max0, max1).  Only the
>> average has this problem.
>> 
>> Exponential smoothing doesn't have this problem.  IIRC uptime uses that.
>
> I am writing this so cloud end users can programatically get informations 
> about
> their vms disk statistics.
>
> Cloud end users are known to use their cloud API to script the
> elasticity of their
> architecture.
> Some code will poll system statistics to decide if new instances must
> be launched
> or existing instances must be pruned.
> This means introducing a delay in the accounting code would slow down their
> decisions.
>
> min and max is also useful to know since it gives an idea of the deviation.


For what it's worth, the algorithm in the Dr. Dobb's Paolo referenced
can compute a standard deviation.  Can we figure out what users really
want, standard deviation, min/max, or both?

[...]

Re: [Qemu-devel] [PATCH v2 3/3] util: Add an utility infrastructure used to compute an average on a time slice

Reply via email to