* Vincent Guittot <vincent.guit...@linaro.org> wrote:

> On Thu, 25 Apr 2019 at 19:44, Ingo Molnar <mi...@kernel.org> wrote:
> >
> >
> > * Ingo Molnar <mi...@kernel.org> wrote:
> >
> > >
> > > * Peter Zijlstra <pet...@infradead.org> wrote:
> > >
> > > > On Wed, Apr 17, 2019 at 08:29:32PM +0200, Ingo Molnar wrote:
> > > > > Assuming PeterZ & Rafael & Quentin doesn't hate the whole thermal load
> > > > > tracking approach.
> > > >
> > > > I seem to remember competing proposals, and have forgotten everything
> > > > about them; the cover letter also didn't have references to them or
> > > > mention them in any way.
> > > >
> > > > As to the averaging and period, I personally prefer a PELT signal with
> > > > the windows lined up, if that really is too short a window, then a PELT
> > > > like signal with a natural multiple of the PELT period would make sense,
> > > > such that the windows still line up nicely.
> > > >
> > > > Mixing different averaging methods and non-aligned windows just makes me
> > > > uncomfortable.
> > >
> > > Yeah, so the problem with PELT is that while it nicely approximates
> > > variable-period decay calculations with plain additions, shifts and table
> > > lookups (i.e. accelerates pow()), AFAICS the most important decay
> > > parameter is fixed: the speed of decay, the dampening factor, which is
> > > fixed at 32:
> > >
> > >   Documentation/scheduler/sched-pelt.c
> > >
> > >   #define HALFLIFE 32
> > >
> > > Right?
> > >
> > > Thara's numbers suggest that there's high sensitivity to the speed of
> > > decay. By using PELT we'd be using whatever averaging speed there is
> > > within PELT.
> > >
> > > Now we could make that parametric of course, but that would both
> > > complicate the PELT lookup code (one more dimension) and would negatively
> > > affect code generation in a number of places.
> >
> > I missed the other solution, which is what you suggested: by
> > increasing/reducing the PELT window size we can effectively shift decay
> > speed and use just a single lookup table.
> >
> > I.e. instead of the fixed period size of 1024 in accumulate_sum(), use
> > decay_load() directly but use a different (longer) window size from 1024
> > usecs to calculate 'periods', and make it a multiple of 1024.
> 
> Can't we also scale the now parameter of ___update_load_sum() ?
> If we right shift it before calling ___update_load_sum, it should be
> the same as using a half period of 62, 128, 256ms ...
> The main drawback would be a lost of precision but we are in the range
> of 2, 4, 8us compared to the 1ms window
> 
> This is quite similar to how we scale the utilization with frequency and uarch

Yeah, that would work too.

Thanks,

        Ingo

Reply via email to