On 28-Jun 15:51, Vincent Guittot wrote: > On Fri, 28 Jun 2019 at 14:38, Peter Zijlstra <pet...@infradead.org> wrote: > > > > On Fri, Jun 28, 2019 at 11:08:14AM +0100, Patrick Bellasi wrote: > > > On 26-Jun 13:40, Vincent Guittot wrote: > > > > Hi Patrick, > > > > > > > > On Thu, 20 Jun 2019 at 17:06, Patrick Bellasi <patrick.bell...@arm.com> > > > > wrote: > > > > > > > > > > The estimated utilization for a task is currently defined based on: > > > > > - enqueued: the utilization value at the end of the last activation > > > > > - ewma: an exponential moving average which samples are the > > > > > enqueued values > > > > > > > > > > According to this definition, when a task suddenly change it's > > > > > bandwidth > > > > > requirements from small to big, the EWMA will need to collect multiple > > > > > samples before converging up to track the new big utilization. > > > > > > > > > > Moreover, after the PELT scale invariance update [1], in the above > > > > > scenario we > > > > > can see that the utilization of the task has a significant drop from > > > > > the first > > > > > big activation to the following one. That's implied by the new > > > > > "time-scaling" > > > > > > > > Could you give us more details about this? I'm not sure to understand > > > > what changes between the 1st big activation and the following one ? > > > > > > We are after a solution for the problem Douglas Raillard discussed at > > > OSPM, specifically the "Task util drop after 1st idle" highlighted in > > > slide 6 of his presentation: > > > > > > > > > http://retis.sssup.it/ospm-summit/Downloads/02_05-Douglas_Raillard-How_can_we_make_schedutil_even_more_effective.pdf > > > > > > > So I see the problem, and I don't hate the patch, but I'm still > > struggling to understand how exactly it related to the time-scaling > > stuff. Afaict the fundamental problem here is layering two averages. The > > AFAICT, it's not related to the time-scaling > > In fact the big 1st activation happens because task runs at low OPP > and hasn't enough time to finish its running phase before the time to > begin the next one happens. This means that the task will run several > computations phase in one go which is no more a 75% task.
But in that case, running multiple activations back to back, should we not expect the util_avg to exceed the 75% mark? > From a pelt PoV, the task is far larger than a 75% task and its > utilization too because it runs far longer (even after scaling time > with frequency). Which thus should match my expectation above, no? > Once cpu reaches a high enough OPP that enable to have sleep phase > between each running phases, the task load tracking comes back to the > normal slope increase (the one that would have happen if task would > have jump from 5% to 75% but already running at max OPP) Indeed, I can see from the plots a change in slope. But there is also that big drop after the first big activation: 375 units in 1.1ms. Is that expected? I guess yes, since we fix the clock_pelt with the lost_idle_time. > > second (EWMA in our case) will always lag/delay the input of the first > > (PELT). > > > > The time-scaling thing might make matters worse, because that helps PELT > > ramp up faster, but that is not the primary issue. > > > > Or am I missing something? -- #include <best/regards.h> Patrick Bellasi