Hi Chris, On Tue, 9 Jul 2019 at 17:23, Chris Redpath <chris.redp...@arm.com> wrote: > > Hi Peter, > > On 09/07/2019 14:50, Peter Zijlstra wrote: > > On Tue, Jul 09, 2019 at 12:57:59PM +0100, Chris Redpath wrote: > >> The ancient workaround to avoid the cost of updating rq clocks in the > >> middle of a migration causes some issues on asymmetric CPU capacity > >> systems where we use task utilization to determine which cpus fit a task. > >> On quiet systems we can inflate task util after a migration which > >> causes misfit to fire and force-migrate the task. > >> > >> This occurs when: > >> > >> (a) a task has util close to the non-overutilized capacity limit of a > >> particular cpu (cpu0 here); and > >> (b) the prev_cpu was quiet otherwise, such that rq clock is > >> sufficiently out of date (cpu1 here). > >> > >> e.g. > >> _____ > >> cpu0: ________________________| |______________ > >> > >> |<- misfit happens > >> ______ ___ ___ > >> cpu1: ____| |______________|___| |_________| > >> > >> ->| |<- wakeup migration time > >> last rq clock update > >> > >> When the task util is in just the right range for the system, we can end > >> up migrating an unlucky task back and forth many times until we are lucky > >> and the source rq happens to be updated close to the migration time. > >> > >> In order to address this, lets update both rq_clock and cfs_rq where > >> this could be an issue. > > > > Can you quantify how much of a problem this really is? It is really sad, > > but this is already the second place where we take rq->lock on > > migration. We worked so hard to avoid having to acquire it :/ > > > > I think you're familiar with the way we test the EAS and misfit stuff, > but some might not be, so I'll just outline them. > > We have performance and placement tests for a suite of simple synthetic > scenarios selected to trigger the EAS & misfit mechanisms. The > performance tests use rt-app's slack metric, and we try to minimise > negative slack (i.e. missed deadlines). > > In the placement tests we estimate the minimum energy consumed to run a > particular synthetic test job and we calculate the energy consumed in > the actual execution according to a trace. We pass the test if our > estimate of actual is less than ideal+20%. > > We enter this code quite often in our testing, most individual runs of a > test which has small tasks involved have at least one hit where we make > a change to the clock with this patch in.
Do you have a rt-app file that you can share ? > > That said - despite the relatively high number of hits only about 5% of > runs see enough additional energy consumed to trigger a test failure. We > do try to keep a quiet system as much as possible and only run for a few > seconds so the impact we see in testing is also probably higher than in > the real world. Yeah, I'm curious to see the impact on a real system which have a 60fps screen update like an android phone > > I totally appreciate the reluctance to add this - I don't much like it > either, but I was hoping that sticking it behind the asym_cpucapacity > key might be a reasonable compromise. > > At least only those people who select a CPU using task util and capacity > see this cost, and we have smaller systems so in theory the cost is smaller. > > I'm very open to exploring alternatives :) > > >> Signed-off-by: Chris Redpath <chris.redp...@arm.com> > >> --- > >> kernel/sched/fair.c | 15 +++++++++++++++ > >> 1 file changed, 15 insertions(+) > >> > >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > >> index b798fe7ff7cd..51791db26a2a 100644 > >> --- a/kernel/sched/fair.c > >> +++ b/kernel/sched/fair.c > >> @@ -6545,6 +6545,21 @@ static void migrate_task_rq_fair(struct task_struct > >> *p, int new_cpu) > >> * wakee task is less decayed, but giving the wakee more load > >> * sounds not bad. > >> */ > >> + if (static_branch_unlikely(&sched_asym_cpucapacity) && > >> + p->state == TASK_WAKING) { > > > > nit: indent fail. > > > > oops, will tweak it > > --Chris > IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy the > information in any medium. Thank you.