On 24/03/15 09:13, Wanpeng Li wrote: > Hi Juri, > On Tue, Mar 24, 2015 at 09:27:09AM +0000, Juri Lelli wrote: >> Hi, >> >> On 23/03/2015 08:55, Peter Zijlstra wrote: >>> On Mon, Mar 23, 2015 at 08:25:04AM +0100, Ingo Molnar wrote: >>>>>>>>>>> + if (cpu >= nr_cpu_ids) { >>>>>>>>>>> + if (dl_bandwidth_enabled()) { >>>>>>>>>>> + /* >>>>>>>>>>> + * Fail to find any suitable >>>>>>>>>>> cpu. >>>>>>>>>>> + * The task will never come >>>>>>>>>>> back! >>>>>>>>>>> + */ >>>>>>>>>>> + WARN_ON(1); >>>>>>>>>> >>>>>>>>>> Can this condition happen to users with a non-buggy kernel? >>> >>>> I still haven't seen a satisfactory answer to this question. Please >>>> don't resend patches without clearing questions raised during review. >>> >>> So I had a look on Friday, it _should_ not happen, but it does due to a >>> second bug Juri is currently chasing down. >>> >> >> Right, it should not happen. It happens because hotplug operations are >> destructive w.r.t. cpusets. Peter, how about we move the check you put >> in sched_cpu_inactive() to cpuset_cpu_inactive()? This way, if we fail, >> we don't need to destroy/rebuild the domains. > > I remember you mentioned that there is a bug through IRC last week, if this > patch solve it? >
It seems to fix it. With the previous check we correctly fail to turn off a cpu with -dl task running only the first time. After that the bandwidth information associated with it was gone and subsequent hotplug operations on the same cpu would turn it off. Thanks, - Juri > Regards, > Wanpeng Li > >> >> Thanks, >> >> - Juri >> >> >From 65e8033e05f8b70116747062d00d5a5c266699fb Mon Sep 17 00:00:00 2001 >> From: Juri Lelli <juri.le...@gmail.com> >> Date: Tue, 24 Mar 2015 07:47:03 +0000 >> Subject: [PATCH] sched/core: check for available -dl bandwidth in >> cpuset_cpu_inactive >> >> Signed-off-by: Juri Lelli <juri.le...@arm.com> >> --- >> kernel/sched/core.c | 56 >> ++++++++++++++++++++++++++--------------------------- >> 1 file changed, 28 insertions(+), 28 deletions(-) >> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c >> index 50927eb..3723ad0 100644 >> --- a/kernel/sched/core.c >> +++ b/kernel/sched/core.c >> @@ -5318,36 +5318,13 @@ static int sched_cpu_active(struct notifier_block >> *nfb, >> static int sched_cpu_inactive(struct notifier_block *nfb, >> unsigned long action, void *hcpu) >> { >> - unsigned long flags; >> - long cpu = (long)hcpu; >> - struct dl_bw *dl_b; >> - >> switch (action & ~CPU_TASKS_FROZEN) { >> case CPU_DOWN_PREPARE: >> - set_cpu_active(cpu, false); >> - >> - /* explicitly allow suspend */ >> - if (!(action & CPU_TASKS_FROZEN)) { >> - bool overflow; >> - int cpus; >> - >> - rcu_read_lock_sched(); >> - dl_b = dl_bw_of(cpu); >> - >> - raw_spin_lock_irqsave(&dl_b->lock, flags); >> - cpus = dl_bw_cpus(cpu); >> - overflow = __dl_overflow(dl_b, cpus, 0, 0); >> - raw_spin_unlock_irqrestore(&dl_b->lock, flags); >> - >> - rcu_read_unlock_sched(); >> - >> - if (overflow) >> - return notifier_from_errno(-EBUSY); >> - } >> + set_cpu_active((long)hcpu, false); >> return NOTIFY_OK; >> + default: >> + return NOTIFY_DONE; >> } >> - >> - return NOTIFY_DONE; >> } >> >> static int __init migration_init(void) >> @@ -7001,7 +6978,6 @@ static int cpuset_cpu_active(struct notifier_block >> *nfb, unsigned long action, >> */ >> >> case CPU_ONLINE: >> - case CPU_DOWN_FAILED: >> cpuset_update_active_cpus(true); >> break; >> default: >> @@ -7013,8 +6989,32 @@ static int cpuset_cpu_active(struct notifier_block >> *nfb, unsigned long action, >> static int cpuset_cpu_inactive(struct notifier_block *nfb, unsigned long >> action, >> void *hcpu) >> { >> - switch (action) { >> + unsigned long flags; >> + long cpu = (long)hcpu; >> + struct dl_bw *dl_b; >> + >> + switch (action & ~CPU_TASKS_FROZEN) { >> case CPU_DOWN_PREPARE: >> + /* explicitly allow suspend */ >> + if (!(action & CPU_TASKS_FROZEN)) { >> + bool overflow; >> + int cpus; >> + >> + rcu_read_lock_sched(); >> + dl_b = dl_bw_of(cpu); >> + >> + raw_spin_lock_irqsave(&dl_b->lock, flags); >> + cpus = dl_bw_cpus(cpu); >> + overflow = __dl_overflow(dl_b, cpus, 0, 0); >> + raw_spin_unlock_irqrestore(&dl_b->lock, flags); >> + >> + rcu_read_unlock_sched(); >> + >> + if (overflow) { >> + trace_printk("hotplug failed for cpu %lu", cpu); >> + return notifier_from_errno(-EBUSY); >> + } >> + } >> cpuset_update_active_cpus(false); >> break; >> case CPU_DOWN_PREPARE_FROZEN: >> -- >> 2.3.0 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/