On Monday 18 May 2015 08:48:04 Peter Zijlstra wrote:
> On Sun, May 17, 2015 at 05:48:44PM +0200, Gabriele Mazzotta wrote:
> > Hi,
> > 
> > I've recently noticed that if I suspend and resume my laptop, I can no
> > longer execute turbostat. This is what I get when I try to start it:
> > # turbostat
> > Could not migrate to CPU 1
> > turbostat: re-initialized with num_cpus 4
> > Could not migrate to CPU 1
> > 
> > Since everything works as expected with v4.0, I ran a bisection and
> > found that commit 3c18d447b3b36a8d ("sched/core: Check for available
> > DL bandwidth in cpuset_cpu_inactive()") is the cause of the regression.
> > 
> > I don't know if there's something else affected by that change, but
> > I can consistently reproduce the bug with turbostat.
> 
> 
> This should be fixed by the below commit which is already in Linus'
> tree.

Thank you for the quick reply. As I replied to Ingo's mail, which
arrived just a bit earlier than yours, yes, the commit here below fixes
the problem.

Thanks,
Gabriele

> ---
> commit 533445c6e53368569e50ab3fb712230c03d523f3
> Author: Omar Sandoval <[email protected]>
> Date:   Mon May 4 03:09:36 2015 -0700
> 
>     sched/core: Fix regression in cpuset_cpu_inactive() for suspend
>     
>     Commit 3c18d447b3b3 ("sched/core: Check for available DL bandwidth in
>     cpuset_cpu_inactive()"), a SCHED_DEADLINE bugfix, had a logic error that
>     caused a regression in setting a CPU inactive during suspend. I ran into
>     this when a program was failing pthread_setaffinity_np() with EINVAL after
>     a suspend+wake up.
>     
>     A simple reproducer:
>     
>       $ ./a.out
>       sched_setaffinity: Success
>       $ systemctl suspend
>       $ ./a.out
>       sched_setaffinity: Invalid argument
>     
>     ... where ./a.out is:
>     
>       #define _GNU_SOURCE
>       #include <errno.h>
>       #include <sched.h>
>       #include <stdio.h>
>       #include <stdlib.h>
>       #include <string.h>
>       #include <unistd.h>
>     
>       int main(void)
>       {
>               long num_cores;
>               cpu_set_t cpu_set;
>               int ret;
>     
>               num_cores = sysconf(_SC_NPROCESSORS_ONLN);
>               CPU_ZERO(&cpu_set);
>               CPU_SET(num_cores - 1, &cpu_set);
>               errno = 0;
>               ret = sched_setaffinity(getpid(), sizeof(cpu_set), &cpu_set);
>               perror("sched_setaffinity");
>               return ret ? EXIT_FAILURE : EXIT_SUCCESS;
>       }
>     
>     The mistake is that suspend is handled in the action ==
>     CPU_DOWN_PREPARE_FROZEN case of the switch statement in
>     cpuset_cpu_inactive().
>     
>     However, the commit in question masked out CPU_TASKS_FROZEN
>     from the action, making this case dead.
>     
>     The fix is straightforward.
>     
>     Signed-off-by: Omar Sandoval <[email protected]>
>     Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
>     Cc: Borislav Petkov <[email protected]>
>     Cc: H. Peter Anvin <[email protected]>
>     Cc: Juri Lelli <[email protected]>
>     Cc: Thomas Gleixner <[email protected]>
>     Fixes: 3c18d447b3b3 ("sched/core: Check for available DL bandwidth in 
> cpuset_cpu_inactive()")
>     Link: 
> http://lkml.kernel.org/r/1cb5ecb3d6543c38cce5790387f336f54ec8e2bc.1430733960.git.osan...@osandov.com
>     Signed-off-by: Ingo Molnar <[email protected]>
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 34db9bf892a3..57bd333bc4ab 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6999,27 +6999,23 @@ static int cpuset_cpu_inactive(struct notifier_block 
> *nfb, unsigned long action,
>       unsigned long flags;
>       long cpu = (long)hcpu;
>       struct dl_bw *dl_b;
> +     bool overflow;
> +     int cpus;
>  
> -     switch (action & ~CPU_TASKS_FROZEN) {
> +     switch (action) {
>       case CPU_DOWN_PREPARE:
> -             /* explicitly allow suspend */
> -             if (!(action & CPU_TASKS_FROZEN)) {
> -                     bool overflow;
> -                     int cpus;
> -
> -                     rcu_read_lock_sched();
> -                     dl_b = dl_bw_of(cpu);
> +             rcu_read_lock_sched();
> +             dl_b = dl_bw_of(cpu);
>  
> -                     raw_spin_lock_irqsave(&dl_b->lock, flags);
> -                     cpus = dl_bw_cpus(cpu);
> -                     overflow = __dl_overflow(dl_b, cpus, 0, 0);
> -                     raw_spin_unlock_irqrestore(&dl_b->lock, flags);
> +             raw_spin_lock_irqsave(&dl_b->lock, flags);
> +             cpus = dl_bw_cpus(cpu);
> +             overflow = __dl_overflow(dl_b, cpus, 0, 0);
> +             raw_spin_unlock_irqrestore(&dl_b->lock, flags);
>  
> -                     rcu_read_unlock_sched();
> +             rcu_read_unlock_sched();
>  
> -                     if (overflow)
> -                             return notifier_from_errno(-EBUSY);
> -             }
> +             if (overflow)
> +                     return notifier_from_errno(-EBUSY);
>               cpuset_update_active_cpus(false);
>               break;
>       case CPU_DOWN_PREPARE_FROZEN:

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to