suspend-related shceduler fix
Thanks to the evil Intel AMT serial, I've been able to figure out what made my x220 hang when suspending. Turns out that the fix matthew@ committed almost two weeks ago uncovered another bug that made sched_stop_secondary_cpus() spin forever if there was a processing running on a secondary cpu but nothing left on the run qeueue. In that case sched_choosecpu() short-circuits and returns the current cpu. The fix is obvious: don't short-circuit when we're on a cpu that should stop. ok? Index: kern_sched.c === RCS file: /home/cvs/src/sys/kern/kern_sched.c,v retrieving revision 1.33 diff -u -p -r1.33 kern_sched.c --- kern_sched.c13 Jul 2014 21:44:58 - 1.33 +++ kern_sched.c26 Jul 2014 11:44:26 - @@ -275,6 +275,7 @@ sched_chooseproc(void) while ((p = TAILQ_FIRST(spc-spc_qs[queue]))) { remrunqueue(p); p-p_cpu = sched_choosecpu(p); + KASSERT(p-p_cpu != curcpu()); setrunqueue(p); } } @@ -408,6 +409,7 @@ sched_choosecpu(struct proc *p) */ if (cpuset_isset(set, p-p_cpu) || (p-p_cpu == curcpu() p-p_cpu-ci_schedstate.spc_nrun == 0 + (p-p_cpu-ci_schedstate.spc_schedflags SPCF_SHOULDHALT) == 0 curproc == p)) { sched_wasidle++; return (p-p_cpu);
Re: suspend-related shceduler fix
This cures my x220 as well. OK bcook@ On Jul 26, 2014 6:56 AM, Mark Kettenis mark.kette...@xs4all.nl wrote: Thanks to the evil Intel AMT serial, I've been able to figure out what made my x220 hang when suspending. Turns out that the fix matthew@ committed almost two weeks ago uncovered another bug that made sched_stop_secondary_cpus() spin forever if there was a processing running on a secondary cpu but nothing left on the run qeueue. In that case sched_choosecpu() short-circuits and returns the current cpu. The fix is obvious: don't short-circuit when we're on a cpu that should stop. ok? Index: kern_sched.c === RCS file: /home/cvs/src/sys/kern/kern_sched.c,v retrieving revision 1.33 diff -u -p -r1.33 kern_sched.c --- kern_sched.c13 Jul 2014 21:44:58 - 1.33 +++ kern_sched.c26 Jul 2014 11:44:26 - @@ -275,6 +275,7 @@ sched_chooseproc(void) while ((p = TAILQ_FIRST(spc-spc_qs[queue]))) { remrunqueue(p); p-p_cpu = sched_choosecpu(p); + KASSERT(p-p_cpu != curcpu()); setrunqueue(p); } } @@ -408,6 +409,7 @@ sched_choosecpu(struct proc *p) */ if (cpuset_isset(set, p-p_cpu) || (p-p_cpu == curcpu() p-p_cpu-ci_schedstate.spc_nrun == 0 + (p-p_cpu-ci_schedstate.spc_schedflags SPCF_SHOULDHALT) == 0 curproc == p)) { sched_wasidle++; return (p-p_cpu);
Re: suspend-related shceduler fix
On Sat, Jul 26, 2014 at 4:56 AM, Mark Kettenis mark.kette...@xs4all.nl wrote: Thanks to the evil Intel AMT serial, I've been able to figure out what made my x220 hang when suspending. Turns out that the fix matthew@ committed almost two weeks ago uncovered another bug that made sched_stop_secondary_cpus() spin forever if there was a processing running on a secondary cpu but nothing left on the run qeueue. In that case sched_choosecpu() short-circuits and returns the current cpu. The fix is obvious: don't short-circuit when we're on a cpu that should stop. ok? ok guenther@