Re: [PATCH] powerpc: secondary CPUs signal to master before setting active and online (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-10 Thread Michael Ellerman
On Tue, 2014-12-09 at 12:54 -0800, Linus Torvalds wrote: > On Mon, Dec 8, 2014 at 3:58 PM, Anton Blanchard wrote: > > Hi Ingo, > > > >> At that point I thought the previous task_cpu() was somewhat ingrained > >> in the scheduler and came up with the patch. If not, we could go on a > >> hunt to

Re: [PATCH] powerpc: secondary CPUs signal to master before setting active and online (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-10 Thread Thomas Gleixner
On Tue, 9 Dec 2014, Linus Torvalds wrote: > On Mon, Dec 8, 2014 at 3:58 PM, Anton Blanchard wrote: > > Hi Ingo, > > > >> At that point I thought the previous task_cpu() was somewhat ingrained > >> in the scheduler and came up with the patch. If not, we could go on a > >> hunt to see what else

Re: [PATCH] powerpc: secondary CPUs signal to master before setting active and online (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-10 Thread Thomas Gleixner
On Tue, 9 Dec 2014, Linus Torvalds wrote: On Mon, Dec 8, 2014 at 3:58 PM, Anton Blanchard an...@samba.org wrote: Hi Ingo, At that point I thought the previous task_cpu() was somewhat ingrained in the scheduler and came up with the patch. If not, we could go on a hunt to see what else

Re: [PATCH] powerpc: secondary CPUs signal to master before setting active and online (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-10 Thread Michael Ellerman
On Tue, 2014-12-09 at 12:54 -0800, Linus Torvalds wrote: On Mon, Dec 8, 2014 at 3:58 PM, Anton Blanchard an...@samba.org wrote: Hi Ingo, At that point I thought the previous task_cpu() was somewhat ingrained in the scheduler and came up with the patch. If not, we could go on a hunt to

Re: [PATCH] powerpc: secondary CPUs signal to master before setting active and online (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-09 Thread Linus Torvalds
On Mon, Dec 8, 2014 at 3:58 PM, Anton Blanchard wrote: > Hi Ingo, > >> At that point I thought the previous task_cpu() was somewhat ingrained >> in the scheduler and came up with the patch. If not, we could go on a >> hunt to see what else needs fixing. > > I had another look. The scheduled does

Re: [PATCH] powerpc: secondary CPUs signal to master before setting active and online (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-09 Thread Linus Torvalds
On Mon, Dec 8, 2014 at 3:58 PM, Anton Blanchard an...@samba.org wrote: Hi Ingo, At that point I thought the previous task_cpu() was somewhat ingrained in the scheduler and came up with the patch. If not, we could go on a hunt to see what else needs fixing. I had another look. The scheduled

Re: [PATCH] kthread: kthread_bind fails to enforce CPU affinity (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-08 Thread Lai Jiangshan
On 12/08/2014 09:54 PM, Steven Rostedt wrote: > On Mon, 8 Dec 2014 14:27:01 +1100 > Anton Blanchard wrote: > >> I have a busy ppc64le KVM box where guests sometimes hit the infamous >> "kernel BUG at kernel/smpboot.c:134!" issue during boot: >>

[PATCH] powerpc: secondary CPUs signal to master before setting active and online (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-08 Thread Anton Blanchard
infamous "kernel BUG at kernel/smpboot.c:134!" issue during boot: BUG_ON(td->cpu != smp_processor_id()); Basically a per CPU hotplug thread scheduled on the wrong CPU. The oops output confirms it: CPU: 0 Comm: watchdog/130 The problem is that we aren't ensuring the CPU active and onli

Re: [PATCH] kthread: kthread_bind fails to enforce CPU affinity (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-08 Thread Steven Rostedt
On Mon, 8 Dec 2014 14:27:01 +1100 Anton Blanchard wrote: > I have a busy ppc64le KVM box where guests sometimes hit the infamous > "kernel BUG at kernel/smpboot.c:134!" issue during boot: > > BUG_ON(td->cpu != smp_processor_id()); > > Basically a per CPU hotplu

Re: [PATCH] kthread: kthread_bind fails to enforce CPU affinity (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-08 Thread Anton Blanchard
Hi Ingo, > So we cannot call set_task_cpu() because in the normal life time > of a task the ->cpu value gets set on wakeup. So if a task is > blocked right now, and its affinity changes, it ought to get a > correct ->cpu selected on wakeup. The affinity mask and the > current value of ->cpu

Re: [PATCH] kthread: kthread_bind fails to enforce CPU affinity (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-08 Thread Ingo Molnar
* Anton Blanchard wrote: > I have a busy ppc64le KVM box where guests sometimes hit the > infamous "kernel BUG at kernel/smpboot.c:134!" issue during > boot: > > BUG_ON(td->cpu != smp_processor_id()); > > Basically a per CPU hotplug thread scheduled o

Re: [PATCH] kthread: kthread_bind fails to enforce CPU affinity (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-08 Thread Ingo Molnar
* Anton Blanchard an...@samba.org wrote: I have a busy ppc64le KVM box where guests sometimes hit the infamous kernel BUG at kernel/smpboot.c:134! issue during boot: BUG_ON(td-cpu != smp_processor_id()); Basically a per CPU hotplug thread scheduled on the wrong CPU. The oops output

Re: [PATCH] kthread: kthread_bind fails to enforce CPU affinity (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-08 Thread Anton Blanchard
Hi Ingo, So we cannot call set_task_cpu() because in the normal life time of a task the -cpu value gets set on wakeup. So if a task is blocked right now, and its affinity changes, it ought to get a correct -cpu selected on wakeup. The affinity mask and the current value of -cpu getting

Re: [PATCH] kthread: kthread_bind fails to enforce CPU affinity (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-08 Thread Steven Rostedt
On Mon, 8 Dec 2014 14:27:01 +1100 Anton Blanchard an...@samba.org wrote: I have a busy ppc64le KVM box where guests sometimes hit the infamous kernel BUG at kernel/smpboot.c:134! issue during boot: BUG_ON(td-cpu != smp_processor_id()); Basically a per CPU hotplug thread scheduled

[PATCH] powerpc: secondary CPUs signal to master before setting active and online (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-08 Thread Anton Blanchard
BUG at kernel/smpboot.c:134! issue during boot: BUG_ON(td-cpu != smp_processor_id()); Basically a per CPU hotplug thread scheduled on the wrong CPU. The oops output confirms it: CPU: 0 Comm: watchdog/130 The problem is that we aren't ensuring the CPU active and online bits are set before

Re: [PATCH] kthread: kthread_bind fails to enforce CPU affinity (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-08 Thread Lai Jiangshan
On 12/08/2014 09:54 PM, Steven Rostedt wrote: On Mon, 8 Dec 2014 14:27:01 +1100 Anton Blanchard an...@samba.org wrote: I have a busy ppc64le KVM box where guests sometimes hit the infamous kernel BUG at kernel/smpboot.c:134! issue during boot: BUG_ON(td-cpu != smp_processor_id

Re: [PATCH] kthread: kthread_bind fails to enforce CPU affinity (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-07 Thread Anton Blanchard
Hi Linus, > The __set_task_cpu() function does various other things too: > > set_task_rq(p, cpu); > #ifdef CONFIG_SMP > /* > * After ->cpu is set up to a new value, task_rq_lock(p, ...) > can be > * successfuly executed on another CPU. We must ensure that >

Re: [PATCH] kthread: kthread_bind fails to enforce CPU affinity (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-07 Thread Linus Torvalds
On Sun, Dec 7, 2014 at 7:27 PM, Anton Blanchard wrote: > > Since we cannot call set_task_cpu (the task is in a sleeping state), > just do an explicit set of task_thread_info(p)->cpu. Scheduler people: is this sufficient and ok? The __set_task_cpu() function does various other things too:

[PATCH] kthread: kthread_bind fails to enforce CPU affinity (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-07 Thread Anton Blanchard
I have a busy ppc64le KVM box where guests sometimes hit the infamous "kernel BUG at kernel/smpboot.c:134!" issue during boot: BUG_ON(td->cpu != smp_processor_id()); Basically a per CPU hotplug thread scheduled on the wrong CPU. The oops output confirms it: CPU: 0 Comm: watchdog/1

[PATCH] kthread: kthread_bind fails to enforce CPU affinity (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-07 Thread Anton Blanchard
I have a busy ppc64le KVM box where guests sometimes hit the infamous kernel BUG at kernel/smpboot.c:134! issue during boot: BUG_ON(td-cpu != smp_processor_id()); Basically a per CPU hotplug thread scheduled on the wrong CPU. The oops output confirms it: CPU: 0 Comm: watchdog/130 The issue

Re: [PATCH] kthread: kthread_bind fails to enforce CPU affinity (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-07 Thread Linus Torvalds
On Sun, Dec 7, 2014 at 7:27 PM, Anton Blanchard an...@samba.org wrote: Since we cannot call set_task_cpu (the task is in a sleeping state), just do an explicit set of task_thread_info(p)-cpu. Scheduler people: is this sufficient and ok? The __set_task_cpu() function does various other things

Re: [PATCH] kthread: kthread_bind fails to enforce CPU affinity (fixes kernel BUG at kernel/smpboot.c:134!)

2014-12-07 Thread Anton Blanchard
Hi Linus, The __set_task_cpu() function does various other things too: set_task_rq(p, cpu); #ifdef CONFIG_SMP /* * After -cpu is set up to a new value, task_rq_lock(p, ...) can be * successfuly executed on another CPU. We must ensure that updates of

Re: [LKP] [sched] kernel BUG at kernel/smpboot.c:134!

2014-11-06 Thread Peter Zijlstra
On Thu, Nov 06, 2014 at 02:07:58AM +0800, Yuyang Du wrote: > Hi Peter and Thomas, > > LKP found a bug, and it was bisected to my rewrite patch: > http://article.gmane.org/gmane.linux.kernel/1818393/ > > But I really don't have a clue about why the patch can introduce > such a bug, as the patch

Re: [LKP] [sched] kernel BUG at kernel/smpboot.c:134!

2014-11-06 Thread Peter Zijlstra
On Thu, Nov 06, 2014 at 02:07:58AM +0800, Yuyang Du wrote: Hi Peter and Thomas, LKP found a bug, and it was bisected to my rewrite patch: http://article.gmane.org/gmane.linux.kernel/1818393/ But I really don't have a clue about why the patch can introduce such a bug, as the patch does not

Re: [LKP] [sched] kernel BUG at kernel/smpboot.c:134!

2014-11-05 Thread Yuyang Du
| 1 | > | BUG:kernel_test_crashed | 0 > | 3 | > +---+++ > > > [3.205664] masked ExtINT on CPU#98 > [3.205664] CPU98: Thermal LVT vector (0xfa) already installed > [3.234545] [ cut here ]-

Re: [LKP] [sched] kernel BUG at kernel/smpboot.c:134!

2014-11-05 Thread Yuyang Du
] masked ExtINT on CPU#98 [3.205664] CPU98: Thermal LVT vector (0xfa) already installed [3.234545] [ cut here ] [3.235000] kernel BUG at kernel/smpboot.c:134! [3.235000] invalid opcode: [#1] SMP [3.235000] Modules linked in: [3.235000] CPU: 0 PID

[LKP] [sched] kernel BUG at kernel/smpboot.c:134!

2014-11-03 Thread kernel test robot
installed [3.234545] [ cut here ] [3.235000] kernel BUG at kernel/smpboot.c:134! [3.235000] invalid opcode: [#1] SMP [3.235000] Modules linked in: [3.235000] CPU: 0 PID: 789 Comm: watchdog/98 Not tainted 3.17.0-rc7-g6fe1f1b #7 [3.235000] Har

[LKP] [sched] kernel BUG at kernel/smpboot.c:134!

2014-11-03 Thread kernel test robot
[3.234545] [ cut here ] [3.235000] kernel BUG at kernel/smpboot.c:134! [3.235000] invalid opcode: [#1] SMP [3.235000] Modules linked in: [3.235000] CPU: 0 PID: 789 Comm: watchdog/98 Not tainted 3.17.0-rc7-g6fe1f1b #7 [3.235000] Hardware name: Intel

kernel BUG at kernel/smpboot.c:134!

2014-09-22 Thread Brian Norris
d 4700 to 4800 cycles): ... [ 164.737561] CPU1: Booted secondary processor [ 164.785821] CPU1: shutdown [ 164.73] [ cut here ] [ 164.793537] kernel BUG at kernel/smpboot.c:134! [ 164.793540] Internal error: Oops - BUG: 0 [#1] SMP ARM [ 164.793547] Modules

kernel BUG at kernel/smpboot.c:134!

2014-09-22 Thread Brian Norris
to 4800 cycles): ... [ 164.737561] CPU1: Booted secondary processor [ 164.785821] CPU1: shutdown [ 164.73] [ cut here ] [ 164.793537] kernel BUG at kernel/smpboot.c:134! [ 164.793540] Internal error: Oops - BUG: 0 [#1] SMP ARM [ 164.793547] Modules linked

kernel BUG at kernel/smpboot.c:134

2014-06-23 Thread Subbaraman Narayanamurthy
056.489245] Code: e594a000 eb085236 e15a 0a00 (e7f001f2) [57056.489259] [ cut here ] [57056.492840] kernel BUG at kernel/smpboot.c:134! [57056.513236] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM [57056.519055] Modules linked in: wlan(O) mhi(O) [57056.523394] CPU:

kernel BUG at kernel/smpboot.c:134

2014-06-23 Thread Subbaraman Narayanamurthy
: e594a000 eb085236 e15a 0a00 (e7f001f2) [57056.489259] [ cut here ] [57056.492840] kernel BUG at kernel/smpboot.c:134! [57056.513236] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM [57056.519055] Modules linked in: wlan(O) mhi(O) [57056.523394] CPU: 0 PID: 14 Comm

Re: kernel BUG at kernel/smpboot.c:134!

2013-04-08 Thread Thomas Gleixner
On Mon, 8 Apr 2013, Borislav Petkov wrote: > On Mon, Apr 08, 2013 at 11:24:14AM +0200, Thomas Gleixner wrote: > > + /* Rebind ourself to the target cpu */ > > + if (test_bit(KTHREAD_IS_PER_CPU, >flags)) { > > + set_cpus_allowed_ptr(currrent, cpumask_of(self->cpu)); > > "currrent" is

Re: kernel BUG at kernel/smpboot.c:134!

2013-04-08 Thread Borislav Petkov
On Mon, Apr 08, 2013 at 11:24:14AM +0200, Thomas Gleixner wrote: > On Sun, 7 Apr 2013, Borislav Petkov wrote: > > > On Sun, Apr 07, 2013 at 11:20:10AM +0200, Thomas Gleixner wrote: > > > And it's even more bogus because the cpu to which we would bind in > > > kthread_create_on_cpu() is not yet

Re: kernel BUG at kernel/smpboot.c:134!

2013-04-08 Thread Thomas Gleixner
On Sun, 7 Apr 2013, Borislav Petkov wrote: > On Sun, Apr 07, 2013 at 11:20:10AM +0200, Thomas Gleixner wrote: > > And it's even more bogus because the cpu to which we would bind in > > kthread_create_on_cpu() is not yet online. > > In case you guys are wondering about reproducibility, I saw the

Re: kernel BUG at kernel/smpboot.c:134!

2013-04-08 Thread Thomas Gleixner
On Sun, 7 Apr 2013, Borislav Petkov wrote: On Sun, Apr 07, 2013 at 11:20:10AM +0200, Thomas Gleixner wrote: And it's even more bogus because the cpu to which we would bind in kthread_create_on_cpu() is not yet online. In case you guys are wondering about reproducibility, I saw the same

Re: kernel BUG at kernel/smpboot.c:134!

2013-04-08 Thread Borislav Petkov
On Mon, Apr 08, 2013 at 11:24:14AM +0200, Thomas Gleixner wrote: On Sun, 7 Apr 2013, Borislav Petkov wrote: On Sun, Apr 07, 2013 at 11:20:10AM +0200, Thomas Gleixner wrote: And it's even more bogus because the cpu to which we would bind in kthread_create_on_cpu() is not yet online.

Re: kernel BUG at kernel/smpboot.c:134!

2013-04-08 Thread Thomas Gleixner
On Mon, 8 Apr 2013, Borislav Petkov wrote: On Mon, Apr 08, 2013 at 11:24:14AM +0200, Thomas Gleixner wrote: + /* Rebind ourself to the target cpu */ + if (test_bit(KTHREAD_IS_PER_CPU, self-flags)) { + set_cpus_allowed_ptr(currrent, cpumask_of(self-cpu)); currrent is a typo,

Re: kernel BUG at kernel/smpboot.c:134!

2013-04-07 Thread Borislav Petkov
guest too (don't ask why? :-)) And yes, this was without kvm (software emulation only in qemu). [0.395000] [ cut here ] [0.395000] kernel BUG at kernel/smpboot.c:134! [0.395000] invalid opcode: [#1] PREEMPT SMP [0.395000] Modules linked in: [0.

Re: kernel BUG at kernel/smpboot.c:134!

2013-04-07 Thread Thomas Gleixner
On Sat, 6 Apr 2013, Thomas Gleixner wrote: > This is Hillfs proposed patch: > > > --- a/kernel/kthread.c Sat Jan 19 13:03:52 2013 > > +++ b/kernel/kthread.c Sat Jan 19 13:17:54 2013 > > @@ -306,6 +306,7 @@ struct task_struct *kthread_create_on_cp > > return p; > >

Re: kernel BUG at kernel/smpboot.c:134!

2013-04-07 Thread Thomas Gleixner
On Sat, 6 Apr 2013, Thomas Gleixner wrote: This is Hillfs proposed patch: --- a/kernel/kthread.c Sat Jan 19 13:03:52 2013 +++ b/kernel/kthread.c Sat Jan 19 13:17:54 2013 @@ -306,6 +306,7 @@ struct task_struct *kthread_create_on_cp return p;

Re: kernel BUG at kernel/smpboot.c:134!

2013-04-07 Thread Borislav Petkov
too (don't ask why? :-)) And yes, this was without kvm (software emulation only in qemu). [0.395000] [ cut here ] [0.395000] kernel BUG at kernel/smpboot.c:134! [0.395000] invalid opcode: [#1] PREEMPT SMP [0.395000] Modules linked in: [0.395000] Pid

Re: kernel BUG at kernel/smpboot.c:134!

2013-04-06 Thread Thomas Gleixner
On Sat, 6 Apr 2013, Srivatsa S. Bhat wrote: > Hi Dave, > > On 04/06/2013 03:13 AM, Dave Hansen wrote: > > Hey Thomas, > > > > I seem to be running in to smpboot_thread_fn()'s > > > > BUG_ON(td->cpu != smp_processor_id()); That should be WARN_ON of course. Stupid me. > > pretty

Re: kernel BUG at kernel/smpboot.c:134!

2013-04-06 Thread Srivatsa S. Bhat
at >> [ 790.223270] ----[ cut here ]---- >> [ 790.223966] kernel BUG at kernel/smpboot.c:134! >> [ 790.224739] invalid opcode: [#1] SMP >> [ 790.225671] Modules linked in: >> [ 790.226428] CPU 81 >> [ 790.226909] Pid: 3909, comm: migrat

Re: kernel BUG at kernel/smpboot.c:134!

2013-04-06 Thread Srivatsa S. Bhat
] kernel BUG at kernel/smpboot.c:134! [ 790.224739] invalid opcode: [#1] SMP [ 790.225671] Modules linked in: [ 790.226428] CPU 81 [ 790.226909] Pid: 3909, comm: migration/135 Tainted: GW 3.9.0-rc5-00184-gb6a9b7f-dirty #118 FUJITSU-SV PRIMEQUEST 1800E2/SB [ 790.228775

Re: kernel BUG at kernel/smpboot.c:134!

2013-04-06 Thread Thomas Gleixner
On Sat, 6 Apr 2013, Srivatsa S. Bhat wrote: Hi Dave, On 04/06/2013 03:13 AM, Dave Hansen wrote: Hey Thomas, I seem to be running in to smpboot_thread_fn()'s BUG_ON(td-cpu != smp_processor_id()); That should be WARN_ON of course. Stupid me. pretty regularly, both at boot

kernel BUG at kernel/smpboot.c:134!

2013-04-05 Thread Dave Hansen
ing it more often at higher cpu counts, but it doesn't trigger on bringing up a particular CPU as far as I can tell. This is on a pull of mainline from today, e0a77f263. Any ideas? > [ 790.223270] [ cut here ] > [ 790.223966] kernel BUG at kernel/smpb

kernel BUG at kernel/smpboot.c:134!

2013-04-05 Thread Dave Hansen
it more often at higher cpu counts, but it doesn't trigger on bringing up a particular CPU as far as I can tell. This is on a pull of mainline from today, e0a77f263. Any ideas? [ 790.223270] [ cut here ] [ 790.223966] kernel BUG at kernel/smpboot.c:134! [ 790.224739