This is a note to let you know that I've just added the patch titled

    x86: cpu-hotplug: Prevent softirq wakeup on wrong CPU

to the 2.6.39-stable tree which can be found at:
    
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     x86-cpu-hotplug-prevent-softirq-wakeup-on-wrong-cpu.patch
and it can be found in the queue-2.6.39 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <[email protected]> know about it.


>From fd8a7de177b6f56a0fc59ad211c197a7df06b1ad Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <[email protected]>
Date: Tue, 20 Jul 2010 14:34:50 +0200
Subject: x86: cpu-hotplug: Prevent softirq wakeup on wrong CPU

From: Thomas Gleixner <[email protected]>

commit fd8a7de177b6f56a0fc59ad211c197a7df06b1ad upstream.

After a newly plugged CPU sets the cpu_online bit it enables
interrupts and goes idle. The cpu which brought up the new cpu waits
for the cpu_online bit and when it observes it, it sets the cpu_active
bit for this cpu. The cpu_active bit is the relevant one for the
scheduler to consider the cpu as a viable target.

With forced threaded interrupt handlers which imply forced threaded
softirqs we observed the following race:

cpu 0                         cpu 1

bringup(cpu1);
                              set_cpu_online(smp_processor_id(), true);
                              local_irq_enable();
while (!cpu_online(cpu1));
                              timer_interrupt()
                                -> wake_up(softirq_thread_cpu1);
                                     -> enqueue_on(softirq_thread_cpu1, cpu0);

                                                                        ^^^^

cpu_notify(CPU_ONLINE, cpu1);
  -> sched_cpu_active(cpu1)
     -> set_cpu_active((cpu1, true);

When an interrupt happens before the cpu_active bit is set by the cpu
which brought up the newly onlined cpu, then the scheduler refuses to
enqueue the woken thread which is bound to that newly onlined cpu on
that newly onlined cpu due to the not yet set cpu_active bit and
selects a fallback runqueue. Not really an expected and desirable
behaviour.

So far this has only been observed with forced hard/softirq threading,
but in theory this could happen without forced threaded hard/softirqs
as well. It's probably unobservable as it would take a massive
interrupt storm on the newly onlined cpu which causes the softirq loop
to wake up the softirq thread and an even longer delay of the cpu
which waits for the cpu_online bit.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Peter Zijlstra <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
 arch/x86/kernel/smpboot.c |   13 +++++++++++++
 1 file changed, 13 insertions(+)

--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -285,6 +285,19 @@ notrace static void __cpuinit start_seco
        per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE;
        x86_platform.nmi_init();
 
+       /*
+        * Wait until the cpu which brought this one up marked it
+        * online before enabling interrupts. If we don't do that then
+        * we can end up waking up the softirq thread before this cpu
+        * reached the active state, which makes the scheduler unhappy
+        * and schedule the softirq thread on the wrong cpu. This is
+        * only observable with forced threaded interrupts, but in
+        * theory it could also happen w/o them. It's just way harder
+        * to achieve.
+        */
+       while (!cpumask_test_cpu(smp_processor_id(), cpu_active_mask))
+               cpu_relax();
+
        /* enable local interrupts */
        local_irq_enable();
 


Patches currently in stable-queue which might be from [email protected] are

queue-2.6.39/x86-cpu-hotplug-prevent-softirq-wakeup-on-wrong-cpu.patch
queue-2.6.39/x86-devicetree-add-missing-early_init_dt_setup_initrd_arch.patch
queue-2.6.39/genirq-fix-descriptor-init-on-non-sparse-irqs.patch

_______________________________________________
stable mailing list
[email protected]
http://linux.kernel.org/mailman/listinfo/stable

Reply via email to