Commit:     a263898f628dd21e59210b547986c154788f628e
Parent:     57a04513cb35086d54bcb2cb92e6627fc8fa0fae
Author:     Ingo Molnar <[EMAIL PROTECTED]>
AuthorDate: Sun Dec 30 11:58:17 2007 +0100
Committer:  Linus Torvalds <[EMAIL PROTECTED]>
CommitDate: Sun Jan 6 12:39:42 2008 -0800

    CPU hotplug: fix cpu_is_offline() on !CONFIG_HOTPLUG_CPU
    make randconfig bootup testing found that the cpufreq code
    crashes on bootup, if the powernow-k8 driver is enabled and
    if maxcpus=1 passed on the boot line to a !CONFIG_HOTPLUG_CPU
    First lockdep found out that there's an inconsistent unlock
     [ BUG: bad unlock balance detected! ]
     swapper/1 is trying to release lock (&per_cpu(cpu_policy_rwsem, cpu)) at:
     [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
     but there are no more locks to release!
    Call Trace:
     [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
     [<ffffffff80251c29>] print_unlock_inbalance_bug+0x104/0x12c
     [<ffffffff80252f3a>] mark_held_locks+0x56/0x94
     [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
     [<ffffffff807008b6>] cpufreq_add_dev+0x2a8/0x5c4
    then shortly afterwards the cpufreq code crashed on an assert:
     ------------[ cut here ]------------
     kernel BUG at drivers/cpufreq/cpufreq.c:1068!
     invalid opcode: 0000 [1] SMP
     Call Trace:
      [<ffffffff805145d6>] sysdev_driver_unregister+0x5b/0x91
      [<ffffffff806ff520>] cpufreq_register_driver+0x15d/0x1a2
      [<ffffffff80cc0596>] powernowk8_init+0x86/0x94
     ---[ end trace 1e9219be2b4431de ]---
    the bug was caused by maxcpus=1 bootup, which brought up the
    secondary core as !cpu_online() but !cpu_is_offline() either,
    which on on !CONFIG_HOTPLUG_CPU is always 0 (include/linux/cpu.h):
      /* CPUs don't go offline once they're online w/o CONFIG_HOTPLUG_CPU */
      static inline int cpu_is_offline(int cpu) { return 0; }
    but the cpufreq code uses cpu_online() and cpu_is_offline() in
    a mixed way - the low-level drivers use cpu_online(), while
    the cpufreq core uses cpu_is_offline(). This opened up the
    possibility to add the non-initialized sysdev device of the
    secondary core:
     cpufreq-core: trying to register driver powernow-k8
     cpufreq-core: adding CPU 0
     powernow-k8: BIOS error - no PSB or ACPI _PSS objects
     cpufreq-core: initialization failed
     cpufreq-core: adding CPU 1
     cpufreq-core: initialization failed
    which then blew up. The fix is to make cpu_is_offline() always
    the negation of cpu_online(). With that fix applied the kernel
    boots up fine without crashing:
     Calling initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94()
     powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ 
processors (1 cpu cores) (version 2.20.00)
     powernow-k8: BIOS error - no PSB or ACPI _PSS objects
     initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94() returned -19.
     initcall 0xffffffff80cc0510 ran for 19 msecs: powernowk8_init+0x0/0x94()
     Calling initcall 0xffffffff80cc328f: init_lapic_nmi_sysfs+0x0/0x39()
    We could fix this by making CPU enumeration aware of max_cpus, but that
    would be more fragile IMO, and the cpu_online(cpu) != cpu_is_offline(cpu)
    possibility was quite confusing and a continuous source of bugs too.
    Most distributions have kernels with CPU hotplug enabled, so this bug
    remained hidden for a long time.
    Bug forensics:
    The broken cpu_is_offline() API variant was introduced via:
     commit a59d2e4e6977e7b94e003c96a41f07e96cddc340
     Author: Rusty Russell <[EMAIL PROTECTED]>
     Date:   Mon Mar 8 06:06:03 2004 -0800
         [PATCH] minor cleanups for hotplug CPUs
    ( this predates linux-2.6.git, this commit is available from Thomas's
      historic git tree. )
    Then 1.5 years later the cpufreq code made use of it:
     commit c32b6b8e524d2c337767d312814484d9289550cf
     Author: Ashok Raj <[EMAIL PROTECTED]>
     Date:   Sun Oct 30 14:59:54 2005 -0800
         [PATCH] create and destroy cpufreq sysfs entries based on cpu notifiers
     +       if (cpu_is_offline(cpu))
     +               return 0;
    which is a correct use of the subtly broken new API. v2.6.15 then
    shipped with this bug included.
    then it took two more years for random-kernel qa to hit it.
    Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]>
    Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
    Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
 include/linux/cpu.h     |    4 ----
 include/linux/cpumask.h |    2 ++
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index b79c575..92f2029 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -107,7 +107,6 @@ extern void unlock_cpu_hotplug(void);
 #define register_hotcpu_notifier(nb)   register_cpu_notifier(nb)
 #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb)
 int cpu_down(unsigned int cpu);
-#define cpu_is_offline(cpu) unlikely(!cpu_online(cpu))
 #else          /* CONFIG_HOTPLUG_CPU */
@@ -122,9 +121,6 @@ static inline void cpuhotplug_mutex_unlock(struct mutex 
 /* These aren't inline functions due to a GCC bug. */
 #define register_hotcpu_notifier(nb)   ({ (void)(nb); 0; })
 #define unregister_hotcpu_notifier(nb) ({ (void)(nb); })
-/* CPUs don't go offline once they're online w/o CONFIG_HOTPLUG_CPU */
-static inline int cpu_is_offline(int cpu) { return 0; }
 #endif         /* CONFIG_HOTPLUG_CPU */
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 23f5514..85bd790 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -397,6 +397,8 @@ extern cpumask_t cpu_present_map;
 #define cpu_present(cpu)       ((cpu) == 0)
+#define cpu_is_offline(cpu)    unlikely(!cpu_online(cpu))
 #ifdef CONFIG_SMP
 extern int nr_cpu_ids;
 #define any_online_cpu(mask) __any_online_cpu(&(mask))
To unsubscribe from this list: send the line "unsubscribe git-commits-head" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at

Reply via email to