cpu hotplug strangeness in 2.6.24-rc2 (was Re: cpu hotplug support broken in 2.6.23-rc3)
Hi! > > > Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS > > > file: > > > > > > > There is a list of maintainers in the Documentation/cpu-hotplug.txt, > > which includes maintainers for different platforms as well. > > > > It's a good idea to add that info to the MAINTAINERS file as well. > > Yes, please. Just an update... In 2.6.24-rc2, cpu hotplug basically works, _but_: if I do echo 0 > online; echo 0 > online; at same cpu, I get error, and can't up anything any more. It is not serious, but it is not pretty, either. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
cpu hotplug strangeness in 2.6.24-rc2 (was Re: cpu hotplug support broken in 2.6.23-rc3)
Hi! Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS file: There is a list of maintainers in the Documentation/cpu-hotplug.txt, which includes maintainers for different platforms as well. It's a good idea to add that info to the MAINTAINERS file as well. Yes, please. Just an update... In 2.6.24-rc2, cpu hotplug basically works, _but_: if I do echo 0 online; echo 0 online; at same cpu, I get error, and can't up anything any more. It is not serious, but it is not pretty, either. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
Hi! > > > Venki sent me an initial patch, but it has issues with the notify > > > ordering. Find below my "cache the broadcast flags" version for testing. > > > > Hmmpf, the flag is still cleared when the cpu goes offline. Need to take > > a closer look. > > I finally tracked it down. There were several ways to turn the box into > a brick. Sigh ! > > Can you please test the combo patch below ? Sorry, I was on holidays. I assume this is in -rc9 or so, already? Yes, seems so. Unfortunately, cpu hotplug seems to be still behaving strangely in -rc9. I can echo 0 > online (and cpu will go down). I do echo 0 > online, again, and I get -EBUSY. Good. But I try to do echo 1 > online, and get -EBUSY, too... and that's bad :-(. [EMAIL PROTECTED]:/sys/devices/system/cpu/cpu1# echo 0 > online [EMAIL PROTECTED]:/sys/devices/system/cpu/cpu1# echo 0 > online -bash: echo: write error: Device or resource busy [EMAIL PROTECTED]:/sys/devices/system/cpu/cpu1# echo 1 > online -bash: echo: write error: Device or resource busy [EMAIL PROTECTED]:/sys/devices/system/cpu/cpu1# uname -a Linux amd 2.6.23-rc9 #507 SMP Tue Oct 2 09:58:40 CEST 2007 i686 GNU/Linux Kernel says: Oct 2 11:42:12 amd log1n[1436]: ROOT LOGIN on `tty1' Oct 2 11:42:56 amd kernel: CPU 1 is now offline Oct 2 11:42:56 amd kernel: SMP alternatives: switching to UP code Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
Hi! Venki sent me an initial patch, but it has issues with the notify ordering. Find below my cache the broadcast flags version for testing. Hmmpf, the flag is still cleared when the cpu goes offline. Need to take a closer look. I finally tracked it down. There were several ways to turn the box into a brick. Sigh ! Can you please test the combo patch below ? Sorry, I was on holidays. I assume this is in -rc9 or so, already? Yes, seems so. Unfortunately, cpu hotplug seems to be still behaving strangely in -rc9. I can echo 0 online (and cpu will go down). I do echo 0 online, again, and I get -EBUSY. Good. But I try to do echo 1 online, and get -EBUSY, too... and that's bad :-(. [EMAIL PROTECTED]:/sys/devices/system/cpu/cpu1# echo 0 online [EMAIL PROTECTED]:/sys/devices/system/cpu/cpu1# echo 0 online -bash: echo: write error: Device or resource busy [EMAIL PROTECTED]:/sys/devices/system/cpu/cpu1# echo 1 online -bash: echo: write error: Device or resource busy [EMAIL PROTECTED]:/sys/devices/system/cpu/cpu1# uname -a Linux amd 2.6.23-rc9 #507 SMP Tue Oct 2 09:58:40 CEST 2007 i686 GNU/Linux Kernel says: Oct 2 11:42:12 amd log1n[1436]: ROOT LOGIN on `tty1' Oct 2 11:42:56 amd kernel: CPU 1 is now offline Oct 2 11:42:56 amd kernel: SMP alternatives: switching to UP code Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Sat, 15 Sep 2007 15:28:23 +0200 Thomas Gleixner <[EMAIL PROTECTED]> wrote: > On Sat, 2007-09-15 at 03:18 -0700, Andrew Morton wrote: > > > http://git.kernel.org/?p=linux/kernel/git/tglx/linux-2.6-hrt.git;a=shortlog;h=for-2.6.23 > > > > > > > That patch fixes the resume-from-ram and suspend-to-ram regressions on the > > Vaio. > > > > I dropped the timekeeping.c hunks because they are an older version of > > timekeeping-prevent-time-going-backwards-on-resume.patch which I already > > had. > > > > Is this good to go? Needs a bit of changelogging. > > Changelog it in the git tree. Please pull from there: who, me? > The following changes since commit 53a3f3087be361dacfc02e7a85b6d6142a41ce8a: > Linus Torvalds (1): > Merge branch 'for-linus' of > master.kernel.org:/.../cooloney/blackfin-2.6 > > are available in the git repository at: > > ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt.git > for-2.6.23 > > Thomas Gleixner (6): > timekeeping: access rtc outside of xtime lock > timekeeping: Prevent time going backwards on resume > ACPI: Reevaluate C/P/T states when a cpu becomes online > clockevents: Enforce oneshot broadcast when broadcast mask is set on > resume > clockevents: do not shutdown the oneshot broadcast device > clockevents: prevent stale tick update on offline cpu please send it to Linus? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Sat, 2007-09-15 at 03:18 -0700, Andrew Morton wrote: > > http://git.kernel.org/?p=linux/kernel/git/tglx/linux-2.6-hrt.git;a=shortlog;h=for-2.6.23 > > > > That patch fixes the resume-from-ram and suspend-to-ram regressions on the > Vaio. > > I dropped the timekeeping.c hunks because they are an older version of > timekeeping-prevent-time-going-backwards-on-resume.patch which I already > had. > > Is this good to go? Needs a bit of changelogging. Changelog it in the git tree. Please pull from there: The following changes since commit 53a3f3087be361dacfc02e7a85b6d6142a41ce8a: Linus Torvalds (1): Merge branch 'for-linus' of master.kernel.org:/.../cooloney/blackfin-2.6 are available in the git repository at: ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt.git for-2.6.23 Thomas Gleixner (6): timekeeping: access rtc outside of xtime lock timekeeping: Prevent time going backwards on resume ACPI: Reevaluate C/P/T states when a cpu becomes online clockevents: Enforce oneshot broadcast when broadcast mask is set on resume clockevents: do not shutdown the oneshot broadcast device clockevents: prevent stale tick update on offline cpu drivers/acpi/processor_core.c | 21 + kernel/time/tick-broadcast.c | 24 kernel/time/tick-sched.c | 12 kernel/time/timekeeping.c | 10 +- 4 files changed, 58 insertions(+), 9 deletions(-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Sat, 2007-09-15 at 03:18 -0700, Andrew Morton wrote: > On Sat, 15 Sep 2007 11:49:41 +0200 Thomas Gleixner <[EMAIL PROTECTED]> wrote: > > I dropped the timekeeping.c hunks because they are an older version of > timekeeping-prevent-time-going-backwards-on-resume.patch which I already > had. Err, no. The timekeeping hunk is redone due to the lockdep fix which I made. Thanks, tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Sat, 15 Sep 2007 11:49:41 +0200 Thomas Gleixner <[EMAIL PROTECTED]> wrote: > On Fri, 2007-09-14 at 15:15 +0200, Thomas Gleixner wrote: > > > Venki sent me an initial patch, but it has issues with the notify > > > ordering. Find below my "cache the broadcast flags" version for testing. > > > > Hmmpf, the flag is still cleared when the cpu goes offline. Need to take > > a closer look. > > I finally tracked it down. There were several ways to turn the box into > a brick. Sigh ! > > Can you please test the combo patch below ? > > The details are available from the for-2.6.23 branch of my hrt git repo: > > http://git.kernel.org/?p=linux/kernel/git/tglx/linux-2.6-hrt.git;a=shortlog;h=for-2.6.23 > That patch fixes the resume-from-ram and suspend-to-ram regressions on the Vaio. I dropped the timekeeping.c hunks because they are an older version of timekeeping-prevent-time-going-backwards-on-resume.patch which I already had. Is this good to go? Needs a bit of changelogging. drivers/acpi/processor_core.c | 21 + kernel/time/tick-broadcast.c | 24 kernel/time/tick-sched.c | 12 3 files changed, 49 insertions(+), 8 deletions(-) diff -puN drivers/acpi/processor_core.c~cpu-hotplug-support-broken-in-2623-rc3 drivers/acpi/processor_core.c --- a/drivers/acpi/processor_core.c~cpu-hotplug-support-broken-in-2623-rc3 +++ a/drivers/acpi/processor_core.c @@ -724,6 +724,25 @@ static void acpi_processor_notify(acpi_h return; } +static int acpi_cpu_soft_notify(struct notifier_block *nfb, + unsigned long action, void *hcpu) +{ + unsigned int cpu = (unsigned long)hcpu; + struct acpi_processor *pr = processors[cpu]; + + if (action == CPU_ONLINE && pr) { + acpi_processor_ppc_has_changed(pr); + acpi_processor_cst_has_changed(pr); + acpi_processor_tstate_has_changed(pr); + } + return NOTIFY_OK; +} + +static struct notifier_block acpi_cpu_notifier = +{ + .notifier_call = acpi_cpu_soft_notify, +}; + static int acpi_processor_add(struct acpi_device *device) { struct acpi_processor *pr = NULL; @@ -987,6 +1006,7 @@ void acpi_processor_install_hotplug_noti ACPI_UINT32_MAX, processor_walk_namespace_cb, , NULL); #endif + register_hotcpu_notifier(_cpu_notifier); } static @@ -999,6 +1019,7 @@ void acpi_processor_uninstall_hotplug_no ACPI_UINT32_MAX, processor_walk_namespace_cb, , NULL); #endif + unregister_hotcpu_notifier(_cpu_notifier); } /* diff -puN kernel/time/tick-broadcast.c~cpu-hotplug-support-broken-in-2623-rc3 kernel/time/tick-broadcast.c --- a/kernel/time/tick-broadcast.c~cpu-hotplug-support-broken-in-2623-rc3 +++ a/kernel/time/tick-broadcast.c @@ -382,12 +382,23 @@ static int tick_broadcast_set_event(ktim int tick_resume_broadcast_oneshot(struct clock_event_device *bc) { + int cpu = smp_processor_id(); + + /* +* If the CPU is marked for broadcast, enforce oneshot +* broadcast mode. The jinxed VAIO does not resume otherwise. +* No idea why it ends up in a lower C State during resume +* without notifying the clock events layer. +*/ + if (cpu_isset(cpu, tick_broadcast_mask)) + cpu_set(cpu, tick_broadcast_oneshot_mask); + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); if(!cpus_empty(tick_broadcast_oneshot_mask)) tick_broadcast_set_event(ktime_get(), 1); - return cpu_isset(smp_processor_id(), tick_broadcast_oneshot_mask); + return cpu_isset(cpu, tick_broadcast_oneshot_mask); } /* @@ -549,20 +560,17 @@ void tick_broadcast_switch_to_oneshot(vo */ void tick_shutdown_broadcast_oneshot(unsigned int *cpup) { - struct clock_event_device *bc; unsigned long flags; unsigned int cpu = *cpup; spin_lock_irqsave(_broadcast_lock, flags); - bc = tick_broadcast_device.evtdev; + /* +* Clear the broadcast mask flag for the dead cpu, but do not +* stop the broadcast device! +*/ cpu_clear(cpu, tick_broadcast_oneshot_mask); - if (tick_broadcast_device.mode == TICKDEV_MODE_ONESHOT) { - if (bc && cpus_empty(tick_broadcast_oneshot_mask)) - clockevents_set_mode(bc, CLOCK_EVT_MODE_SHUTDOWN); - } - spin_unlock_irqrestore(_broadcast_lock, flags); } diff -puN kernel/time/tick-sched.c~cpu-hotplug-support-broken-in-2623-rc3 kernel/time/tick-sched.c --- a/kernel/time/tick-sched.c~cpu-hotplug-support-broken-in-2623-rc3 +++ a/kernel/time/tick-sched.c @@ -160,6 +160,18 @@ void tick_nohz_stop_sched_tick(void) cpu = smp_processor_id(); ts = _cpu(tick_cpu_sched, cpu); + /* +* If this cpu is offline and it is the one which updates +*
Re: cpu hotplug support broken in 2.6.23-rc3
Pavel, On Fri, 2007-09-14 at 15:15 +0200, Thomas Gleixner wrote: > > Venki sent me an initial patch, but it has issues with the notify > > ordering. Find below my "cache the broadcast flags" version for testing. > > Hmmpf, the flag is still cleared when the cpu goes offline. Need to take > a closer look. I finally tracked it down. There were several ways to turn the box into a brick. Sigh ! Can you please test the combo patch below ? The details are available from the for-2.6.23 branch of my hrt git repo: http://git.kernel.org/?p=linux/kernel/git/tglx/linux-2.6-hrt.git;a=shortlog;h=for-2.6.23 Thanks, tglx Index: linux-2.6/kernel/time/timekeeping.c === --- linux-2.6.orig/kernel/time/timekeeping.c2007-09-15 11:42:09.0 +0200 +++ linux-2.6/kernel/time/timekeeping.c 2007-09-15 11:43:03.0 +0200 @@ -217,6 +217,7 @@ static void change_clocksource(void) } #else static inline void change_clocksource(void) { } +static inline s64 __get_nsec_offset(void) { return 0; } #endif /** @@ -280,6 +281,8 @@ void __init timekeeping_init(void) static int timekeeping_suspended; /* time in seconds when suspend began */ static unsigned long timekeeping_suspend_time; +/* xtime offset when we went into suspend */ +static s64 timekeeping_suspend_nsecs; /** * timekeeping_resume - Resumes the generic timekeeping subsystem. @@ -305,6 +308,8 @@ static int timekeeping_resume(struct sys wall_to_monotonic.tv_sec -= sleep_length; total_sleep_time += sleep_length; } + /* Make sure that we have the correct xtime reference */ + timespec_add_ns(, timekeeping_suspend_nsecs); /* re-base the last cycle value */ clock->cycle_last = clocksource_read(clock); clock->error = 0; @@ -325,9 +330,12 @@ static int timekeeping_suspend(struct sy { unsigned long flags; + timekeeping_suspend_time = read_persistent_clock(); + write_seqlock_irqsave(_lock, flags); + /* Get the current xtime offset */ + timekeeping_suspend_nsecs = __get_nsec_offset(); timekeeping_suspended = 1; - timekeeping_suspend_time = read_persistent_clock(); write_sequnlock_irqrestore(_lock, flags); clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL); Index: linux-2.6/drivers/acpi/processor_core.c === --- linux-2.6.orig/drivers/acpi/processor_core.c2007-09-15 11:42:09.0 +0200 +++ linux-2.6/drivers/acpi/processor_core.c 2007-09-15 11:43:03.0 +0200 @@ -724,6 +724,25 @@ static void acpi_processor_notify(acpi_h return; } +static int acpi_cpu_soft_notify(struct notifier_block *nfb, + unsigned long action, void *hcpu) +{ + unsigned int cpu = (unsigned long)hcpu; + struct acpi_processor *pr = processors[cpu]; + + if (action == CPU_ONLINE && pr) { + acpi_processor_ppc_has_changed(pr); + acpi_processor_cst_has_changed(pr); + acpi_processor_tstate_has_changed(pr); + } + return NOTIFY_OK; +} + +static struct notifier_block acpi_cpu_notifier = +{ + .notifier_call = acpi_cpu_soft_notify, +}; + static int acpi_processor_add(struct acpi_device *device) { struct acpi_processor *pr = NULL; @@ -987,6 +1006,7 @@ void acpi_processor_install_hotplug_noti ACPI_UINT32_MAX, processor_walk_namespace_cb, , NULL); #endif + register_hotcpu_notifier(_cpu_notifier); } static @@ -999,6 +1019,7 @@ void acpi_processor_uninstall_hotplug_no ACPI_UINT32_MAX, processor_walk_namespace_cb, , NULL); #endif + unregister_hotcpu_notifier(_cpu_notifier); } /* Index: linux-2.6/kernel/time/tick-broadcast.c === --- linux-2.6.orig/kernel/time/tick-broadcast.c 2007-09-15 11:42:09.0 +0200 +++ linux-2.6/kernel/time/tick-broadcast.c 2007-09-15 11:43:03.0 +0200 @@ -382,12 +382,23 @@ static int tick_broadcast_set_event(ktim int tick_resume_broadcast_oneshot(struct clock_event_device *bc) { + int cpu = smp_processor_id(); + + /* +* If the CPU is marked for broadcast, enforce oneshot +* broadcast mode. The jinxed VAIO does not resume otherwise. +* No idea why it ends up in a lower C State during resume +* without notifying the clock events layer. +*/ + if (cpu_isset(cpu, tick_broadcast_mask)) + cpu_set(cpu, tick_broadcast_oneshot_mask); + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); if(!cpus_empty(tick_broadcast_oneshot_mask)) tick_broadcast_set_event(ktime_get(), 1); - return cpu_isset(smp_processor_id(), tick_broadcast_oneshot_mask); +
Re: cpu hotplug support broken in 2.6.23-rc3
Pavel, On Fri, 2007-09-14 at 15:15 +0200, Thomas Gleixner wrote: Venki sent me an initial patch, but it has issues with the notify ordering. Find below my cache the broadcast flags version for testing. Hmmpf, the flag is still cleared when the cpu goes offline. Need to take a closer look. I finally tracked it down. There were several ways to turn the box into a brick. Sigh ! Can you please test the combo patch below ? The details are available from the for-2.6.23 branch of my hrt git repo: http://git.kernel.org/?p=linux/kernel/git/tglx/linux-2.6-hrt.git;a=shortlog;h=for-2.6.23 Thanks, tglx Index: linux-2.6/kernel/time/timekeeping.c === --- linux-2.6.orig/kernel/time/timekeeping.c2007-09-15 11:42:09.0 +0200 +++ linux-2.6/kernel/time/timekeeping.c 2007-09-15 11:43:03.0 +0200 @@ -217,6 +217,7 @@ static void change_clocksource(void) } #else static inline void change_clocksource(void) { } +static inline s64 __get_nsec_offset(void) { return 0; } #endif /** @@ -280,6 +281,8 @@ void __init timekeeping_init(void) static int timekeeping_suspended; /* time in seconds when suspend began */ static unsigned long timekeeping_suspend_time; +/* xtime offset when we went into suspend */ +static s64 timekeeping_suspend_nsecs; /** * timekeeping_resume - Resumes the generic timekeeping subsystem. @@ -305,6 +308,8 @@ static int timekeeping_resume(struct sys wall_to_monotonic.tv_sec -= sleep_length; total_sleep_time += sleep_length; } + /* Make sure that we have the correct xtime reference */ + timespec_add_ns(xtime, timekeeping_suspend_nsecs); /* re-base the last cycle value */ clock-cycle_last = clocksource_read(clock); clock-error = 0; @@ -325,9 +330,12 @@ static int timekeeping_suspend(struct sy { unsigned long flags; + timekeeping_suspend_time = read_persistent_clock(); + write_seqlock_irqsave(xtime_lock, flags); + /* Get the current xtime offset */ + timekeeping_suspend_nsecs = __get_nsec_offset(); timekeeping_suspended = 1; - timekeeping_suspend_time = read_persistent_clock(); write_sequnlock_irqrestore(xtime_lock, flags); clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL); Index: linux-2.6/drivers/acpi/processor_core.c === --- linux-2.6.orig/drivers/acpi/processor_core.c2007-09-15 11:42:09.0 +0200 +++ linux-2.6/drivers/acpi/processor_core.c 2007-09-15 11:43:03.0 +0200 @@ -724,6 +724,25 @@ static void acpi_processor_notify(acpi_h return; } +static int acpi_cpu_soft_notify(struct notifier_block *nfb, + unsigned long action, void *hcpu) +{ + unsigned int cpu = (unsigned long)hcpu; + struct acpi_processor *pr = processors[cpu]; + + if (action == CPU_ONLINE pr) { + acpi_processor_ppc_has_changed(pr); + acpi_processor_cst_has_changed(pr); + acpi_processor_tstate_has_changed(pr); + } + return NOTIFY_OK; +} + +static struct notifier_block acpi_cpu_notifier = +{ + .notifier_call = acpi_cpu_soft_notify, +}; + static int acpi_processor_add(struct acpi_device *device) { struct acpi_processor *pr = NULL; @@ -987,6 +1006,7 @@ void acpi_processor_install_hotplug_noti ACPI_UINT32_MAX, processor_walk_namespace_cb, action, NULL); #endif + register_hotcpu_notifier(acpi_cpu_notifier); } static @@ -999,6 +1019,7 @@ void acpi_processor_uninstall_hotplug_no ACPI_UINT32_MAX, processor_walk_namespace_cb, action, NULL); #endif + unregister_hotcpu_notifier(acpi_cpu_notifier); } /* Index: linux-2.6/kernel/time/tick-broadcast.c === --- linux-2.6.orig/kernel/time/tick-broadcast.c 2007-09-15 11:42:09.0 +0200 +++ linux-2.6/kernel/time/tick-broadcast.c 2007-09-15 11:43:03.0 +0200 @@ -382,12 +382,23 @@ static int tick_broadcast_set_event(ktim int tick_resume_broadcast_oneshot(struct clock_event_device *bc) { + int cpu = smp_processor_id(); + + /* +* If the CPU is marked for broadcast, enforce oneshot +* broadcast mode. The jinxed VAIO does not resume otherwise. +* No idea why it ends up in a lower C State during resume +* without notifying the clock events layer. +*/ + if (cpu_isset(cpu, tick_broadcast_mask)) + cpu_set(cpu, tick_broadcast_oneshot_mask); + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); if(!cpus_empty(tick_broadcast_oneshot_mask)) tick_broadcast_set_event(ktime_get(), 1); - return cpu_isset(smp_processor_id(),
Re: cpu hotplug support broken in 2.6.23-rc3
On Sat, 15 Sep 2007 11:49:41 +0200 Thomas Gleixner [EMAIL PROTECTED] wrote: On Fri, 2007-09-14 at 15:15 +0200, Thomas Gleixner wrote: Venki sent me an initial patch, but it has issues with the notify ordering. Find below my cache the broadcast flags version for testing. Hmmpf, the flag is still cleared when the cpu goes offline. Need to take a closer look. I finally tracked it down. There were several ways to turn the box into a brick. Sigh ! Can you please test the combo patch below ? The details are available from the for-2.6.23 branch of my hrt git repo: http://git.kernel.org/?p=linux/kernel/git/tglx/linux-2.6-hrt.git;a=shortlog;h=for-2.6.23 That patch fixes the resume-from-ram and suspend-to-ram regressions on the Vaio. I dropped the timekeeping.c hunks because they are an older version of timekeeping-prevent-time-going-backwards-on-resume.patch which I already had. Is this good to go? Needs a bit of changelogging. drivers/acpi/processor_core.c | 21 + kernel/time/tick-broadcast.c | 24 kernel/time/tick-sched.c | 12 3 files changed, 49 insertions(+), 8 deletions(-) diff -puN drivers/acpi/processor_core.c~cpu-hotplug-support-broken-in-2623-rc3 drivers/acpi/processor_core.c --- a/drivers/acpi/processor_core.c~cpu-hotplug-support-broken-in-2623-rc3 +++ a/drivers/acpi/processor_core.c @@ -724,6 +724,25 @@ static void acpi_processor_notify(acpi_h return; } +static int acpi_cpu_soft_notify(struct notifier_block *nfb, + unsigned long action, void *hcpu) +{ + unsigned int cpu = (unsigned long)hcpu; + struct acpi_processor *pr = processors[cpu]; + + if (action == CPU_ONLINE pr) { + acpi_processor_ppc_has_changed(pr); + acpi_processor_cst_has_changed(pr); + acpi_processor_tstate_has_changed(pr); + } + return NOTIFY_OK; +} + +static struct notifier_block acpi_cpu_notifier = +{ + .notifier_call = acpi_cpu_soft_notify, +}; + static int acpi_processor_add(struct acpi_device *device) { struct acpi_processor *pr = NULL; @@ -987,6 +1006,7 @@ void acpi_processor_install_hotplug_noti ACPI_UINT32_MAX, processor_walk_namespace_cb, action, NULL); #endif + register_hotcpu_notifier(acpi_cpu_notifier); } static @@ -999,6 +1019,7 @@ void acpi_processor_uninstall_hotplug_no ACPI_UINT32_MAX, processor_walk_namespace_cb, action, NULL); #endif + unregister_hotcpu_notifier(acpi_cpu_notifier); } /* diff -puN kernel/time/tick-broadcast.c~cpu-hotplug-support-broken-in-2623-rc3 kernel/time/tick-broadcast.c --- a/kernel/time/tick-broadcast.c~cpu-hotplug-support-broken-in-2623-rc3 +++ a/kernel/time/tick-broadcast.c @@ -382,12 +382,23 @@ static int tick_broadcast_set_event(ktim int tick_resume_broadcast_oneshot(struct clock_event_device *bc) { + int cpu = smp_processor_id(); + + /* +* If the CPU is marked for broadcast, enforce oneshot +* broadcast mode. The jinxed VAIO does not resume otherwise. +* No idea why it ends up in a lower C State during resume +* without notifying the clock events layer. +*/ + if (cpu_isset(cpu, tick_broadcast_mask)) + cpu_set(cpu, tick_broadcast_oneshot_mask); + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); if(!cpus_empty(tick_broadcast_oneshot_mask)) tick_broadcast_set_event(ktime_get(), 1); - return cpu_isset(smp_processor_id(), tick_broadcast_oneshot_mask); + return cpu_isset(cpu, tick_broadcast_oneshot_mask); } /* @@ -549,20 +560,17 @@ void tick_broadcast_switch_to_oneshot(vo */ void tick_shutdown_broadcast_oneshot(unsigned int *cpup) { - struct clock_event_device *bc; unsigned long flags; unsigned int cpu = *cpup; spin_lock_irqsave(tick_broadcast_lock, flags); - bc = tick_broadcast_device.evtdev; + /* +* Clear the broadcast mask flag for the dead cpu, but do not +* stop the broadcast device! +*/ cpu_clear(cpu, tick_broadcast_oneshot_mask); - if (tick_broadcast_device.mode == TICKDEV_MODE_ONESHOT) { - if (bc cpus_empty(tick_broadcast_oneshot_mask)) - clockevents_set_mode(bc, CLOCK_EVT_MODE_SHUTDOWN); - } - spin_unlock_irqrestore(tick_broadcast_lock, flags); } diff -puN kernel/time/tick-sched.c~cpu-hotplug-support-broken-in-2623-rc3 kernel/time/tick-sched.c --- a/kernel/time/tick-sched.c~cpu-hotplug-support-broken-in-2623-rc3 +++ a/kernel/time/tick-sched.c @@ -160,6 +160,18 @@ void tick_nohz_stop_sched_tick(void) cpu = smp_processor_id(); ts = per_cpu(tick_cpu_sched, cpu); + /* +* If this cpu is offline and it is the one which updates +*
Re: cpu hotplug support broken in 2.6.23-rc3
On Sat, 2007-09-15 at 03:18 -0700, Andrew Morton wrote: On Sat, 15 Sep 2007 11:49:41 +0200 Thomas Gleixner [EMAIL PROTECTED] wrote: I dropped the timekeeping.c hunks because they are an older version of timekeeping-prevent-time-going-backwards-on-resume.patch which I already had. Err, no. The timekeeping hunk is redone due to the lockdep fix which I made. Thanks, tglx - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Sat, 2007-09-15 at 03:18 -0700, Andrew Morton wrote: http://git.kernel.org/?p=linux/kernel/git/tglx/linux-2.6-hrt.git;a=shortlog;h=for-2.6.23 That patch fixes the resume-from-ram and suspend-to-ram regressions on the Vaio. I dropped the timekeeping.c hunks because they are an older version of timekeeping-prevent-time-going-backwards-on-resume.patch which I already had. Is this good to go? Needs a bit of changelogging. Changelog it in the git tree. Please pull from there: The following changes since commit 53a3f3087be361dacfc02e7a85b6d6142a41ce8a: Linus Torvalds (1): Merge branch 'for-linus' of master.kernel.org:/.../cooloney/blackfin-2.6 are available in the git repository at: ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt.git for-2.6.23 Thomas Gleixner (6): timekeeping: access rtc outside of xtime lock timekeeping: Prevent time going backwards on resume ACPI: Reevaluate C/P/T states when a cpu becomes online clockevents: Enforce oneshot broadcast when broadcast mask is set on resume clockevents: do not shutdown the oneshot broadcast device clockevents: prevent stale tick update on offline cpu drivers/acpi/processor_core.c | 21 + kernel/time/tick-broadcast.c | 24 kernel/time/tick-sched.c | 12 kernel/time/timekeeping.c | 10 +- 4 files changed, 58 insertions(+), 9 deletions(-) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Sat, 15 Sep 2007 15:28:23 +0200 Thomas Gleixner [EMAIL PROTECTED] wrote: On Sat, 2007-09-15 at 03:18 -0700, Andrew Morton wrote: http://git.kernel.org/?p=linux/kernel/git/tglx/linux-2.6-hrt.git;a=shortlog;h=for-2.6.23 That patch fixes the resume-from-ram and suspend-to-ram regressions on the Vaio. I dropped the timekeeping.c hunks because they are an older version of timekeeping-prevent-time-going-backwards-on-resume.patch which I already had. Is this good to go? Needs a bit of changelogging. Changelog it in the git tree. Please pull from there: who, me? The following changes since commit 53a3f3087be361dacfc02e7a85b6d6142a41ce8a: Linus Torvalds (1): Merge branch 'for-linus' of master.kernel.org:/.../cooloney/blackfin-2.6 are available in the git repository at: ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt.git for-2.6.23 Thomas Gleixner (6): timekeeping: access rtc outside of xtime lock timekeeping: Prevent time going backwards on resume ACPI: Reevaluate C/P/T states when a cpu becomes online clockevents: Enforce oneshot broadcast when broadcast mask is set on resume clockevents: do not shutdown the oneshot broadcast device clockevents: prevent stale tick update on offline cpu please send it to Linus? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: cpu hotplug support broken in 2.6.23-rc3
On Fri, 2007-09-14 at 11:49 -0700, Pallipadi, Venkatesh wrote: > >> > >> Is there a patch you want me to test? Or does Len have anything to > >> play with? > > > >Venki sent me an initial patch, but it has issues with the notify > >ordering. Find below my "cache the broadcast flags" version > >for testing. > > > > While wirting that patch, I knew solution could not be that simple :(. > Does the patch work for online offline case atleast? > Will look at the Suspend/Resume ordering part in that case. Yup, the online/offline part works and it helped me to decode the other reason (/me needs a dark brown paperbag) why Pavel noticed that his box turned into a brick. I'll send out a full series of fixups (including your online/offline one) tomorrow morning. I want to give that some more testing. Vs. the resume reevaluation: I don't think it's an urgent problem. It's only my VAIO which does not tell the kernel after resume that the power supply source has changed. All my other boxen do that and we never had a complaint about that from other folks. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: cpu hotplug support broken in 2.6.23-rc3
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of >Thomas Gleixner >Sent: Friday, September 14, 2007 5:51 AM >To: Pavel Machek >Cc: Rafael J. Wysocki; Jeff Chua; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; kernel list; Len Brown >Subject: Re: cpu hotplug support broken in 2.6.23-rc3 > >Pavel, > >On Fri, 2007-09-14 at 14:38 +0200, Pavel Machek wrote: >> > I have an yet untested fix, which preserves the broadcast >state across >> > the offline state, but Len is looking into it as well, >whether we can >> > just reevaluate the power states (and the broadcast flags) >when a cpu >> > becomes online again. If Len can do that easily for >2.6.23, I'd prefer >> > that. >> >> Is there a patch you want me to test? Or does Len have anything to >> play with? > >Venki sent me an initial patch, but it has issues with the notify >ordering. Find below my "cache the broadcast flags" version >for testing. > While wirting that patch, I knew solution could not be that simple :(. Does the patch work for online offline case atleast? Will look at the Suspend/Resume ordering part in that case. Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Fri, 2007-09-14 at 14:50 +0200, Thomas Gleixner wrote: > Pavel, > > On Fri, 2007-09-14 at 14:38 +0200, Pavel Machek wrote: > > > I have an yet untested fix, which preserves the broadcast state across > > > the offline state, but Len is looking into it as well, whether we can > > > just reevaluate the power states (and the broadcast flags) when a cpu > > > becomes online again. If Len can do that easily for 2.6.23, I'd prefer > > > that. > > > > Is there a patch you want me to test? Or does Len have anything to > > play with? > > Venki sent me an initial patch, but it has issues with the notify > ordering. Find below my "cache the broadcast flags" version for testing. Hmmpf, the flag is still cleared when the cpu goes offline. Need to take a closer look. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
Pavel, On Fri, 2007-09-14 at 14:38 +0200, Pavel Machek wrote: > > I have an yet untested fix, which preserves the broadcast state across > > the offline state, but Len is looking into it as well, whether we can > > just reevaluate the power states (and the broadcast flags) when a cpu > > becomes online again. If Len can do that easily for 2.6.23, I'd prefer > > that. > > Is there a patch you want me to test? Or does Len have anything to > play with? Venki sent me an initial patch, but it has issues with the notify ordering. Find below my "cache the broadcast flags" version for testing. Thanks, tglx --- kernel/time/tick-broadcast.c | 21 ++--- 1 file changed, 18 insertions(+), 3 deletions(-) Index: linux-2.6/kernel/time/tick-broadcast.c === --- linux-2.6.orig/kernel/time/tick-broadcast.c 2007-09-14 13:22:29.0 +0200 +++ linux-2.6/kernel/time/tick-broadcast.c 2007-09-14 13:22:29.0 +0200 @@ -261,10 +261,25 @@ void tick_broadcast_on_off(unsigned long int cpu = get_cpu(); if (!cpu_isset(*oncpu, cpu_online_map)) { - printk(KERN_ERR "tick-braodcast: ignoring broadcast for " - "offline CPU #%d\n", *oncpu); - } else { + unsigned long flags; + + spin_lock_irqsave(_broadcast_lock, flags); + /* +* We need to cache the broadcast flag for offline +* CPUs. ACPI currently does not reevaluate the +* broadcast flag when a CPU goes online again. Adding +* a cpu notifier to ACPI is probably the correct +* solution, but it is hard to get this correct due to +* notify ordering problems. So caching the flag is +* the safe solution for now. +*/ + if (reason == CLOCK_EVT_NOTIFY_BROADCAST_ON) + cpu_set(*oncpu, tick_broadcast_mask); + else + cpu_clear(*oncpu, tick_broadcast_mask); + spin_unlock_irqrestore(_broadcast_lock, flags); + } else { if (cpu == *oncpu) tick_do_broadcast_on_off(); else - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
Hi! > > > What was the last known to work version ? > > > > I'm afraid I only turned on HIGH_RES_TIMERS in 2.6.23-rc1 > > timeframe... so I'm not sure if it ever worked for me. > > > > I can confirm it is working in 2.6.23-rc5 with highres disabled, and > > broken with highres enabled. NOHZ turns "waits for keypress during > > unplug/replug" into "just plain hangs". > > Ok, I can reproduce it and I tracked down what happens: > > When the CPU goes offline, the clock event source for this CPU (lapic) > is removed from the clock events framework. This also clears the > information that the CPU is using C-States which stop the local APIC > timer. > > Now you put the CPU online again and the local APIC timer is used, but > the C-State information is not evaluated again in ACPI. This means that > the clock events code does not know that the APIC might stop. In the > worst case this will happen and make the CPU wait for timer interrupts > forever. > > The problem only appears when you are on battery (c3/c4 available) or on > those broken machines, where C2 is in reality C3 (e.g. akpm's VAIO) > > I have an yet untested fix, which preserves the broadcast state across > the offline state, but Len is looking into it as well, whether we can > just reevaluate the power states (and the broadcast flags) when a cpu > becomes online again. If Len can do that easily for 2.6.23, I'd prefer > that. Is there a patch you want me to test? Or does Len have anything to play with? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
Hi! What was the last known to work version ? I'm afraid I only turned on HIGH_RES_TIMERS in 2.6.23-rc1 timeframe... so I'm not sure if it ever worked for me. I can confirm it is working in 2.6.23-rc5 with highres disabled, and broken with highres enabled. NOHZ turns waits for keypress during unplug/replug into just plain hangs. Ok, I can reproduce it and I tracked down what happens: When the CPU goes offline, the clock event source for this CPU (lapic) is removed from the clock events framework. This also clears the information that the CPU is using C-States which stop the local APIC timer. Now you put the CPU online again and the local APIC timer is used, but the C-State information is not evaluated again in ACPI. This means that the clock events code does not know that the APIC might stop. In the worst case this will happen and make the CPU wait for timer interrupts forever. The problem only appears when you are on battery (c3/c4 available) or on those broken machines, where C2 is in reality C3 (e.g. akpm's VAIO) I have an yet untested fix, which preserves the broadcast state across the offline state, but Len is looking into it as well, whether we can just reevaluate the power states (and the broadcast flags) when a cpu becomes online again. If Len can do that easily for 2.6.23, I'd prefer that. Is there a patch you want me to test? Or does Len have anything to play with? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
Pavel, On Fri, 2007-09-14 at 14:38 +0200, Pavel Machek wrote: I have an yet untested fix, which preserves the broadcast state across the offline state, but Len is looking into it as well, whether we can just reevaluate the power states (and the broadcast flags) when a cpu becomes online again. If Len can do that easily for 2.6.23, I'd prefer that. Is there a patch you want me to test? Or does Len have anything to play with? Venki sent me an initial patch, but it has issues with the notify ordering. Find below my cache the broadcast flags version for testing. Thanks, tglx --- kernel/time/tick-broadcast.c | 21 ++--- 1 file changed, 18 insertions(+), 3 deletions(-) Index: linux-2.6/kernel/time/tick-broadcast.c === --- linux-2.6.orig/kernel/time/tick-broadcast.c 2007-09-14 13:22:29.0 +0200 +++ linux-2.6/kernel/time/tick-broadcast.c 2007-09-14 13:22:29.0 +0200 @@ -261,10 +261,25 @@ void tick_broadcast_on_off(unsigned long int cpu = get_cpu(); if (!cpu_isset(*oncpu, cpu_online_map)) { - printk(KERN_ERR tick-braodcast: ignoring broadcast for - offline CPU #%d\n, *oncpu); - } else { + unsigned long flags; + + spin_lock_irqsave(tick_broadcast_lock, flags); + /* +* We need to cache the broadcast flag for offline +* CPUs. ACPI currently does not reevaluate the +* broadcast flag when a CPU goes online again. Adding +* a cpu notifier to ACPI is probably the correct +* solution, but it is hard to get this correct due to +* notify ordering problems. So caching the flag is +* the safe solution for now. +*/ + if (reason == CLOCK_EVT_NOTIFY_BROADCAST_ON) + cpu_set(*oncpu, tick_broadcast_mask); + else + cpu_clear(*oncpu, tick_broadcast_mask); + spin_unlock_irqrestore(tick_broadcast_lock, flags); + } else { if (cpu == *oncpu) tick_do_broadcast_on_off(reason); else - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Fri, 2007-09-14 at 14:50 +0200, Thomas Gleixner wrote: Pavel, On Fri, 2007-09-14 at 14:38 +0200, Pavel Machek wrote: I have an yet untested fix, which preserves the broadcast state across the offline state, but Len is looking into it as well, whether we can just reevaluate the power states (and the broadcast flags) when a cpu becomes online again. If Len can do that easily for 2.6.23, I'd prefer that. Is there a patch you want me to test? Or does Len have anything to play with? Venki sent me an initial patch, but it has issues with the notify ordering. Find below my cache the broadcast flags version for testing. Hmmpf, the flag is still cleared when the cpu goes offline. Need to take a closer look. tglx - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: cpu hotplug support broken in 2.6.23-rc3
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Thomas Gleixner Sent: Friday, September 14, 2007 5:51 AM To: Pavel Machek Cc: Rafael J. Wysocki; Jeff Chua; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; kernel list; Len Brown Subject: Re: cpu hotplug support broken in 2.6.23-rc3 Pavel, On Fri, 2007-09-14 at 14:38 +0200, Pavel Machek wrote: I have an yet untested fix, which preserves the broadcast state across the offline state, but Len is looking into it as well, whether we can just reevaluate the power states (and the broadcast flags) when a cpu becomes online again. If Len can do that easily for 2.6.23, I'd prefer that. Is there a patch you want me to test? Or does Len have anything to play with? Venki sent me an initial patch, but it has issues with the notify ordering. Find below my cache the broadcast flags version for testing. While wirting that patch, I knew solution could not be that simple :(. Does the patch work for online offline case atleast? Will look at the Suspend/Resume ordering part in that case. Thanks, Venki - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: cpu hotplug support broken in 2.6.23-rc3
On Fri, 2007-09-14 at 11:49 -0700, Pallipadi, Venkatesh wrote: Is there a patch you want me to test? Or does Len have anything to play with? Venki sent me an initial patch, but it has issues with the notify ordering. Find below my cache the broadcast flags version for testing. While wirting that patch, I knew solution could not be that simple :(. Does the patch work for online offline case atleast? Will look at the Suspend/Resume ordering part in that case. Yup, the online/offline part works and it helped me to decode the other reason (/me needs a dark brown paperbag) why Pavel noticed that his box turned into a brick. I'll send out a full series of fixups (including your online/offline one) tomorrow morning. I want to give that some more testing. Vs. the resume reevaluation: I don't think it's an urgent problem. It's only my VAIO which does not tell the kernel after resume that the power supply source has changed. All my other boxen do that and we never had a complaint about that from other folks. tglx - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Tue, 2007-09-04 at 09:27 +0200, Pavel Machek wrote: > > On Mon, 2007-09-03 at 12:19 +0200, Rafael J. Wysocki wrote: > > > > Ok, so it gets weirder. I have now machine in "hung" state; other > > > > consoles still work, but there are no timers - sleep 1 hangs forever. > > > > > > > > sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel. > > > > > > > > So I indeed suspect difference-in-kconfig to trigger this, and will > > > > try disabling noidlehz. > > > > > > I would unset CONFIG_HIGH_RES_TIMERS for starters. > > > > > > Well, I guess Thomas should know about that. ;-) > > > > What was the last known to work version ? > > I'm afraid I only turned on HIGH_RES_TIMERS in 2.6.23-rc1 > timeframe... so I'm not sure if it ever worked for me. > > I can confirm it is working in 2.6.23-rc5 with highres disabled, and > broken with highres enabled. NOHZ turns "waits for keypress during > unplug/replug" into "just plain hangs". Ok, I can reproduce it and I tracked down what happens: When the CPU goes offline, the clock event source for this CPU (lapic) is removed from the clock events framework. This also clears the information that the CPU is using C-States which stop the local APIC timer. Now you put the CPU online again and the local APIC timer is used, but the C-State information is not evaluated again in ACPI. This means that the clock events code does not know that the APIC might stop. In the worst case this will happen and make the CPU wait for timer interrupts forever. The problem only appears when you are on battery (c3/c4 available) or on those broken machines, where C2 is in reality C3 (e.g. akpm's VAIO) I have an yet untested fix, which preserves the broadcast state across the offline state, but Len is looking into it as well, whether we can just reevaluate the power states (and the broadcast flags) when a cpu becomes online again. If Len can do that easily for 2.6.23, I'd prefer that. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Tue, 2007-09-04 at 09:27 +0200, Pavel Machek wrote: On Mon, 2007-09-03 at 12:19 +0200, Rafael J. Wysocki wrote: Ok, so it gets weirder. I have now machine in hung state; other consoles still work, but there are no timers - sleep 1 hangs forever. sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel. So I indeed suspect difference-in-kconfig to trigger this, and will try disabling noidlehz. I would unset CONFIG_HIGH_RES_TIMERS for starters. Well, I guess Thomas should know about that. ;-) What was the last known to work version ? I'm afraid I only turned on HIGH_RES_TIMERS in 2.6.23-rc1 timeframe... so I'm not sure if it ever worked for me. I can confirm it is working in 2.6.23-rc5 with highres disabled, and broken with highres enabled. NOHZ turns waits for keypress during unplug/replug into just plain hangs. Ok, I can reproduce it and I tracked down what happens: When the CPU goes offline, the clock event source for this CPU (lapic) is removed from the clock events framework. This also clears the information that the CPU is using C-States which stop the local APIC timer. Now you put the CPU online again and the local APIC timer is used, but the C-State information is not evaluated again in ACPI. This means that the clock events code does not know that the APIC might stop. In the worst case this will happen and make the CPU wait for timer interrupts forever. The problem only appears when you are on battery (c3/c4 available) or on those broken machines, where C2 is in reality C3 (e.g. akpm's VAIO) I have an yet untested fix, which preserves the broadcast state across the offline state, but Len is looking into it as well, whether we can just reevaluate the power states (and the broadcast flags) when a cpu becomes online again. If Len can do that easily for 2.6.23, I'd prefer that. tglx - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
> On Mon, 2007-09-03 at 12:19 +0200, Rafael J. Wysocki wrote: > > > Ok, so it gets weirder. I have now machine in "hung" state; other > > > consoles still work, but there are no timers - sleep 1 hangs forever. > > > > > > sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel. > > > > > > So I indeed suspect difference-in-kconfig to trigger this, and will > > > try disabling noidlehz. > > > > I would unset CONFIG_HIGH_RES_TIMERS for starters. > > > > Well, I guess Thomas should know about that. ;-) > > What was the last known to work version ? I'm afraid I only turned on HIGH_RES_TIMERS in 2.6.23-rc1 timeframe... so I'm not sure if it ever worked for me. I can confirm it is working in 2.6.23-rc5 with highres disabled, and broken with highres enabled. NOHZ turns "waits for keypress during unplug/replug" into "just plain hangs". Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Mon, 2007-09-03 at 12:19 +0200, Rafael J. Wysocki wrote: Ok, so it gets weirder. I have now machine in hung state; other consoles still work, but there are no timers - sleep 1 hangs forever. sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel. So I indeed suspect difference-in-kconfig to trigger this, and will try disabling noidlehz. I would unset CONFIG_HIGH_RES_TIMERS for starters. Well, I guess Thomas should know about that. ;-) What was the last known to work version ? I'm afraid I only turned on HIGH_RES_TIMERS in 2.6.23-rc1 timeframe... so I'm not sure if it ever worked for me. I can confirm it is working in 2.6.23-rc5 with highres disabled, and broken with highres enabled. NOHZ turns waits for keypress during unplug/replug into just plain hangs. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Mon, 2007-09-03 at 12:19 +0200, Rafael J. Wysocki wrote: > > Ok, so it gets weirder. I have now machine in "hung" state; other > > consoles still work, but there are no timers - sleep 1 hangs forever. > > > > sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel. > > > > So I indeed suspect difference-in-kconfig to trigger this, and will > > try disabling noidlehz. > > I would unset CONFIG_HIGH_RES_TIMERS for starters. > > Well, I guess Thomas should know about that. ;-) What was the last known to work version ? tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: highres timers break cpu hotplug in 2.6.23-rc5 [was Re: cpu hotplug support broken in 2.6.23-rc3]
On 9/3/07, Pavel Machek <[EMAIL PROTECTED]> wrote: > It gets weirder. With "nohz=off" on commandline, I have to press any > key (generate interrupt?) for echo 1 > online to finish. 2.6.23-rc5 > kernel... but hotplug/unplug works reliably now. > > With nohz=off highres=off I can unplug/replug cpus as much as I > want... running in tight loop now. Yes. CONFIG_NO_HZ and and CONFIG_HIGH_RES_TIMERS has to be unset or suspend-to-disk would just hang, unless you type something on the keyboard, and then you can suspend to disk. It seems interrupts are not missing. Thanks, Jeff. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Wed 2007-08-29 13:38:27, Gautham R Shenoy wrote: > Hi Pavel, > On Mon, Aug 27, 2007 at 12:43:50PM +0200, Pavel Machek wrote: > > Hi! > > > > Trying to do few onlines/offlines reliably hangs my machine (thinkpad > > x60, i386 architecture). > > > > That's strange. > > I've been running cpu offline/online tests with kern bench, > cpufreq-ondemand and a few rt-tasks running in the background > and it has worked for me. > Something like 100 iterations without a problem. But these were on > machines with 4-8 cpus. So may be this could be something specific to > the dual cpu machine. Seems like it is specific to nohz/highrestimers. > Can you post the .config? I'll try to recreate it? Will send privately. > > Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS > > file: > > > > There is a list of maintainers in the Documentation/cpu-hotplug.txt, > which includes maintainers for different platforms as well. > > It's a good idea to add that info to the MAINTAINERS file as well. Yes, please. -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
highres timers break cpu hotplug in 2.6.23-rc5 [was Re: cpu hotplug support broken in 2.6.23-rc3]
Hi! > > Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or > > so cycles at one point. > > Mine still survives with this ... with sleep 1 ... > > # for((i=0; i<100; i++)); do echo $i; echo $((i % 2)) > >/sys/devices/system/cpu/cpu1/online; sleep 1; done > > and this as well ... without sleep ... > > # for((i=0; i<100; i++)); do echo $i; echo $((i % 2)) > >/sys/devices/system/cpu/cpu1/online; done > > I'm on reiserfs. gcc 3.4.5. Config sent to you seperately so as not to > cloud lkml. If anyone wants the config, please let me know. Is mime > "attachment" acceptable now on lkml? It gets weirder. With "nohz=off" on commandline, I have to press any key (generate interrupt?) for echo 1 > online to finish. 2.6.23-rc5 kernel... but hotplug/unplug works reliably now. With nohz=off highres=off I can unplug/replug cpus as much as I want... running in tight loop now. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Monday, 3 September 2007 05:47, Pavel Machek wrote: > Hi! > > > > Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or > > > so cycles at one point. > > > > Mine still survives with this ... with sleep 1 ... > > > > # for((i=0; i<100; i++)); do echo $i; echo $((i % 2)) > > >/sys/devices/system/cpu/cpu1/online; sleep 1; done > > > > and this as well ... without sleep ... > > > > # for((i=0; i<100; i++)); do echo $i; echo $((i % 2)) > > >/sys/devices/system/cpu/cpu1/online; done > > > > I'm on reiserfs. gcc 3.4.5. Config sent to you seperately so as not to > > cloud lkml. If anyone wants the config, please let me know. Is mime > > "attachment" acceptable now on lkml? > > Ok, so it gets weirder. I have now machine in "hung" state; other > consoles still work, but there are no timers - sleep 1 hangs forever. > > sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel. > > So I indeed suspect difference-in-kconfig to trigger this, and will > try disabling noidlehz. I would unset CONFIG_HIGH_RES_TIMERS for starters. Well, I guess Thomas should know about that. ;-) Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
Hi! > > Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or > > so cycles at one point. > > Mine still survives with this ... with sleep 1 ... > > # for((i=0; i<100; i++)); do echo $i; echo $((i % 2)) > >/sys/devices/system/cpu/cpu1/online; sleep 1; done > > and this as well ... without sleep ... > > # for((i=0; i<100; i++)); do echo $i; echo $((i % 2)) > >/sys/devices/system/cpu/cpu1/online; done > > I'm on reiserfs. gcc 3.4.5. Config sent to you seperately so as not to > cloud lkml. If anyone wants the config, please let me know. Is mime > "attachment" acceptable now on lkml? Ok, so it gets weirder. I have now machine in "hung" state; other consoles still work, but there are no timers - sleep 1 hangs forever. sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel. So I indeed suspect difference-in-kconfig to trigger this, and will try disabling noidlehz. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
Hi! Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or so cycles at one point. Mine still survives with this ... with sleep 1 ... # for((i=0; i100; i++)); do echo $i; echo $((i % 2)) /sys/devices/system/cpu/cpu1/online; sleep 1; done and this as well ... without sleep ... # for((i=0; i100; i++)); do echo $i; echo $((i % 2)) /sys/devices/system/cpu/cpu1/online; done I'm on reiserfs. gcc 3.4.5. Config sent to you seperately so as not to cloud lkml. If anyone wants the config, please let me know. Is mime attachment acceptable now on lkml? Ok, so it gets weirder. I have now machine in hung state; other consoles still work, but there are no timers - sleep 1 hangs forever. sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel. So I indeed suspect difference-in-kconfig to trigger this, and will try disabling noidlehz. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
highres timers break cpu hotplug in 2.6.23-rc5 [was Re: cpu hotplug support broken in 2.6.23-rc3]
Hi! Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or so cycles at one point. Mine still survives with this ... with sleep 1 ... # for((i=0; i100; i++)); do echo $i; echo $((i % 2)) /sys/devices/system/cpu/cpu1/online; sleep 1; done and this as well ... without sleep ... # for((i=0; i100; i++)); do echo $i; echo $((i % 2)) /sys/devices/system/cpu/cpu1/online; done I'm on reiserfs. gcc 3.4.5. Config sent to you seperately so as not to cloud lkml. If anyone wants the config, please let me know. Is mime attachment acceptable now on lkml? It gets weirder. With nohz=off on commandline, I have to press any key (generate interrupt?) for echo 1 online to finish. 2.6.23-rc5 kernel... but hotplug/unplug works reliably now. With nohz=off highres=off I can unplug/replug cpus as much as I want... running in tight loop now. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Monday, 3 September 2007 05:47, Pavel Machek wrote: Hi! Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or so cycles at one point. Mine still survives with this ... with sleep 1 ... # for((i=0; i100; i++)); do echo $i; echo $((i % 2)) /sys/devices/system/cpu/cpu1/online; sleep 1; done and this as well ... without sleep ... # for((i=0; i100; i++)); do echo $i; echo $((i % 2)) /sys/devices/system/cpu/cpu1/online; done I'm on reiserfs. gcc 3.4.5. Config sent to you seperately so as not to cloud lkml. If anyone wants the config, please let me know. Is mime attachment acceptable now on lkml? Ok, so it gets weirder. I have now machine in hung state; other consoles still work, but there are no timers - sleep 1 hangs forever. sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel. So I indeed suspect difference-in-kconfig to trigger this, and will try disabling noidlehz. I would unset CONFIG_HIGH_RES_TIMERS for starters. Well, I guess Thomas should know about that. ;-) Greetings, Rafael - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Wed 2007-08-29 13:38:27, Gautham R Shenoy wrote: Hi Pavel, On Mon, Aug 27, 2007 at 12:43:50PM +0200, Pavel Machek wrote: Hi! Trying to do few onlines/offlines reliably hangs my machine (thinkpad x60, i386 architecture). That's strange. I've been running cpu offline/online tests with kern bench, cpufreq-ondemand and a few rt-tasks running in the background and it has worked for me. Something like 100 iterations without a problem. But these were on machines with 4-8 cpus. So may be this could be something specific to the dual cpu machine. Seems like it is specific to nohz/highrestimers. Can you post the .config? I'll try to recreate it? Will send privately. Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS file: There is a list of maintainers in the Documentation/cpu-hotplug.txt, which includes maintainers for different platforms as well. It's a good idea to add that info to the MAINTAINERS file as well. Yes, please. -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: highres timers break cpu hotplug in 2.6.23-rc5 [was Re: cpu hotplug support broken in 2.6.23-rc3]
On 9/3/07, Pavel Machek [EMAIL PROTECTED] wrote: It gets weirder. With nohz=off on commandline, I have to press any key (generate interrupt?) for echo 1 online to finish. 2.6.23-rc5 kernel... but hotplug/unplug works reliably now. With nohz=off highres=off I can unplug/replug cpus as much as I want... running in tight loop now. Yes. CONFIG_NO_HZ and and CONFIG_HIGH_RES_TIMERS has to be unset or suspend-to-disk would just hang, unless you type something on the keyboard, and then you can suspend to disk. It seems interrupts are not missing. Thanks, Jeff. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
Hi Pavel, On Mon, Aug 27, 2007 at 12:43:50PM +0200, Pavel Machek wrote: > Hi! > > Trying to do few onlines/offlines reliably hangs my machine (thinkpad > x60, i386 architecture). > That's strange. I've been running cpu offline/online tests with kern bench, cpufreq-ondemand and a few rt-tasks running in the background and it has worked for me. Something like 100 iterations without a problem. But these were on machines with 4-8 cpus. So may be this could be something specific to the dual cpu machine. Can you post the .config? I'll try to recreate it? It's really strange since you mention that it tooks was an echo 1/0 into the sysfs file to break it. > Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS > file: > There is a list of maintainers in the Documentation/cpu-hotplug.txt, which includes maintainers for different platforms as well. It's a good idea to add that info to the MAINTAINERS file as well. Thanks and Regards gautham. -- Gautham R Shenoy Linux Technology Center IBM India. "Freedom comes with a price tag of responsibility, which is still a bargain, because Freedom is priceless!" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
Hi Pavel, On Mon, Aug 27, 2007 at 12:43:50PM +0200, Pavel Machek wrote: Hi! Trying to do few onlines/offlines reliably hangs my machine (thinkpad x60, i386 architecture). That's strange. I've been running cpu offline/online tests with kern bench, cpufreq-ondemand and a few rt-tasks running in the background and it has worked for me. Something like 100 iterations without a problem. But these were on machines with 4-8 cpus. So may be this could be something specific to the dual cpu machine. Can you post the .config? I'll try to recreate it? It's really strange since you mention that it tooks was an echo 1/0 into the sysfs file to break it. Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS file: There is a list of maintainers in the Documentation/cpu-hotplug.txt, which includes maintainers for different platforms as well. It's a good idea to add that info to the MAINTAINERS file as well. Thanks and Regards gautham. -- Gautham R Shenoy Linux Technology Center IBM India. Freedom comes with a price tag of responsibility, which is still a bargain, because Freedom is priceless! - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On 8/28/07, Pavel Machek <[EMAIL PROTECTED]> wrote: > Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or > so cycles at one point. Mine still survives with this ... with sleep 1 ... # for((i=0; i<100; i++)); do echo $i; echo $((i % 2)) >/sys/devices/system/cpu/cpu1/online; sleep 1; done and this as well ... without sleep ... # for((i=0; i<100; i++)); do echo $i; echo $((i % 2)) >/sys/devices/system/cpu/cpu1/online; done I'm on reiserfs. gcc 3.4.5. Config sent to you seperately so as not to cloud lkml. If anyone wants the config, please let me know. Is mime "attachment" acceptable now on lkml? Thanks, Jeff. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
2007/8/28, Rafael J. Wysocki <[EMAIL PROTECTED]>: > On Monday, 27 August 2007 23:58, Pavel Machek wrote: > > On Mon 2007-08-27 23:59:31, Rafael J. Wysocki wrote: > > > On Monday, 27 August 2007 23:32, Pavel Machek wrote: > > > > On Mon 2007-08-27 22:36:57, Jeff Chua wrote: > > > > > On 8/27/07, Pavel Machek <[EMAIL PROTECTED]> wrote: > > > > > > On Mon 2007-08-27 12:43:50, Pavel Machek wrote: > > > > > > > Hi! > > > > > > > > > > > > > > Trying to do few onlines/offlines reliably hangs my machine > > > > > > > (thinkpad > > > > > > > x60, i386 architecture). > > > > > > > > > > I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s, > > > > > and my system still survives. > > > > > > > > Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or > > > > so cycles at one point. > > > > > > > > ...or maybe difference is in the .config, or maybe I broken something > > > > in my kernel sources I have been doing enough CPU offline/online test these days and it works fine. But there is no cpufreq driver which supports my machine. So my test didn't cover test cpu hotplug code in cpufreq. If you have cpufreq driver and it is built as module, it is worth trying same test after unloading cpufreq driver in order to narrow down the problem area. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Monday, 27 August 2007 23:58, Pavel Machek wrote: > On Mon 2007-08-27 23:59:31, Rafael J. Wysocki wrote: > > On Monday, 27 August 2007 23:32, Pavel Machek wrote: > > > On Mon 2007-08-27 22:36:57, Jeff Chua wrote: > > > > On 8/27/07, Pavel Machek <[EMAIL PROTECTED]> wrote: > > > > > On Mon 2007-08-27 12:43:50, Pavel Machek wrote: > > > > > > Hi! > > > > > > > > > > > > Trying to do few onlines/offlines reliably hangs my machine > > > > > > (thinkpad > > > > > > x60, i386 architecture). > > > > > > > > I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s, > > > > and my system still survives. > > > > > > Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or > > > so cycles at one point. > > > > > > ...or maybe difference is in the .config, or maybe I broken something > > > in my kernel sources > > > > Well, something seems to be wrong with the CPU hotplug, but it's insanely > > difficult to reproduce on my boxes. > > > > I bet on one of the notifiers blocking while waiting on a frozen task. > > It happens reliably for me, with this script... and randomly, when I > just echo 0/1 > online from commandline... so it should not be > anything with the frozen tasks. That suggests the CPU hotplug just deadlocks internally. Can you put some printk's into _cpu_down() and see where exactly it hangs? > echo test > /sys/power/disk > echo disk > /sys/power/state > > reliably hangs on resume in the attached script. It works ok with > nosmp. Which step hangs it? Or is it at random? Rafael > #!/bin/bash > killall klogd > > echo -n "testing refrigerator (testproc)..." > echo testproc > /sys/power/disk > echo disk > /sys/power/state > echo "okay" > > sleep 2 > echo -n "testing drivers (test)..." > echo test > /sys/power/disk > echo disk > /sys/power/state > echo "okay" > > sleep 2 > echo -n "testing swsusp (reboot)..." > echo reboot > /sys/power/disk > echo disk > /sys/power/state > echo "okay" > > sleep 2 > echo -n "testing s2ram..." > s2ram > echo "okay" > > sleep 2 > echo -n "testing swsusp (shutdown)..." > echo shutdown > /sys/power/disk > echo disk > /sys/power/state > echo "okay" > > sleep 2 > echo -n "testing swsusp (platform)..." > echo platform > /sys/power/disk > echo disk > /sys/power/state > echo "okay" > > sleep 2 > echo -n "testing s2ram..." > s2ram > echo "okay" > > -- "Premature optimization is the root of all evil." - Donald Knuth - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Monday, 27 August 2007 23:58, Pavel Machek wrote: On Mon 2007-08-27 23:59:31, Rafael J. Wysocki wrote: On Monday, 27 August 2007 23:32, Pavel Machek wrote: On Mon 2007-08-27 22:36:57, Jeff Chua wrote: On 8/27/07, Pavel Machek [EMAIL PROTECTED] wrote: On Mon 2007-08-27 12:43:50, Pavel Machek wrote: Hi! Trying to do few onlines/offlines reliably hangs my machine (thinkpad x60, i386 architecture). I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s, and my system still survives. Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or so cycles at one point. ...or maybe difference is in the .config, or maybe I broken something in my kernel sources Well, something seems to be wrong with the CPU hotplug, but it's insanely difficult to reproduce on my boxes. I bet on one of the notifiers blocking while waiting on a frozen task. It happens reliably for me, with this script... and randomly, when I just echo 0/1 online from commandline... so it should not be anything with the frozen tasks. That suggests the CPU hotplug just deadlocks internally. Can you put some printk's into _cpu_down() and see where exactly it hangs? echo test /sys/power/disk echo disk /sys/power/state reliably hangs on resume in the attached script. It works ok with nosmp. Which step hangs it? Or is it at random? Rafael #!/bin/bash killall klogd echo -n testing refrigerator (testproc)... echo testproc /sys/power/disk echo disk /sys/power/state echo okay sleep 2 echo -n testing drivers (test)... echo test /sys/power/disk echo disk /sys/power/state echo okay sleep 2 echo -n testing swsusp (reboot)... echo reboot /sys/power/disk echo disk /sys/power/state echo okay sleep 2 echo -n testing s2ram... s2ram echo okay sleep 2 echo -n testing swsusp (shutdown)... echo shutdown /sys/power/disk echo disk /sys/power/state echo okay sleep 2 echo -n testing swsusp (platform)... echo platform /sys/power/disk echo disk /sys/power/state echo okay sleep 2 echo -n testing s2ram... s2ram echo okay -- Premature optimization is the root of all evil. - Donald Knuth - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
2007/8/28, Rafael J. Wysocki [EMAIL PROTECTED]: On Monday, 27 August 2007 23:58, Pavel Machek wrote: On Mon 2007-08-27 23:59:31, Rafael J. Wysocki wrote: On Monday, 27 August 2007 23:32, Pavel Machek wrote: On Mon 2007-08-27 22:36:57, Jeff Chua wrote: On 8/27/07, Pavel Machek [EMAIL PROTECTED] wrote: On Mon 2007-08-27 12:43:50, Pavel Machek wrote: Hi! Trying to do few onlines/offlines reliably hangs my machine (thinkpad x60, i386 architecture). I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s, and my system still survives. Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or so cycles at one point. ...or maybe difference is in the .config, or maybe I broken something in my kernel sources I have been doing enough CPU offline/online test these days and it works fine. But there is no cpufreq driver which supports my machine. So my test didn't cover test cpu hotplug code in cpufreq. If you have cpufreq driver and it is built as module, it is worth trying same test after unloading cpufreq driver in order to narrow down the problem area. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On 8/28/07, Pavel Machek [EMAIL PROTECTED] wrote: Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or so cycles at one point. Mine still survives with this ... with sleep 1 ... # for((i=0; i100; i++)); do echo $i; echo $((i % 2)) /sys/devices/system/cpu/cpu1/online; sleep 1; done and this as well ... without sleep ... # for((i=0; i100; i++)); do echo $i; echo $((i % 2)) /sys/devices/system/cpu/cpu1/online; done I'm on reiserfs. gcc 3.4.5. Config sent to you seperately so as not to cloud lkml. If anyone wants the config, please let me know. Is mime attachment acceptable now on lkml? Thanks, Jeff. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Mon 2007-08-27 23:59:31, Rafael J. Wysocki wrote: > On Monday, 27 August 2007 23:32, Pavel Machek wrote: > > On Mon 2007-08-27 22:36:57, Jeff Chua wrote: > > > On 8/27/07, Pavel Machek <[EMAIL PROTECTED]> wrote: > > > > On Mon 2007-08-27 12:43:50, Pavel Machek wrote: > > > > > Hi! > > > > > > > > > > Trying to do few onlines/offlines reliably hangs my machine (thinkpad > > > > > x60, i386 architecture). > > > > > > I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s, > > > and my system still survives. > > > > Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or > > so cycles at one point. > > > > ...or maybe difference is in the .config, or maybe I broken something > > in my kernel sources > > Well, something seems to be wrong with the CPU hotplug, but it's insanely > difficult to reproduce on my boxes. > > I bet on one of the notifiers blocking while waiting on a frozen task. It happens reliably for me, with this script... and randomly, when I just echo 0/1 > online from commandline... so it should not be anything with the frozen tasks. echo test > /sys/power/disk echo disk > /sys/power/state reliably hangs on resume in the attached script. It works ok with nosmp. Pavel #!/bin/bash killall klogd echo -n "testing refrigerator (testproc)..." echo testproc > /sys/power/disk echo disk > /sys/power/state echo "okay" sleep 2 echo -n "testing drivers (test)..." echo test > /sys/power/disk echo disk > /sys/power/state echo "okay" sleep 2 echo -n "testing swsusp (reboot)..." echo reboot > /sys/power/disk echo disk > /sys/power/state echo "okay" sleep 2 echo -n "testing s2ram..." s2ram echo "okay" sleep 2 echo -n "testing swsusp (shutdown)..." echo shutdown > /sys/power/disk echo disk > /sys/power/state echo "okay" sleep 2 echo -n "testing swsusp (platform)..." echo platform > /sys/power/disk echo disk > /sys/power/state echo "okay" sleep 2 echo -n "testing s2ram..." s2ram echo "okay" -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Monday, 27 August 2007 23:32, Pavel Machek wrote: > On Mon 2007-08-27 22:36:57, Jeff Chua wrote: > > On 8/27/07, Pavel Machek <[EMAIL PROTECTED]> wrote: > > > On Mon 2007-08-27 12:43:50, Pavel Machek wrote: > > > > Hi! > > > > > > > > Trying to do few onlines/offlines reliably hangs my machine (thinkpad > > > > x60, i386 architecture). > > > > I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s, > > and my system still survives. > > Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or > so cycles at one point. > > ...or maybe difference is in the .config, or maybe I broken something > in my kernel sources Well, something seems to be wrong with the CPU hotplug, but it's insanely difficult to reproduce on my boxes. I bet on one of the notifiers blocking while waiting on a frozen task. Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Mon 2007-08-27 22:36:57, Jeff Chua wrote: > On 8/27/07, Pavel Machek <[EMAIL PROTECTED]> wrote: > > On Mon 2007-08-27 12:43:50, Pavel Machek wrote: > > > Hi! > > > > > > Trying to do few onlines/offlines reliably hangs my machine (thinkpad > > > x60, i386 architecture). > > I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s, > and my system still survives. Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or so cycles at one point. ...or maybe difference is in the .config, or maybe I broken something in my kernel sources Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
Hi, On 27/08/07, Jeff Chua <[EMAIL PROTECTED]> wrote: > On 8/27/07, Pavel Machek <[EMAIL PROTECTED]> wrote: > > On Mon 2007-08-27 12:43:50, Pavel Machek wrote: > > > Hi! > > > > > > Trying to do few onlines/offlines reliably hangs my machine (thinkpad > > > x60, i386 architecture). > > I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s, > and my system still survives. So maybe diff between your and Pavel's config file will give an answer. Any details about the software environment? Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On 8/27/07, Pavel Machek <[EMAIL PROTECTED]> wrote: > On Mon 2007-08-27 12:43:50, Pavel Machek wrote: > > Hi! > > > > Trying to do few onlines/offlines reliably hangs my machine (thinkpad > > x60, i386 architecture). I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s, and my system still survives. Jeff. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Mon 2007-08-27 12:43:50, Pavel Machek wrote: > Hi! > > Trying to do few onlines/offlines reliably hangs my machine (thinkpad > x60, i386 architecture). > > Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS > file: > > [EMAIL PROTECTED]:/data/l/linux$ grep CPU MAINTAINERS > CPU FREQUENCY DRIVERS > CPUID/MSR DRIVER > CPUSETS > i386 SETUP CODE / CPU ERRATA WORKAROUNDS > SCx200 CPU SUPPORT ...plus it actually breaks suspend, and it is regression from 2.6.22. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
cpu hotplug support broken in 2.6.23-rc3
Hi! Trying to do few onlines/offlines reliably hangs my machine (thinkpad x60, i386 architecture). Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS file: [EMAIL PROTECTED]:/data/l/linux$ grep CPU MAINTAINERS CPU FREQUENCY DRIVERS CPUID/MSR DRIVER CPUSETS i386 SETUP CODE / CPU ERRATA WORKAROUNDS SCx200 CPU SUPPORT Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
cpu hotplug support broken in 2.6.23-rc3
Hi! Trying to do few onlines/offlines reliably hangs my machine (thinkpad x60, i386 architecture). Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS file: [EMAIL PROTECTED]:/data/l/linux$ grep CPU MAINTAINERS CPU FREQUENCY DRIVERS CPUID/MSR DRIVER CPUSETS i386 SETUP CODE / CPU ERRATA WORKAROUNDS SCx200 CPU SUPPORT Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Mon 2007-08-27 12:43:50, Pavel Machek wrote: Hi! Trying to do few onlines/offlines reliably hangs my machine (thinkpad x60, i386 architecture). Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS file: [EMAIL PROTECTED]:/data/l/linux$ grep CPU MAINTAINERS CPU FREQUENCY DRIVERS CPUID/MSR DRIVER CPUSETS i386 SETUP CODE / CPU ERRATA WORKAROUNDS SCx200 CPU SUPPORT ...plus it actually breaks suspend, and it is regression from 2.6.22. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On 8/27/07, Pavel Machek [EMAIL PROTECTED] wrote: On Mon 2007-08-27 12:43:50, Pavel Machek wrote: Hi! Trying to do few onlines/offlines reliably hangs my machine (thinkpad x60, i386 architecture). I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s, and my system still survives. Jeff. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
Hi, On 27/08/07, Jeff Chua [EMAIL PROTECTED] wrote: On 8/27/07, Pavel Machek [EMAIL PROTECTED] wrote: On Mon 2007-08-27 12:43:50, Pavel Machek wrote: Hi! Trying to do few onlines/offlines reliably hangs my machine (thinkpad x60, i386 architecture). I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s, and my system still survives. So maybe diff between your and Pavel's config file will give an answer. Any details about the software environment? Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Mon 2007-08-27 22:36:57, Jeff Chua wrote: On 8/27/07, Pavel Machek [EMAIL PROTECTED] wrote: On Mon 2007-08-27 12:43:50, Pavel Machek wrote: Hi! Trying to do few onlines/offlines reliably hangs my machine (thinkpad x60, i386 architecture). I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s, and my system still survives. Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or so cycles at one point. ...or maybe difference is in the .config, or maybe I broken something in my kernel sources Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Monday, 27 August 2007 23:32, Pavel Machek wrote: On Mon 2007-08-27 22:36:57, Jeff Chua wrote: On 8/27/07, Pavel Machek [EMAIL PROTECTED] wrote: On Mon 2007-08-27 12:43:50, Pavel Machek wrote: Hi! Trying to do few onlines/offlines reliably hangs my machine (thinkpad x60, i386 architecture). I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s, and my system still survives. Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or so cycles at one point. ...or maybe difference is in the .config, or maybe I broken something in my kernel sources Well, something seems to be wrong with the CPU hotplug, but it's insanely difficult to reproduce on my boxes. I bet on one of the notifiers blocking while waiting on a frozen task. Greetings, Rafael - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpu hotplug support broken in 2.6.23-rc3
On Mon 2007-08-27 23:59:31, Rafael J. Wysocki wrote: On Monday, 27 August 2007 23:32, Pavel Machek wrote: On Mon 2007-08-27 22:36:57, Jeff Chua wrote: On 8/27/07, Pavel Machek [EMAIL PROTECTED] wrote: On Mon 2007-08-27 12:43:50, Pavel Machek wrote: Hi! Trying to do few onlines/offlines reliably hangs my machine (thinkpad x60, i386 architecture). I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s, and my system still survives. Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or so cycles at one point. ...or maybe difference is in the .config, or maybe I broken something in my kernel sources Well, something seems to be wrong with the CPU hotplug, but it's insanely difficult to reproduce on my boxes. I bet on one of the notifiers blocking while waiting on a frozen task. It happens reliably for me, with this script... and randomly, when I just echo 0/1 online from commandline... so it should not be anything with the frozen tasks. echo test /sys/power/disk echo disk /sys/power/state reliably hangs on resume in the attached script. It works ok with nosmp. Pavel #!/bin/bash killall klogd echo -n testing refrigerator (testproc)... echo testproc /sys/power/disk echo disk /sys/power/state echo okay sleep 2 echo -n testing drivers (test)... echo test /sys/power/disk echo disk /sys/power/state echo okay sleep 2 echo -n testing swsusp (reboot)... echo reboot /sys/power/disk echo disk /sys/power/state echo okay sleep 2 echo -n testing s2ram... s2ram echo okay sleep 2 echo -n testing swsusp (shutdown)... echo shutdown /sys/power/disk echo disk /sys/power/state echo okay sleep 2 echo -n testing swsusp (platform)... echo platform /sys/power/disk echo disk /sys/power/state echo okay sleep 2 echo -n testing s2ram... s2ram echo okay -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/