cpu hotplug strangeness in 2.6.24-rc2 (was Re: cpu hotplug support broken in 2.6.23-rc3)

2007-11-15 Thread Pavel Machek
Hi!

> > > Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS
> > > file:
> > > 
> > 
> > There is a list of maintainers in the Documentation/cpu-hotplug.txt, 
> > which includes maintainers for different platforms as well.
> > 
> > It's a good idea to add that info to the MAINTAINERS file as well.
> 
> Yes, please.

Just an update... In 2.6.24-rc2, cpu hotplug basically works, _but_:

if I do echo 0 > online; echo 0 > online; at same cpu, I get error,
and can't up anything any more. It is not serious, but it is not
pretty, either.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


cpu hotplug strangeness in 2.6.24-rc2 (was Re: cpu hotplug support broken in 2.6.23-rc3)

2007-11-15 Thread Pavel Machek
Hi!

   Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS
   file:
   
  
  There is a list of maintainers in the Documentation/cpu-hotplug.txt, 
  which includes maintainers for different platforms as well.
  
  It's a good idea to add that info to the MAINTAINERS file as well.
 
 Yes, please.

Just an update... In 2.6.24-rc2, cpu hotplug basically works, _but_:

if I do echo 0  online; echo 0  online; at same cpu, I get error,
and can't up anything any more. It is not serious, but it is not
pretty, either.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-10-02 Thread Pavel Machek
Hi!

> > > Venki sent me an initial patch, but it has issues with the notify
> > > ordering. Find below my "cache the broadcast flags" version for testing.
> > 
> > Hmmpf, the flag is still cleared when the cpu goes offline. Need to take
> > a closer look.
> 
> I finally tracked it down. There were several ways to turn the box into
> a brick. Sigh !
> 
> Can you please test the combo patch below ?

Sorry, I was on holidays. I assume this is in -rc9 or so, already?
Yes, seems so.

Unfortunately, cpu hotplug seems to be still behaving strangely in
-rc9. I can echo 0 > online (and cpu will go down). I do echo 0 >
online, again, and I get -EBUSY. Good. But I try to do echo 1 >
online, and get -EBUSY, too... and that's bad :-(.

[EMAIL PROTECTED]:/sys/devices/system/cpu/cpu1# echo 0 > online
[EMAIL PROTECTED]:/sys/devices/system/cpu/cpu1# echo 0 > online
-bash: echo: write error: Device or resource busy
[EMAIL PROTECTED]:/sys/devices/system/cpu/cpu1# echo 1 > online
-bash: echo: write error: Device or resource busy
[EMAIL PROTECTED]:/sys/devices/system/cpu/cpu1# uname -a
Linux amd 2.6.23-rc9 #507 SMP Tue Oct 2 09:58:40 CEST 2007 i686
GNU/Linux

Kernel says:

Oct  2 11:42:12 amd log1n[1436]: ROOT LOGIN on `tty1'
Oct  2 11:42:56 amd kernel: CPU 1 is now offline
Oct  2 11:42:56 amd kernel: SMP alternatives: switching to UP code

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-10-02 Thread Pavel Machek
Hi!

   Venki sent me an initial patch, but it has issues with the notify
   ordering. Find below my cache the broadcast flags version for testing.
  
  Hmmpf, the flag is still cleared when the cpu goes offline. Need to take
  a closer look.
 
 I finally tracked it down. There were several ways to turn the box into
 a brick. Sigh !
 
 Can you please test the combo patch below ?

Sorry, I was on holidays. I assume this is in -rc9 or so, already?
Yes, seems so.

Unfortunately, cpu hotplug seems to be still behaving strangely in
-rc9. I can echo 0  online (and cpu will go down). I do echo 0 
online, again, and I get -EBUSY. Good. But I try to do echo 1 
online, and get -EBUSY, too... and that's bad :-(.

[EMAIL PROTECTED]:/sys/devices/system/cpu/cpu1# echo 0  online
[EMAIL PROTECTED]:/sys/devices/system/cpu/cpu1# echo 0  online
-bash: echo: write error: Device or resource busy
[EMAIL PROTECTED]:/sys/devices/system/cpu/cpu1# echo 1  online
-bash: echo: write error: Device or resource busy
[EMAIL PROTECTED]:/sys/devices/system/cpu/cpu1# uname -a
Linux amd 2.6.23-rc9 #507 SMP Tue Oct 2 09:58:40 CEST 2007 i686
GNU/Linux

Kernel says:

Oct  2 11:42:12 amd log1n[1436]: ROOT LOGIN on `tty1'
Oct  2 11:42:56 amd kernel: CPU 1 is now offline
Oct  2 11:42:56 amd kernel: SMP alternatives: switching to UP code

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-15 Thread Andrew Morton
On Sat, 15 Sep 2007 15:28:23 +0200 Thomas Gleixner <[EMAIL PROTECTED]> wrote:

> On Sat, 2007-09-15 at 03:18 -0700, Andrew Morton wrote:
> > > http://git.kernel.org/?p=linux/kernel/git/tglx/linux-2.6-hrt.git;a=shortlog;h=for-2.6.23
> > > 
> > 
> > That patch fixes the resume-from-ram and suspend-to-ram regressions on the
> > Vaio.
> > 
> > I dropped the timekeeping.c hunks because they are an older version of
> > timekeeping-prevent-time-going-backwards-on-resume.patch which I already
> > had.
> > 
> > Is this good to go?  Needs a bit of changelogging.
> 
> Changelog it in the git tree. Please pull from there:

who, me?

> The following changes since commit 53a3f3087be361dacfc02e7a85b6d6142a41ce8a:
>   Linus Torvalds (1):
> Merge branch 'for-linus' of 
> master.kernel.org:/.../cooloney/blackfin-2.6
> 
> are available in the git repository at:
> 
>   ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt.git 
> for-2.6.23
> 
> Thomas Gleixner (6):
>   timekeeping: access rtc outside of xtime lock
>   timekeeping: Prevent time going backwards on resume
>   ACPI: Reevaluate C/P/T states when a cpu becomes online
>   clockevents: Enforce oneshot broadcast when broadcast mask is set on 
> resume
>   clockevents: do not shutdown the oneshot broadcast device
>   clockevents: prevent stale tick update on offline cpu

please send it to Linus?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-15 Thread Thomas Gleixner
On Sat, 2007-09-15 at 03:18 -0700, Andrew Morton wrote:
> > http://git.kernel.org/?p=linux/kernel/git/tglx/linux-2.6-hrt.git;a=shortlog;h=for-2.6.23
> > 
> 
> That patch fixes the resume-from-ram and suspend-to-ram regressions on the
> Vaio.
> 
> I dropped the timekeeping.c hunks because they are an older version of
> timekeeping-prevent-time-going-backwards-on-resume.patch which I already
> had.
> 
> Is this good to go?  Needs a bit of changelogging.

Changelog it in the git tree. Please pull from there:

The following changes since commit 53a3f3087be361dacfc02e7a85b6d6142a41ce8a:
  Linus Torvalds (1):
Merge branch 'for-linus' of master.kernel.org:/.../cooloney/blackfin-2.6

are available in the git repository at:

  ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt.git 
for-2.6.23

Thomas Gleixner (6):
  timekeeping: access rtc outside of xtime lock
  timekeeping: Prevent time going backwards on resume
  ACPI: Reevaluate C/P/T states when a cpu becomes online
  clockevents: Enforce oneshot broadcast when broadcast mask is set on 
resume
  clockevents: do not shutdown the oneshot broadcast device
  clockevents: prevent stale tick update on offline cpu

 drivers/acpi/processor_core.c |   21 +
 kernel/time/tick-broadcast.c  |   24 
 kernel/time/tick-sched.c  |   12 
 kernel/time/timekeeping.c |   10 +-
 4 files changed, 58 insertions(+), 9 deletions(-)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-15 Thread Thomas Gleixner
On Sat, 2007-09-15 at 03:18 -0700, Andrew Morton wrote:
> On Sat, 15 Sep 2007 11:49:41 +0200 Thomas Gleixner <[EMAIL PROTECTED]> wrote:
>
> I dropped the timekeeping.c hunks because they are an older version of
> timekeeping-prevent-time-going-backwards-on-resume.patch which I already
> had.

Err, no. The timekeeping hunk is redone due to the lockdep fix which I
made.

Thanks,

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-15 Thread Andrew Morton
On Sat, 15 Sep 2007 11:49:41 +0200 Thomas Gleixner <[EMAIL PROTECTED]> wrote:

> On Fri, 2007-09-14 at 15:15 +0200, Thomas Gleixner wrote:
> > > Venki sent me an initial patch, but it has issues with the notify
> > > ordering. Find below my "cache the broadcast flags" version for testing.
> > 
> > Hmmpf, the flag is still cleared when the cpu goes offline. Need to take
> > a closer look.
> 
> I finally tracked it down. There were several ways to turn the box into
> a brick. Sigh !
> 
> Can you please test the combo patch below ?
> 
> The details are available from the for-2.6.23 branch of my hrt git repo:
> 
> http://git.kernel.org/?p=linux/kernel/git/tglx/linux-2.6-hrt.git;a=shortlog;h=for-2.6.23
> 

That patch fixes the resume-from-ram and suspend-to-ram regressions on the
Vaio.

I dropped the timekeeping.c hunks because they are an older version of
timekeeping-prevent-time-going-backwards-on-resume.patch which I already
had.

Is this good to go?  Needs a bit of changelogging.


 drivers/acpi/processor_core.c |   21 +
 kernel/time/tick-broadcast.c  |   24 
 kernel/time/tick-sched.c  |   12 
 3 files changed, 49 insertions(+), 8 deletions(-)

diff -puN drivers/acpi/processor_core.c~cpu-hotplug-support-broken-in-2623-rc3 
drivers/acpi/processor_core.c
--- a/drivers/acpi/processor_core.c~cpu-hotplug-support-broken-in-2623-rc3
+++ a/drivers/acpi/processor_core.c
@@ -724,6 +724,25 @@ static void acpi_processor_notify(acpi_h
return;
 }
 
+static int acpi_cpu_soft_notify(struct notifier_block *nfb,
+   unsigned long action, void *hcpu)
+{
+   unsigned int cpu = (unsigned long)hcpu;
+   struct acpi_processor *pr = processors[cpu];
+
+   if (action == CPU_ONLINE && pr) {
+   acpi_processor_ppc_has_changed(pr);
+   acpi_processor_cst_has_changed(pr);
+   acpi_processor_tstate_has_changed(pr);
+   }
+   return NOTIFY_OK;
+}
+
+static struct notifier_block acpi_cpu_notifier =
+{
+   .notifier_call = acpi_cpu_soft_notify,
+};
+
 static int acpi_processor_add(struct acpi_device *device)
 {
struct acpi_processor *pr = NULL;
@@ -987,6 +1006,7 @@ void acpi_processor_install_hotplug_noti
ACPI_UINT32_MAX,
processor_walk_namespace_cb, , NULL);
 #endif
+   register_hotcpu_notifier(_cpu_notifier);
 }
 
 static
@@ -999,6 +1019,7 @@ void acpi_processor_uninstall_hotplug_no
ACPI_UINT32_MAX,
processor_walk_namespace_cb, , NULL);
 #endif
+   unregister_hotcpu_notifier(_cpu_notifier);
 }
 
 /*
diff -puN kernel/time/tick-broadcast.c~cpu-hotplug-support-broken-in-2623-rc3 
kernel/time/tick-broadcast.c
--- a/kernel/time/tick-broadcast.c~cpu-hotplug-support-broken-in-2623-rc3
+++ a/kernel/time/tick-broadcast.c
@@ -382,12 +382,23 @@ static int tick_broadcast_set_event(ktim
 
 int tick_resume_broadcast_oneshot(struct clock_event_device *bc)
 {
+   int cpu = smp_processor_id();
+
+   /*
+* If the CPU is marked for broadcast, enforce oneshot
+* broadcast mode. The jinxed VAIO does not resume otherwise.
+* No idea why it ends up in a lower C State during resume
+* without notifying the clock events layer.
+*/
+   if (cpu_isset(cpu, tick_broadcast_mask))
+   cpu_set(cpu, tick_broadcast_oneshot_mask);
+
clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT);
 
if(!cpus_empty(tick_broadcast_oneshot_mask))
tick_broadcast_set_event(ktime_get(), 1);
 
-   return cpu_isset(smp_processor_id(), tick_broadcast_oneshot_mask);
+   return cpu_isset(cpu, tick_broadcast_oneshot_mask);
 }
 
 /*
@@ -549,20 +560,17 @@ void tick_broadcast_switch_to_oneshot(vo
  */
 void tick_shutdown_broadcast_oneshot(unsigned int *cpup)
 {
-   struct clock_event_device *bc;
unsigned long flags;
unsigned int cpu = *cpup;
 
spin_lock_irqsave(_broadcast_lock, flags);
 
-   bc = tick_broadcast_device.evtdev;
+   /*
+* Clear the broadcast mask flag for the dead cpu, but do not
+* stop the broadcast device!
+*/
cpu_clear(cpu, tick_broadcast_oneshot_mask);
 
-   if (tick_broadcast_device.mode == TICKDEV_MODE_ONESHOT) {
-   if (bc && cpus_empty(tick_broadcast_oneshot_mask))
-   clockevents_set_mode(bc, CLOCK_EVT_MODE_SHUTDOWN);
-   }
-
spin_unlock_irqrestore(_broadcast_lock, flags);
 }
 
diff -puN kernel/time/tick-sched.c~cpu-hotplug-support-broken-in-2623-rc3 
kernel/time/tick-sched.c
--- a/kernel/time/tick-sched.c~cpu-hotplug-support-broken-in-2623-rc3
+++ a/kernel/time/tick-sched.c
@@ -160,6 +160,18 @@ void tick_nohz_stop_sched_tick(void)
cpu = smp_processor_id();
ts = _cpu(tick_cpu_sched, cpu);
 
+   /*
+* If this cpu is offline and it is the one which updates
+* 

Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-15 Thread Thomas Gleixner
Pavel,

On Fri, 2007-09-14 at 15:15 +0200, Thomas Gleixner wrote:
> > Venki sent me an initial patch, but it has issues with the notify
> > ordering. Find below my "cache the broadcast flags" version for testing.
> 
> Hmmpf, the flag is still cleared when the cpu goes offline. Need to take
> a closer look.

I finally tracked it down. There were several ways to turn the box into
a brick. Sigh !

Can you please test the combo patch below ?

The details are available from the for-2.6.23 branch of my hrt git repo:

http://git.kernel.org/?p=linux/kernel/git/tglx/linux-2.6-hrt.git;a=shortlog;h=for-2.6.23

Thanks,

tglx

Index: linux-2.6/kernel/time/timekeeping.c
===
--- linux-2.6.orig/kernel/time/timekeeping.c2007-09-15 11:42:09.0 
+0200
+++ linux-2.6/kernel/time/timekeeping.c 2007-09-15 11:43:03.0 +0200
@@ -217,6 +217,7 @@ static void change_clocksource(void)
 }
 #else
 static inline void change_clocksource(void) { }
+static inline s64 __get_nsec_offset(void) { return 0; }
 #endif
 
 /**
@@ -280,6 +281,8 @@ void __init timekeeping_init(void)
 static int timekeeping_suspended;
 /* time in seconds when suspend began */
 static unsigned long timekeeping_suspend_time;
+/* xtime offset when we went into suspend */
+static s64 timekeeping_suspend_nsecs;
 
 /**
  * timekeeping_resume - Resumes the generic timekeeping subsystem.
@@ -305,6 +308,8 @@ static int timekeeping_resume(struct sys
wall_to_monotonic.tv_sec -= sleep_length;
total_sleep_time += sleep_length;
}
+   /* Make sure that we have the correct xtime reference */
+   timespec_add_ns(, timekeeping_suspend_nsecs);
/* re-base the last cycle value */
clock->cycle_last = clocksource_read(clock);
clock->error = 0;
@@ -325,9 +330,12 @@ static int timekeeping_suspend(struct sy
 {
unsigned long flags;
 
+   timekeeping_suspend_time = read_persistent_clock();
+
write_seqlock_irqsave(_lock, flags);
+   /* Get the current xtime offset */
+   timekeeping_suspend_nsecs = __get_nsec_offset();
timekeeping_suspended = 1;
-   timekeeping_suspend_time = read_persistent_clock();
write_sequnlock_irqrestore(_lock, flags);
 
clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
Index: linux-2.6/drivers/acpi/processor_core.c
===
--- linux-2.6.orig/drivers/acpi/processor_core.c2007-09-15 
11:42:09.0 +0200
+++ linux-2.6/drivers/acpi/processor_core.c 2007-09-15 11:43:03.0 
+0200
@@ -724,6 +724,25 @@ static void acpi_processor_notify(acpi_h
return;
 }
 
+static int acpi_cpu_soft_notify(struct notifier_block *nfb,
+   unsigned long action, void *hcpu)
+{
+   unsigned int cpu = (unsigned long)hcpu;
+   struct acpi_processor *pr = processors[cpu];
+
+   if (action == CPU_ONLINE && pr) {
+   acpi_processor_ppc_has_changed(pr);
+   acpi_processor_cst_has_changed(pr);
+   acpi_processor_tstate_has_changed(pr);
+   }
+   return NOTIFY_OK;
+}
+
+static struct notifier_block acpi_cpu_notifier =
+{
+   .notifier_call = acpi_cpu_soft_notify,
+};
+
 static int acpi_processor_add(struct acpi_device *device)
 {
struct acpi_processor *pr = NULL;
@@ -987,6 +1006,7 @@ void acpi_processor_install_hotplug_noti
ACPI_UINT32_MAX,
processor_walk_namespace_cb, , NULL);
 #endif
+   register_hotcpu_notifier(_cpu_notifier);
 }
 
 static
@@ -999,6 +1019,7 @@ void acpi_processor_uninstall_hotplug_no
ACPI_UINT32_MAX,
processor_walk_namespace_cb, , NULL);
 #endif
+   unregister_hotcpu_notifier(_cpu_notifier);
 }
 
 /*
Index: linux-2.6/kernel/time/tick-broadcast.c
===
--- linux-2.6.orig/kernel/time/tick-broadcast.c 2007-09-15 11:42:09.0 
+0200
+++ linux-2.6/kernel/time/tick-broadcast.c  2007-09-15 11:43:03.0 
+0200
@@ -382,12 +382,23 @@ static int tick_broadcast_set_event(ktim
 
 int tick_resume_broadcast_oneshot(struct clock_event_device *bc)
 {
+   int cpu = smp_processor_id();
+
+   /*
+* If the CPU is marked for broadcast, enforce oneshot
+* broadcast mode. The jinxed VAIO does not resume otherwise.
+* No idea why it ends up in a lower C State during resume
+* without notifying the clock events layer.
+*/
+   if (cpu_isset(cpu, tick_broadcast_mask))
+   cpu_set(cpu, tick_broadcast_oneshot_mask);
+
clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT);
 
if(!cpus_empty(tick_broadcast_oneshot_mask))
tick_broadcast_set_event(ktime_get(), 1);
 
-   return cpu_isset(smp_processor_id(), tick_broadcast_oneshot_mask);
+   

Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-15 Thread Thomas Gleixner
Pavel,

On Fri, 2007-09-14 at 15:15 +0200, Thomas Gleixner wrote:
  Venki sent me an initial patch, but it has issues with the notify
  ordering. Find below my cache the broadcast flags version for testing.
 
 Hmmpf, the flag is still cleared when the cpu goes offline. Need to take
 a closer look.

I finally tracked it down. There were several ways to turn the box into
a brick. Sigh !

Can you please test the combo patch below ?

The details are available from the for-2.6.23 branch of my hrt git repo:

http://git.kernel.org/?p=linux/kernel/git/tglx/linux-2.6-hrt.git;a=shortlog;h=for-2.6.23

Thanks,

tglx

Index: linux-2.6/kernel/time/timekeeping.c
===
--- linux-2.6.orig/kernel/time/timekeeping.c2007-09-15 11:42:09.0 
+0200
+++ linux-2.6/kernel/time/timekeeping.c 2007-09-15 11:43:03.0 +0200
@@ -217,6 +217,7 @@ static void change_clocksource(void)
 }
 #else
 static inline void change_clocksource(void) { }
+static inline s64 __get_nsec_offset(void) { return 0; }
 #endif
 
 /**
@@ -280,6 +281,8 @@ void __init timekeeping_init(void)
 static int timekeeping_suspended;
 /* time in seconds when suspend began */
 static unsigned long timekeeping_suspend_time;
+/* xtime offset when we went into suspend */
+static s64 timekeeping_suspend_nsecs;
 
 /**
  * timekeeping_resume - Resumes the generic timekeeping subsystem.
@@ -305,6 +308,8 @@ static int timekeeping_resume(struct sys
wall_to_monotonic.tv_sec -= sleep_length;
total_sleep_time += sleep_length;
}
+   /* Make sure that we have the correct xtime reference */
+   timespec_add_ns(xtime, timekeeping_suspend_nsecs);
/* re-base the last cycle value */
clock-cycle_last = clocksource_read(clock);
clock-error = 0;
@@ -325,9 +330,12 @@ static int timekeeping_suspend(struct sy
 {
unsigned long flags;
 
+   timekeeping_suspend_time = read_persistent_clock();
+
write_seqlock_irqsave(xtime_lock, flags);
+   /* Get the current xtime offset */
+   timekeeping_suspend_nsecs = __get_nsec_offset();
timekeeping_suspended = 1;
-   timekeeping_suspend_time = read_persistent_clock();
write_sequnlock_irqrestore(xtime_lock, flags);
 
clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
Index: linux-2.6/drivers/acpi/processor_core.c
===
--- linux-2.6.orig/drivers/acpi/processor_core.c2007-09-15 
11:42:09.0 +0200
+++ linux-2.6/drivers/acpi/processor_core.c 2007-09-15 11:43:03.0 
+0200
@@ -724,6 +724,25 @@ static void acpi_processor_notify(acpi_h
return;
 }
 
+static int acpi_cpu_soft_notify(struct notifier_block *nfb,
+   unsigned long action, void *hcpu)
+{
+   unsigned int cpu = (unsigned long)hcpu;
+   struct acpi_processor *pr = processors[cpu];
+
+   if (action == CPU_ONLINE  pr) {
+   acpi_processor_ppc_has_changed(pr);
+   acpi_processor_cst_has_changed(pr);
+   acpi_processor_tstate_has_changed(pr);
+   }
+   return NOTIFY_OK;
+}
+
+static struct notifier_block acpi_cpu_notifier =
+{
+   .notifier_call = acpi_cpu_soft_notify,
+};
+
 static int acpi_processor_add(struct acpi_device *device)
 {
struct acpi_processor *pr = NULL;
@@ -987,6 +1006,7 @@ void acpi_processor_install_hotplug_noti
ACPI_UINT32_MAX,
processor_walk_namespace_cb, action, NULL);
 #endif
+   register_hotcpu_notifier(acpi_cpu_notifier);
 }
 
 static
@@ -999,6 +1019,7 @@ void acpi_processor_uninstall_hotplug_no
ACPI_UINT32_MAX,
processor_walk_namespace_cb, action, NULL);
 #endif
+   unregister_hotcpu_notifier(acpi_cpu_notifier);
 }
 
 /*
Index: linux-2.6/kernel/time/tick-broadcast.c
===
--- linux-2.6.orig/kernel/time/tick-broadcast.c 2007-09-15 11:42:09.0 
+0200
+++ linux-2.6/kernel/time/tick-broadcast.c  2007-09-15 11:43:03.0 
+0200
@@ -382,12 +382,23 @@ static int tick_broadcast_set_event(ktim
 
 int tick_resume_broadcast_oneshot(struct clock_event_device *bc)
 {
+   int cpu = smp_processor_id();
+
+   /*
+* If the CPU is marked for broadcast, enforce oneshot
+* broadcast mode. The jinxed VAIO does not resume otherwise.
+* No idea why it ends up in a lower C State during resume
+* without notifying the clock events layer.
+*/
+   if (cpu_isset(cpu, tick_broadcast_mask))
+   cpu_set(cpu, tick_broadcast_oneshot_mask);
+
clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT);
 
if(!cpus_empty(tick_broadcast_oneshot_mask))
tick_broadcast_set_event(ktime_get(), 1);
 
-   return cpu_isset(smp_processor_id(), 

Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-15 Thread Andrew Morton
On Sat, 15 Sep 2007 11:49:41 +0200 Thomas Gleixner [EMAIL PROTECTED] wrote:

 On Fri, 2007-09-14 at 15:15 +0200, Thomas Gleixner wrote:
   Venki sent me an initial patch, but it has issues with the notify
   ordering. Find below my cache the broadcast flags version for testing.
  
  Hmmpf, the flag is still cleared when the cpu goes offline. Need to take
  a closer look.
 
 I finally tracked it down. There were several ways to turn the box into
 a brick. Sigh !
 
 Can you please test the combo patch below ?
 
 The details are available from the for-2.6.23 branch of my hrt git repo:
 
 http://git.kernel.org/?p=linux/kernel/git/tglx/linux-2.6-hrt.git;a=shortlog;h=for-2.6.23
 

That patch fixes the resume-from-ram and suspend-to-ram regressions on the
Vaio.

I dropped the timekeeping.c hunks because they are an older version of
timekeeping-prevent-time-going-backwards-on-resume.patch which I already
had.

Is this good to go?  Needs a bit of changelogging.


 drivers/acpi/processor_core.c |   21 +
 kernel/time/tick-broadcast.c  |   24 
 kernel/time/tick-sched.c  |   12 
 3 files changed, 49 insertions(+), 8 deletions(-)

diff -puN drivers/acpi/processor_core.c~cpu-hotplug-support-broken-in-2623-rc3 
drivers/acpi/processor_core.c
--- a/drivers/acpi/processor_core.c~cpu-hotplug-support-broken-in-2623-rc3
+++ a/drivers/acpi/processor_core.c
@@ -724,6 +724,25 @@ static void acpi_processor_notify(acpi_h
return;
 }
 
+static int acpi_cpu_soft_notify(struct notifier_block *nfb,
+   unsigned long action, void *hcpu)
+{
+   unsigned int cpu = (unsigned long)hcpu;
+   struct acpi_processor *pr = processors[cpu];
+
+   if (action == CPU_ONLINE  pr) {
+   acpi_processor_ppc_has_changed(pr);
+   acpi_processor_cst_has_changed(pr);
+   acpi_processor_tstate_has_changed(pr);
+   }
+   return NOTIFY_OK;
+}
+
+static struct notifier_block acpi_cpu_notifier =
+{
+   .notifier_call = acpi_cpu_soft_notify,
+};
+
 static int acpi_processor_add(struct acpi_device *device)
 {
struct acpi_processor *pr = NULL;
@@ -987,6 +1006,7 @@ void acpi_processor_install_hotplug_noti
ACPI_UINT32_MAX,
processor_walk_namespace_cb, action, NULL);
 #endif
+   register_hotcpu_notifier(acpi_cpu_notifier);
 }
 
 static
@@ -999,6 +1019,7 @@ void acpi_processor_uninstall_hotplug_no
ACPI_UINT32_MAX,
processor_walk_namespace_cb, action, NULL);
 #endif
+   unregister_hotcpu_notifier(acpi_cpu_notifier);
 }
 
 /*
diff -puN kernel/time/tick-broadcast.c~cpu-hotplug-support-broken-in-2623-rc3 
kernel/time/tick-broadcast.c
--- a/kernel/time/tick-broadcast.c~cpu-hotplug-support-broken-in-2623-rc3
+++ a/kernel/time/tick-broadcast.c
@@ -382,12 +382,23 @@ static int tick_broadcast_set_event(ktim
 
 int tick_resume_broadcast_oneshot(struct clock_event_device *bc)
 {
+   int cpu = smp_processor_id();
+
+   /*
+* If the CPU is marked for broadcast, enforce oneshot
+* broadcast mode. The jinxed VAIO does not resume otherwise.
+* No idea why it ends up in a lower C State during resume
+* without notifying the clock events layer.
+*/
+   if (cpu_isset(cpu, tick_broadcast_mask))
+   cpu_set(cpu, tick_broadcast_oneshot_mask);
+
clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT);
 
if(!cpus_empty(tick_broadcast_oneshot_mask))
tick_broadcast_set_event(ktime_get(), 1);
 
-   return cpu_isset(smp_processor_id(), tick_broadcast_oneshot_mask);
+   return cpu_isset(cpu, tick_broadcast_oneshot_mask);
 }
 
 /*
@@ -549,20 +560,17 @@ void tick_broadcast_switch_to_oneshot(vo
  */
 void tick_shutdown_broadcast_oneshot(unsigned int *cpup)
 {
-   struct clock_event_device *bc;
unsigned long flags;
unsigned int cpu = *cpup;
 
spin_lock_irqsave(tick_broadcast_lock, flags);
 
-   bc = tick_broadcast_device.evtdev;
+   /*
+* Clear the broadcast mask flag for the dead cpu, but do not
+* stop the broadcast device!
+*/
cpu_clear(cpu, tick_broadcast_oneshot_mask);
 
-   if (tick_broadcast_device.mode == TICKDEV_MODE_ONESHOT) {
-   if (bc  cpus_empty(tick_broadcast_oneshot_mask))
-   clockevents_set_mode(bc, CLOCK_EVT_MODE_SHUTDOWN);
-   }
-
spin_unlock_irqrestore(tick_broadcast_lock, flags);
 }
 
diff -puN kernel/time/tick-sched.c~cpu-hotplug-support-broken-in-2623-rc3 
kernel/time/tick-sched.c
--- a/kernel/time/tick-sched.c~cpu-hotplug-support-broken-in-2623-rc3
+++ a/kernel/time/tick-sched.c
@@ -160,6 +160,18 @@ void tick_nohz_stop_sched_tick(void)
cpu = smp_processor_id();
ts = per_cpu(tick_cpu_sched, cpu);
 
+   /*
+* If this cpu is offline and it is the one which updates
+* 

Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-15 Thread Thomas Gleixner
On Sat, 2007-09-15 at 03:18 -0700, Andrew Morton wrote:
 On Sat, 15 Sep 2007 11:49:41 +0200 Thomas Gleixner [EMAIL PROTECTED] wrote:

 I dropped the timekeeping.c hunks because they are an older version of
 timekeeping-prevent-time-going-backwards-on-resume.patch which I already
 had.

Err, no. The timekeeping hunk is redone due to the lockdep fix which I
made.

Thanks,

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-15 Thread Thomas Gleixner
On Sat, 2007-09-15 at 03:18 -0700, Andrew Morton wrote:
  http://git.kernel.org/?p=linux/kernel/git/tglx/linux-2.6-hrt.git;a=shortlog;h=for-2.6.23
  
 
 That patch fixes the resume-from-ram and suspend-to-ram regressions on the
 Vaio.
 
 I dropped the timekeeping.c hunks because they are an older version of
 timekeeping-prevent-time-going-backwards-on-resume.patch which I already
 had.
 
 Is this good to go?  Needs a bit of changelogging.

Changelog it in the git tree. Please pull from there:

The following changes since commit 53a3f3087be361dacfc02e7a85b6d6142a41ce8a:
  Linus Torvalds (1):
Merge branch 'for-linus' of master.kernel.org:/.../cooloney/blackfin-2.6

are available in the git repository at:

  ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt.git 
for-2.6.23

Thomas Gleixner (6):
  timekeeping: access rtc outside of xtime lock
  timekeeping: Prevent time going backwards on resume
  ACPI: Reevaluate C/P/T states when a cpu becomes online
  clockevents: Enforce oneshot broadcast when broadcast mask is set on 
resume
  clockevents: do not shutdown the oneshot broadcast device
  clockevents: prevent stale tick update on offline cpu

 drivers/acpi/processor_core.c |   21 +
 kernel/time/tick-broadcast.c  |   24 
 kernel/time/tick-sched.c  |   12 
 kernel/time/timekeeping.c |   10 +-
 4 files changed, 58 insertions(+), 9 deletions(-)


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-15 Thread Andrew Morton
On Sat, 15 Sep 2007 15:28:23 +0200 Thomas Gleixner [EMAIL PROTECTED] wrote:

 On Sat, 2007-09-15 at 03:18 -0700, Andrew Morton wrote:
   http://git.kernel.org/?p=linux/kernel/git/tglx/linux-2.6-hrt.git;a=shortlog;h=for-2.6.23
   
  
  That patch fixes the resume-from-ram and suspend-to-ram regressions on the
  Vaio.
  
  I dropped the timekeeping.c hunks because they are an older version of
  timekeeping-prevent-time-going-backwards-on-resume.patch which I already
  had.
  
  Is this good to go?  Needs a bit of changelogging.
 
 Changelog it in the git tree. Please pull from there:

who, me?

 The following changes since commit 53a3f3087be361dacfc02e7a85b6d6142a41ce8a:
   Linus Torvalds (1):
 Merge branch 'for-linus' of 
 master.kernel.org:/.../cooloney/blackfin-2.6
 
 are available in the git repository at:
 
   ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt.git 
 for-2.6.23
 
 Thomas Gleixner (6):
   timekeeping: access rtc outside of xtime lock
   timekeeping: Prevent time going backwards on resume
   ACPI: Reevaluate C/P/T states when a cpu becomes online
   clockevents: Enforce oneshot broadcast when broadcast mask is set on 
 resume
   clockevents: do not shutdown the oneshot broadcast device
   clockevents: prevent stale tick update on offline cpu

please send it to Linus?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: cpu hotplug support broken in 2.6.23-rc3

2007-09-14 Thread Thomas Gleixner
On Fri, 2007-09-14 at 11:49 -0700, Pallipadi, Venkatesh wrote:
> >> 
> >> Is there a patch you want me to test? Or does Len have anything to
> >> play with?
> >
> >Venki sent me an initial patch, but it has issues with the notify
> >ordering. Find below my "cache the broadcast flags" version 
> >for testing.
> >
> 
> While wirting that patch, I knew solution could not be that simple :(.
> Does the patch work for online offline case atleast?
> Will look at the Suspend/Resume ordering part in that case.

Yup, the online/offline part works and it helped me to decode the other
reason (/me needs a dark brown paperbag) why Pavel noticed that his box
turned into a brick. I'll send out a full series of fixups (including
your online/offline one) tomorrow morning. I want to give that some more
testing.

Vs. the resume reevaluation: I don't think it's an urgent problem. It's
only my VAIO which does not tell the kernel after resume that the power
supply source has changed. All my other boxen do that and we never had a
complaint about that from other folks.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: cpu hotplug support broken in 2.6.23-rc3

2007-09-14 Thread Pallipadi, Venkatesh
 

>-Original Message-
>From: [EMAIL PROTECTED] 
>[mailto:[EMAIL PROTECTED] On Behalf Of 
>Thomas Gleixner
>Sent: Friday, September 14, 2007 5:51 AM
>To: Pavel Machek
>Cc: Rafael J. Wysocki; Jeff Chua; [EMAIL PROTECTED]; 
>[EMAIL PROTECTED]; [EMAIL PROTECTED]; kernel list; Len Brown
>Subject: Re: cpu hotplug support broken in 2.6.23-rc3
>
>Pavel,
>
>On Fri, 2007-09-14 at 14:38 +0200, Pavel Machek wrote:
>> > I have an yet untested fix, which preserves the broadcast 
>state across
>> > the offline state, but Len is looking into it as well, 
>whether we can
>> > just reevaluate the power states (and the broadcast flags) 
>when a cpu
>> > becomes online again. If Len can do that easily for 
>2.6.23, I'd prefer
>> > that.
>> 
>> Is there a patch you want me to test? Or does Len have anything to
>> play with?
>
>Venki sent me an initial patch, but it has issues with the notify
>ordering. Find below my "cache the broadcast flags" version 
>for testing.
>

While wirting that patch, I knew solution could not be that simple :(.
Does the patch work for online offline case atleast?
Will look at the Suspend/Resume ordering part in that case.

Thanks,
Venki
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-14 Thread Thomas Gleixner
On Fri, 2007-09-14 at 14:50 +0200, Thomas Gleixner wrote:
> Pavel,
> 
> On Fri, 2007-09-14 at 14:38 +0200, Pavel Machek wrote:
> > > I have an yet untested fix, which preserves the broadcast state across
> > > the offline state, but Len is looking into it as well, whether we can
> > > just reevaluate the power states (and the broadcast flags) when a cpu
> > > becomes online again. If Len can do that easily for 2.6.23, I'd prefer
> > > that.
> > 
> > Is there a patch you want me to test? Or does Len have anything to
> > play with?
> 
> Venki sent me an initial patch, but it has issues with the notify
> ordering. Find below my "cache the broadcast flags" version for testing.

Hmmpf, the flag is still cleared when the cpu goes offline. Need to take
a closer look.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-14 Thread Thomas Gleixner
Pavel,

On Fri, 2007-09-14 at 14:38 +0200, Pavel Machek wrote:
> > I have an yet untested fix, which preserves the broadcast state across
> > the offline state, but Len is looking into it as well, whether we can
> > just reevaluate the power states (and the broadcast flags) when a cpu
> > becomes online again. If Len can do that easily for 2.6.23, I'd prefer
> > that.
> 
> Is there a patch you want me to test? Or does Len have anything to
> play with?

Venki sent me an initial patch, but it has issues with the notify
ordering. Find below my "cache the broadcast flags" version for testing.

Thanks,

tglx

---
 kernel/time/tick-broadcast.c |   21 ++---
 1 file changed, 18 insertions(+), 3 deletions(-)

Index: linux-2.6/kernel/time/tick-broadcast.c
===
--- linux-2.6.orig/kernel/time/tick-broadcast.c 2007-09-14 13:22:29.0 
+0200
+++ linux-2.6/kernel/time/tick-broadcast.c  2007-09-14 13:22:29.0 
+0200
@@ -261,10 +261,25 @@ void tick_broadcast_on_off(unsigned long
int cpu = get_cpu();
 
if (!cpu_isset(*oncpu, cpu_online_map)) {
-   printk(KERN_ERR "tick-braodcast: ignoring broadcast for "
-  "offline CPU #%d\n", *oncpu);
-   } else {
+   unsigned long flags;
+
+   spin_lock_irqsave(_broadcast_lock, flags);
+   /*
+* We need to cache the broadcast flag for offline
+* CPUs. ACPI currently does not reevaluate the
+* broadcast flag when a CPU goes online again. Adding
+* a cpu notifier to ACPI is probably the correct
+* solution, but it is hard to get this correct due to
+* notify ordering problems. So caching the flag is
+* the safe solution for now.
+*/
+   if (reason == CLOCK_EVT_NOTIFY_BROADCAST_ON)
+   cpu_set(*oncpu, tick_broadcast_mask);
+   else
+   cpu_clear(*oncpu, tick_broadcast_mask);
 
+   spin_unlock_irqrestore(_broadcast_lock, flags);
+   } else {
if (cpu == *oncpu)
tick_do_broadcast_on_off();
else


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-14 Thread Pavel Machek
Hi!

> > > What was the last known to work version ?
> > 
> > I'm afraid I only turned on HIGH_RES_TIMERS in 2.6.23-rc1
> > timeframe... so I'm not sure if it ever worked for me.
> > 
> > I can confirm it is working in 2.6.23-rc5 with highres disabled, and
> > broken with highres enabled. NOHZ turns "waits for keypress during
> > unplug/replug" into "just plain hangs".
> 
> Ok, I can reproduce it and I tracked down what happens:
> 
> When the CPU goes offline, the clock event source for this CPU (lapic)
> is removed from the clock events framework. This also clears the
> information that the CPU is using C-States which stop the local APIC
> timer.
> 
> Now you put the CPU online again and the local APIC timer is used, but
> the C-State information is not evaluated again in ACPI. This means that
> the clock events code does not know that the APIC might stop. In the
> worst case this will happen and make the CPU wait for timer interrupts
> forever.
> 
> The problem only appears when you are on battery (c3/c4 available) or on
> those broken machines, where C2 is in reality C3 (e.g. akpm's VAIO)
> 
> I have an yet untested fix, which preserves the broadcast state across
> the offline state, but Len is looking into it as well, whether we can
> just reevaluate the power states (and the broadcast flags) when a cpu
> becomes online again. If Len can do that easily for 2.6.23, I'd prefer
> that.

Is there a patch you want me to test? Or does Len have anything to
play with?
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-14 Thread Pavel Machek
Hi!

   What was the last known to work version ?
  
  I'm afraid I only turned on HIGH_RES_TIMERS in 2.6.23-rc1
  timeframe... so I'm not sure if it ever worked for me.
  
  I can confirm it is working in 2.6.23-rc5 with highres disabled, and
  broken with highres enabled. NOHZ turns waits for keypress during
  unplug/replug into just plain hangs.
 
 Ok, I can reproduce it and I tracked down what happens:
 
 When the CPU goes offline, the clock event source for this CPU (lapic)
 is removed from the clock events framework. This also clears the
 information that the CPU is using C-States which stop the local APIC
 timer.
 
 Now you put the CPU online again and the local APIC timer is used, but
 the C-State information is not evaluated again in ACPI. This means that
 the clock events code does not know that the APIC might stop. In the
 worst case this will happen and make the CPU wait for timer interrupts
 forever.
 
 The problem only appears when you are on battery (c3/c4 available) or on
 those broken machines, where C2 is in reality C3 (e.g. akpm's VAIO)
 
 I have an yet untested fix, which preserves the broadcast state across
 the offline state, but Len is looking into it as well, whether we can
 just reevaluate the power states (and the broadcast flags) when a cpu
 becomes online again. If Len can do that easily for 2.6.23, I'd prefer
 that.

Is there a patch you want me to test? Or does Len have anything to
play with?
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-14 Thread Thomas Gleixner
Pavel,

On Fri, 2007-09-14 at 14:38 +0200, Pavel Machek wrote:
  I have an yet untested fix, which preserves the broadcast state across
  the offline state, but Len is looking into it as well, whether we can
  just reevaluate the power states (and the broadcast flags) when a cpu
  becomes online again. If Len can do that easily for 2.6.23, I'd prefer
  that.
 
 Is there a patch you want me to test? Or does Len have anything to
 play with?

Venki sent me an initial patch, but it has issues with the notify
ordering. Find below my cache the broadcast flags version for testing.

Thanks,

tglx

---
 kernel/time/tick-broadcast.c |   21 ++---
 1 file changed, 18 insertions(+), 3 deletions(-)

Index: linux-2.6/kernel/time/tick-broadcast.c
===
--- linux-2.6.orig/kernel/time/tick-broadcast.c 2007-09-14 13:22:29.0 
+0200
+++ linux-2.6/kernel/time/tick-broadcast.c  2007-09-14 13:22:29.0 
+0200
@@ -261,10 +261,25 @@ void tick_broadcast_on_off(unsigned long
int cpu = get_cpu();
 
if (!cpu_isset(*oncpu, cpu_online_map)) {
-   printk(KERN_ERR tick-braodcast: ignoring broadcast for 
-  offline CPU #%d\n, *oncpu);
-   } else {
+   unsigned long flags;
+
+   spin_lock_irqsave(tick_broadcast_lock, flags);
+   /*
+* We need to cache the broadcast flag for offline
+* CPUs. ACPI currently does not reevaluate the
+* broadcast flag when a CPU goes online again. Adding
+* a cpu notifier to ACPI is probably the correct
+* solution, but it is hard to get this correct due to
+* notify ordering problems. So caching the flag is
+* the safe solution for now.
+*/
+   if (reason == CLOCK_EVT_NOTIFY_BROADCAST_ON)
+   cpu_set(*oncpu, tick_broadcast_mask);
+   else
+   cpu_clear(*oncpu, tick_broadcast_mask);
 
+   spin_unlock_irqrestore(tick_broadcast_lock, flags);
+   } else {
if (cpu == *oncpu)
tick_do_broadcast_on_off(reason);
else


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-14 Thread Thomas Gleixner
On Fri, 2007-09-14 at 14:50 +0200, Thomas Gleixner wrote:
 Pavel,
 
 On Fri, 2007-09-14 at 14:38 +0200, Pavel Machek wrote:
   I have an yet untested fix, which preserves the broadcast state across
   the offline state, but Len is looking into it as well, whether we can
   just reevaluate the power states (and the broadcast flags) when a cpu
   becomes online again. If Len can do that easily for 2.6.23, I'd prefer
   that.
  
  Is there a patch you want me to test? Or does Len have anything to
  play with?
 
 Venki sent me an initial patch, but it has issues with the notify
 ordering. Find below my cache the broadcast flags version for testing.

Hmmpf, the flag is still cleared when the cpu goes offline. Need to take
a closer look.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: cpu hotplug support broken in 2.6.23-rc3

2007-09-14 Thread Pallipadi, Venkatesh
 

-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of 
Thomas Gleixner
Sent: Friday, September 14, 2007 5:51 AM
To: Pavel Machek
Cc: Rafael J. Wysocki; Jeff Chua; [EMAIL PROTECTED]; 
[EMAIL PROTECTED]; [EMAIL PROTECTED]; kernel list; Len Brown
Subject: Re: cpu hotplug support broken in 2.6.23-rc3

Pavel,

On Fri, 2007-09-14 at 14:38 +0200, Pavel Machek wrote:
  I have an yet untested fix, which preserves the broadcast 
state across
  the offline state, but Len is looking into it as well, 
whether we can
  just reevaluate the power states (and the broadcast flags) 
when a cpu
  becomes online again. If Len can do that easily for 
2.6.23, I'd prefer
  that.
 
 Is there a patch you want me to test? Or does Len have anything to
 play with?

Venki sent me an initial patch, but it has issues with the notify
ordering. Find below my cache the broadcast flags version 
for testing.


While wirting that patch, I knew solution could not be that simple :(.
Does the patch work for online offline case atleast?
Will look at the Suspend/Resume ordering part in that case.

Thanks,
Venki
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: cpu hotplug support broken in 2.6.23-rc3

2007-09-14 Thread Thomas Gleixner
On Fri, 2007-09-14 at 11:49 -0700, Pallipadi, Venkatesh wrote:
  
  Is there a patch you want me to test? Or does Len have anything to
  play with?
 
 Venki sent me an initial patch, but it has issues with the notify
 ordering. Find below my cache the broadcast flags version 
 for testing.
 
 
 While wirting that patch, I knew solution could not be that simple :(.
 Does the patch work for online offline case atleast?
 Will look at the Suspend/Resume ordering part in that case.

Yup, the online/offline part works and it helped me to decode the other
reason (/me needs a dark brown paperbag) why Pavel noticed that his box
turned into a brick. I'll send out a full series of fixups (including
your online/offline one) tomorrow morning. I want to give that some more
testing.

Vs. the resume reevaluation: I don't think it's an urgent problem. It's
only my VAIO which does not tell the kernel after resume that the power
supply source has changed. All my other boxen do that and we never had a
complaint about that from other folks.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-13 Thread Thomas Gleixner

On Tue, 2007-09-04 at 09:27 +0200, Pavel Machek wrote:
> > On Mon, 2007-09-03 at 12:19 +0200, Rafael J. Wysocki wrote:
> > > > Ok, so it gets weirder. I have now machine in "hung" state; other
> > > > consoles still work, but there are no timers -  sleep 1 hangs forever.
> > > > 
> > > > sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel.
> > > > 
> > > > So I indeed suspect difference-in-kconfig to trigger this, and will
> > > > try disabling noidlehz.
> > > 
> > > I would unset CONFIG_HIGH_RES_TIMERS for starters.
> > > 
> > > Well, I guess Thomas should know about that. ;-)
> > 
> > What was the last known to work version ?
> 
> I'm afraid I only turned on HIGH_RES_TIMERS in 2.6.23-rc1
> timeframe... so I'm not sure if it ever worked for me.
> 
> I can confirm it is working in 2.6.23-rc5 with highres disabled, and
> broken with highres enabled. NOHZ turns "waits for keypress during
> unplug/replug" into "just plain hangs".

Ok, I can reproduce it and I tracked down what happens:

When the CPU goes offline, the clock event source for this CPU (lapic)
is removed from the clock events framework. This also clears the
information that the CPU is using C-States which stop the local APIC
timer.

Now you put the CPU online again and the local APIC timer is used, but
the C-State information is not evaluated again in ACPI. This means that
the clock events code does not know that the APIC might stop. In the
worst case this will happen and make the CPU wait for timer interrupts
forever.

The problem only appears when you are on battery (c3/c4 available) or on
those broken machines, where C2 is in reality C3 (e.g. akpm's VAIO)

I have an yet untested fix, which preserves the broadcast state across
the offline state, but Len is looking into it as well, whether we can
just reevaluate the power states (and the broadcast flags) when a cpu
becomes online again. If Len can do that easily for 2.6.23, I'd prefer
that.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-13 Thread Thomas Gleixner

On Tue, 2007-09-04 at 09:27 +0200, Pavel Machek wrote:
  On Mon, 2007-09-03 at 12:19 +0200, Rafael J. Wysocki wrote:
Ok, so it gets weirder. I have now machine in hung state; other
consoles still work, but there are no timers -  sleep 1 hangs forever.

sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel.

So I indeed suspect difference-in-kconfig to trigger this, and will
try disabling noidlehz.
   
   I would unset CONFIG_HIGH_RES_TIMERS for starters.
   
   Well, I guess Thomas should know about that. ;-)
  
  What was the last known to work version ?
 
 I'm afraid I only turned on HIGH_RES_TIMERS in 2.6.23-rc1
 timeframe... so I'm not sure if it ever worked for me.
 
 I can confirm it is working in 2.6.23-rc5 with highres disabled, and
 broken with highres enabled. NOHZ turns waits for keypress during
 unplug/replug into just plain hangs.

Ok, I can reproduce it and I tracked down what happens:

When the CPU goes offline, the clock event source for this CPU (lapic)
is removed from the clock events framework. This also clears the
information that the CPU is using C-States which stop the local APIC
timer.

Now you put the CPU online again and the local APIC timer is used, but
the C-State information is not evaluated again in ACPI. This means that
the clock events code does not know that the APIC might stop. In the
worst case this will happen and make the CPU wait for timer interrupts
forever.

The problem only appears when you are on battery (c3/c4 available) or on
those broken machines, where C2 is in reality C3 (e.g. akpm's VAIO)

I have an yet untested fix, which preserves the broadcast state across
the offline state, but Len is looking into it as well, whether we can
just reevaluate the power states (and the broadcast flags) when a cpu
becomes online again. If Len can do that easily for 2.6.23, I'd prefer
that.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-04 Thread Pavel Machek
> On Mon, 2007-09-03 at 12:19 +0200, Rafael J. Wysocki wrote:
> > > Ok, so it gets weirder. I have now machine in "hung" state; other
> > > consoles still work, but there are no timers -  sleep 1 hangs forever.
> > > 
> > > sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel.
> > > 
> > > So I indeed suspect difference-in-kconfig to trigger this, and will
> > > try disabling noidlehz.
> > 
> > I would unset CONFIG_HIGH_RES_TIMERS for starters.
> > 
> > Well, I guess Thomas should know about that. ;-)
> 
> What was the last known to work version ?

I'm afraid I only turned on HIGH_RES_TIMERS in 2.6.23-rc1
timeframe... so I'm not sure if it ever worked for me.

I can confirm it is working in 2.6.23-rc5 with highres disabled, and
broken with highres enabled. NOHZ turns "waits for keypress during
unplug/replug" into "just plain hangs".
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-04 Thread Pavel Machek
 On Mon, 2007-09-03 at 12:19 +0200, Rafael J. Wysocki wrote:
   Ok, so it gets weirder. I have now machine in hung state; other
   consoles still work, but there are no timers -  sleep 1 hangs forever.
   
   sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel.
   
   So I indeed suspect difference-in-kconfig to trigger this, and will
   try disabling noidlehz.
  
  I would unset CONFIG_HIGH_RES_TIMERS for starters.
  
  Well, I guess Thomas should know about that. ;-)
 
 What was the last known to work version ?

I'm afraid I only turned on HIGH_RES_TIMERS in 2.6.23-rc1
timeframe... so I'm not sure if it ever worked for me.

I can confirm it is working in 2.6.23-rc5 with highres disabled, and
broken with highres enabled. NOHZ turns waits for keypress during
unplug/replug into just plain hangs.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-03 Thread Thomas Gleixner
On Mon, 2007-09-03 at 12:19 +0200, Rafael J. Wysocki wrote:
> > Ok, so it gets weirder. I have now machine in "hung" state; other
> > consoles still work, but there are no timers -  sleep 1 hangs forever.
> > 
> > sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel.
> > 
> > So I indeed suspect difference-in-kconfig to trigger this, and will
> > try disabling noidlehz.
> 
> I would unset CONFIG_HIGH_RES_TIMERS for starters.
> 
> Well, I guess Thomas should know about that. ;-)

What was the last known to work version ?

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: highres timers break cpu hotplug in 2.6.23-rc5 [was Re: cpu hotplug support broken in 2.6.23-rc3]

2007-09-03 Thread Jeff Chua
On 9/3/07, Pavel Machek <[EMAIL PROTECTED]> wrote:

> It gets weirder. With "nohz=off" on commandline, I have to press any
> key (generate interrupt?) for echo 1  > online to finish. 2.6.23-rc5
> kernel... but hotplug/unplug works reliably now.
>
> With nohz=off highres=off I can unplug/replug cpus as much as I
> want... running in tight loop now.

Yes. CONFIG_NO_HZ and and CONFIG_HIGH_RES_TIMERS has to be unset or
suspend-to-disk would just hang, unless you type something on the
keyboard, and then you can suspend to disk. It seems interrupts are
not missing.

Thanks,
Jeff.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-03 Thread Pavel Machek
On Wed 2007-08-29 13:38:27, Gautham R Shenoy wrote:
> Hi Pavel,
> On Mon, Aug 27, 2007 at 12:43:50PM +0200, Pavel Machek wrote:
> > Hi!
> > 
> > Trying to do few onlines/offlines reliably hangs my machine (thinkpad
> > x60, i386 architecture).
> > 
> 
> That's strange. 
> 
> I've been running cpu offline/online tests with kern bench, 
> cpufreq-ondemand and a few rt-tasks running in the background
> and it has worked for me. 
> Something like 100 iterations without a problem. But these were on
> machines with 4-8 cpus.  So may be this could be something specific to
> the dual cpu machine.

Seems like it is specific to nohz/highrestimers. 

> Can you post the .config? I'll try to recreate it?

Will send privately.

> > Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS
> > file:
> > 
> 
> There is a list of maintainers in the Documentation/cpu-hotplug.txt, 
> which includes maintainers for different platforms as well.
> 
> It's a good idea to add that info to the MAINTAINERS file as well.

Yes, please.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


highres timers break cpu hotplug in 2.6.23-rc5 [was Re: cpu hotplug support broken in 2.6.23-rc3]

2007-09-03 Thread Pavel Machek
Hi!

> > Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
> > so cycles at one point.
> 
> Mine still survives with this ... with sleep 1 ...
> 
> # for((i=0; i<100; i++)); do echo $i; echo $((i % 2))
> >/sys/devices/system/cpu/cpu1/online; sleep 1; done
> 
> and this as well ... without sleep ...
> 
> # for((i=0; i<100; i++)); do echo $i; echo $((i % 2))
> >/sys/devices/system/cpu/cpu1/online; done
> 
> I'm on reiserfs. gcc 3.4.5. Config sent to you seperately so as not to
> cloud lkml. If anyone wants the config, please let me know. Is mime
> "attachment" acceptable now on lkml?

It gets weirder. With "nohz=off" on commandline, I have to press any
key (generate interrupt?) for echo 1  > online to finish. 2.6.23-rc5
kernel... but hotplug/unplug works reliably now.

With nohz=off highres=off I can unplug/replug cpus as much as I
want... running in tight loop now.
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-03 Thread Rafael J. Wysocki
On Monday, 3 September 2007 05:47, Pavel Machek wrote:
> Hi!
> 
> > > Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
> > > so cycles at one point.
> > 
> > Mine still survives with this ... with sleep 1 ...
> > 
> > # for((i=0; i<100; i++)); do echo $i; echo $((i % 2))
> > >/sys/devices/system/cpu/cpu1/online; sleep 1; done
> > 
> > and this as well ... without sleep ...
> > 
> > # for((i=0; i<100; i++)); do echo $i; echo $((i % 2))
> > >/sys/devices/system/cpu/cpu1/online; done
> > 
> > I'm on reiserfs. gcc 3.4.5. Config sent to you seperately so as not to
> > cloud lkml. If anyone wants the config, please let me know. Is mime
> > "attachment" acceptable now on lkml?
> 
> Ok, so it gets weirder. I have now machine in "hung" state; other
> consoles still work, but there are no timers -  sleep 1 hangs forever.
> 
> sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel.
> 
> So I indeed suspect difference-in-kconfig to trigger this, and will
> try disabling noidlehz.

I would unset CONFIG_HIGH_RES_TIMERS for starters.

Well, I guess Thomas should know about that. ;-)

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-03 Thread Pavel Machek
Hi!

> > Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
> > so cycles at one point.
> 
> Mine still survives with this ... with sleep 1 ...
> 
> # for((i=0; i<100; i++)); do echo $i; echo $((i % 2))
> >/sys/devices/system/cpu/cpu1/online; sleep 1; done
> 
> and this as well ... without sleep ...
> 
> # for((i=0; i<100; i++)); do echo $i; echo $((i % 2))
> >/sys/devices/system/cpu/cpu1/online; done
> 
> I'm on reiserfs. gcc 3.4.5. Config sent to you seperately so as not to
> cloud lkml. If anyone wants the config, please let me know. Is mime
> "attachment" acceptable now on lkml?

Ok, so it gets weirder. I have now machine in "hung" state; other
consoles still work, but there are no timers -  sleep 1 hangs forever.

sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel.

So I indeed suspect difference-in-kconfig to trigger this, and will
try disabling noidlehz.
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-03 Thread Pavel Machek
Hi!

  Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
  so cycles at one point.
 
 Mine still survives with this ... with sleep 1 ...
 
 # for((i=0; i100; i++)); do echo $i; echo $((i % 2))
 /sys/devices/system/cpu/cpu1/online; sleep 1; done
 
 and this as well ... without sleep ...
 
 # for((i=0; i100; i++)); do echo $i; echo $((i % 2))
 /sys/devices/system/cpu/cpu1/online; done
 
 I'm on reiserfs. gcc 3.4.5. Config sent to you seperately so as not to
 cloud lkml. If anyone wants the config, please let me know. Is mime
 attachment acceptable now on lkml?

Ok, so it gets weirder. I have now machine in hung state; other
consoles still work, but there are no timers -  sleep 1 hangs forever.

sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel.

So I indeed suspect difference-in-kconfig to trigger this, and will
try disabling noidlehz.
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


highres timers break cpu hotplug in 2.6.23-rc5 [was Re: cpu hotplug support broken in 2.6.23-rc3]

2007-09-03 Thread Pavel Machek
Hi!

  Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
  so cycles at one point.
 
 Mine still survives with this ... with sleep 1 ...
 
 # for((i=0; i100; i++)); do echo $i; echo $((i % 2))
 /sys/devices/system/cpu/cpu1/online; sleep 1; done
 
 and this as well ... without sleep ...
 
 # for((i=0; i100; i++)); do echo $i; echo $((i % 2))
 /sys/devices/system/cpu/cpu1/online; done
 
 I'm on reiserfs. gcc 3.4.5. Config sent to you seperately so as not to
 cloud lkml. If anyone wants the config, please let me know. Is mime
 attachment acceptable now on lkml?

It gets weirder. With nohz=off on commandline, I have to press any
key (generate interrupt?) for echo 1   online to finish. 2.6.23-rc5
kernel... but hotplug/unplug works reliably now.

With nohz=off highres=off I can unplug/replug cpus as much as I
want... running in tight loop now.
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-03 Thread Rafael J. Wysocki
On Monday, 3 September 2007 05:47, Pavel Machek wrote:
 Hi!
 
   Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
   so cycles at one point.
  
  Mine still survives with this ... with sleep 1 ...
  
  # for((i=0; i100; i++)); do echo $i; echo $((i % 2))
  /sys/devices/system/cpu/cpu1/online; sleep 1; done
  
  and this as well ... without sleep ...
  
  # for((i=0; i100; i++)); do echo $i; echo $((i % 2))
  /sys/devices/system/cpu/cpu1/online; done
  
  I'm on reiserfs. gcc 3.4.5. Config sent to you seperately so as not to
  cloud lkml. If anyone wants the config, please let me know. Is mime
  attachment acceptable now on lkml?
 
 Ok, so it gets weirder. I have now machine in hung state; other
 consoles still work, but there are no timers -  sleep 1 hangs forever.
 
 sysrq-t shows kstopmachine hung in hrtimer_try_to_cancel.
 
 So I indeed suspect difference-in-kconfig to trigger this, and will
 try disabling noidlehz.

I would unset CONFIG_HIGH_RES_TIMERS for starters.

Well, I guess Thomas should know about that. ;-)

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-09-03 Thread Pavel Machek
On Wed 2007-08-29 13:38:27, Gautham R Shenoy wrote:
 Hi Pavel,
 On Mon, Aug 27, 2007 at 12:43:50PM +0200, Pavel Machek wrote:
  Hi!
  
  Trying to do few onlines/offlines reliably hangs my machine (thinkpad
  x60, i386 architecture).
  
 
 That's strange. 
 
 I've been running cpu offline/online tests with kern bench, 
 cpufreq-ondemand and a few rt-tasks running in the background
 and it has worked for me. 
 Something like 100 iterations without a problem. But these were on
 machines with 4-8 cpus.  So may be this could be something specific to
 the dual cpu machine.

Seems like it is specific to nohz/highrestimers. 

 Can you post the .config? I'll try to recreate it?

Will send privately.

  Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS
  file:
  
 
 There is a list of maintainers in the Documentation/cpu-hotplug.txt, 
 which includes maintainers for different platforms as well.
 
 It's a good idea to add that info to the MAINTAINERS file as well.

Yes, please.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: highres timers break cpu hotplug in 2.6.23-rc5 [was Re: cpu hotplug support broken in 2.6.23-rc3]

2007-09-03 Thread Jeff Chua
On 9/3/07, Pavel Machek [EMAIL PROTECTED] wrote:

 It gets weirder. With nohz=off on commandline, I have to press any
 key (generate interrupt?) for echo 1   online to finish. 2.6.23-rc5
 kernel... but hotplug/unplug works reliably now.

 With nohz=off highres=off I can unplug/replug cpus as much as I
 want... running in tight loop now.

Yes. CONFIG_NO_HZ and and CONFIG_HIGH_RES_TIMERS has to be unset or
suspend-to-disk would just hang, unless you type something on the
keyboard, and then you can suspend to disk. It seems interrupts are
not missing.

Thanks,
Jeff.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-29 Thread Gautham R Shenoy
Hi Pavel,
On Mon, Aug 27, 2007 at 12:43:50PM +0200, Pavel Machek wrote:
> Hi!
> 
> Trying to do few onlines/offlines reliably hangs my machine (thinkpad
> x60, i386 architecture).
> 

That's strange. 

I've been running cpu offline/online tests with kern bench, 
cpufreq-ondemand and a few rt-tasks running in the background
and it has worked for me. 
Something like 100 iterations without a problem. But these were on
machines with 4-8 cpus.  So may be this could be something specific to
the dual cpu machine.

Can you post the .config? I'll try to recreate it?

It's really strange since you mention that it tooks was  
an echo 1/0 into the sysfs file to break it. 

> Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS
> file:
> 

There is a list of maintainers in the Documentation/cpu-hotplug.txt, 
which includes maintainers for different platforms as well.

It's a good idea to add that info to the MAINTAINERS file as well.

Thanks and Regards
gautham.
-- 
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-29 Thread Gautham R Shenoy
Hi Pavel,
On Mon, Aug 27, 2007 at 12:43:50PM +0200, Pavel Machek wrote:
 Hi!
 
 Trying to do few onlines/offlines reliably hangs my machine (thinkpad
 x60, i386 architecture).
 

That's strange. 

I've been running cpu offline/online tests with kern bench, 
cpufreq-ondemand and a few rt-tasks running in the background
and it has worked for me. 
Something like 100 iterations without a problem. But these were on
machines with 4-8 cpus.  So may be this could be something specific to
the dual cpu machine.

Can you post the .config? I'll try to recreate it?

It's really strange since you mention that it tooks was  
an echo 1/0 into the sysfs file to break it. 

 Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS
 file:
 

There is a list of maintainers in the Documentation/cpu-hotplug.txt, 
which includes maintainers for different platforms as well.

It's a good idea to add that info to the MAINTAINERS file as well.

Thanks and Regards
gautham.
-- 
Gautham R Shenoy
Linux Technology Center
IBM India.
Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-28 Thread Jeff Chua
On 8/28/07, Pavel Machek <[EMAIL PROTECTED]> wrote:

> Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
> so cycles at one point.

Mine still survives with this ... with sleep 1 ...

# for((i=0; i<100; i++)); do echo $i; echo $((i % 2))
>/sys/devices/system/cpu/cpu1/online; sleep 1; done

and this as well ... without sleep ...

# for((i=0; i<100; i++)); do echo $i; echo $((i % 2))
>/sys/devices/system/cpu/cpu1/online; done

I'm on reiserfs. gcc 3.4.5. Config sent to you seperately so as not to
cloud lkml. If anyone wants the config, please let me know. Is mime
"attachment" acceptable now on lkml?

Thanks,
Jeff.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-28 Thread Akinobu Mita
2007/8/28, Rafael J. Wysocki <[EMAIL PROTECTED]>:
> On Monday, 27 August 2007 23:58, Pavel Machek wrote:
> > On Mon 2007-08-27 23:59:31, Rafael J. Wysocki wrote:
> > > On Monday, 27 August 2007 23:32, Pavel Machek wrote:
> > > > On Mon 2007-08-27 22:36:57, Jeff Chua wrote:
> > > > > On 8/27/07, Pavel Machek <[EMAIL PROTECTED]> wrote:
> > > > > > On Mon 2007-08-27 12:43:50, Pavel Machek wrote:
> > > > > > > Hi!
> > > > > > >
> > > > > > > Trying to do few onlines/offlines reliably hangs my machine 
> > > > > > > (thinkpad
> > > > > > > x60, i386 architecture).
> > > > >
> > > > > I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s,
> > > > > and my system still survives.
> > > >
> > > > Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
> > > > so cycles at one point.
> > > >
> > > > ...or maybe difference is in the .config, or maybe I broken something
> > > > in my kernel sources

I have been doing enough CPU offline/online test these days and it works fine.
But there is no cpufreq driver which supports my machine. So my test didn't
cover test cpu hotplug code in cpufreq.

If you have cpufreq driver and it is built as module, it is worth trying
same test after unloading cpufreq driver in order to narrow down the problem
area.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-28 Thread Rafael J. Wysocki
On Monday, 27 August 2007 23:58, Pavel Machek wrote:
> On Mon 2007-08-27 23:59:31, Rafael J. Wysocki wrote:
> > On Monday, 27 August 2007 23:32, Pavel Machek wrote:
> > > On Mon 2007-08-27 22:36:57, Jeff Chua wrote:
> > > > On 8/27/07, Pavel Machek <[EMAIL PROTECTED]> wrote:
> > > > > On Mon 2007-08-27 12:43:50, Pavel Machek wrote:
> > > > > > Hi!
> > > > > >
> > > > > > Trying to do few onlines/offlines reliably hangs my machine 
> > > > > > (thinkpad
> > > > > > x60, i386 architecture).
> > > > 
> > > > I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s,
> > > > and my system still survives.
> > > 
> > > Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
> > > so cycles at one point.
> > > 
> > > ...or maybe difference is in the .config, or maybe I broken something
> > > in my kernel sources
> > 
> > Well, something seems to be wrong with the CPU hotplug, but it's insanely
> > difficult to reproduce on my boxes.
> > 
> > I bet on one of the notifiers blocking while waiting on a frozen task.
> 
> It happens reliably for me, with this script... and randomly, when I
> just echo 0/1 > online from commandline... so it should not be
> anything with the frozen tasks.

That suggests the CPU hotplug just deadlocks internally.

Can you put some printk's into _cpu_down() and see where exactly it hangs?

> echo test > /sys/power/disk
> echo disk > /sys/power/state
> 
> reliably hangs on resume in the attached script. It works ok with
> nosmp.

Which step hangs it?  Or is it at random?

Rafael


> #!/bin/bash
> killall klogd
> 
> echo -n "testing refrigerator (testproc)..."
> echo testproc > /sys/power/disk
> echo disk > /sys/power/state
> echo "okay"
> 
> sleep 2
> echo -n "testing drivers (test)..."
> echo test > /sys/power/disk
> echo disk > /sys/power/state
> echo "okay"
> 
> sleep 2
> echo -n "testing swsusp (reboot)..."
> echo reboot > /sys/power/disk
> echo disk > /sys/power/state
> echo "okay"
> 
> sleep 2
> echo -n "testing s2ram..."
> s2ram
> echo "okay"
> 
> sleep 2
> echo -n "testing swsusp (shutdown)..."
> echo shutdown > /sys/power/disk
> echo disk > /sys/power/state
> echo "okay"
> 
> sleep 2
> echo -n "testing swsusp (platform)..."
> echo platform > /sys/power/disk
> echo disk > /sys/power/state
> echo "okay"
> 
> sleep 2
> echo -n "testing s2ram..."
> s2ram
> echo "okay"
>  
> 

-- 
"Premature optimization is the root of all evil." - Donald Knuth
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-28 Thread Rafael J. Wysocki
On Monday, 27 August 2007 23:58, Pavel Machek wrote:
 On Mon 2007-08-27 23:59:31, Rafael J. Wysocki wrote:
  On Monday, 27 August 2007 23:32, Pavel Machek wrote:
   On Mon 2007-08-27 22:36:57, Jeff Chua wrote:
On 8/27/07, Pavel Machek [EMAIL PROTECTED] wrote:
 On Mon 2007-08-27 12:43:50, Pavel Machek wrote:
  Hi!
 
  Trying to do few onlines/offlines reliably hangs my machine 
  (thinkpad
  x60, i386 architecture).

I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s,
and my system still survives.
   
   Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
   so cycles at one point.
   
   ...or maybe difference is in the .config, or maybe I broken something
   in my kernel sources
  
  Well, something seems to be wrong with the CPU hotplug, but it's insanely
  difficult to reproduce on my boxes.
  
  I bet on one of the notifiers blocking while waiting on a frozen task.
 
 It happens reliably for me, with this script... and randomly, when I
 just echo 0/1  online from commandline... so it should not be
 anything with the frozen tasks.

That suggests the CPU hotplug just deadlocks internally.

Can you put some printk's into _cpu_down() and see where exactly it hangs?

 echo test  /sys/power/disk
 echo disk  /sys/power/state
 
 reliably hangs on resume in the attached script. It works ok with
 nosmp.

Which step hangs it?  Or is it at random?

Rafael


 #!/bin/bash
 killall klogd
 
 echo -n testing refrigerator (testproc)...
 echo testproc  /sys/power/disk
 echo disk  /sys/power/state
 echo okay
 
 sleep 2
 echo -n testing drivers (test)...
 echo test  /sys/power/disk
 echo disk  /sys/power/state
 echo okay
 
 sleep 2
 echo -n testing swsusp (reboot)...
 echo reboot  /sys/power/disk
 echo disk  /sys/power/state
 echo okay
 
 sleep 2
 echo -n testing s2ram...
 s2ram
 echo okay
 
 sleep 2
 echo -n testing swsusp (shutdown)...
 echo shutdown  /sys/power/disk
 echo disk  /sys/power/state
 echo okay
 
 sleep 2
 echo -n testing swsusp (platform)...
 echo platform  /sys/power/disk
 echo disk  /sys/power/state
 echo okay
 
 sleep 2
 echo -n testing s2ram...
 s2ram
 echo okay
  
 

-- 
Premature optimization is the root of all evil. - Donald Knuth
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-28 Thread Akinobu Mita
2007/8/28, Rafael J. Wysocki [EMAIL PROTECTED]:
 On Monday, 27 August 2007 23:58, Pavel Machek wrote:
  On Mon 2007-08-27 23:59:31, Rafael J. Wysocki wrote:
   On Monday, 27 August 2007 23:32, Pavel Machek wrote:
On Mon 2007-08-27 22:36:57, Jeff Chua wrote:
 On 8/27/07, Pavel Machek [EMAIL PROTECTED] wrote:
  On Mon 2007-08-27 12:43:50, Pavel Machek wrote:
   Hi!
  
   Trying to do few onlines/offlines reliably hangs my machine 
   (thinkpad
   x60, i386 architecture).

 I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s,
 and my system still survives.
   
Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
so cycles at one point.
   
...or maybe difference is in the .config, or maybe I broken something
in my kernel sources

I have been doing enough CPU offline/online test these days and it works fine.
But there is no cpufreq driver which supports my machine. So my test didn't
cover test cpu hotplug code in cpufreq.

If you have cpufreq driver and it is built as module, it is worth trying
same test after unloading cpufreq driver in order to narrow down the problem
area.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-28 Thread Jeff Chua
On 8/28/07, Pavel Machek [EMAIL PROTECTED] wrote:

 Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
 so cycles at one point.

Mine still survives with this ... with sleep 1 ...

# for((i=0; i100; i++)); do echo $i; echo $((i % 2))
/sys/devices/system/cpu/cpu1/online; sleep 1; done

and this as well ... without sleep ...

# for((i=0; i100; i++)); do echo $i; echo $((i % 2))
/sys/devices/system/cpu/cpu1/online; done

I'm on reiserfs. gcc 3.4.5. Config sent to you seperately so as not to
cloud lkml. If anyone wants the config, please let me know. Is mime
attachment acceptable now on lkml?

Thanks,
Jeff.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-27 Thread Pavel Machek
On Mon 2007-08-27 23:59:31, Rafael J. Wysocki wrote:
> On Monday, 27 August 2007 23:32, Pavel Machek wrote:
> > On Mon 2007-08-27 22:36:57, Jeff Chua wrote:
> > > On 8/27/07, Pavel Machek <[EMAIL PROTECTED]> wrote:
> > > > On Mon 2007-08-27 12:43:50, Pavel Machek wrote:
> > > > > Hi!
> > > > >
> > > > > Trying to do few onlines/offlines reliably hangs my machine (thinkpad
> > > > > x60, i386 architecture).
> > > 
> > > I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s,
> > > and my system still survives.
> > 
> > Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
> > so cycles at one point.
> > 
> > ...or maybe difference is in the .config, or maybe I broken something
> > in my kernel sources
> 
> Well, something seems to be wrong with the CPU hotplug, but it's insanely
> difficult to reproduce on my boxes.
> 
> I bet on one of the notifiers blocking while waiting on a frozen task.

It happens reliably for me, with this script... and randomly, when I
just echo 0/1 > online from commandline... so it should not be
anything with the frozen tasks.

echo test > /sys/power/disk
echo disk > /sys/power/state

reliably hangs on resume in the attached script. It works ok with
nosmp.

Pavel

#!/bin/bash
killall klogd

echo -n "testing refrigerator (testproc)..."
echo testproc > /sys/power/disk
echo disk > /sys/power/state
echo "okay"

sleep 2
echo -n "testing drivers (test)..."
echo test > /sys/power/disk
echo disk > /sys/power/state
echo "okay"

sleep 2
echo -n "testing swsusp (reboot)..."
echo reboot > /sys/power/disk
echo disk > /sys/power/state
echo "okay"

sleep 2
echo -n "testing s2ram..."
s2ram
echo "okay"

sleep 2
echo -n "testing swsusp (shutdown)..."
echo shutdown > /sys/power/disk
echo disk > /sys/power/state
echo "okay"

sleep 2
echo -n "testing swsusp (platform)..."
echo platform > /sys/power/disk
echo disk > /sys/power/state
echo "okay"

sleep 2
echo -n "testing s2ram..."
s2ram
echo "okay"
 

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-27 Thread Rafael J. Wysocki
On Monday, 27 August 2007 23:32, Pavel Machek wrote:
> On Mon 2007-08-27 22:36:57, Jeff Chua wrote:
> > On 8/27/07, Pavel Machek <[EMAIL PROTECTED]> wrote:
> > > On Mon 2007-08-27 12:43:50, Pavel Machek wrote:
> > > > Hi!
> > > >
> > > > Trying to do few onlines/offlines reliably hangs my machine (thinkpad
> > > > x60, i386 architecture).
> > 
> > I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s,
> > and my system still survives.
> 
> Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
> so cycles at one point.
> 
> ...or maybe difference is in the .config, or maybe I broken something
> in my kernel sources

Well, something seems to be wrong with the CPU hotplug, but it's insanely
difficult to reproduce on my boxes.

I bet on one of the notifiers blocking while waiting on a frozen task.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-27 Thread Pavel Machek
On Mon 2007-08-27 22:36:57, Jeff Chua wrote:
> On 8/27/07, Pavel Machek <[EMAIL PROTECTED]> wrote:
> > On Mon 2007-08-27 12:43:50, Pavel Machek wrote:
> > > Hi!
> > >
> > > Trying to do few onlines/offlines reliably hangs my machine (thinkpad
> > > x60, i386 architecture).
> 
> I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s,
> and my system still survives.

Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
so cycles at one point.

...or maybe difference is in the .config, or maybe I broken something
in my kernel sources
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-27 Thread Michal Piotrowski
Hi,

On 27/08/07, Jeff Chua <[EMAIL PROTECTED]> wrote:
> On 8/27/07, Pavel Machek <[EMAIL PROTECTED]> wrote:
> > On Mon 2007-08-27 12:43:50, Pavel Machek wrote:
> > > Hi!
> > >
> > > Trying to do few onlines/offlines reliably hangs my machine (thinkpad
> > > x60, i386 architecture).
>
> I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s,
> and my system still survives.

So maybe diff between your and Pavel's config file will give an answer.

Any details about the software environment?

Regards,
Michal

-- 
LOG
http://www.stardust.webpages.pl/log/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-27 Thread Jeff Chua
On 8/27/07, Pavel Machek <[EMAIL PROTECTED]> wrote:
> On Mon 2007-08-27 12:43:50, Pavel Machek wrote:
> > Hi!
> >
> > Trying to do few onlines/offlines reliably hangs my machine (thinkpad
> > x60, i386 architecture).

I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s,
and my system still survives.

Jeff.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-27 Thread Pavel Machek
On Mon 2007-08-27 12:43:50, Pavel Machek wrote:
> Hi!
> 
> Trying to do few onlines/offlines reliably hangs my machine (thinkpad
> x60, i386 architecture).
> 
> Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS
> file:
> 
> [EMAIL PROTECTED]:/data/l/linux$ grep CPU MAINTAINERS
> CPU FREQUENCY DRIVERS
> CPUID/MSR DRIVER
> CPUSETS
> i386 SETUP CODE / CPU ERRATA WORKAROUNDS
> SCx200 CPU SUPPORT

...plus it actually breaks suspend, and it is regression from 2.6.22.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


cpu hotplug support broken in 2.6.23-rc3

2007-08-27 Thread Pavel Machek
Hi!

Trying to do few onlines/offlines reliably hangs my machine (thinkpad
x60, i386 architecture).

Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS
file:

[EMAIL PROTECTED]:/data/l/linux$ grep CPU MAINTAINERS
CPU FREQUENCY DRIVERS
CPUID/MSR DRIVER
CPUSETS
i386 SETUP CODE / CPU ERRATA WORKAROUNDS
SCx200 CPU SUPPORT
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


cpu hotplug support broken in 2.6.23-rc3

2007-08-27 Thread Pavel Machek
Hi!

Trying to do few onlines/offlines reliably hangs my machine (thinkpad
x60, i386 architecture).

Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS
file:

[EMAIL PROTECTED]:/data/l/linux$ grep CPU MAINTAINERS
CPU FREQUENCY DRIVERS
CPUID/MSR DRIVER
CPUSETS
i386 SETUP CODE / CPU ERRATA WORKAROUNDS
SCx200 CPU SUPPORT
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-27 Thread Pavel Machek
On Mon 2007-08-27 12:43:50, Pavel Machek wrote:
 Hi!
 
 Trying to do few onlines/offlines reliably hangs my machine (thinkpad
 x60, i386 architecture).
 
 Plus I guess it would be nice to add CPU HOTPLUG into MAINTAINERS
 file:
 
 [EMAIL PROTECTED]:/data/l/linux$ grep CPU MAINTAINERS
 CPU FREQUENCY DRIVERS
 CPUID/MSR DRIVER
 CPUSETS
 i386 SETUP CODE / CPU ERRATA WORKAROUNDS
 SCx200 CPU SUPPORT

...plus it actually breaks suspend, and it is regression from 2.6.22.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-27 Thread Jeff Chua
On 8/27/07, Pavel Machek [EMAIL PROTECTED] wrote:
 On Mon 2007-08-27 12:43:50, Pavel Machek wrote:
  Hi!
 
  Trying to do few onlines/offlines reliably hangs my machine (thinkpad
  x60, i386 architecture).

I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s,
and my system still survives.

Jeff.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-27 Thread Michal Piotrowski
Hi,

On 27/08/07, Jeff Chua [EMAIL PROTECTED] wrote:
 On 8/27/07, Pavel Machek [EMAIL PROTECTED] wrote:
  On Mon 2007-08-27 12:43:50, Pavel Machek wrote:
   Hi!
  
   Trying to do few onlines/offlines reliably hangs my machine (thinkpad
   x60, i386 architecture).

 I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s,
 and my system still survives.

So maybe diff between your and Pavel's config file will give an answer.

Any details about the software environment?

Regards,
Michal

-- 
LOG
http://www.stardust.webpages.pl/log/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-27 Thread Pavel Machek
On Mon 2007-08-27 22:36:57, Jeff Chua wrote:
 On 8/27/07, Pavel Machek [EMAIL PROTECTED] wrote:
  On Mon 2007-08-27 12:43:50, Pavel Machek wrote:
   Hi!
  
   Trying to do few onlines/offlines reliably hangs my machine (thinkpad
   x60, i386 architecture).
 
 I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s,
 and my system still survives.

Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
so cycles at one point.

...or maybe difference is in the .config, or maybe I broken something
in my kernel sources
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-27 Thread Rafael J. Wysocki
On Monday, 27 August 2007 23:32, Pavel Machek wrote:
 On Mon 2007-08-27 22:36:57, Jeff Chua wrote:
  On 8/27/07, Pavel Machek [EMAIL PROTECTED] wrote:
   On Mon 2007-08-27 12:43:50, Pavel Machek wrote:
Hi!
   
Trying to do few onlines/offlines reliably hangs my machine (thinkpad
x60, i386 architecture).
  
  I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s,
  and my system still survives.
 
 Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
 so cycles at one point.
 
 ...or maybe difference is in the .config, or maybe I broken something
 in my kernel sources

Well, something seems to be wrong with the CPU hotplug, but it's insanely
difficult to reproduce on my boxes.

I bet on one of the notifiers blocking while waiting on a frozen task.

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpu hotplug support broken in 2.6.23-rc3

2007-08-27 Thread Pavel Machek
On Mon 2007-08-27 23:59:31, Rafael J. Wysocki wrote:
 On Monday, 27 August 2007 23:32, Pavel Machek wrote:
  On Mon 2007-08-27 22:36:57, Jeff Chua wrote:
   On 8/27/07, Pavel Machek [EMAIL PROTECTED] wrote:
On Mon 2007-08-27 12:43:50, Pavel Machek wrote:
 Hi!

 Trying to do few onlines/offlines reliably hangs my machine (thinkpad
 x60, i386 architecture).
   
   I just 3 cycles of on-line/off-line on 2.6.23-rc3 on ThinkPad x60s,
   and my system still survives.
  
  Can you try 20-or-so tests? Mine hangs randomly, so it survived 4 or
  so cycles at one point.
  
  ...or maybe difference is in the .config, or maybe I broken something
  in my kernel sources
 
 Well, something seems to be wrong with the CPU hotplug, but it's insanely
 difficult to reproduce on my boxes.
 
 I bet on one of the notifiers blocking while waiting on a frozen task.

It happens reliably for me, with this script... and randomly, when I
just echo 0/1  online from commandline... so it should not be
anything with the frozen tasks.

echo test  /sys/power/disk
echo disk  /sys/power/state

reliably hangs on resume in the attached script. It works ok with
nosmp.

Pavel

#!/bin/bash
killall klogd

echo -n testing refrigerator (testproc)...
echo testproc  /sys/power/disk
echo disk  /sys/power/state
echo okay

sleep 2
echo -n testing drivers (test)...
echo test  /sys/power/disk
echo disk  /sys/power/state
echo okay

sleep 2
echo -n testing swsusp (reboot)...
echo reboot  /sys/power/disk
echo disk  /sys/power/state
echo okay

sleep 2
echo -n testing s2ram...
s2ram
echo okay

sleep 2
echo -n testing swsusp (shutdown)...
echo shutdown  /sys/power/disk
echo disk  /sys/power/state
echo okay

sleep 2
echo -n testing swsusp (platform)...
echo platform  /sys/power/disk
echo disk  /sys/power/state
echo okay

sleep 2
echo -n testing s2ram...
s2ram
echo okay
 

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/