Re: [PATCH] x86: fix cpu-hotplug regression

2007-11-12 Thread Andreas Herrmann
On Fri, Nov 09, 2007 at 12:16:34PM -0800, Andrew Morton wrote:
> On Wed, 7 Nov 2007 23:15:38 +0900
> "Akinobu Mita" <[EMAIL PROTECTED]> wrote:
> 
> > > [PATCH] x86: fix cpu hotplug regression (don't call mce_create_device on 
> > > CPU_UP_PREPARE)
> > >
> > > Fix regression introduced with d435d862baca3e25e5eec236762a43251b1e7ffc
> > > ("cpu hotplug: mce: fix cpu hotplug error handling").
> > >
> > > For CPUs not brought up during boot (using maxcpus and additional_cpus
> > > parameters) we don't know whether mce is supported or not at 
> > > "CPU_UP_PREPARE"-time.
> > > Thus mce_cpu_callback should be called after the CPU is online.
> > 
> > Thank you for finding and fixing the problem.
> > I added two fixes to your patch:
> > 
> > - Avoid mce_remove_device() for the CPU that is not correctly initialized
> >   by mce_create_device() failure.
> > 
> > - make CPU_ONLINE callback always return NOTIFY_OK.
> >   Because CPU_ONLINE callback return value is always ignored.
> > 
> > > Signed-off-by: Andreas Herrmann <[EMAIL PROTECTED]>
> > 
> > [EMAIL PROTECTED]: make CPU_ONLINE callback always return NOTIFY_OK]
> > [EMAIL PROTECTED]: avoid mce_remove_device() for not initialized device]
> > 
> > Signed-off-by: Akinobu Mita <[EMAIL PROTECTED]>
> 
> Andreas, could you please review and preferably runtime-test this new
> version?

I'll test it today ...

> Assuming that all goes well could one of you please prepare a final patch
> with a complete changelog?

... and can send a final patch after successful testing.


Regards,

Andreas


-- 
Operating | AMD Saxony Limited Liability Company & Co. KG,
  System  | Wilschdorfer Landstr. 101, 01109 Dresden, Germany
 Research | Register Court Dresden: HRA 4896, General Partner authorized
  Center  | to represent: AMD Saxony LLC (Wilmington, Delaware, US)
  (OSRC)  | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix cpu-hotplug regression

2007-11-12 Thread Andreas Herrmann
On Fri, Nov 09, 2007 at 12:16:34PM -0800, Andrew Morton wrote:
 On Wed, 7 Nov 2007 23:15:38 +0900
 Akinobu Mita [EMAIL PROTECTED] wrote:
 
   [PATCH] x86: fix cpu hotplug regression (don't call mce_create_device on 
   CPU_UP_PREPARE)
  
   Fix regression introduced with d435d862baca3e25e5eec236762a43251b1e7ffc
   (cpu hotplug: mce: fix cpu hotplug error handling).
  
   For CPUs not brought up during boot (using maxcpus and additional_cpus
   parameters) we don't know whether mce is supported or not at 
   CPU_UP_PREPARE-time.
   Thus mce_cpu_callback should be called after the CPU is online.
  
  Thank you for finding and fixing the problem.
  I added two fixes to your patch:
  
  - Avoid mce_remove_device() for the CPU that is not correctly initialized
by mce_create_device() failure.
  
  - make CPU_ONLINE callback always return NOTIFY_OK.
Because CPU_ONLINE callback return value is always ignored.
  
   Signed-off-by: Andreas Herrmann [EMAIL PROTECTED]
  
  [EMAIL PROTECTED]: make CPU_ONLINE callback always return NOTIFY_OK]
  [EMAIL PROTECTED]: avoid mce_remove_device() for not initialized device]
  
  Signed-off-by: Akinobu Mita [EMAIL PROTECTED]
 
 Andreas, could you please review and preferably runtime-test this new
 version?

I'll test it today ...

 Assuming that all goes well could one of you please prepare a final patch
 with a complete changelog?

... and can send a final patch after successful testing.


Regards,

Andreas


-- 
Operating | AMD Saxony Limited Liability Company  Co. KG,
  System  | Wilschdorfer Landstr. 101, 01109 Dresden, Germany
 Research | Register Court Dresden: HRA 4896, General Partner authorized
  Center  | to represent: AMD Saxony LLC (Wilmington, Delaware, US)
  (OSRC)  | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix cpu-hotplug regression

2007-11-09 Thread Andrew Morton
On Wed, 7 Nov 2007 23:15:38 +0900
"Akinobu Mita" <[EMAIL PROTECTED]> wrote:

> > [PATCH] x86: fix cpu hotplug regression (don't call mce_create_device on 
> > CPU_UP_PREPARE)
> >
> > Fix regression introduced with d435d862baca3e25e5eec236762a43251b1e7ffc
> > ("cpu hotplug: mce: fix cpu hotplug error handling").
> >
> > For CPUs not brought up during boot (using maxcpus and additional_cpus
> > parameters) we don't know whether mce is supported or not at 
> > "CPU_UP_PREPARE"-time.
> > Thus mce_cpu_callback should be called after the CPU is online.
> 
> Thank you for finding and fixing the problem.
> I added two fixes to your patch:
> 
> - Avoid mce_remove_device() for the CPU that is not correctly initialized
>   by mce_create_device() failure.
> 
> - make CPU_ONLINE callback always return NOTIFY_OK.
>   Because CPU_ONLINE callback return value is always ignored.
> 
> > Signed-off-by: Andreas Herrmann <[EMAIL PROTECTED]>
> 
> [EMAIL PROTECTED]: make CPU_ONLINE callback always return NOTIFY_OK]
> [EMAIL PROTECTED]: avoid mce_remove_device() for not initialized device]
> 
> Signed-off-by: Akinobu Mita <[EMAIL PROTECTED]>

Andreas, could you please review and preferably runtime-test this new
version?

Assuming that all goes well could one of you please prepare a final patch
with a complete changelog?

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix cpu-hotplug regression

2007-11-09 Thread Andrew Morton
On Wed, 7 Nov 2007 23:15:38 +0900
Akinobu Mita [EMAIL PROTECTED] wrote:

  [PATCH] x86: fix cpu hotplug regression (don't call mce_create_device on 
  CPU_UP_PREPARE)
 
  Fix regression introduced with d435d862baca3e25e5eec236762a43251b1e7ffc
  (cpu hotplug: mce: fix cpu hotplug error handling).
 
  For CPUs not brought up during boot (using maxcpus and additional_cpus
  parameters) we don't know whether mce is supported or not at 
  CPU_UP_PREPARE-time.
  Thus mce_cpu_callback should be called after the CPU is online.
 
 Thank you for finding and fixing the problem.
 I added two fixes to your patch:
 
 - Avoid mce_remove_device() for the CPU that is not correctly initialized
   by mce_create_device() failure.
 
 - make CPU_ONLINE callback always return NOTIFY_OK.
   Because CPU_ONLINE callback return value is always ignored.
 
  Signed-off-by: Andreas Herrmann [EMAIL PROTECTED]
 
 [EMAIL PROTECTED]: make CPU_ONLINE callback always return NOTIFY_OK]
 [EMAIL PROTECTED]: avoid mce_remove_device() for not initialized device]
 
 Signed-off-by: Akinobu Mita [EMAIL PROTECTED]

Andreas, could you please review and preferably runtime-test this new
version?

Assuming that all goes well could one of you please prepare a final patch
with a complete changelog?

Thanks.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix cpu-hotplug regression

2007-11-07 Thread Akinobu Mita
> [PATCH] x86: fix cpu hotplug regression (don't call mce_create_device on 
> CPU_UP_PREPARE)
>
> Fix regression introduced with d435d862baca3e25e5eec236762a43251b1e7ffc
> ("cpu hotplug: mce: fix cpu hotplug error handling").
>
> For CPUs not brought up during boot (using maxcpus and additional_cpus
> parameters) we don't know whether mce is supported or not at 
> "CPU_UP_PREPARE"-time.
> Thus mce_cpu_callback should be called after the CPU is online.

Thank you for finding and fixing the problem.
I added two fixes to your patch:

- Avoid mce_remove_device() for the CPU that is not correctly initialized
  by mce_create_device() failure.

- make CPU_ONLINE callback always return NOTIFY_OK.
  Because CPU_ONLINE callback return value is always ignored.

> Signed-off-by: Andreas Herrmann <[EMAIL PROTECTED]>

[EMAIL PROTECTED]: make CPU_ONLINE callback always return NOTIFY_OK]
[EMAIL PROTECTED]: avoid mce_remove_device() for not initialized device]

Signed-off-by: Akinobu Mita <[EMAIL PROTECTED]>

---
 arch/x86/kernel/cpu/mcheck/mce_64.c |   18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

Index: 2.6-git/arch/x86/kernel/cpu/mcheck/mce_64.c
===
--- 2.6-git.orig/arch/x86/kernel/cpu/mcheck/mce_64.c
+++ 2.6-git/arch/x86/kernel/cpu/mcheck/mce_64.c
@@ -802,6 +802,8 @@ static struct sysdev_attribute *mce_attr
NULL
 };

+static cpumask_t mce_device_initialized = CPU_MASK_NONE;
+
 /* Per cpu sysdev init.  All of the cpus still share the same ctl bank */
 static __cpuinit int mce_create_device(unsigned int cpu)
 {
@@ -825,6 +827,7 @@ static __cpuinit int mce_create_device(u
if (err)
goto error;
}
+   cpu_set(cpu, mce_device_initialized);

return 0;
 error:
@@ -841,10 +844,14 @@ static void mce_remove_device(unsigned i
 {
int i;

+   if (!cpu_isset(cpu, mce_device_initialized))
+   return;
+
for (i = 0; mce_attributes[i]; i++)
sysdev_remove_file(_cpu(device_mce,cpu),
mce_attributes[i]);
sysdev_unregister(_cpu(device_mce,cpu));
+   cpu_clear(cpu, mce_device_initialized);
 }

 /* Get notified when a cpu comes on/off. Be hotplug friendly. */
@@ -852,21 +859,18 @@ static int
 mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
unsigned int cpu = (unsigned long)hcpu;
-   int err = 0;

switch (action) {
-   case CPU_UP_PREPARE:
-   case CPU_UP_PREPARE_FROZEN:
-   err = mce_create_device(cpu);
+   case CPU_ONLINE:
+   case CPU_ONLINE_FROZEN:
+   mce_create_device(cpu);
break;
-   case CPU_UP_CANCELED:
-   case CPU_UP_CANCELED_FROZEN:
case CPU_DEAD:
case CPU_DEAD_FROZEN:
mce_remove_device(cpu);
break;
}
-   return err ? NOTIFY_BAD : NOTIFY_OK;
+   return NOTIFY_OK;
 }

 static struct notifier_block mce_cpu_notifier = {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix cpu-hotplug regression

2007-11-07 Thread Andreas Herrmann
On Wed, Nov 07, 2007 at 03:35:43AM +0100, Andi Kleen wrote:
> On Wednesday 07 November 2007 02:12, Andreas Herrmann wrote:
> 
> > In cases where not all CPUs are brought up during
> > boot (e.g. using maxcpus and additional_cpus parameters)
> > mce_cpu_callback now returns NOTFIY_BAD because
> > for such CPUs cpu_data is not completely filled when
> > the notifier is called. Thus mce_create_device fails right
> > at its beginning:
> >
> > if (!mce_available(_data[cpu]))
> > return -EIO;
> >
> > As a quick fix I suggest to check boot_cpu_data for MCE.
> 
> I guess it would be better to just move the device creation
> to after the CPU has booted. AKA call mce_create_dev() on CPU_ONLINE
> instead.

Yes, and this was the old behaviour. The mentioned patch changed it -
("do mce_create_device in CPU_UP_PREPARE instead of CPU_ONLINE").

Thinking twice about the problem it seems obvious that this part
of the patch should just be reverted.

Attached is a new fix (diff against 2.6.24-rc2).


Regards,

Andreas

--
[PATCH] x86: fix cpu hotplug regression (don't call mce_create_device on 
CPU_UP_PREPARE)

Fix regression introduced with d435d862baca3e25e5eec236762a43251b1e7ffc
("cpu hotplug: mce: fix cpu hotplug error handling").

For CPUs not brought up during boot (using maxcpus and additional_cpus
parameters) we don't know whether mce is supported or not at 
"CPU_UP_PREPARE"-time.
Thus mce_cpu_callback should be called after the CPU is online.


Signed-off-by: Andreas Herrmann <[EMAIL PROTECTED]>
---
 arch/x86/kernel/cpu/mcheck/mce_64.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_64.c 
b/arch/x86/kernel/cpu/mcheck/mce_64.c
index b9f802e..8e83070 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_64.c
@@ -855,12 +855,10 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned 
long action, void *hcpu)
int err = 0;
 
switch (action) {
-   case CPU_UP_PREPARE:
-   case CPU_UP_PREPARE_FROZEN:
+   case CPU_ONLINE:
+   case CPU_ONLINE_FROZEN:
err = mce_create_device(cpu);
break;
-   case CPU_UP_CANCELED:
-   case CPU_UP_CANCELED_FROZEN:
case CPU_DEAD:
case CPU_DEAD_FROZEN:
mce_remove_device(cpu);
-- 
1.5.3.4




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix cpu-hotplug regression

2007-11-07 Thread Akinobu Mita
 [PATCH] x86: fix cpu hotplug regression (don't call mce_create_device on 
 CPU_UP_PREPARE)

 Fix regression introduced with d435d862baca3e25e5eec236762a43251b1e7ffc
 (cpu hotplug: mce: fix cpu hotplug error handling).

 For CPUs not brought up during boot (using maxcpus and additional_cpus
 parameters) we don't know whether mce is supported or not at 
 CPU_UP_PREPARE-time.
 Thus mce_cpu_callback should be called after the CPU is online.

Thank you for finding and fixing the problem.
I added two fixes to your patch:

- Avoid mce_remove_device() for the CPU that is not correctly initialized
  by mce_create_device() failure.

- make CPU_ONLINE callback always return NOTIFY_OK.
  Because CPU_ONLINE callback return value is always ignored.

 Signed-off-by: Andreas Herrmann [EMAIL PROTECTED]

[EMAIL PROTECTED]: make CPU_ONLINE callback always return NOTIFY_OK]
[EMAIL PROTECTED]: avoid mce_remove_device() for not initialized device]

Signed-off-by: Akinobu Mita [EMAIL PROTECTED]

---
 arch/x86/kernel/cpu/mcheck/mce_64.c |   18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

Index: 2.6-git/arch/x86/kernel/cpu/mcheck/mce_64.c
===
--- 2.6-git.orig/arch/x86/kernel/cpu/mcheck/mce_64.c
+++ 2.6-git/arch/x86/kernel/cpu/mcheck/mce_64.c
@@ -802,6 +802,8 @@ static struct sysdev_attribute *mce_attr
NULL
 };

+static cpumask_t mce_device_initialized = CPU_MASK_NONE;
+
 /* Per cpu sysdev init.  All of the cpus still share the same ctl bank */
 static __cpuinit int mce_create_device(unsigned int cpu)
 {
@@ -825,6 +827,7 @@ static __cpuinit int mce_create_device(u
if (err)
goto error;
}
+   cpu_set(cpu, mce_device_initialized);

return 0;
 error:
@@ -841,10 +844,14 @@ static void mce_remove_device(unsigned i
 {
int i;

+   if (!cpu_isset(cpu, mce_device_initialized))
+   return;
+
for (i = 0; mce_attributes[i]; i++)
sysdev_remove_file(per_cpu(device_mce,cpu),
mce_attributes[i]);
sysdev_unregister(per_cpu(device_mce,cpu));
+   cpu_clear(cpu, mce_device_initialized);
 }

 /* Get notified when a cpu comes on/off. Be hotplug friendly. */
@@ -852,21 +859,18 @@ static int
 mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
unsigned int cpu = (unsigned long)hcpu;
-   int err = 0;

switch (action) {
-   case CPU_UP_PREPARE:
-   case CPU_UP_PREPARE_FROZEN:
-   err = mce_create_device(cpu);
+   case CPU_ONLINE:
+   case CPU_ONLINE_FROZEN:
+   mce_create_device(cpu);
break;
-   case CPU_UP_CANCELED:
-   case CPU_UP_CANCELED_FROZEN:
case CPU_DEAD:
case CPU_DEAD_FROZEN:
mce_remove_device(cpu);
break;
}
-   return err ? NOTIFY_BAD : NOTIFY_OK;
+   return NOTIFY_OK;
 }

 static struct notifier_block mce_cpu_notifier = {
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix cpu-hotplug regression

2007-11-07 Thread Andreas Herrmann
On Wed, Nov 07, 2007 at 03:35:43AM +0100, Andi Kleen wrote:
 On Wednesday 07 November 2007 02:12, Andreas Herrmann wrote:
 
  In cases where not all CPUs are brought up during
  boot (e.g. using maxcpus and additional_cpus parameters)
  mce_cpu_callback now returns NOTFIY_BAD because
  for such CPUs cpu_data is not completely filled when
  the notifier is called. Thus mce_create_device fails right
  at its beginning:
 
  if (!mce_available(cpu_data[cpu]))
  return -EIO;
 
  As a quick fix I suggest to check boot_cpu_data for MCE.
 
 I guess it would be better to just move the device creation
 to after the CPU has booted. AKA call mce_create_dev() on CPU_ONLINE
 instead.

Yes, and this was the old behaviour. The mentioned patch changed it -
(do mce_create_device in CPU_UP_PREPARE instead of CPU_ONLINE).

Thinking twice about the problem it seems obvious that this part
of the patch should just be reverted.

Attached is a new fix (diff against 2.6.24-rc2).


Regards,

Andreas

--
[PATCH] x86: fix cpu hotplug regression (don't call mce_create_device on 
CPU_UP_PREPARE)

Fix regression introduced with d435d862baca3e25e5eec236762a43251b1e7ffc
(cpu hotplug: mce: fix cpu hotplug error handling).

For CPUs not brought up during boot (using maxcpus and additional_cpus
parameters) we don't know whether mce is supported or not at 
CPU_UP_PREPARE-time.
Thus mce_cpu_callback should be called after the CPU is online.


Signed-off-by: Andreas Herrmann [EMAIL PROTECTED]
---
 arch/x86/kernel/cpu/mcheck/mce_64.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_64.c 
b/arch/x86/kernel/cpu/mcheck/mce_64.c
index b9f802e..8e83070 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_64.c
@@ -855,12 +855,10 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned 
long action, void *hcpu)
int err = 0;
 
switch (action) {
-   case CPU_UP_PREPARE:
-   case CPU_UP_PREPARE_FROZEN:
+   case CPU_ONLINE:
+   case CPU_ONLINE_FROZEN:
err = mce_create_device(cpu);
break;
-   case CPU_UP_CANCELED:
-   case CPU_UP_CANCELED_FROZEN:
case CPU_DEAD:
case CPU_DEAD_FROZEN:
mce_remove_device(cpu);
-- 
1.5.3.4




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix cpu-hotplug regression

2007-11-06 Thread Andi Kleen
On Wednesday 07 November 2007 02:12, Andreas Herrmann wrote:

> In cases where not all CPUs are brought up during
> boot (e.g. using maxcpus and additional_cpus parameters)
> mce_cpu_callback now returns NOTFIY_BAD because
> for such CPUs cpu_data is not completely filled when
> the notifier is called. Thus mce_create_device fails right
> at its beginning:
>
> if (!mce_available(_data[cpu]))
> return -EIO;
>
> As a quick fix I suggest to check boot_cpu_data for MCE.

I guess it would be better to just move the device creation
to after the CPU has booted. AKA call mce_create_dev() on CPU_ONLINE
instead.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: fix cpu-hotplug regression

2007-11-06 Thread Andreas Herrmann
[PATCH] x86: fix cpu-hotplug regression

Commit d435d862baca3e25e5eec236762a43251b1e7ffc
("cpu hotplug: mce: fix cpu hotplug error handling")
changed the error handling in mce_cpu_callback.

In cases where not all CPUs are brought up during
boot (e.g. using maxcpus and additional_cpus parameters)
mce_cpu_callback now returns NOTFIY_BAD because
for such CPUs cpu_data is not completely filled when
the notifier is called. Thus mce_create_device fails right
at its beginning:

if (!mce_available(_data[cpu]))
return -EIO;

As a quick fix I suggest to check boot_cpu_data for MCE.

To reproduce this regression:

(1) boot with maxcpus=2 addtional_cpus=2 on a 4 CPU x86-64 system
(2) # echo 1 >/sys/devices/system/cpu/cpu2/online
  -bash: echo: write error: Invalid argument

dmesg shows:

_cpu_up: attempt to bring up CPU 2 failed

Signed-off-by: Andreas Herrmann <[EMAIL PROTECTED]>
---
 arch/x86/kernel/cpu/mcheck/mce_64.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_64.c 
b/arch/x86/kernel/cpu/mcheck/mce_64.c
index b9f802e..5112a70 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_64.c
@@ -808,7 +808,7 @@ static __cpuinit int mce_create_device(unsigned int cpu)
int err;
int i;
 
-   if (!mce_available(_data(cpu)))
+   if (!mce_available(_cpu_data))
return -EIO;
 
memset(_cpu(device_mce, cpu).kobj, 0, sizeof(struct kobject));
-- 
1.5.3.4




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: fix cpu-hotplug regression

2007-11-06 Thread Andreas Herrmann
[PATCH] x86: fix cpu-hotplug regression

Commit d435d862baca3e25e5eec236762a43251b1e7ffc
(cpu hotplug: mce: fix cpu hotplug error handling)
changed the error handling in mce_cpu_callback.

In cases where not all CPUs are brought up during
boot (e.g. using maxcpus and additional_cpus parameters)
mce_cpu_callback now returns NOTFIY_BAD because
for such CPUs cpu_data is not completely filled when
the notifier is called. Thus mce_create_device fails right
at its beginning:

if (!mce_available(cpu_data[cpu]))
return -EIO;

As a quick fix I suggest to check boot_cpu_data for MCE.

To reproduce this regression:

(1) boot with maxcpus=2 addtional_cpus=2 on a 4 CPU x86-64 system
(2) # echo 1 /sys/devices/system/cpu/cpu2/online
  -bash: echo: write error: Invalid argument

dmesg shows:

_cpu_up: attempt to bring up CPU 2 failed

Signed-off-by: Andreas Herrmann [EMAIL PROTECTED]
---
 arch/x86/kernel/cpu/mcheck/mce_64.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_64.c 
b/arch/x86/kernel/cpu/mcheck/mce_64.c
index b9f802e..5112a70 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_64.c
@@ -808,7 +808,7 @@ static __cpuinit int mce_create_device(unsigned int cpu)
int err;
int i;
 
-   if (!mce_available(cpu_data(cpu)))
+   if (!mce_available(boot_cpu_data))
return -EIO;
 
memset(per_cpu(device_mce, cpu).kobj, 0, sizeof(struct kobject));
-- 
1.5.3.4




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix cpu-hotplug regression

2007-11-06 Thread Andi Kleen
On Wednesday 07 November 2007 02:12, Andreas Herrmann wrote:

 In cases where not all CPUs are brought up during
 boot (e.g. using maxcpus and additional_cpus parameters)
 mce_cpu_callback now returns NOTFIY_BAD because
 for such CPUs cpu_data is not completely filled when
 the notifier is called. Thus mce_create_device fails right
 at its beginning:

 if (!mce_available(cpu_data[cpu]))
 return -EIO;

 As a quick fix I suggest to check boot_cpu_data for MCE.

I guess it would be better to just move the device creation
to after the CPU has booted. AKA call mce_create_dev() on CPU_ONLINE
instead.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/