This patch makes "stop machine" ignore isolated CPUs (if the config option is 
enabled).

It addresses exact same usecase explained in the previous workqueue isolation 
patch.
Where a user-space RT thread can prevent stop machine threads from running, 
which causes
the entire system to hang.

Stop machine is particularly bad when it comes to latencies because it halts 
every single
CPU and may take several milliseconds to complete. It's currently used for 
module insertion
and removal only.
As some folks pointed out in the previous discussions this patch is potentially 
unsafe
if applications running on the isolated CPUs use kernel services affected by 
the module
insertion and removal.
I've been running kernels with this patch on a wide range of the machines in 
production
environment were we routinely insert/remove modules with applications running 
on isolated
CPUs. Also I've recently done quite a bit of testing on life multi-core systems 
with
"stop machine" _completely_ disabled, and was not able to trigger any problems.
For more details please see this thread
        http://marc.info/?l=linux-kernel&m=120243837206248&w=2
That of course does not mean that the patch is totally safe but it does not 
seem to
cause any instability in real life.

This feature does not add any overhead when disabled. It's marked as 
experimental
due to potential issues mentioned above.

Signed-off-by: Max Krasnyansky <[EMAIL PROTECTED]>
---
 kernel/Kconfig.cpuisol |   15 +++++++++++++++
 kernel/stop_machine.c  |    8 +++++++-
 2 files changed, 22 insertions(+), 1 deletions(-)

diff --git a/kernel/Kconfig.cpuisol b/kernel/Kconfig.cpuisol
index e681b02..24c1ef0 100644
--- a/kernel/Kconfig.cpuisol
+++ b/kernel/Kconfig.cpuisol
@@ -25,3 +25,18 @@ config CPUISOL_WORKQUEUE
          heavily rely on per cpu workqueues.
 
          Say 'Y' to enable workqueue isolation.  If unsure say 'N'.
+
+config CPUISOL_STOPMACHINE
+       bool "Do not halt isolated CPUs with Stop Machine (EXPERIMENTAL)"
+       depends on CPUISOL && STOP_MACHINE && EXPERIMENTAL
+       help
+         If this option is enabled kernel will not halt isolated CPUs
+         when Stop Machine is triggered. Stop Machine is currently only
+         used by the module insertion and removal.
+         Please note that at this point this feature is experimental. It is 
+         not known to really break anything but can potentially introduce
+         an instability due to race conditions in module removal logic.
+
+         Say 'Y' if support for dynamic module insertion and removal is
+         required for the system that uses isolated CPUs. 
+         If unsure say 'N'.
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 6f4e0e1..aa3af15 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -89,6 +89,12 @@ static void stopmachine_set_state(enum stopmachine_state 
state)
                cpu_relax();
 }
 
+#ifdef CONFIG_CPUISOL_STOPMACHINE
+#define cpu_unusable(cpu) cpu_isolated(cpu)
+#else
+#define cpu_unusable(cpu) (0)
+#endif
+
 static int stop_machine(void)
 {
        int i, ret = 0;
@@ -98,7 +104,7 @@ static int stop_machine(void)
        stopmachine_state = STOPMACHINE_WAIT;
 
        for_each_online_cpu(i) {
-               if (i == raw_smp_processor_id())
+               if (i == raw_smp_processor_id() || cpu_unusable(i))
                        continue;
                ret = kernel_thread(stopmachine, (void *)(long)i,CLONE_KERNEL);
                if (ret < 0)
-- 
1.5.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to