This is a note to let you know that I've just added the patch titled

    call_function_many: fix list delete vs add race

to the 2.6.38-stable tree which can be found at:
    
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     call_function_many-fix-list-delete-vs-add-race.patch
and it can be found in the queue-2.6.38 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <[email protected]> know about it.


>From e6cd1e07a185d5f9b0aa75e020df02d3c1c44940 Mon Sep 17 00:00:00 2001
From: Milton Miller <[email protected]>
Date: Tue, 15 Mar 2011 13:27:16 -0600
Subject: call_function_many: fix list delete vs add race

From: Milton Miller <[email protected]>

commit e6cd1e07a185d5f9b0aa75e020df02d3c1c44940 upstream.

Peter pointed out there was nothing preventing the list_del_rcu in
smp_call_function_interrupt from running before the list_add_rcu in
smp_call_function_many.

Fix this by not setting refs until we have gotten the lock for the list.
Take advantage of the wmb in list_add_rcu to save an explicit additional
one.

I tried to force this race with a udelay before the lock & list_add and
by mixing all 64 online cpus with just 3 random cpus in the mask, but
was unsuccessful.  Still, inspection shows a valid race, and the fix is
a extension of the existing protection window in the current code.

Reported-by: Peter Zijlstra <[email protected]>
Signed-off-by: Milton Miller <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
 kernel/smp.c |   20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -491,14 +491,15 @@ void smp_call_function_many(const struct
        cpumask_clear_cpu(this_cpu, data->cpumask);
 
        /*
-        * To ensure the interrupt handler gets an complete view
-        * we order the cpumask and refs writes and order the read
-        * of them in the interrupt handler.  In addition we may
-        * only clear our own cpu bit from the mask.
+        * We reuse the call function data without waiting for any grace
+        * period after some other cpu removes it from the global queue.
+        * This means a cpu might find our data block as it is writen.
+        * The interrupt handler waits until it sees refs filled out
+        * while its cpu mask bit is set; here we may only clear our
+        * own cpu mask bit, and must wait to set refs until we are sure
+        * previous writes are complete and we have obtained the lock to
+        * add the element to the queue.
         */
-       smp_wmb();
-
-       atomic_set(&data->refs, cpumask_weight(data->cpumask));
 
        raw_spin_lock_irqsave(&call_function.lock, flags);
        /*
@@ -507,6 +508,11 @@ void smp_call_function_many(const struct
         * will not miss any other list entries:
         */
        list_add_rcu(&data->csd.list, &call_function.queue);
+       /*
+        * We rely on the wmb() in list_add_rcu to order the writes
+        * to func, data, and cpumask before this write to refs.
+        */
+       atomic_set(&data->refs, cpumask_weight(data->cpumask));
        raw_spin_unlock_irqrestore(&call_function.lock, flags);
 
        /*


Patches currently in stable-queue which might be from [email protected] are

queue-2.6.38/smp_call_function_many-handle-concurrent-clearing-of-mask.patch
queue-2.6.38/call_function_many-fix-list-delete-vs-add-race.patch
queue-2.6.38/call_function_many-add-missing-ordering.patch

_______________________________________________
stable mailing list
[email protected]
http://linux.kernel.org/mailman/listinfo/stable

Reply via email to