On Sat, May 09, 2020 at 12:53:38PM -0400, Qian Cai wrote:
> 
> 
> > On May 9, 2020, at 12:12 PM, Paul E. McKenney <[email protected]> wrote:
> > 
> > Ah, and I forgot to ask.  Why "if (data_race(prev->next == node)" instead
> > of "if (data_race(prev->next) == node"?
> 
> I think the one you suggested is slightly better to point out the exact race. 
> Do you want me to resend or you could squash it instead?

The patch was still at the top of my stack, so I just amended it.  Here
is the updated version.

                                                        Thanx, Paul

------------------------------------------------------------------------

commit 13e69ca01ce1621ce74248bda86cfad47fa5a0fa
Author: Qian Cai <[email protected]>
Date:   Tue Feb 11 08:54:15 2020 -0500

    locking/osq_lock: Annotate a data race in osq_lock
    
    The prev->next pointer can be accessed concurrently as noticed by KCSAN:
    
     write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107:
      osq_lock+0x25f/0x350
      osq_wait_next at kernel/locking/osq_lock.c:79
      (inlined by) osq_lock at kernel/locking/osq_lock.c:185
      rwsem_optimistic_spin
      <snip>
    
     read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100:
      osq_lock+0x196/0x350
      osq_lock at kernel/locking/osq_lock.c:157
      rwsem_optimistic_spin
      <snip>
    
    Since the write only stores NULL to prev->next and the read tests if
    prev->next equals to this_cpu_ptr(&osq_node). Even if the value is
    shattered, the code is still working correctly. Thus, mark it as an
    intentional data race using the data_race() macro.
    
    Signed-off-by: Qian Cai <[email protected]>
    Signed-off-by: Paul E. McKenney <[email protected]>

diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 1f77349..1de006e 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
         */
 
        for (;;) {
-               if (prev->next == node &&
+               /*
+                * cpu_relax() below implies a compiler barrier which would
+                * prevent this comparison being optimized away.
+                */
+               if (data_race(prev->next) == node &&
                    cmpxchg(&prev->next, node, NULL) == node)
                        break;
 

Reply via email to