On 3/8/23 10:08, Paolo Bonzini wrote:
On 3/8/23 17:47, Richard Henderson wrote:
The case that I was imagining for smp_mb__before_rmw() is something like this:

     wake_me = true;
     smp_mb__before_rmw();
     if (qatomic_xchg(&can_sleep, true)) { ... }

where you really need a full barrier.

What is different about this that doesn't apply to the remove-head case we've been talking about?

For remove-head, nothing is going to change the BH_PENDING flag while the code runs.  This would be an okay transformation of the code, at both the compiler and the processor level:

   // first part of atomic_fetch_and
   old_flags = LDAXR of bh->flags

   // QSLIST_REMOVE_HEAD ends up between load and store of
   // atomic_fetch_and
   all the loads and stores for remove-head

   // second part of atomic_fetch_and
   new_flags = old_flags & ~(BH_PENDING|BH_SCHEDULED|BH_IDLE);
   (successful) STLXR of new_flags into bh->flags


Instead in the case above, I was thinking you would get a missed wakeup without the barriers.  Here is the full pseudocode:

   // this store can move below the load of can_sleep
   qatomic_set(&wake_me, true);
   if (qatomic_xchg(&can_sleep, true)) sleep;

   // this store can move below the load of wake_me
   qatomic_set(&can_sleep, false);
   if (qatomic_xchg(&wake_me, false)) wake them;

The buggy case is where the threads observe can_sleep = true, wake_me = false, i.e. the original value of the variables.

Oh, I see, buggy in a larger context.

Reviewed-by: Richard Henderson <richard.hender...@linaro.org>


r~

Reply via email to