On 5/15/23 16:24, Eelco Chaudron wrote:
> 
> 
> On 4 May 2023, at 0:55, Ilya Maximets wrote:
> 
>> On 3/27/23 13:25, Eelco Chaudron wrote:
>>> Make the read of the current seq->value atomic, i.e., not needing to
>>> acquire the global mutex when reading it. On 64-bit systems, this
>>> incurs no overhead, and it will avoid the mutex and potentially
>>> a system call.
>>>
>>> For incrementing the value followed by waking up the threads, we are
>>> still taking the mutex, so the current behavior is not changing. The
>>> seq_read() behavior is already defined as, "Returns seq's current
>>> sequence number (which could change immediately)". So the change
>>> should not impact the current behavior.
>>>
>>> Signed-off-by: Eelco Chaudron <[email protected]>
>>> ---
>>>  lib/ovs-rcu.c |    2 +-
>>>  lib/seq.c     |   37 ++++++++++++++++---------------------
>>>  lib/seq.h     |    1 -
>>>  3 files changed, 17 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/lib/ovs-rcu.c b/lib/ovs-rcu.c
>>> index 946aa04d1..9e07d9bab 100644
>>> --- a/lib/ovs-rcu.c
>>> +++ b/lib/ovs-rcu.c
>>> @@ -170,7 +170,7 @@ ovsrcu_try_quiesce(void)
>>>      ovs_assert(!single_threaded());
>>>      perthread = ovsrcu_perthread_get();
>>>      if (!seq_try_lock()) {
>>> -        perthread->seqno = seq_read_protected(global_seqno);
>>> +        perthread->seqno = seq_read(global_seqno);
>>>          if (perthread->cbset) {
>>>              ovsrcu_flush_cbset__(perthread, true);
>>>          }
>>> diff --git a/lib/seq.c b/lib/seq.c
>>> index 99e5bf8bd..22646f3f8 100644
>>> --- a/lib/seq.c
>>> +++ b/lib/seq.c
>>> @@ -32,7 +32,7 @@ COVERAGE_DEFINE(seq_change);
>>>
>>>  /* A sequence number object. */
>>>  struct seq {
>>> -    uint64_t value OVS_GUARDED;
>>> +    atomic_uint64_t value;
>>>      struct hmap waiters OVS_GUARDED; /* Contains 'struct seq_waiter's. */
>>>  };
>>>
>>> @@ -72,6 +72,7 @@ static void seq_wake_waiters(struct seq *) 
>>> OVS_REQUIRES(seq_mutex);
>>>  struct seq * OVS_EXCLUDED(seq_mutex)
>>>  seq_create(void)
>>>  {
>>> +    uint64_t seq_value;
>>>      struct seq *seq;
>>>
>>>      seq_init();
>>> @@ -81,7 +82,8 @@ seq_create(void)
>>>      COVERAGE_INC(seq_change);
>>>
>>>      ovs_mutex_lock(&seq_mutex);
>>> -    seq->value = seq_next++;
>>> +    seq_value = seq_next++;
>>> +    atomic_store_relaxed(&seq->value, seq_value);
>>>      hmap_init(&seq->waiters);
>>>      ovs_mutex_unlock(&seq_mutex);
>>>
>>> @@ -126,9 +128,11 @@ void
>>>  seq_change_protected(struct seq *seq)
>>>      OVS_REQUIRES(seq_mutex)
>>>  {
>>> +    uint64_t seq_value = seq_next++;
>>> +
>>>      COVERAGE_INC(seq_change);
>>>
>>> -    seq->value = seq_next++;
>>> +    atomic_store_relaxed(&seq->value, seq_value);
>>>      seq_wake_waiters(seq);
>>>  }
>>>
>>> @@ -143,18 +147,6 @@ seq_change(struct seq *seq)
>>>      ovs_mutex_unlock(&seq_mutex);
>>>  }
>>>
>>> -/* Returns 'seq''s current sequence number (which could change 
>>> immediately).
>>> - *
>>> - * seq_read() and seq_wait() can be used together to yield a race-free 
>>> wakeup
>>> - * when an object changes, even without an ability to lock the object.  See
>>> - * Usage in seq.h for details. */
>>> -uint64_t
>>> -seq_read_protected(const struct seq *seq)
>>> -    OVS_REQUIRES(seq_mutex)
>>> -{
>>> -    return seq->value;
>>> -}
>>> -
>>>  /* Returns 'seq''s current sequence number (which could change 
>>> immediately).
>>>   *
>>>   * seq_read() and seq_wait() can be used together to yield a race-free 
>>> wakeup
>>> @@ -162,14 +154,15 @@ seq_read_protected(const struct seq *seq)
>>>   * Usage in seq.h for details. */
>>>  uint64_t
>>>  seq_read(const struct seq *seq)
>>> -    OVS_EXCLUDED(seq_mutex)
>>>  {
>>>      uint64_t value;
>>>
>>> -    ovs_mutex_lock(&seq_mutex);
>>> -    value = seq_read_protected(seq);
>>> -    ovs_mutex_unlock(&seq_mutex);
>>> -
>>> +    /* We use atomic_read(), memory_order_seq_cst, to simulate the full
>>> +     * memory barrier you would get with locking and unlocking the global
>>> +     * mutex.
>>
>> Hi, Eelco.   I'm not sure this is actually correct.
>>
>> We're performing memory_order_seq_cst operation on the value.
>>
>> Sequential consistency means just acquire-release semantics in our case, 
>> since
>> we're pairing with non-seq-cst operations.  And acquire-release semantics 
>> works
>> between acquire loads and release stores on the same atomic.  But all the 
>> loads
>> on the value we have with this change are relaxed.  And the reader doesn't 
>> take
>> the mutex.  Meaning there is no acquire-release pair that can synchronize the
>> memory between threads.  memory_order_seq_cst read on its own can't really
>> guarantee any synchronization.
>>
>> Also, the documentation for the sequence number library explicitly says that
>> """
>>  seq_change() synchronizes with seq_read() and
>>  seq_wait() on the same variable in release-acquire fashion.  That
>>  is, all effects of the memory accesses performed by a thread prior
>>  to seq_change() are visible to the threads returning from
>>  seq_read() or seq_wait() observing that change.
>> """
>>
>> And I don't think that's the case with this change anymore.
> 
> It was more than a year ago I did this patch, so can’t really remember my 
> reasoning for doing it this way, and thinking it’s fine :)
> 
> I think it might have been the following:
> 
> 1) Before increasing the value in seq_change() there is a full memory barrier 
> due to taking the lock
> 2a) Now if we read the atomic seq in seq_read() and this new value is 
> incremented in the seq_read above, the memory should be in sync
> 2b) If we read it before we increase it, it’s the old value and it should not 
> matter if the read has happened or not.
> 
> Maybe I miss something obvious, if so let me know.

The problem here is not the value of the sequence number, but the
fact that pair seq_change() + seq_read() is documented as an
acquire-release synchronization pair.  Meaning that these functions
has to provide memory synchronization on unrelated memory locations.
And we don't have an acquire-release pair of operations that would
ensure that, IIUC.  It might work with current implementations for
well known architectures today, but I don't think it's required for
any of these operations to always perform full memory barriers
regardless of memory location.

Might be fixed by using release semantics for stores though, I guess.

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to