Re: [ovs-dev] [PATCH] seq: Make read of the current value atomic

Eelco Chaudron Fri, 26 May 2023 06:42:36 -0700


On 17 May 2023, at 12:18, Eelco Chaudron wrote:


> On 16 May 2023, at 21:48, Ilya Maximets wrote:
>
>> On 5/16/23 10:20, Eelco Chaudron wrote:
>>>
>>>
>>> On 15 May 2023, at 17:47, Ilya Maximets wrote:
>>>
>>>> On 5/15/23 16:24, Eelco Chaudron wrote:
>>>>>
>>>>>
>>>>> On 4 May 2023, at 0:55, Ilya Maximets wrote:
>>>>>
>>>>>> On 3/27/23 13:25, Eelco Chaudron wrote:
>>>>>>> Make the read of the current seq->value atomic, i.e., not needing to
>>>>>>> acquire the global mutex when reading it. On 64-bit systems, this
>>>>>>> incurs no overhead, and it will avoid the mutex and potentially
>>>>>>> a system call.
>>>>>>>
>>>>>>> For incrementing the value followed by waking up the threads, we are
>>>>>>> still taking the mutex, so the current behavior is not changing. The
>>>>>>> seq_read() behavior is already defined as, "Returns seq's current
>>>>>>> sequence number (which could change immediately)". So the change
>>>>>>> should not impact the current behavior.
>>>>>>>
>>>>>>> Signed-off-by: Eelco Chaudron <[email protected]>
>>>>>>> ---
>>>>>>>  lib/ovs-rcu.c |    2 +-
>>>>>>>  lib/seq.c     |   37 ++++++++++++++++---------------------
>>>>>>>  lib/seq.h     |    1 -
>>>>>>>  3 files changed, 17 insertions(+), 23 deletions(-)
>>>>>>>
>>>>>>> diff --git a/lib/ovs-rcu.c b/lib/ovs-rcu.c
>>>>>>> index 946aa04d1..9e07d9bab 100644
>>>>>>> --- a/lib/ovs-rcu.c
>>>>>>> +++ b/lib/ovs-rcu.c
>>>>>>> @@ -170,7 +170,7 @@ ovsrcu_try_quiesce(void)
>>>>>>>      ovs_assert(!single_threaded());
>>>>>>>      perthread = ovsrcu_perthread_get();
>>>>>>>      if (!seq_try_lock()) {
>>>>>>> -        perthread->seqno = seq_read_protected(global_seqno);
>>>>>>> +        perthread->seqno = seq_read(global_seqno);
>>>>>>>          if (perthread->cbset) {
>>>>>>>              ovsrcu_flush_cbset__(perthread, true);
>>>>>>>          }
>>>>>>> diff --git a/lib/seq.c b/lib/seq.c
>>>>>>> index 99e5bf8bd..22646f3f8 100644
>>>>>>> --- a/lib/seq.c
>>>>>>> +++ b/lib/seq.c
>>>>>>> @@ -32,7 +32,7 @@ COVERAGE_DEFINE(seq_change);
>>>>>>>
>>>>>>>  /* A sequence number object. */
>>>>>>>  struct seq {
>>>>>>> -    uint64_t value OVS_GUARDED;
>>>>>>> +    atomic_uint64_t value;
>>>>>>>      struct hmap waiters OVS_GUARDED; /* Contains 'struct seq_waiter's. 
>>>>>>> */
>>>>>>>  };
>>>>>>>
>>>>>>> @@ -72,6 +72,7 @@ static void seq_wake_waiters(struct seq *) 
>>>>>>> OVS_REQUIRES(seq_mutex);
>>>>>>>  struct seq * OVS_EXCLUDED(seq_mutex)
>>>>>>>  seq_create(void)
>>>>>>>  {
>>>>>>> +    uint64_t seq_value;
>>>>>>>      struct seq *seq;
>>>>>>>
>>>>>>>      seq_init();
>>>>>>> @@ -81,7 +82,8 @@ seq_create(void)
>>>>>>>      COVERAGE_INC(seq_change);
>>>>>>>
>>>>>>>      ovs_mutex_lock(&seq_mutex);
>>>>>>> -    seq->value = seq_next++;
>>>>>>> +    seq_value = seq_next++;
>>>>>>> +    atomic_store_relaxed(&seq->value, seq_value);
>>>>>>>      hmap_init(&seq->waiters);
>>>>>>>      ovs_mutex_unlock(&seq_mutex);
>>>>>>>
>>>>>>> @@ -126,9 +128,11 @@ void
>>>>>>>  seq_change_protected(struct seq *seq)
>>>>>>>      OVS_REQUIRES(seq_mutex)
>>>>>>>  {
>>>>>>> +    uint64_t seq_value = seq_next++;
>>>>>>> +
>>>>>>>      COVERAGE_INC(seq_change);
>>>>>>>
>>>>>>> -    seq->value = seq_next++;
>>>>>>> +    atomic_store_relaxed(&seq->value, seq_value);
>>>>>>>      seq_wake_waiters(seq);
>>>>>>>  }
>>>>>>>
>>>>>>> @@ -143,18 +147,6 @@ seq_change(struct seq *seq)
>>>>>>>      ovs_mutex_unlock(&seq_mutex);
>>>>>>>  }
>>>>>>>
>>>>>>> -/* Returns 'seq''s current sequence number (which could change 
>>>>>>> immediately).
>>>>>>> - *
>>>>>>> - * seq_read() and seq_wait() can be used together to yield a race-free 
>>>>>>> wakeup
>>>>>>> - * when an object changes, even without an ability to lock the object. 
>>>>>>>  See
>>>>>>> - * Usage in seq.h for details. */
>>>>>>> -uint64_t
>>>>>>> -seq_read_protected(const struct seq *seq)
>>>>>>> -    OVS_REQUIRES(seq_mutex)
>>>>>>> -{
>>>>>>> -    return seq->value;
>>>>>>> -}
>>>>>>> -
>>>>>>>  /* Returns 'seq''s current sequence number (which could change 
>>>>>>> immediately).
>>>>>>>   *
>>>>>>>   * seq_read() and seq_wait() can be used together to yield a race-free 
>>>>>>> wakeup
>>>>>>> @@ -162,14 +154,15 @@ seq_read_protected(const struct seq *seq)
>>>>>>>   * Usage in seq.h for details. */
>>>>>>>  uint64_t
>>>>>>>  seq_read(const struct seq *seq)
>>>>>>> -    OVS_EXCLUDED(seq_mutex)
>>>>>>>  {
>>>>>>>      uint64_t value;
>>>>>>>
>>>>>>> -    ovs_mutex_lock(&seq_mutex);
>>>>>>> -    value = seq_read_protected(seq);
>>>>>>> -    ovs_mutex_unlock(&seq_mutex);
>>>>>>> -
>>>>>>> +    /* We use atomic_read(), memory_order_seq_cst, to simulate the full
>>>>>>> +     * memory barrier you would get with locking and unlocking the 
>>>>>>> global
>>>>>>> +     * mutex.
>>>>>>
>>>>>> Hi, Eelco.   I'm not sure this is actually correct.
>>>>>>
>>>>>> We're performing memory_order_seq_cst operation on the value.
>>>>>>
>>>>>> Sequential consistency means just acquire-release semantics in our case, 
>>>>>> since
>>>>>> we're pairing with non-seq-cst operations.  And acquire-release 
>>>>>> semantics works
>>>>>> between acquire loads and release stores on the same atomic.  But all 
>>>>>> the loads
>>>>>> on the value we have with this change are relaxed.  And the reader 
>>>>>> doesn't take
>>>>>> the mutex.  Meaning there is no acquire-release pair that can 
>>>>>> synchronize the
>>>>>> memory between threads.  memory_order_seq_cst read on its own can't 
>>>>>> really
>>>>>> guarantee any synchronization.
>>>>>>
>>>>>> Also, the documentation for the sequence number library explicitly says 
>>>>>> that
>>>>>> """
>>>>>>  seq_change() synchronizes with seq_read() and
>>>>>>  seq_wait() on the same variable in release-acquire fashion.  That
>>>>>>  is, all effects of the memory accesses performed by a thread prior
>>>>>>  to seq_change() are visible to the threads returning from
>>>>>>  seq_read() or seq_wait() observing that change.
>>>>>> """
>>>>>>
>>>>>> And I don't think that's the case with this change anymore.
>>>>>
>>>>> It was more than a year ago I did this patch, so can’t really remember my 
>>>>> reasoning for doing it this way, and thinking it’s fine :)
>>>>>
>>>>> I think it might have been the following:
>>>>>
>>>>> 1) Before increasing the value in seq_change() there is a full memory 
>>>>> barrier due to taking the lock
>>>>> 2a) Now if we read the atomic seq in seq_read() and this new value is 
>>>>> incremented in the seq_read above, the memory should be in sync
>>>>> 2b) If we read it before we increase it, it’s the old value and it should 
>>>>> not matter if the read has happened or not.
>>>>>
>>>>> Maybe I miss something obvious, if so let me know.
>>>>
>>>> The problem here is not the value of the sequence number, but the
>>>> fact that pair seq_change() + seq_read() is documented as an
>>>> acquire-release synchronization pair.  Meaning that these functions
>>>> has to provide memory synchronization on unrelated memory locations.
>>>
>>> Yes, that’s what I tried to explain, but looks like I failed ;) I think 
>>> it’s a bit less restricted, as it is only for the sequence number returned 
>>> by seq_read compare to the new sequence number generated by seq_change.
>>>
>>>> And we don't have an acquire-release pair of operations that would
>>>> ensure that, IIUC.  It might work with current implementations for
>>>> well known architectures today, but I don't think it's required for
>>>> any of these operations to always perform full memory barriers
>>>> regardless of memory location.
>>>
>>> Yes, you are right, I think the memory barrier is not a requirement for the 
>>> mutex_lock(), I guess only for the mutex_unlock().
>>>
>>>> Might be fixed by using release semantics for stores though, I guess.
>>>
>>> So something like this could solve it:
>>>
>>> void
>>> seq_change_protected(struct seq *seq)
>>>     OVS_REQUIRES(seq_mutex)
>>> {
>>>     uint64_t seq_value = seq_next++;
>>>
>>>     COVERAGE_INC(seq_change);
>>>
>>>     atomic_thread_fence(memory_order_release);
>>>     atomic_store_relaxed(&seq->value, seq_value);
>>>     seq_wake_waiters(seq);
>>> }
>>>
>>> uint64_t
>>> seq_read(const struct seq *seq)
>>> {
>>>     uint64_t value;
>>>
>>>     /* Note that the odd CONST_CAST() is here to keep sparse happy. */
>>>     atomic_read_relaxed(&CONST_CAST(struct seq *, seq)->value, &value);
>>>     atomic_thread_fence(memory_order_acquire);
>>>     return value;
>>> }
>>>
>>> What do you think?
>>
>> Why not just explicit read/store with acquire-release?  I'm not sure why
>> we need to use a fence.  Could you elaborate?
>
> I misread the spec, and assumed this order was only for the atomic var 
> itself. Found a nice blog explaining this, 
> https://preshing.com/20120913/acquire-and-release-semantics/, it also had my 
> solution as an option, so at least I was on the right track :)
>
> So taking their example the following should be enough (as you suggested :)
>
> void
> seq_change_protected(struct seq *seq)
>     OVS_REQUIRES(seq_mutex)
> {
>     uint64_t seq_value = seq_next++;
>
>     COVERAGE_INC(seq_change);
>
>     atomic_store_explicit(&seq->value, seq_value, memory_order_release);
>     seq_wake_waiters(seq);
> }
>
> uint64_t
> seq_read(const struct seq *seq)
> {
>     uint64_t value;
>
>     /* Note that the odd CONST_CAST() is here to keep sparse happy. */
>     atomic_read_explicit(&CONST_CAST(struct seq *, seq)->value, &value, 
> memory_order_acquire);
>     return value;
> }

Any feedback on the above, so I can send a re-worked patch?

//Eelco

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] seq: Make read of the current value atomic

Reply via email to