On 10 May 2023, at 23:41, Ilya Maximets wrote:

> On 5/9/23 16:29, Eelco Chaudron wrote:
>> When doing performance testing with OVS v3.1 we ran into a deadlock
>> situation with the netdev_hmap_rwlock read/write lock. After some
>> debugging, it was discovered that the netdev_hmap_rwlock read lock
>> was taken recursively. And well in the folowing sequence of events:
>>
>>  netdev_ports_flow_get()
>>    It takes the read lock, while it walks all the ports
>>    in the port_to_netdev hmap and calls:
>>    - netdev_flow_get() which will call:
>>      - netdev_tc_flow_get() which will call:
>>        - netdev_ifindex_to_odp_port()
>>           This function also takes the same read lock to
>>           walk the ifindex_to_port hmap.
>>
>> In OVS a read/write lock does not support recursive readers. For details
>> see the comments in ovs-thread.h. If you do this, it will lock up,
>> mainly due to OVS setting the PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP
>> attribute to the lock.
>>
>> The solution with this patch is to use two separate read/write
>> locks, with an order guarantee to avoid another potential deadlock.
>>
>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2182541
>> Fixes: 9fe21a4fc12a ("netdev-offload: replace netdev_hmap_mutex to 
>> netdev_hmap_rwlock")
>> Reviewed-by: Simon Horman <[email protected]>
>> Signed-off-by: Eelco Chaudron <[email protected]>
>>
>> ---
>> v2: Renamed locks to be more meaningful.
>>
>>  lib/netdev-offload.c |   70 
>> +++++++++++++++++++++++++++-----------------------
>>  1 file changed, 38 insertions(+), 32 deletions(-)
>>
>> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c
>> index 4592262bd..293864c51 100644
>> --- a/lib/netdev-offload.c
>> +++ b/lib/netdev-offload.c
>> @@ -485,11 +485,13 @@ netdev_set_hw_info(struct netdev *netdev, int type, 
>> int val)
>>  }
>>
>>  /* Protects below port hashmaps. */
>> -static struct ovs_rwlock netdev_hmap_rwlock = OVS_RWLOCK_INITIALIZER;
>> +static struct ovs_rwlock ifindex_to_port_rwlock = OVS_RWLOCK_INITIALIZER;
>> +static struct ovs_rwlock port_to_netdev_rwlock \
>
> We shouldn't add line continuations to a normal C code.

Guess my brain was still stuck in Python mode ;) Anyway, thanks for fixing this 
and committing this all the way down to 2.17!

//Eelco

>
>> +    OVS_ACQ_BEFORE(ifindex_to_port_rwlock) = OVS_RWLOCK_INITIALIZER;
>
> Thanks for v3!  I removed the '\' above and applied the patch.
> Also backported down to 2.17.
>
> Best regards, Ilya Maximets.

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to