On 22 Oct 2025, at 19:58, Ilya Maximets wrote:

> On 10/22/25 6:28 PM, Mike Pattrick wrote:
>> On Fri, Oct 17, 2025 at 1:15 PM Ilya Maximets <[email protected] 
>> <mailto:[email protected]>> wrote:
>>
>>     In a few places the list sizes are checked without holding the lock.
>>     While this probably speeds up the processing a bit, it's technically
>>     incorrect as the access is not atomic.  Caught by thread-safety
>>     analyzer of clang 21, as it now properly supports GUARDED annotations
>>     on structure fields:
>>
>>       lib/ipf.c:1117:33:
>>         error: reading the value pointed to by 'frag_exp_list' requires
>>         holding any mutex [-Werror,-Wthread-safety-analysis]
>>        1117 |     if (ovs_list_is_empty(&ipf->frag_exp_list)) {
>>             |                                 ^
>>       lib/ipf.c:1154:33:
>>         error: reading the value pointed to by 'reassembled_pkt_list'
>>         requires holding any mutex [-Werror,-Wthread-safety-analysis]
>>        1154 |     if (ovs_list_is_empty(&ipf->reassembled_pkt_list)) {
>>             |                                 ^
>>
>>     There is also a potential deadlock between the thread destroying the
>>     ipf and the clean thread as they are using the latch and the ipf lock
>>     in the opposite order.  The latch itself is thread-safe, so we should
>>     not be holding the lock while setting it and waiting for the clean
>>     thread that may be waiting for the lock.
>>
>>     Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.")
>>     Signed-off-by: Ilya Maximets <[email protected] 
>> <mailto:[email protected]>>
>>     ---
>>      lib/ipf.c | 33 +++++++++++----------------------
>>      1 file changed, 11 insertions(+), 22 deletions(-)
>>
>>     diff --git a/lib/ipf.c b/lib/ipf.c
>>     index b16797312..6ce144bfb 100644
>>     --- a/lib/ipf.c
>>     +++ b/lib/ipf.c
>>     @@ -1071,12 +1071,9 @@ ipf_send_completed_frags(struct ipf *ipf, struct 
>> dp_packet_batch *pb,
>>                               long long now, bool v6, uint16_t zone,
>>                               odp_port_t in_port)
>>      {
>>     -    if (ovs_list_is_empty(&ipf->frag_complete_list)) {
>>     -        return;
>>     -    }
>>
>>
>> I worry about the performance impact with these changes, though I haven't 
>> tested it.
>> I could run a test if you haven't.
>
> If you can run some performance tests, that would be great.  I'm not sure how 
> much
> impact this have, certainly some.  OTOH, fragment reassembly is not a 
> particularly
> performant code path regardless.
>
>> Receiving a single fragmented packet on any port
>> would result in all CT threads fighting over a global lock. Maybe there  
>> should be
>> a new atomic counter for nfrags_completed?
>
> May be an option, if performance hit is significant.  Though it's a bit of a
> brute-force solution, i.e. not particularly elegant.
>
>> I was also thinking about a trylock however
>> the risk with that is not all threads can handle all completed frags.
>
> It's a bit strange that we're completing packets on threads that potentially 
> didn't
> receive them.  It would be better if only the threads that completed some 
> packets
> on the current run handled them.  Those threads are taking the lock anyways.
>
>>  
>>
>>     +    struct ipf_list *ipf_list;
>>
>>          ovs_mutex_lock(&ipf->ipf_lock);
>>     -    struct ipf_list *ipf_list;
>>
>>          LIST_FOR_EACH_SAFE (ipf_list, list_node, &ipf->frag_complete_list) {
>>
>>     @@ -1114,14 +1111,11 @@ ipf_delete_expired_frags(struct ipf *ipf, long 
>> long now)
>>          };
>>
>>
>>     -    if (ovs_list_is_empty(&ipf->frag_exp_list)) {
>>     -        return;
>>     -    }
>>     -
>>     -    ovs_mutex_lock(&ipf->ipf_lock);
>>          struct ipf_list *ipf_list;
>>          size_t lists_removed = 0;
>>
>>     +    ovs_mutex_lock(&ipf->ipf_lock);
>>     +
>>          LIST_FOR_EACH_SAFE (ipf_list, list_node, &ipf->frag_exp_list) {
>>              if (now <= ipf_list->expiration ||
>>                  lists_removed >= IPF_FRAG_LIST_MAX_EXPIRED) {
>>     @@ -1151,12 +1145,9 @@ static void
>>      ipf_execute_reass_pkts(struct ipf *ipf, struct dp_packet_batch *pb,
>>                             ovs_be16 dl_type)
>>      {
>>     -    if (ovs_list_is_empty(&ipf->reassembled_pkt_list)) {
>>     -        return;
>>     -    }
>>     +    struct reassembled_pkt *rp;
>>
>>          ovs_mutex_lock(&ipf->ipf_lock);
>>     -    struct reassembled_pkt *rp;
>>
>>          LIST_FOR_EACH_SAFE (rp, rp_list_node, &ipf->reassembled_pkt_list) {
>>              if (!rp->list->reass_execute_ctx &&
>>     @@ -1175,12 +1166,9 @@ static void
>>      ipf_post_execute_reass_pkts(struct ipf *ipf,
>>                                  struct dp_packet_batch *pb, bool v6)
>>      {
>>     -    if (ovs_list_is_empty(&ipf->reassembled_pkt_list)) {
>>     -        return;
>>     -    }
>>     +    struct reassembled_pkt *rp;
>>
>>          ovs_mutex_lock(&ipf->ipf_lock);
>>     -    struct reassembled_pkt *rp;
>>
>>          LIST_FOR_EACH_SAFE (rp, rp_list_node, &ipf->reassembled_pkt_list) {
>>              const size_t pb_cnt = dp_packet_batch_size(pb);
>>     @@ -1308,11 +1296,11 @@ ipf_clean_thread_main(void *f)
>>
>>              long long now = time_msec();
>>
>>     +        ovs_mutex_lock(&ipf->ipf_lock);
>>     +
>>              if (!ovs_list_is_empty(&ipf->frag_exp_list) ||
>>                  !ovs_list_is_empty(&ipf->frag_complete_list)) {
>>
>>
>> Couldn't this conditional just be removed?
>
> Either way is fine, I guess.  We can remove the if statements here, it will
> just be a slightly larger change.  I can do that in v2 or on commit depending
> on the results of the performance discussion above.

Did a quick review, and functionally the patch seems correct to me. I guess if 
performance is an issue, we could create an ipf_peek_frag_list_empty() function 
with analysis disabled. However, the question remains: is it safe to assume the 
list is empty without a lock here?

Cheers,
Eelco

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to