Hi Ilya,

On 5/11/26 12:13 PM, Ilya Maximets wrote:
> On 5/7/26 10:11 PM, Dumitru Ceara wrote:
>> On 5/7/26 4:31 PM, Dumitru Ceara wrote:
>>> Hi all,
>>>
>>> It seems that the test added by this patch of mine sometimes fails in
>>> GitHub CI when ovs-vswitchd is stopped; the failure is due to:
>>>
>>> ./ovn.at:10193: check_logs "
>>>         $error
>>>         /connection failed (No such file or directory)/d
>>>         /has no network name*/d
>>>         /receive tunnel port not found*/d
>>>         /Failed to locate tunnel to reach main chassis/d
>>>         /Transaction causes multiple rows.*MAC_Binding/d
>>>         /Transaction causes multiple rows.*FDB/d
>>>     " $sbox
>>> --- /dev/null       2026-05-06 14:47:18.250105001 +0000
>>> +++ /workspace/ovn-tmp/tests/testsuite.dir/at-groups/155/stdout
>>> 2026-05-06 14:54:35.770811381 +0000
>>> @@ -0,0 +1,2 @@
>>> +2026-05-06T14:54:35.548Z|00471|ofproto_dpif_rid|ERR|recirc_id 4 left
>>> allocated when ofproto (br-int) is destructed
>>> +2026-05-06T14:54:35.548Z|00472|ofproto_dpif_rid|ERR|recirc_id 2 left
>>> allocated when ofproto (br-int) is destructed
>>>
>>> https://github.com/ovsrobot/ovn/actions/runs/25442425151/job/74637759422#step:12:5664
>>>
>>> I'm failing to reproduce the issue locally but I'll keep investigating.
>>>
>>
>> +Eelco as we were discussing offline about this earlier.
>>
>> So, if I understand correctly, it's just a race when ovs-vswitchd exits:
>>
>> ofproto-dpif.c:destruct() is called, it removes all OF rules and then
>> calls recirc_free_ofproto() which assumes there's no remaining recirc_id.
>>
>> However, close_dpif_backer() which in turn calls udpif_destroy() and
>> deletes all the ukeys (including recirc_ids they might have) is only
>> called at the end of destruct().
> 
> The destruction of recirculation ids is first happening in dpif_flush()
> while stopping the threads and running revalidator purge, before calling
> the destruct() function.  But there are two races with this:
> 
> 1. The flush will remove ukeys, but their actual destruction and the
>    freeing of recirculation node references is postponed, so if the RCU
>    thread is not fast enough, they will still be allocated at the time
>    of ofproto-dpif destruction.
> 
> 2. Flush restarts the threads, so technically new ukeys can be created
>    after the flush and before the destruction.  Though this should not
>    generally be happening in unit tests where all the traffic is controlled.
> 
>>
>> static void
>> destruct(struct ofproto *ofproto_, bool del)
>> {
>> ...
>>     ofproto->backer->need_revalidate = REV_RECONFIGURE;
>>     xlate_txn_start();
>>     xlate_remove_ofproto(ofproto);
>>     xlate_txn_commit();
>>
>>     hmap_remove(&all_ofproto_dpifs_by_name,
>>                 &ofproto->all_ofproto_dpifs_by_name_node);
>>
>>     OFPROTO_FOR_EACH_TABLE (table, &ofproto->up) {
>>         CLS_FOR_EACH (rule, up.cr, &table->cls) {
>>             ofproto_rule_delete(&ofproto->up, &rule->up);
>>         }
>>     }
>> ...
>>
>>     recirc_free_ofproto(ofproto, ofproto->up.name);
>> ...
>>     /* Wait for all the meter destroy work to finish. */
>>     ovsrcu_barrier();
>>     close_dpif_backer(ofproto->backer, del);
>> }
>>
>> Is this something that's worth addressing in OVS?
> 
> There should be an RCU barrier before the leak check (recirc_free_ofproto)
> to avoid the first race.  But we also need to check after the final close
> in order to avoid the second race.  Both can be solved by moving the check
> to the end of the function where we already have the barrier.  Second barrier
> after the backer closing doesn't seem to be necessary because udpif_destroy
> doesn't postpone the ukey destruction.  I can try and send a patch for this.
> There is also some dead code to be removed (rule->recirc_id seem to be 
> unused).
> 
>> I understand it's
>> benign because we're exiting anyway but I am a bit surprised we don't
>> hit it more often in tests (OVN tests too).
> 
> I wonder if there is something special about this test that make OVS starve
> for CPU resources and not being able to run RCU callbacks fast enough.  Maybe
> some OVN processes taking too much CPU for one reason or another?
> 

I don't think so.  I've ran it a lot of times locally and I see nothing
like that.  It's literally just sending a few packets.  I also see
nothing suggesting that in GitHub actions (the only place where this
test fails until now):

https://github.com/dceara/ovn/actions/runs/25502938530/job/74841281392#step:13:32

The above is a "focused" test run, executing just 3 unit tests.

>>
>> Alternatively, would it be an option to change OVN's OVN_CLEANUP_VSWITCH
>> macro to call revalidator/purge before stopping vswitchd?  I.e.:
>>

FTR, we seem to have hit this in the past too, the "AT_SETUP([IGMP
external querier])" also has:

OVN_CLEANUP([hv1
/left allocated/d
], [hv2
/left allocated/d
])
AT_CLEANUP
])

To ignore these logs.  Added by:
65f9f010b426 ("tests: Check unit tests logs for errors.")

https://github.com/ovn-org/ovn/commit/65f9f010b426

Thanks,
Dumitru

>> diff --git a/tests/ovn-macros.at b/tests/ovn-macros.at
>> index 39f03ba62b..b0b22e2fe6 100644
>> --- a/tests/ovn-macros.at
>> +++ b/tests/ovn-macros.at
>> @@ -223,6 +223,7 @@ m4_define([OVN_CLEANUP_VSWITCH],[
>>      echo
>>      echo "$1: clean up vswitch"
>>      as $1
>> +    ovs-appctl revalidator/purge
>>      OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
>>      OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>>  ])
>>
>> Thanks,
>> Dumitru
>>
>>> Regards,
>>> Dumitru
>>>
>>> On 5/6/26 1:49 PM, Dumitru Ceara wrote:
>>>> On 5/6/26 10:33 AM, Mairtin O'Loingsigh wrote:
>>>>> On Mon, May 04, 2026 at 09:05:40AM +0200, Dumitru Ceara wrote:
>>>>>> Hi Mairtin,
>>>>>>
>>>>>> Thanks for the review!
>>>>>>
>>>>>> On 4/29/26 10:28 AM, Mairtin O'Loingsigh wrote:
>>>>>>> On Fri, Apr 24, 2026 at 05:35:58PM +0200, Dumitru Ceara via dev wrote:
>>>>>>>> The ARP/ND responder stage (ls_in_arp_rsp) unconditionally
>>>>>>>> bypassed all traffic arriving from localnet ports via a
>>>>>>>> priority-100 "next;" flow.  This caused broadcast ARP/ND
>>>>>>>> requests from the physical network to be flooded to every
>>>>>>>> logical switch port instead of being handled by proxy
>>>>>>>> ARP/ND.  On switches with ~200+ ports the resulting
>>>>>>>> multicast replication exceeded the OVS 4K resubmit limit,
>>>>>>>> dropping the packets and breaking connectivity.
>>>>>>>>
>>>>>>>> Replace the bypass with a targeted mechanism:
>>>>>>>>
>>>>>>>>   - In ls_in_lookup_fdb, set flags.localnet = 1 for
>>>>>>>>     packets arriving from localnet ports (P50 fallback;
>>>>>>>>     the existing P100 FDB-learning flow already sets this
>>>>>>>>     flag when FDB learning is enabled).
>>>>>>>>
>>>>>>>>   - In the P50 ARP/ND reply flows, append the condition
>>>>>>>>     "((flags.localnet == 1 && is_chassis_resident(port))
>>>>>>>>      || flags.localnet == 0)" on switches that have
>>>>>>>>     localnet ports.
>>>>>>>>
>>>>>>>> This ensures that ARP/ND requests from localnet are only
>>>>>>>> answered on the chassis hosting the target VIF, preventing
>>>>>>>> both the flood and duplicate replies from multiple
>>>>>>>> hypervisors.  VIF-to-VIF proxy ARP/ND is unchanged because
>>>>>>>> flags.localnet is 0 for non-localnet-sourced traffic.
>>>>>>>>
>>>>>>>> Fixes: f763a3273b84 ("ovn: Avoid ARP responder for packets from 
>>>>>>>> localnet port")
>>>>>>>> Reported-at: https://redhat.atlassian.net/browse/FDP-3436
>>>>>>>> Assisted-by: Claude Opus 4.6, Claude Code
>>>>>>>> Signed-off-by: Dumitru Ceara <[email protected]>
>>>>>>>> ---
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>>>>  
>>>>>>>> +/* On switches with localnet ports, restrict ARP/ND replies for
>>>>>>>> + * localnet-sourced requests to the chassis hosting the target VIF
>>>>>>>> + * (preventing duplicate replies from every hypervisor).  Non-localnet
>>>>>>>> + * requests (VIF-to-VIF) are answered unconditionally as before. */
>>>>>>>> +static void
>>>>>>>> +build_lswitch_arp_nd_local_resp_match(struct ds *match,
>>>>>>>> +                                      const struct ovn_port *op)
>>>>>>>> +{
>>>>>>>> +    if (!ls_has_localnet_port(op->od)) {
>>>>>>>> +        return;
>>>>>>>> +    }
>>>>>>>> +
>>>>>>>> +    ds_put_format(match,
>>>>>>>> +        " && ((flags.localnet == 1 && is_chassis_resident(%s))"
>>>>>>>> +            " || flags.localnet == 0)", op->json_key);
>>>>>>> nit: spacing
>>>>>>
>>>>>> I had actually done this on purpose to make it a bit more visible that "
>>>>>> || flags.localnet == 0" is part of the condition in parenthesis.  But I
>>>>>> have no strong preference in the end.  Please let me know if you still
>>>>>> would like me to change it.
>>>>>>
>>>>>>>> +}
>>>>>>>> +
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>>>>
>>>>>>> LGTM. Just one small nit.
>>>>>>>
>>>>>>> Acked-by: Mairtin O'Loingsigh <[email protected]>
>>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Dumitru
>>>>>>
>>>>>
>>>>> This spacing does look more readable. No need to change.
>>>>>
>>>>
>>>> Hi Mairtin,
>>>>
>>>> Thanks for the confirmation!  Applied to main and all stable branches
>>>> down to 24.03.
>>>>
>>>> Regards,
>>>> Dumitru
>>>>
>>
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to