Hi all,
It seems that the test added by this patch of mine sometimes fails in
GitHub CI when ovs-vswitchd is stopped; the failure is due to:
./ovn.at:10193: check_logs "
$error
/connection failed (No such file or directory)/d
/has no network name*/d
/receive tunnel port not found*/d
/Failed to locate tunnel to reach main chassis/d
/Transaction causes multiple rows.*MAC_Binding/d
/Transaction causes multiple rows.*FDB/d
" $sbox
--- /dev/null 2026-05-06 14:47:18.250105001 +0000
+++ /workspace/ovn-tmp/tests/testsuite.dir/at-groups/155/stdout
2026-05-06 14:54:35.770811381 +0000
@@ -0,0 +1,2 @@
+2026-05-06T14:54:35.548Z|00471|ofproto_dpif_rid|ERR|recirc_id 4 left
allocated when ofproto (br-int) is destructed
+2026-05-06T14:54:35.548Z|00472|ofproto_dpif_rid|ERR|recirc_id 2 left
allocated when ofproto (br-int) is destructed
https://github.com/ovsrobot/ovn/actions/runs/25442425151/job/74637759422#step:12:5664
I'm failing to reproduce the issue locally but I'll keep investigating.
Regards,
Dumitru
On 5/6/26 1:49 PM, Dumitru Ceara wrote:
> On 5/6/26 10:33 AM, Mairtin O'Loingsigh wrote:
>> On Mon, May 04, 2026 at 09:05:40AM +0200, Dumitru Ceara wrote:
>>> Hi Mairtin,
>>>
>>> Thanks for the review!
>>>
>>> On 4/29/26 10:28 AM, Mairtin O'Loingsigh wrote:
>>>> On Fri, Apr 24, 2026 at 05:35:58PM +0200, Dumitru Ceara via dev wrote:
>>>>> The ARP/ND responder stage (ls_in_arp_rsp) unconditionally
>>>>> bypassed all traffic arriving from localnet ports via a
>>>>> priority-100 "next;" flow. This caused broadcast ARP/ND
>>>>> requests from the physical network to be flooded to every
>>>>> logical switch port instead of being handled by proxy
>>>>> ARP/ND. On switches with ~200+ ports the resulting
>>>>> multicast replication exceeded the OVS 4K resubmit limit,
>>>>> dropping the packets and breaking connectivity.
>>>>>
>>>>> Replace the bypass with a targeted mechanism:
>>>>>
>>>>> - In ls_in_lookup_fdb, set flags.localnet = 1 for
>>>>> packets arriving from localnet ports (P50 fallback;
>>>>> the existing P100 FDB-learning flow already sets this
>>>>> flag when FDB learning is enabled).
>>>>>
>>>>> - In the P50 ARP/ND reply flows, append the condition
>>>>> "((flags.localnet == 1 && is_chassis_resident(port))
>>>>> || flags.localnet == 0)" on switches that have
>>>>> localnet ports.
>>>>>
>>>>> This ensures that ARP/ND requests from localnet are only
>>>>> answered on the chassis hosting the target VIF, preventing
>>>>> both the flood and duplicate replies from multiple
>>>>> hypervisors. VIF-to-VIF proxy ARP/ND is unchanged because
>>>>> flags.localnet is 0 for non-localnet-sourced traffic.
>>>>>
>>>>> Fixes: f763a3273b84 ("ovn: Avoid ARP responder for packets from localnet
>>>>> port")
>>>>> Reported-at: https://redhat.atlassian.net/browse/FDP-3436
>>>>> Assisted-by: Claude Opus 4.6, Claude Code
>>>>> Signed-off-by: Dumitru Ceara <[email protected]>
>>>>> ---
>>>
>>> [...]
>>>
>>>>>
>>>>> +/* On switches with localnet ports, restrict ARP/ND replies for
>>>>> + * localnet-sourced requests to the chassis hosting the target VIF
>>>>> + * (preventing duplicate replies from every hypervisor). Non-localnet
>>>>> + * requests (VIF-to-VIF) are answered unconditionally as before. */
>>>>> +static void
>>>>> +build_lswitch_arp_nd_local_resp_match(struct ds *match,
>>>>> + const struct ovn_port *op)
>>>>> +{
>>>>> + if (!ls_has_localnet_port(op->od)) {
>>>>> + return;
>>>>> + }
>>>>> +
>>>>> + ds_put_format(match,
>>>>> + " && ((flags.localnet == 1 && is_chassis_resident(%s))"
>>>>> + " || flags.localnet == 0)", op->json_key);
>>>> nit: spacing
>>>
>>> I had actually done this on purpose to make it a bit more visible that "
>>> || flags.localnet == 0" is part of the condition in parenthesis. But I
>>> have no strong preference in the end. Please let me know if you still
>>> would like me to change it.
>>>
>>>>> +}
>>>>> +
>>>
>>> [...]
>>>
>>>>>
>>>> LGTM. Just one small nit.
>>>>
>>>> Acked-by: Mairtin O'Loingsigh <[email protected]>
>>>>
>>>
>>> Regards,
>>> Dumitru
>>>
>>
>> This spacing does look more readable. No need to change.
>>
>
> Hi Mairtin,
>
> Thanks for the confirmation! Applied to main and all stable branches
> down to 24.03.
>
> Regards,
> Dumitru
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev