On 4/7/25 5:22 PM, Numan Siddique wrote:
> On Mon, Apr 7, 2025 at 9:15 AM Dumitru Ceara <[email protected]> wrote:
>>
>> On 4/5/25 6:38 AM, [email protected] wrote:
>>> From: Numan Siddique <[email protected]>
>>>
>>
>> Hi Numan,
>>
>> Thanks for the patch!
>>
>>> After the commit [1], OVN doesn't reply to the unicast ARP
>>> requests for the known IPs. This patch adds a global option
>>> "unicast_arp_responder" and if set to true, OVN will also
>>> respond to the unicast ARP requests. This gives the option
>>> for deployments which were relying on the older behavior.
>>>
>>
>> I wonder if the new unicast_arp_responder option should be a router (or
>> switch) option instead. Or maybe keep the global option but also add
>> per router or switch ways of tuning this. Some deployments might not
>> want this behavior globally.
>
> Thanks for the review. I think the option should be a switch one
> since the arp responder flows
> are added in the logical switch pipeline and arps are generally within
> the same subnet boundary.
> I'll address your comments and submit v2.
>
Hi Numan,
Ack, that is, if we need this knob. Please see below.
>>
>>> However the main reason to add this option is to address
>>> another issue. Consider the below topology where logical switches
>>> sw0 and sw1 are connected lr0. lr0 is also connected to a
>>> public switch which is not shown here.
>>>
>>> switch 8144b9c4-b38f-47db-a403-62283c8a5756 (sw0)
>>> port sw0-port2
>>> addresses: ["50:54:00:00:00:04 10.0.0.4"]
>>> port sw0-port1
>>> addresses: ["50:54:00:00:00:03 10.0.0.3"]
>>> port sw0-lr0
>>> type: router
>>> addresses: ["00:00:00:00:ff:01"]
>>> router-port: lr0-sw0
>>>
>>> switch 8f53c293-9486-49ff-9e62-98ea3408ee2e (sw1)
>>> port sw1-port2
>>> addresses: ["40:54:00:00:00:04 20.0.0.4"]
>>> port sw1-lr0
>>> type: router
>>> addresses: ["00:00:00:00:ff:02"]
>>> router-port: lr0-sw1
>>> port sw1-port1
>>> addresses: ["40:54:00:00:00:03 20.0.0.3"]
>>>
>>> ovn-nbctl lr-nat-list lr0
>>> TYPE EXTERNAL_IP LOGICAL_IP EXTERNAL_MAC
>>> LOGICAL_PORT
>>> dnat_and_snat 172.168.0.110 10.0.0.3 50:54:00:00:00:03
>>> sw0-port1
>>> dnat_and_snat 172.168.0.120 20.0.0.3 40:54:00:00:00:04
>>> sw1-port1
>>>
>>> Here the logical port's mac is also used as external mac in the NATs.
>>>
>>> ovn-northd adds the below logical flow in both the sw0 and sw1 pipeline
>>>
>>> table=28(ls_in_l2_lkup ), priority=75 ,
>>> match=(eth.src == {.., 50:54:00:00:00:03, 40:54:00:00:00:04, ..}
>>> && (arp.op == 1 || rarp.op == 3 || nd_ns)),
>>> action=(outport = "_MC_flood_l2"; output;)
>>>
>>> This lflow is added to broadcast the self originated GARP packets.
>>> Most likely to broadcast the GARPs generated by ovn-controller
>>> and where the router MAC is used as source address [2].
>>> Seems like this flow should have been added only on the logical switches
>>> connected to the router if the IPs are reachable from it.
>>>
>>
>> Just a note, the behavior before [2] was to always flood so I think this
>> is not necessarily a problem introduced by [2] (32f5ebb06226
>> ("ovn-northd: Limit ARP/ND broadcast domain whenever possible.")).
>>
>
> The issue here is with the ARP unicast request packets generated by a
> logical port's VIF.
> If the logical port's mac is also used for its dnat_and_snat entry in
> the router, then
> any ARP request packet generated by a logical port VIF is flooded
> instead of sending it to
> the destination, which is wrong.
>
>
>> The problem [2] was trying to avoid was determining whether a packet is
>> GARP. There's no easy way to do that today without adding a per-OVN-IP
>> flow which might be prohibitive from performance perspective.
>>
>
> If I understand correctly [2] was added to flood the GARP packets
> generated by ovn-controller right ?
>
> If so,
> - can't we modify the flow added by [2] in the ls_in_l2_lkup stage
> to also match on "flags.from_ctrl".
I was thinking about that at some point but I wasn't sure if it would
work fine for VLAN-backed networks.
> Although we need to see if we preserve the flags when the packet
> is cloned to enter the switch pipelines.
We do preserve flags, don't we?
> - Since ovn-controller generates GARPs with broadcast eth
> destination, why can't we also match on "eth.dst ==
> ff:ff:ff:ff:ff:ff:ff" ?
>
That's a good point, we should probably do that indeed. Would that
avoid the need of the new knob?
> If we can use one or both the above matches, then probably it's better
> to fix this too. I'll submit another patch.
>
Looking forward to it!
Thanks,
Dumitru
>
>> Adding Ilya in CC because we had an offline discussion regarding the
>> possibility of adding an OVS extension to allow OVN to match on "packet
>> is GARP". That would definitely simplify OVN's life in this case.
>>
>>> When sw0-port1 sends a unicast ARP request to any IP in its subnet,
>>> instead of tunneling the packet to the destination chassis, the packet
>>> gets flooded in the L2 domain since the sw0-port1's mac matches this
>>> flow.
>>>
>>
>> We could change the per NAT ARP flooding flows and only flood if the ARP
>> target is a NAT public IP that's in the network of a connected LRP.
>> That means at most one flow per NAT which might be acceptable.
>>
>>> Until we fix this logical flow properly, this patch is sufficient to
>>> address the unicast ARP request issue. This causes a flood of
>>> ARP packets in the geneve network.
>>>
>>> [1] - c48ed1736a58("Do not reply on unicast arps for IPv4 targets.")
>>> [2] - 32f5ebb06226 ("ovn-northd: Limit ARP/ND broadcast domain whenever
>>> possible.")
>>>
>>> Signed-off-by: Numan Siddique <[email protected]>
>>> ---
>>> northd/en-global-config.c | 6 ++++++
>>> northd/northd.c | 26 +++++++++++++++++---------
>>> ovn-nb.xml | 9 +++++++++
>>> tests/ovn-northd.at | 19 +++++++++++++++++--
>>> 4 files changed, 49 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/northd/en-global-config.c b/northd/en-global-config.c
>>> index c103b137f6..d9b7dcb0a8 100644
>>> --- a/northd/en-global-config.c
>>> +++ b/northd/en-global-config.c
>>> @@ -269,6 +269,12 @@ global_config_nb_global_handler(struct engine_node
>>> *node, void *data)
>>> return false;
>>> }
>>>
>>> + /* Check if northd_internal_version has changed or not. */
>>> + if (config_out_of_sync(&nb->options, &config_data->nb_options,
>>> + "unicast_arp_responder", false)) {
>>> + return false;
>>> + }
>>> +
>>> if (check_nb_options_out_of_sync(nb, config_data, sampling_apps)) {
>>> config_data->tracked_data.nb_options_changed = true;
>>> }
>>> diff --git a/northd/northd.c b/northd/northd.c
>>> index 880ec92ac2..78a7125ce1 100644
>>> --- a/northd/northd.c
>>> +++ b/northd/northd.c
>>> @@ -98,6 +98,9 @@ static bool vxlan_mode;
>>>
>>> static bool vxlan_ic_mode;
>>>
>>> +/* Respond to unicast ARP requests for known ips or not. */
>>> +static bool unicast_arp_responder = false;
>>> +
>>> #define MAX_OVN_TAGS 4096
>>>
>>>
>>> @@ -9515,15 +9518,16 @@ build_lswitch_arp_nd_responder_known_ips(struct
>>> ovn_port *op,
>>> for (size_t i = 0; i < op->n_lsp_addrs; i++) {
>>> for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) {
>>> ds_clear(match);
>>> - /* Do not reply on unicast ARPs, forward them to the target
>>> - * to have ability to monitor target liveness via unicast
>>> - * ARP requests.
>>> - */
>>> - ds_put_format(match,
>>> - "arp.tpa == %s && "
>>> - "arp.op == 1 && "
>>> - "eth.dst == ff:ff:ff:ff:ff:ff",
>>> - op->lsp_addrs[i].ipv4_addrs[j].addr_s);
>>> + ds_put_format(match, "arp.tpa == %s && arp.op == 1",
>>> + op->lsp_addrs[i].ipv4_addrs[j].addr_s);
>>> +
>>> + if (!unicast_arp_responder) {
>>> + /* Do not reply on unicast ARPs, forward them to the
>>> target
>>> + * to have ability to monitor target liveness via
>>> unicast
>>> + * ARP requests.
>>> + */
>>> + ds_put_cstr(match, " && eth.dst == ff:ff:ff:ff:ff:ff");
>>> + }
>>> ds_clear(actions);
>>> ds_put_format(actions,
>>> "eth.dst = eth.src; "
>>> @@ -18980,6 +18984,10 @@ ovnnb_db_run(struct northd_input *input_data,
>>> use_common_zone = smap_get_bool(input_data->nb_options,
>>> "use_common_zone",
>>> false);
>>>
>>> + unicast_arp_responder = smap_get_bool(input_data->nb_options,
>>> + "unicast_arp_responder",
>>> + false);
>>> +
>>> vxlan_mode = is_vxlan_mode(input_data->nb_options,
>>> input_data->sbrec_chassis_table);
>>>
>>> diff --git a/ovn-nb.xml b/ovn-nb.xml
>>> index bf1f1628bd..7f8021667c 100644
>>> --- a/ovn-nb.xml
>>> +++ b/ovn-nb.xml
>>> @@ -429,6 +429,15 @@
>>> </p>
>>> </column>
>>>
>>> + <column name="options" key="unicast_arp_responder" type='{"type":
>>> "boolean"}'>
>>> + <p>
>>> + <code>ovn-northd</code> adds ARP responder flows for known IPs
>>> for
>>> + broadcast ARP request packets. If this option is set, it will
>>> also
>>> + respond to unicast ARP requests. By default this option is set
>>> to
>>> + <code>false</code>.
>>> + </p>
>>> + </column>
>>> +
>>
>> I guess we also need a NEWS file entry for this new feature.
>
> Ack.
>
> Thanks
> Numan
>
>>
>>> <group title="Options for configuring interconnection route
>>> advertisement">
>>> <p>
>>> These options control how routes are advertised between OVN
>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>>> index e3a65e7a44..f0322a311c 100644
>>> --- a/tests/ovn-northd.at
>>> +++ b/tests/ovn-northd.at
>>> @@ -9869,7 +9869,7 @@ AT_CLEANUP
>>> ])
>>>
>>> OVN_FOR_EACH_NORTHD_NO_HV([
>>> -AT_SETUP([check options:disable_arp_nd_rsp for LSP])
>>> +AT_SETUP([ARP responder flows])
>>> ovn_start NORTHD_TYPE
>>> check ovn-nbctl ls-add S1
>>> check ovn-nbctl --wait=sb lsp-add S1 S1-vm1
>>> @@ -9886,7 +9886,22 @@ AT_CHECK([grep -e "ls_in_arp_rsp" S1flows |
>>> ovn_strip_lflows], [0], [dnl
>>> table=??(ls_in_arp_rsp ), priority=50 , match=(nd_ns_mcast &&
>>> ip6.dst == ff02::1:ff00:10 && nd.target == fd00::10), action=(nd_na {
>>> eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10;
>>> nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output;
>>> };)
>>> ])
>>>
>>> -#Set the disable_arp_nd_rsp option and verify the flow
>>> +check ovn-nbctl set NB_Global . options:unicast_arp_responder=true
>>> +check ovn-nbctl --wait=sb sync
>>> +
>>> +ovn-sbctl dump-flows S1 > S1flows
>>> +AT_CAPTURE_FILE([S1flows])
>>> +
>>> +AT_CHECK([grep -e "ls_in_arp_rsp" S1flows | ovn_strip_lflows], [0], [dnl
>>> + table=??(ls_in_arp_rsp ), priority=0 , match=(1), action=(next;)
>>> + table=??(ls_in_arp_rsp ), priority=100 , match=(arp.tpa ==
>>> 192.168.0.10 && arp.op == 1 && inport == "S1-vm1"), action=(next;)
>>> + table=??(ls_in_arp_rsp ), priority=100 , match=(nd_ns_mcast &&
>>> ip6.dst == ff02::1:ff00:10 && nd.target == fd00::10 && inport == "S1-vm1"),
>>> action=(next;)
>>> + table=??(ls_in_arp_rsp ), priority=50 , match=(arp.tpa ==
>>> 192.168.0.10 && arp.op == 1), action=(eth.dst = eth.src; eth.src =
>>> 50:54:00:00:00:10; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha =
>>> 50:54:00:00:00:10; arp.tpa = arp.spa; arp.spa = 192.168.0.10; outport =
>>> inport; flags.loopback = 1; output;)
>>> + table=??(ls_in_arp_rsp ), priority=50 , match=(nd_ns_mcast &&
>>> ip6.dst == ff02::1:ff00:10 && nd.target == fd00::10), action=(nd_na {
>>> eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10;
>>> nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output;
>>> };)
>>> +])
>>> +
>>> +
>>> +# Set the disable_arp_nd_rsp option and verify the flow
>>> check ovn-nbctl --wait=sb set logical_switch_port S1-vm1
>>> options:disable_arp_nd_rsp=true
>>>
>>> ovn-sbctl dump-flows S1 > S1flows
>>
>> Thanks,
>> Dumitru
>>
>>
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev