On 4/7/25 5:22 PM, Numan Siddique wrote: > On Mon, Apr 7, 2025 at 9:15 AM Dumitru Ceara <dce...@redhat.com> wrote: >> >> On 4/5/25 6:38 AM, num...@ovn.org wrote: >>> From: Numan Siddique <num...@ovn.org> >>> >> >> Hi Numan, >> >> Thanks for the patch! >> >>> After the commit [1], OVN doesn't reply to the unicast ARP >>> requests for the known IPs. This patch adds a global option >>> "unicast_arp_responder" and if set to true, OVN will also >>> respond to the unicast ARP requests. This gives the option >>> for deployments which were relying on the older behavior. >>> >> >> I wonder if the new unicast_arp_responder option should be a router (or >> switch) option instead. Or maybe keep the global option but also add >> per router or switch ways of tuning this. Some deployments might not >> want this behavior globally. > > Thanks for the review. I think the option should be a switch one > since the arp responder flows > are added in the logical switch pipeline and arps are generally within > the same subnet boundary. > I'll address your comments and submit v2. >
Hi Numan, Ack, that is, if we need this knob. Please see below. >> >>> However the main reason to add this option is to address >>> another issue. Consider the below topology where logical switches >>> sw0 and sw1 are connected lr0. lr0 is also connected to a >>> public switch which is not shown here. >>> >>> switch 8144b9c4-b38f-47db-a403-62283c8a5756 (sw0) >>> port sw0-port2 >>> addresses: ["50:54:00:00:00:04 10.0.0.4"] >>> port sw0-port1 >>> addresses: ["50:54:00:00:00:03 10.0.0.3"] >>> port sw0-lr0 >>> type: router >>> addresses: ["00:00:00:00:ff:01"] >>> router-port: lr0-sw0 >>> >>> switch 8f53c293-9486-49ff-9e62-98ea3408ee2e (sw1) >>> port sw1-port2 >>> addresses: ["40:54:00:00:00:04 20.0.0.4"] >>> port sw1-lr0 >>> type: router >>> addresses: ["00:00:00:00:ff:02"] >>> router-port: lr0-sw1 >>> port sw1-port1 >>> addresses: ["40:54:00:00:00:03 20.0.0.3"] >>> >>> ovn-nbctl lr-nat-list lr0 >>> TYPE EXTERNAL_IP LOGICAL_IP EXTERNAL_MAC >>> LOGICAL_PORT >>> dnat_and_snat 172.168.0.110 10.0.0.3 50:54:00:00:00:03 >>> sw0-port1 >>> dnat_and_snat 172.168.0.120 20.0.0.3 40:54:00:00:00:04 >>> sw1-port1 >>> >>> Here the logical port's mac is also used as external mac in the NATs. >>> >>> ovn-northd adds the below logical flow in both the sw0 and sw1 pipeline >>> >>> table=28(ls_in_l2_lkup ), priority=75 , >>> match=(eth.src == {.., 50:54:00:00:00:03, 40:54:00:00:00:04, ..} >>> && (arp.op == 1 || rarp.op == 3 || nd_ns)), >>> action=(outport = "_MC_flood_l2"; output;) >>> >>> This lflow is added to broadcast the self originated GARP packets. >>> Most likely to broadcast the GARPs generated by ovn-controller >>> and where the router MAC is used as source address [2]. >>> Seems like this flow should have been added only on the logical switches >>> connected to the router if the IPs are reachable from it. >>> >> >> Just a note, the behavior before [2] was to always flood so I think this >> is not necessarily a problem introduced by [2] (32f5ebb06226 >> ("ovn-northd: Limit ARP/ND broadcast domain whenever possible.")). >> > > The issue here is with the ARP unicast request packets generated by a > logical port's VIF. > If the logical port's mac is also used for its dnat_and_snat entry in > the router, then > any ARP request packet generated by a logical port VIF is flooded > instead of sending it to > the destination, which is wrong. > > >> The problem [2] was trying to avoid was determining whether a packet is >> GARP. There's no easy way to do that today without adding a per-OVN-IP >> flow which might be prohibitive from performance perspective. >> > > If I understand correctly [2] was added to flood the GARP packets > generated by ovn-controller right ? > > If so, > - can't we modify the flow added by [2] in the ls_in_l2_lkup stage > to also match on "flags.from_ctrl". I was thinking about that at some point but I wasn't sure if it would work fine for VLAN-backed networks. > Although we need to see if we preserve the flags when the packet > is cloned to enter the switch pipelines. We do preserve flags, don't we? > - Since ovn-controller generates GARPs with broadcast eth > destination, why can't we also match on "eth.dst == > ff:ff:ff:ff:ff:ff:ff" ? > That's a good point, we should probably do that indeed. Would that avoid the need of the new knob? > If we can use one or both the above matches, then probably it's better > to fix this too. I'll submit another patch. > Looking forward to it! Thanks, Dumitru > >> Adding Ilya in CC because we had an offline discussion regarding the >> possibility of adding an OVS extension to allow OVN to match on "packet >> is GARP". That would definitely simplify OVN's life in this case. >> >>> When sw0-port1 sends a unicast ARP request to any IP in its subnet, >>> instead of tunneling the packet to the destination chassis, the packet >>> gets flooded in the L2 domain since the sw0-port1's mac matches this >>> flow. >>> >> >> We could change the per NAT ARP flooding flows and only flood if the ARP >> target is a NAT public IP that's in the network of a connected LRP. >> That means at most one flow per NAT which might be acceptable. >> >>> Until we fix this logical flow properly, this patch is sufficient to >>> address the unicast ARP request issue. This causes a flood of >>> ARP packets in the geneve network. >>> >>> [1] - c48ed1736a58("Do not reply on unicast arps for IPv4 targets.") >>> [2] - 32f5ebb06226 ("ovn-northd: Limit ARP/ND broadcast domain whenever >>> possible.") >>> >>> Signed-off-by: Numan Siddique <num...@ovn.org> >>> --- >>> northd/en-global-config.c | 6 ++++++ >>> northd/northd.c | 26 +++++++++++++++++--------- >>> ovn-nb.xml | 9 +++++++++ >>> tests/ovn-northd.at | 19 +++++++++++++++++-- >>> 4 files changed, 49 insertions(+), 11 deletions(-) >>> >>> diff --git a/northd/en-global-config.c b/northd/en-global-config.c >>> index c103b137f6..d9b7dcb0a8 100644 >>> --- a/northd/en-global-config.c >>> +++ b/northd/en-global-config.c >>> @@ -269,6 +269,12 @@ global_config_nb_global_handler(struct engine_node >>> *node, void *data) >>> return false; >>> } >>> >>> + /* Check if northd_internal_version has changed or not. */ >>> + if (config_out_of_sync(&nb->options, &config_data->nb_options, >>> + "unicast_arp_responder", false)) { >>> + return false; >>> + } >>> + >>> if (check_nb_options_out_of_sync(nb, config_data, sampling_apps)) { >>> config_data->tracked_data.nb_options_changed = true; >>> } >>> diff --git a/northd/northd.c b/northd/northd.c >>> index 880ec92ac2..78a7125ce1 100644 >>> --- a/northd/northd.c >>> +++ b/northd/northd.c >>> @@ -98,6 +98,9 @@ static bool vxlan_mode; >>> >>> static bool vxlan_ic_mode; >>> >>> +/* Respond to unicast ARP requests for known ips or not. */ >>> +static bool unicast_arp_responder = false; >>> + >>> #define MAX_OVN_TAGS 4096 >>> >>> >>> @@ -9515,15 +9518,16 @@ build_lswitch_arp_nd_responder_known_ips(struct >>> ovn_port *op, >>> for (size_t i = 0; i < op->n_lsp_addrs; i++) { >>> for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) { >>> ds_clear(match); >>> - /* Do not reply on unicast ARPs, forward them to the target >>> - * to have ability to monitor target liveness via unicast >>> - * ARP requests. >>> - */ >>> - ds_put_format(match, >>> - "arp.tpa == %s && " >>> - "arp.op == 1 && " >>> - "eth.dst == ff:ff:ff:ff:ff:ff", >>> - op->lsp_addrs[i].ipv4_addrs[j].addr_s); >>> + ds_put_format(match, "arp.tpa == %s && arp.op == 1", >>> + op->lsp_addrs[i].ipv4_addrs[j].addr_s); >>> + >>> + if (!unicast_arp_responder) { >>> + /* Do not reply on unicast ARPs, forward them to the >>> target >>> + * to have ability to monitor target liveness via >>> unicast >>> + * ARP requests. >>> + */ >>> + ds_put_cstr(match, " && eth.dst == ff:ff:ff:ff:ff:ff"); >>> + } >>> ds_clear(actions); >>> ds_put_format(actions, >>> "eth.dst = eth.src; " >>> @@ -18980,6 +18984,10 @@ ovnnb_db_run(struct northd_input *input_data, >>> use_common_zone = smap_get_bool(input_data->nb_options, >>> "use_common_zone", >>> false); >>> >>> + unicast_arp_responder = smap_get_bool(input_data->nb_options, >>> + "unicast_arp_responder", >>> + false); >>> + >>> vxlan_mode = is_vxlan_mode(input_data->nb_options, >>> input_data->sbrec_chassis_table); >>> >>> diff --git a/ovn-nb.xml b/ovn-nb.xml >>> index bf1f1628bd..7f8021667c 100644 >>> --- a/ovn-nb.xml >>> +++ b/ovn-nb.xml >>> @@ -429,6 +429,15 @@ >>> </p> >>> </column> >>> >>> + <column name="options" key="unicast_arp_responder" type='{"type": >>> "boolean"}'> >>> + <p> >>> + <code>ovn-northd</code> adds ARP responder flows for known IPs >>> for >>> + broadcast ARP request packets. If this option is set, it will >>> also >>> + respond to unicast ARP requests. By default this option is set >>> to >>> + <code>false</code>. >>> + </p> >>> + </column> >>> + >> >> I guess we also need a NEWS file entry for this new feature. > > Ack. > > Thanks > Numan > >> >>> <group title="Options for configuring interconnection route >>> advertisement"> >>> <p> >>> These options control how routes are advertised between OVN >>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at >>> index e3a65e7a44..f0322a311c 100644 >>> --- a/tests/ovn-northd.at >>> +++ b/tests/ovn-northd.at >>> @@ -9869,7 +9869,7 @@ AT_CLEANUP >>> ]) >>> >>> OVN_FOR_EACH_NORTHD_NO_HV([ >>> -AT_SETUP([check options:disable_arp_nd_rsp for LSP]) >>> +AT_SETUP([ARP responder flows]) >>> ovn_start NORTHD_TYPE >>> check ovn-nbctl ls-add S1 >>> check ovn-nbctl --wait=sb lsp-add S1 S1-vm1 >>> @@ -9886,7 +9886,22 @@ AT_CHECK([grep -e "ls_in_arp_rsp" S1flows | >>> ovn_strip_lflows], [0], [dnl >>> table=??(ls_in_arp_rsp ), priority=50 , match=(nd_ns_mcast && >>> ip6.dst == ff02::1:ff00:10 && nd.target == fd00::10), action=(nd_na { >>> eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; >>> nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; >>> };) >>> ]) >>> >>> -#Set the disable_arp_nd_rsp option and verify the flow >>> +check ovn-nbctl set NB_Global . options:unicast_arp_responder=true >>> +check ovn-nbctl --wait=sb sync >>> + >>> +ovn-sbctl dump-flows S1 > S1flows >>> +AT_CAPTURE_FILE([S1flows]) >>> + >>> +AT_CHECK([grep -e "ls_in_arp_rsp" S1flows | ovn_strip_lflows], [0], [dnl >>> + table=??(ls_in_arp_rsp ), priority=0 , match=(1), action=(next;) >>> + table=??(ls_in_arp_rsp ), priority=100 , match=(arp.tpa == >>> 192.168.0.10 && arp.op == 1 && inport == "S1-vm1"), action=(next;) >>> + table=??(ls_in_arp_rsp ), priority=100 , match=(nd_ns_mcast && >>> ip6.dst == ff02::1:ff00:10 && nd.target == fd00::10 && inport == "S1-vm1"), >>> action=(next;) >>> + table=??(ls_in_arp_rsp ), priority=50 , match=(arp.tpa == >>> 192.168.0.10 && arp.op == 1), action=(eth.dst = eth.src; eth.src = >>> 50:54:00:00:00:10; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = >>> 50:54:00:00:00:10; arp.tpa = arp.spa; arp.spa = 192.168.0.10; outport = >>> inport; flags.loopback = 1; output;) >>> + table=??(ls_in_arp_rsp ), priority=50 , match=(nd_ns_mcast && >>> ip6.dst == ff02::1:ff00:10 && nd.target == fd00::10), action=(nd_na { >>> eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; >>> nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; >>> };) >>> +]) >>> + >>> + >>> +# Set the disable_arp_nd_rsp option and verify the flow >>> check ovn-nbctl --wait=sb set logical_switch_port S1-vm1 >>> options:disable_arp_nd_rsp=true >>> >>> ovn-sbctl dump-flows S1 > S1flows >> >> Thanks, >> Dumitru >> >> > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev