On Mon, Apr 7, 2025 at 9:15 AM Dumitru Ceara <dce...@redhat.com> wrote:
>
> On 4/5/25 6:38 AM, num...@ovn.org wrote:
> > From: Numan Siddique <num...@ovn.org>
> >
>
> Hi Numan,
>
> Thanks for the patch!
>
> > After the commit [1], OVN doesn't reply to the unicast ARP
> > requests for the known IPs.  This patch adds a global option
> > "unicast_arp_responder" and if set to true, OVN will also
> > respond to the unicast ARP requests.  This gives the option
> > for deployments which were relying on the older behavior.
> >
>
> I wonder if the new unicast_arp_responder option should be a router (or
> switch) option instead.  Or maybe keep the global option but also add
> per router or switch ways of tuning this.  Some deployments might not
> want this behavior globally.

Thanks for the review.  I think the option should be a switch one
since the arp responder flows
are added in the logical switch pipeline and arps are generally within
the same subnet boundary.
I'll address your comments and submit v2.

>
> > However the main reason to add this option is to address
> > another issue.  Consider the below topology where logical switches
> > sw0 and sw1 are connected lr0. lr0 is also connected to a
> > public switch which is not shown here.
> >
> > switch 8144b9c4-b38f-47db-a403-62283c8a5756 (sw0)
> >     port sw0-port2
> >         addresses: ["50:54:00:00:00:04 10.0.0.4"]
> >     port sw0-port1
> >         addresses: ["50:54:00:00:00:03 10.0.0.3"]
> >     port sw0-lr0
> >         type: router
> >         addresses: ["00:00:00:00:ff:01"]
> >         router-port: lr0-sw0
> >
> > switch 8f53c293-9486-49ff-9e62-98ea3408ee2e (sw1)
> >     port sw1-port2
> >         addresses: ["40:54:00:00:00:04 20.0.0.4"]
> >     port sw1-lr0
> >         type: router
> >         addresses: ["00:00:00:00:ff:02"]
> >         router-port: lr0-sw1
> >     port sw1-port1
> >         addresses: ["40:54:00:00:00:03 20.0.0.3"]
> >
> > ovn-nbctl lr-nat-list lr0
> > TYPE            EXTERNAL_IP      LOGICAL_IP          EXTERNAL_MAC         
> > LOGICAL_PORT
> > dnat_and_snat   172.168.0.110    10.0.0.3            50:54:00:00:00:03    
> > sw0-port1
> > dnat_and_snat   172.168.0.120    20.0.0.3            40:54:00:00:00:04    
> > sw1-port1
> >
> > Here the logical port's mac is also used as external mac in the NATs.
> >
> > ovn-northd adds the below logical flow in both the sw0 and sw1 pipeline
> >
> > table=28(ls_in_l2_lkup      ), priority=75   ,
> >   match=(eth.src == {.., 50:54:00:00:00:03, 40:54:00:00:00:04, ..}
> >          && (arp.op == 1 || rarp.op == 3 || nd_ns)),
> >   action=(outport = "_MC_flood_l2"; output;)
> >
> > This lflow is added to broadcast the self originated GARP packets.
> > Most likely to broadcast the GARPs generated by ovn-controller
> > and where the router MAC is used as source address [2].
> > Seems like this flow should have been added only on the logical switches
> > connected to the router if the IPs are reachable from it.
> >
>
> Just a note, the behavior before [2] was to always flood so I think this
> is not necessarily a problem introduced by [2] (32f5ebb06226
> ("ovn-northd: Limit ARP/ND broadcast domain whenever possible.")).
>

The issue here is with the ARP unicast request packets generated by a
logical port's VIF.
If the logical port's mac is also used for its dnat_and_snat entry in
the router,  then
any ARP request packet generated by a logical port VIF is flooded
instead of sending it to
the destination, which is wrong.


> The problem [2] was trying to avoid was determining whether a packet is
> GARP.  There's no easy way to do that today without adding a per-OVN-IP
> flow which might be prohibitive from performance perspective.
>

If I understand correctly [2] was added to flood the GARP packets
generated by ovn-controller right ?

If so,
   -  can't we modify the flow added by [2] in the ls_in_l2_lkup stage
to also match on "flags.from_ctrl".
      Although we need to see if we preserve the flags when the packet
is cloned to enter the switch pipelines.
   - Since ovn-controller generates  GARPs with broadcast eth
destination,  why can't we also match on "eth.dst ==
ff:ff:ff:ff:ff:ff:ff" ?

If we can use one or both the above matches, then probably it's better
to fix this too.  I'll submit another patch.


> Adding Ilya in CC because we had an offline discussion regarding the
> possibility of adding an OVS extension to allow OVN to match on "packet
> is GARP".  That would definitely simplify OVN's life in this case.
>
> > When sw0-port1 sends a unicast ARP request to any IP in its subnet,
> > instead of tunneling the packet to the destination chassis, the packet
> > gets flooded in the L2 domain since the sw0-port1's mac matches this
> > flow.
> >
>
> We could change the per NAT ARP flooding flows and only flood if the ARP
> target is a NAT public IP that's in the network of a connected LRP.
> That means at most one flow per NAT which might be acceptable.
>
> > Until we fix this logical flow properly, this patch is sufficient to
> > address the unicast ARP request issue.  This causes a flood of
> > ARP packets in the geneve network.
> >
> > [1] - c48ed1736a58("Do not reply on unicast arps for IPv4 targets.")
> > [2] - 32f5ebb06226 ("ovn-northd: Limit ARP/ND broadcast domain whenever 
> > possible.")
> >
> > Signed-off-by: Numan Siddique <num...@ovn.org>
> > ---
> >  northd/en-global-config.c |  6 ++++++
> >  northd/northd.c           | 26 +++++++++++++++++---------
> >  ovn-nb.xml                |  9 +++++++++
> >  tests/ovn-northd.at       | 19 +++++++++++++++++--
> >  4 files changed, 49 insertions(+), 11 deletions(-)
> >
> > diff --git a/northd/en-global-config.c b/northd/en-global-config.c
> > index c103b137f6..d9b7dcb0a8 100644
> > --- a/northd/en-global-config.c
> > +++ b/northd/en-global-config.c
> > @@ -269,6 +269,12 @@ global_config_nb_global_handler(struct engine_node 
> > *node, void *data)
> >          return false;
> >      }
> >
> > +    /* Check if northd_internal_version has changed or not. */
> > +    if (config_out_of_sync(&nb->options, &config_data->nb_options,
> > +                           "unicast_arp_responder", false)) {
> > +        return false;
> > +    }
> > +
> >      if (check_nb_options_out_of_sync(nb, config_data, sampling_apps)) {
> >          config_data->tracked_data.nb_options_changed = true;
> >      }
> > diff --git a/northd/northd.c b/northd/northd.c
> > index 880ec92ac2..78a7125ce1 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -98,6 +98,9 @@ static bool vxlan_mode;
> >
> >  static bool vxlan_ic_mode;
> >
> > +/* Respond to unicast ARP requests for known ips or not. */
> > +static bool unicast_arp_responder = false;
> > +
> >  #define MAX_OVN_TAGS 4096
> >
> >
> > @@ -9515,15 +9518,16 @@ build_lswitch_arp_nd_responder_known_ips(struct 
> > ovn_port *op,
> >          for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> >              for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) {
> >                  ds_clear(match);
> > -                /* Do not reply on unicast ARPs, forward them to the target
> > -                 * to have ability to monitor target liveness via unicast
> > -                 * ARP requests.
> > -                */
> > -                ds_put_format(match,
> > -                    "arp.tpa == %s && "
> > -                    "arp.op == 1 && "
> > -                    "eth.dst == ff:ff:ff:ff:ff:ff",
> > -                    op->lsp_addrs[i].ipv4_addrs[j].addr_s);
> > +                ds_put_format(match, "arp.tpa == %s && arp.op == 1",
> > +                              op->lsp_addrs[i].ipv4_addrs[j].addr_s);
> > +
> > +                if (!unicast_arp_responder) {
> > +                    /* Do not reply on unicast ARPs, forward them to the 
> > target
> > +                     * to have ability to monitor target liveness via 
> > unicast
> > +                     * ARP requests.
> > +                    */
> > +                   ds_put_cstr(match, " && eth.dst == ff:ff:ff:ff:ff:ff");
> > +                }
> >                  ds_clear(actions);
> >                  ds_put_format(actions,
> >                      "eth.dst = eth.src; "
> > @@ -18980,6 +18984,10 @@ ovnnb_db_run(struct northd_input *input_data,
> >      use_common_zone = smap_get_bool(input_data->nb_options, 
> > "use_common_zone",
> >                                      false);
> >
> > +    unicast_arp_responder = smap_get_bool(input_data->nb_options,
> > +                                          "unicast_arp_responder",
> > +                                          false);
> > +
> >      vxlan_mode = is_vxlan_mode(input_data->nb_options,
> >                                 input_data->sbrec_chassis_table);
> >
> > diff --git a/ovn-nb.xml b/ovn-nb.xml
> > index bf1f1628bd..7f8021667c 100644
> > --- a/ovn-nb.xml
> > +++ b/ovn-nb.xml
> > @@ -429,6 +429,15 @@
> >          </p>
> >        </column>
> >
> > +      <column name="options" key="unicast_arp_responder" type='{"type": 
> > "boolean"}'>
> > +        <p>
> > +          <code>ovn-northd</code> adds ARP responder flows for known IPs 
> > for
> > +          broadcast ARP request packets.  If this option is set, it will 
> > also
> > +          respond to unicast ARP requests.  By default this option is set 
> > to
> > +          <code>false</code>.
> > +        </p>
> > +      </column>
> > +
>
> I guess we also need a NEWS file entry for this new feature.

Ack.

Thanks
Numan

>
> >        <group title="Options for configuring interconnection route 
> > advertisement">
> >          <p>
> >            These options control how routes are advertised between OVN
> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > index e3a65e7a44..f0322a311c 100644
> > --- a/tests/ovn-northd.at
> > +++ b/tests/ovn-northd.at
> > @@ -9869,7 +9869,7 @@ AT_CLEANUP
> >  ])
> >
> >  OVN_FOR_EACH_NORTHD_NO_HV([
> > -AT_SETUP([check options:disable_arp_nd_rsp for LSP])
> > +AT_SETUP([ARP responder flows])
> >  ovn_start NORTHD_TYPE
> >  check ovn-nbctl ls-add S1
> >  check ovn-nbctl --wait=sb lsp-add S1 S1-vm1
> > @@ -9886,7 +9886,22 @@ AT_CHECK([grep -e "ls_in_arp_rsp" S1flows | 
> > ovn_strip_lflows], [0], [dnl
> >    table=??(ls_in_arp_rsp      ), priority=50   , match=(nd_ns_mcast && 
> > ip6.dst == ff02::1:ff00:10 && nd.target == fd00::10), action=(nd_na { 
> > eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; 
> > nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; 
> > };)
> >  ])
> >
> > -#Set the disable_arp_nd_rsp option and verify the flow
> > +check ovn-nbctl set NB_Global . options:unicast_arp_responder=true
> > +check ovn-nbctl --wait=sb sync
> > +
> > +ovn-sbctl dump-flows S1 > S1flows
> > +AT_CAPTURE_FILE([S1flows])
> > +
> > +AT_CHECK([grep -e "ls_in_arp_rsp" S1flows | ovn_strip_lflows], [0], [dnl
> > +  table=??(ls_in_arp_rsp      ), priority=0    , match=(1), action=(next;)
> > +  table=??(ls_in_arp_rsp      ), priority=100  , match=(arp.tpa == 
> > 192.168.0.10 && arp.op == 1 && inport == "S1-vm1"), action=(next;)
> > +  table=??(ls_in_arp_rsp      ), priority=100  , match=(nd_ns_mcast && 
> > ip6.dst == ff02::1:ff00:10 && nd.target == fd00::10 && inport == "S1-vm1"), 
> > action=(next;)
> > +  table=??(ls_in_arp_rsp      ), priority=50   , match=(arp.tpa == 
> > 192.168.0.10 && arp.op == 1), action=(eth.dst = eth.src; eth.src = 
> > 50:54:00:00:00:10; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = 
> > 50:54:00:00:00:10; arp.tpa = arp.spa; arp.spa = 192.168.0.10; outport = 
> > inport; flags.loopback = 1; output;)
> > +  table=??(ls_in_arp_rsp      ), priority=50   , match=(nd_ns_mcast && 
> > ip6.dst == ff02::1:ff00:10 && nd.target == fd00::10), action=(nd_na { 
> > eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; 
> > nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; 
> > };)
> > +])
> > +
> > +
> > +# Set the disable_arp_nd_rsp option and verify the flow
> >  check ovn-nbctl --wait=sb set logical_switch_port S1-vm1 
> > options:disable_arp_nd_rsp=true
> >
> >  ovn-sbctl dump-flows S1 > S1flows
>
> Thanks,
> Dumitru
>
>
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to