On Wed, Jun 04, 2025 at 02:49:15PM +0200, Ales Musil via dev wrote:
> On Mon, Jun 2, 2025 at 9:42 AM Xavier Simonart via dev <
> [email protected]> wrote:
> 
> > In HA, when bfd was going down then up for the higher priority gw,
> > new garp were not generated.
> >
> > Signed-off-by: Xavier Simonart <[email protected]>
> > ---
> >
> 
> Thank you Xavier,
> 
> there is one small nit below that I took care of during the merge.
> 
> 
> >  controller/pinctrl.c | 23 +++++++++--------------
> >  1 file changed, 9 insertions(+), 14 deletions(-)
> >
> > diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> > index e10ce06cf..b5b2bb7b5 100644
> > --- a/controller/pinctrl.c
> > +++ b/controller/pinctrl.c
> > @@ -242,7 +242,6 @@ static void send_garp_rarp_prepare(
> >      const struct ovsrec_bridge *,
> >      const struct sbrec_chassis *,
> >      const struct hmap *local_datapaths,
> > -    const struct sset *active_tunnels,
> >      const struct ovsrec_open_vswitch_table *ovs_table)
> >      OVS_REQUIRES(pinctrl_mutex);
> >  static void send_garp_rarp_run(struct rconn *swconn,
> > @@ -4099,8 +4098,7 @@ pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
> >      send_garp_rarp_prepare(ovnsb_idl_txn, sbrec_port_binding_by_datapath,
> >                             sbrec_port_binding_by_name,
> >                             sbrec_mac_binding_by_lport_ip, ecmp_nh_table,
> > -                           br_int, chassis, local_datapaths,
> > active_tunnels,
> > -                           ovs_table);
> > +                           br_int, chassis, local_datapaths, ovs_table);
> >      prepare_ipv6_ras(local_active_ports_ras, sbrec_port_binding_by_name);
> >      prepare_ipv6_prefixd(ovnsb_idl_txn, sbrec_port_binding_by_name,
> >                           local_active_ports_ipv6_pd, chassis,
> > @@ -6598,16 +6596,18 @@ consider_nat_address(struct ovsdb_idl_index
> > *sbrec_port_binding_by_name,
> >                       const struct sbrec_port_binding *pb,
> >                       struct sset *nat_address_keys,
> >                       const struct sbrec_chassis *chassis,
> > -                     const struct sset *active_tunnels,
> >                       struct shash *nat_addresses)
> >  {
> >      struct lport_addresses *laddrs = xmalloc(sizeof *laddrs);
> >      char *lport = NULL;
> > -    if (!extract_addresses_with_port(nat_address, laddrs, &lport)
> > +    const struct sbrec_port_binding *cr_pb = NULL;
> > +    int rc = extract_addresses_with_port(nat_address, laddrs, &lport);
> >
> 
> nit: bool rc
> 
> 
> > +    if (lport) {
> > +        cr_pb = lport_lookup_by_name(sbrec_port_binding_by_name, lport);
> > +    }
> > +    if (!rc
> >          || (!lport && !strcmp(pb->type, "patch"))
> > -        || (lport && !lport_is_chassis_resident(
> > -                sbrec_port_binding_by_name, chassis,
> > -                active_tunnels, lport))) {
> > +        || (lport && (!cr_pb || (cr_pb->chassis != chassis)))) {

Hi Xavier, Hi Ales,

i have concerns over this change.

Let me first try to summarize my understanding of the problem this is
trying to solve, because i am not fully sure about that yet.

I assume the scenario is:
* You have 2 gateway chassis connected to the same localnet
* There is an LRP with a HA_Chassis_Group that utilizes these two
  gateways
* When a gateway claims the LRP it sends out garp's towards to localnet
  port, to let the fabric know where to send traffic to

The issue this patch should address now (if i got it right):
* Let us assume gw1 has a higher priority than gw2, so gw1 sent out garps in 
the past
* Now there is a network partition between gw1 and gw2
* Since the bfd session between the gateways will die, both gw1 and gw2
  will assume they should be active
* Since gw2 was previously not active it sends garps, at least parts of
  the fabric now forward traffic to gw2 instead of gw1
* The network partition heals and the bfd sessions come back up
* gw2 will stop handling the LRP
* gw1 will NOT send garps, since it never "lost" the LRP from its
  perspective
* the network fabric will partially direct packets to the wrong chassis

My understanding of the patch is now that instead of using the bfd
signaling for garps, we just use it to claim ports in the southbound. We
then generate garps based on these southbound ports. That ensures that
either gw2 can not send garps because it does not reach southbound. Or
if it claims the LRP gw1 will notice, claim it back and this will
trigger sending the garps again.

Now my point of concern is that this means failover of LRPs is bound to
the southbound being up. If southbound is not up and gw1 would die, gw2
can not claim the port and will therefor not send garps, basically
causing a traffic interruption until southbound is restored.

This seems to be not the intended behaviour however, otherwise the
compute chassis would not need to have bfd sessions to the gateway
chassis and could just rely on the southbound port_binding entry.

If all of the above is not some general misunderstanding and this
problem realy exists then i would have the following idea for an
solution:
* Revert to the previous behaviour using lport_is_chassis_resident
* Reset the garp timers if any bfd session that is relevant for the LRP
  changed the status from down->up

Noticing when bfd changed from down->up would be by the following
condition for all interfaces that point to chassis that are part of the
HA_Chassis_Group:
`bfd_status:forwarding==true && bfd_status:flap_count != previous_flap_count`

With this we have a southbound free way of signaling when another
chassis might have tried to claim the LRP from its local perspective.

While i have a new version for
https://patchwork.ozlabs.org/project/ovn/list/?series=459489 ready i
would not post it until this is clarified.

Thanks a lot,
Felix


> >          destroy_lport_addresses(laddrs);
> >          free(laddrs);
> >          free(lport);
> > @@ -6635,7 +6635,6 @@ get_nat_addresses_and_keys(struct ovsdb_idl_index
> > *sbrec_port_binding_by_name,
> >                             struct sset *nat_address_keys,
> >                             struct sset *local_l3gw_ports,
> >                             const struct sbrec_chassis *chassis,
> > -                           const struct sset *active_tunnels,
> >                             struct shash *nat_addresses)
> >  {
> >      const char *gw_port;
> > @@ -6652,7 +6651,6 @@ get_nat_addresses_and_keys(struct ovsdb_idl_index
> > *sbrec_port_binding_by_name,
> >                  consider_nat_address(sbrec_port_binding_by_name,
> >                                       pb->nat_addresses[i], pb,
> >                                       nat_address_keys, chassis,
> > -                                     active_tunnels,
> >                                       nat_addresses);
> >              }
> >          } else {
> > @@ -6664,7 +6662,6 @@ get_nat_addresses_and_keys(struct ovsdb_idl_index
> > *sbrec_port_binding_by_name,
> >                  consider_nat_address(sbrec_port_binding_by_name,
> >                                       nat_addresses_options, pb,
> >                                       nat_address_keys, chassis,
> > -                                     active_tunnels,
> >                                       nat_addresses);
> >              }
> >          }
> > @@ -6773,7 +6770,6 @@ send_garp_rarp_prepare(struct ovsdb_idl_txn
> > *ovnsb_idl_txn,
> >                         const struct ovsrec_bridge *br_int,
> >                         const struct sbrec_chassis *chassis,
> >                         const struct hmap *local_datapaths,
> > -                       const struct sset *active_tunnels,
> >                         const struct ovsrec_open_vswitch_table *ovs_table)
> >      OVS_REQUIRES(pinctrl_mutex)
> >  {
> > @@ -6809,8 +6805,7 @@ send_garp_rarp_prepare(struct ovsdb_idl_txn
> > *ovnsb_idl_txn,
> >
> >      get_nat_addresses_and_keys(sbrec_port_binding_by_name,
> >                                 &nat_ip_keys, &local_l3gw_ports,
> > -                               chassis, active_tunnels,
> > -                               &nat_addresses);
> > +                               chassis, &nat_addresses);
> >
> >      /* For deleted ports and deleted nat ips, remove from
> >       * send_garp_rarp_data. */
> > --
> > 2.47.1
> >
> > _______________________________________________
> > dev mailing list
> > [email protected]
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
> >
> Regards,
> Ales
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to