On Wed, Jun 04, 2025 at 02:49:15PM +0200, Ales Musil via dev wrote: > On Mon, Jun 2, 2025 at 9:42 AM Xavier Simonart via dev < > [email protected]> wrote: > > > In HA, when bfd was going down then up for the higher priority gw, > > new garp were not generated. > > > > Signed-off-by: Xavier Simonart <[email protected]> > > --- > > > > Thank you Xavier, > > there is one small nit below that I took care of during the merge. > > > > controller/pinctrl.c | 23 +++++++++-------------- > > 1 file changed, 9 insertions(+), 14 deletions(-) > > > > diff --git a/controller/pinctrl.c b/controller/pinctrl.c > > index e10ce06cf..b5b2bb7b5 100644 > > --- a/controller/pinctrl.c > > +++ b/controller/pinctrl.c > > @@ -242,7 +242,6 @@ static void send_garp_rarp_prepare( > > const struct ovsrec_bridge *, > > const struct sbrec_chassis *, > > const struct hmap *local_datapaths, > > - const struct sset *active_tunnels, > > const struct ovsrec_open_vswitch_table *ovs_table) > > OVS_REQUIRES(pinctrl_mutex); > > static void send_garp_rarp_run(struct rconn *swconn, > > @@ -4099,8 +4098,7 @@ pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn, > > send_garp_rarp_prepare(ovnsb_idl_txn, sbrec_port_binding_by_datapath, > > sbrec_port_binding_by_name, > > sbrec_mac_binding_by_lport_ip, ecmp_nh_table, > > - br_int, chassis, local_datapaths, > > active_tunnels, > > - ovs_table); > > + br_int, chassis, local_datapaths, ovs_table); > > prepare_ipv6_ras(local_active_ports_ras, sbrec_port_binding_by_name); > > prepare_ipv6_prefixd(ovnsb_idl_txn, sbrec_port_binding_by_name, > > local_active_ports_ipv6_pd, chassis, > > @@ -6598,16 +6596,18 @@ consider_nat_address(struct ovsdb_idl_index > > *sbrec_port_binding_by_name, > > const struct sbrec_port_binding *pb, > > struct sset *nat_address_keys, > > const struct sbrec_chassis *chassis, > > - const struct sset *active_tunnels, > > struct shash *nat_addresses) > > { > > struct lport_addresses *laddrs = xmalloc(sizeof *laddrs); > > char *lport = NULL; > > - if (!extract_addresses_with_port(nat_address, laddrs, &lport) > > + const struct sbrec_port_binding *cr_pb = NULL; > > + int rc = extract_addresses_with_port(nat_address, laddrs, &lport); > > > > nit: bool rc > > > > + if (lport) { > > + cr_pb = lport_lookup_by_name(sbrec_port_binding_by_name, lport); > > + } > > + if (!rc > > || (!lport && !strcmp(pb->type, "patch")) > > - || (lport && !lport_is_chassis_resident( > > - sbrec_port_binding_by_name, chassis, > > - active_tunnels, lport))) { > > + || (lport && (!cr_pb || (cr_pb->chassis != chassis)))) {
Hi Xavier, Hi Ales, i have concerns over this change. Let me first try to summarize my understanding of the problem this is trying to solve, because i am not fully sure about that yet. I assume the scenario is: * You have 2 gateway chassis connected to the same localnet * There is an LRP with a HA_Chassis_Group that utilizes these two gateways * When a gateway claims the LRP it sends out garp's towards to localnet port, to let the fabric know where to send traffic to The issue this patch should address now (if i got it right): * Let us assume gw1 has a higher priority than gw2, so gw1 sent out garps in the past * Now there is a network partition between gw1 and gw2 * Since the bfd session between the gateways will die, both gw1 and gw2 will assume they should be active * Since gw2 was previously not active it sends garps, at least parts of the fabric now forward traffic to gw2 instead of gw1 * The network partition heals and the bfd sessions come back up * gw2 will stop handling the LRP * gw1 will NOT send garps, since it never "lost" the LRP from its perspective * the network fabric will partially direct packets to the wrong chassis My understanding of the patch is now that instead of using the bfd signaling for garps, we just use it to claim ports in the southbound. We then generate garps based on these southbound ports. That ensures that either gw2 can not send garps because it does not reach southbound. Or if it claims the LRP gw1 will notice, claim it back and this will trigger sending the garps again. Now my point of concern is that this means failover of LRPs is bound to the southbound being up. If southbound is not up and gw1 would die, gw2 can not claim the port and will therefor not send garps, basically causing a traffic interruption until southbound is restored. This seems to be not the intended behaviour however, otherwise the compute chassis would not need to have bfd sessions to the gateway chassis and could just rely on the southbound port_binding entry. If all of the above is not some general misunderstanding and this problem realy exists then i would have the following idea for an solution: * Revert to the previous behaviour using lport_is_chassis_resident * Reset the garp timers if any bfd session that is relevant for the LRP changed the status from down->up Noticing when bfd changed from down->up would be by the following condition for all interfaces that point to chassis that are part of the HA_Chassis_Group: `bfd_status:forwarding==true && bfd_status:flap_count != previous_flap_count` With this we have a southbound free way of signaling when another chassis might have tried to claim the LRP from its local perspective. While i have a new version for https://patchwork.ozlabs.org/project/ovn/list/?series=459489 ready i would not post it until this is clarified. Thanks a lot, Felix > > destroy_lport_addresses(laddrs); > > free(laddrs); > > free(lport); > > @@ -6635,7 +6635,6 @@ get_nat_addresses_and_keys(struct ovsdb_idl_index > > *sbrec_port_binding_by_name, > > struct sset *nat_address_keys, > > struct sset *local_l3gw_ports, > > const struct sbrec_chassis *chassis, > > - const struct sset *active_tunnels, > > struct shash *nat_addresses) > > { > > const char *gw_port; > > @@ -6652,7 +6651,6 @@ get_nat_addresses_and_keys(struct ovsdb_idl_index > > *sbrec_port_binding_by_name, > > consider_nat_address(sbrec_port_binding_by_name, > > pb->nat_addresses[i], pb, > > nat_address_keys, chassis, > > - active_tunnels, > > nat_addresses); > > } > > } else { > > @@ -6664,7 +6662,6 @@ get_nat_addresses_and_keys(struct ovsdb_idl_index > > *sbrec_port_binding_by_name, > > consider_nat_address(sbrec_port_binding_by_name, > > nat_addresses_options, pb, > > nat_address_keys, chassis, > > - active_tunnels, > > nat_addresses); > > } > > } > > @@ -6773,7 +6770,6 @@ send_garp_rarp_prepare(struct ovsdb_idl_txn > > *ovnsb_idl_txn, > > const struct ovsrec_bridge *br_int, > > const struct sbrec_chassis *chassis, > > const struct hmap *local_datapaths, > > - const struct sset *active_tunnels, > > const struct ovsrec_open_vswitch_table *ovs_table) > > OVS_REQUIRES(pinctrl_mutex) > > { > > @@ -6809,8 +6805,7 @@ send_garp_rarp_prepare(struct ovsdb_idl_txn > > *ovnsb_idl_txn, > > > > get_nat_addresses_and_keys(sbrec_port_binding_by_name, > > &nat_ip_keys, &local_l3gw_ports, > > - chassis, active_tunnels, > > - &nat_addresses); > > + chassis, &nat_addresses); > > > > /* For deleted ports and deleted nat ips, remove from > > * send_garp_rarp_data. */ > > -- > > 2.47.1 > > > > _______________________________________________ > > dev mailing list > > [email protected] > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > > Regards, > Ales > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
