On Thu, Jun 19, 2025 at 2:11 PM Dumitru Ceara <dce...@redhat.com> wrote:
> Hi Ales, > Thank you Dumitru, > > On 6/18/25 7:33 AM, Ales Musil wrote: > > The function behavior would be unexpected in cases when there is > > BFD flap between two gateway chassis. This would result in two > > chassis claiming that the port binding should be claimed by them. > > Nit: "claiming that .. should be claimed by them" sounds a bit weird to > me. Maybe "two chassis racing to claim the same port binding"? > that sounds better, thanks. > > > This leads to multiple problems like unsent gARPs. The gARP problem > > on its own was solved [0], but unfortunately reintroduced [1] in > > slightly different form. > > > > The conclusion from the investigation is that the function in > > the current form is easy to misuse. To prevent that use > > only SB state to determine which chassis has actually claimed > > the port binding. This is the desired behavior in cases where > > this function was used previously. > > > > The code that determines which chassis should claim port binding > > based on BFD status remains unchanged, thus the behavior during > > failover should be the same with minimizing the impact for BFD > > flapping cases. > > > > [0] 289ec19b01ad ("pinctrl: Fix missing garp.") > > [1] 05527bd6ccdb ("controller: Extract garp_rarp to engine node."). > > > > I was initially thinking of asking for a test but (as you pointed out > offline) we already have one, in the multinode test suite. It might be > worth pointing that out in the commit message too. > > > Signed-off-by: Ales Musil <amu...@redhat.com> > > --- > > controller/garp_rarp.c | 8 +++----- > > controller/lport.c | 19 +++++++------------ > > controller/lport.h | 3 --- > > controller/ovn-controller.c | 11 ++--------- > > controller/physical.c | 3 +-- > > controller/pinctrl.c | 22 ++++++++-------------- > > controller/pinctrl.h | 1 - > > controller/route.c | 5 +---- > > controller/route.h | 2 -- > > 9 files changed, 22 insertions(+), 52 deletions(-) > > > > diff --git a/controller/garp_rarp.c b/controller/garp_rarp.c > > index ef377e26b..551d8303f 100644 > > --- a/controller/garp_rarp.c > > +++ b/controller/garp_rarp.c > > @@ -17,6 +17,7 @@ > > #include <config.h> > > > > #include "controller/local_data.h" > > +#include "lport.h" > > #include "mac-binding-index.h" > > #include "openvswitch/hmap.h" > > #include "openvswitch/vlog.h" > > @@ -147,11 +148,7 @@ consider_nat_address(struct ovsdb_idl_index > *sbrec_port_binding_by_name, > > { > > struct lport_addresses *laddrs = xmalloc(sizeof *laddrs); > > char *lport = NULL; > > - const struct sbrec_port_binding *cr_pb = NULL; > > bool rc = extract_addresses_with_port(nat_address, laddrs, &lport); > > - if (lport) { > > - cr_pb = lport_lookup_by_name(sbrec_port_binding_by_name, lport); > > - } > > if (!rc > > || (!lport && !strcmp(pb->type, "patch"))) { > > destroy_lport_addresses(laddrs); > > @@ -160,7 +157,8 @@ consider_nat_address(struct ovsdb_idl_index > *sbrec_port_binding_by_name, > > return; > > } > > if (lport) { > > - if (!cr_pb || (cr_pb->chassis != chassis)) { > > + if (!lport_is_chassis_resident(sbrec_port_binding_by_name, > > + chassis, lport)) { > > sset_add(non_local_lports, lport); > > destroy_lport_addresses(laddrs); > > free(laddrs); > > diff --git a/controller/lport.c b/controller/lport.c > > index 92de375b5..1ea57a5ca 100644 > > --- a/controller/lport.c > > +++ b/controller/lport.c > > @@ -65,35 +65,30 @@ lport_lookup_by_key(struct ovsdb_idl_index > *sbrec_datapath_binding_by_key, > > > > bool > > lport_pb_is_chassis_resident(const struct sbrec_chassis *chassis, > > - const struct sset *active_tunnels, > > const struct sbrec_port_binding *pb) > > { > > if (!pb || !pb->chassis) { > > return false; > > } > > - if (strcmp(pb->type, "chassisredirect")) { > > - return pb->chassis == chassis; > > - } else { > > - return ha_chassis_group_is_active(pb->ha_chassis_group, > > - active_tunnels, chassis); > > - } > > + > > + /* Note we relay on SB to provide the information, this is needed > in case > > + * of flapping BFD when it's not detected by one side. */ > > Typo: s/relay/rely/ > > Nit: the comment makes sense when seen in the diff, together with the > removed lines, but in the resulting code it might make readers wonder > what it means. Maybe we should change it a bit, e.g.: > > Note: we rely on SB to provide information about who owns the port > binding. In particular, for chassisredirect ports, this avoids issues > when the underlying BFD state changes are only detected by some of the > chassis in the associated HA_Chassis_Group. > I have adjusted the comment. > > > + return pb->chassis == chassis; > > } > > > > The rest of the patch looks good to me. So, with the comment and commit > message fixed up: > > Acked-by: Dumitru Ceara <dce...@redhat.com> > > Regards, > Dumitru > > > I have addressed both comments, went ahead and merged this into main. Regards, Ales _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev