On Fri, Jun 12, 2026 at 8:28 AM Dumitru Ceara <[email protected]> wrote:

> On 6/12/26 12:15 PM, Matteo Perin wrote:
> > On Fri, 12 Jun 2026 at 10:07, Dumitru Ceara <[email protected]> wrote:
> >
> >> I have a few questions, not necessarily on the implementation but more
> >> on the support choice we're making here.
> >>
> >>> On Wed, Jun 10, 2026 at 9:40 AM Matteo Perin via dev
> >>> <[email protected]> wrote:
> >>>>
> >>>> By default OVN "fails open" for load balancer health checks: a backend
> >>>> whose service monitor status is not yet known (empty) is treated as
> >>>> available, so traffic is forwarded to it while the first health check
> >>>> probes are still pending. This avoids disrupting traffic when health
> >>>> checks are first enabled, but it also means traffic can be sent to
> >>
> >> I really wonder if this is a valid use case we should care about.  If
> >> I'm reading this correctly you're saying that the CMS would configure a
> >> load balancer (no health check config), then some traffic passes through
> >> the LB, then way later the CMS adds the health check config to the LB,
> >> right?
> >>
> >> I agree, traffic would get disrupted if we make "fail closed" the
> default.
> >>
> >> But OTOH, is this really something that will happen in practice?  The
> >> CMS either configures LBs without health check or configures them with
> >> health check from the beginning.
> >
> >
> > Hi Dumitru,
> >
>
> Hi Matteo,
>
> > I agree, I was not trying to make a point for the current default
> behavior
> > with
> > the CMS example, quite the opposite, actually.
> >
> > To give you the full picture, this feature request is coming from our
> > internal
> > LXD team.
> > In LXD, OVN load balancers are already supported but they are about to
> add
> > a LB pools concept. You can, of course, configure health checks on those
> > pools.
> > When adding instances (VMs and/or containers) to a pool, LXD configures
> the
> > OVN LB accordingly.
> > However, to stress this point, a user may want to add an instance to a
> pool
> > which does not yet have any workload running inside on the port the LB is
> > targeting and we do not want to forward traffic to these (yet).
> >
> > We only care about the "fail closed" scenario to avoid useless/unwanted
> > traffic.
> >
>
> Thanks for the additional context!
>
> >>> backends that have never been successfully probed during the initial
> >>>> monitoring grace period.
> >>>>
> >>
> >> To be honest, I consider this to be more of a bug in the existing health
> >> check implementation.  I can't really imagine a good reason for blindly
> >> sending traffic towards a backend when HC is enabled but we couldn't
> >> probe the status of the backend yet.
> >>
> >>>> Add a new "forward_unknown" key to the options column of the
> >>>> Load_Balancer_Health_Check table. When set to "false", a backend with
> >>>> an unknown (empty) service monitor status is treated as unavailable
> and
> >>>> excluded from the load balancer backends until its first successful
> >>>> probe. The default ("true") preserves the existing fail-open behavior,
> >>>> so this change is fully backward compatible and opt-in.
> >>>>
> >>>> No schema change is required as this reuses the existing options map.
> >>>>
> >>>> Signed-off-by: Matteo Perin <[email protected]>
> >>>> ---
> >>>> This patch is a proposal for a simple addition to the LB health check
> >>>> configuration.
> >>>>
> >>
> >> So, I guess, my question is: should we consider _not_ adding yet another
> >> knob and just fixing the behavior for all cases?
> >
> >
> > I considered this myself but did not go through with changing the default
> > because
> > I have not taken part in the discussion about the original implementation
> > (even
> > though I looked for references about this design choice, without finding
> > any) and
> > I did not want to preclude any (I agree, arguably not something really
> seen
> > in
> > practise) "on the fly" enablement of LB health checks in HA environments
> > where
> > you may not want to disrupt the traffic even in the initial discovery
> > period.
> >
> > That said, after reading your points, I would be in favor of making the
> > proposed
> > behavior the default, given the peculiarity of the above case.
> >
> > If you confirm that this is acceptable and the community opinion is
> > positive on
> > this, I will rework the patch to make the fail-closed behavior the
> default
> > one and
> > drop the added configuration option.
>
> I'm adding Numan to the discussion as he's the original author of the
> Health Check feature:
>
> https://github.com/ovn-org/ovn/commit/ba0d6ea


I'm sorry.  I'm not able to recall why I did it this way.
But I totally agree to NOT add another option and I agree with Dumitru.
I have not looked into this patch but I think it makes sense
for ovn-northd to set the state of the service monitor to offline when it
creates
the row in SB DB.

Numan



>
> >
> > Best Regards,
> > Matteo
> >
>
> Regards,
> Dumitru
>
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to