On Fri, Feb 6, 2026 at 3:31 PM Dumitru Ceara <[email protected]> wrote:

> On 2/6/26 3:20 PM, Erlon Rodrigues Cruz wrote:
> > Hi all,
> >
>
> Hi Erlon,



> > I want to make sure I’m correctly understanding the different points
> raised
> > so far, because I think some of the discussion is touching the same issue
> > from slightly different angles, and I may just be missing an implicit
> > context switch.
> >
> > Brian’s original question about whether we should bypass conntrack for
> > DHCPv4 (67/68) made me look again at the existing flows. As far as I can
> > tell, today DHCPv4 already bypasses conntrack for the multicast/broadcast
> > part of the exchange via the generic eth.mcast -> next rule, so at least
> > DHCPDISCOVER/DHCPOFFER never go through CT in the current pipeline.
> >
> > When Ales mentioned “allowing DHCP traffic to pass by default”, I
> initially
> > read that as baseline behavior, which confused me given the multicast
> > bypass above. On a second read, I think what was really meant there is
> > *allowing
> > DHCP to pass through CT when ACL CT translation is enabled*, which makes
> > sense in that context. The later comment about CT for DHCP traffic in
> > general seems to be in the same direction, but it still runs into the
> fact
> > that multicast traffic is currently excluded from CT altogether.
> >
> > That’s why my initial thought was to special-case DHCP when
> > acl_ct_translation is enabled. However, I agree with Dumitru’s point that
> > this doesn’t really address the broader issue: once translation requires
> > +ct.trk, *any* multicast UDP traffic will be incompatible with the
> existing
> > skip-CT multicast flows, not just DHCP.
> >
> > So my current understanding is:
> >
> >    -
> >
> >    With translation enabled, multicast-bypass and CT-based translation
> are
> >    fundamentally at odds.
> >    -
> >
> >    From a correctness standpoint, the only consistent solution is to
> >    disable (or override) the multicast skip-conntrack flows when
> translation
> >    is configured, even if that has a multicast performance cost that
> needs to
> >    be documented.
>
> Correct, this is the only way forward for now.
>
> >
> > Before I go further in that direction, I’d appreciate confirmation that
> > this matches everyone else’s understanding, or if I’m still missing an
> > important piece of the puzzle.
> >
>
> Your understanding matches mine.  Looking forward to the next revision.


> Regards,
> Dumitru
>

Hi Erlon and Dumitru,
sorry I have missed the email earlier.

But I also agree with both points and think this is the way to go.

Regards,
Ales


> > Thanks,
> > Erlon
> >
> > On Wed, Jan 28, 2026 at 7:11 AM Dumitru Ceara <[email protected]> wrote:
> >
> >> On 1/27/26 11:13 AM, Dumitru Ceara wrote:
> >>> On 1/27/26 7:38 AM, Ales Musil via dev wrote:
> >>>> On Tue, Jan 27, 2026 at 12:37 AM Brian Haley <[email protected]>
> >> wrote:
> >>>>
> >>>>> HI Ales,
> >>>>>
> >>>
> >>> Hi Erlon, Brian, Ales,
> >>>
> >>>>> Thanks for finding the issue, just one comment for you and Erlon.
> >>>>>
> >>>>> On 1/26/26 3:24 PM, Erlon Rodrigues Cruz via dev wrote:
> >>>>>> Hey Ales,
> >>>>>
> >>>>> <snip>
> >>>>>
> >>>>>>> I was able to figure out why is it happening, dunno if we should
> keep
> >>>>> the
> >>>>>>> test around
> >>>>>>> but it can be reproduced even with multicast UDP traffic so it
> might
> >> be
> >>>>>>> enough to adjust
> >>>>>>> the UDP test to also send multicast.  The following two logical
> flows
> >>>>> are
> >>>>>>> to "blame":
> >>>>>>>
> >>>>>>>
> >>>>>> Well, I think the broader the coverage, the better. The advantage of
> >>>>> having
> >>>>>> a DHCP client/server is that the traffic will contain all the
> nuances
> >> of
> >>>>>> real traffic, and we don't have to encode all protocol data. For
> >> example,
> >>>>>> the return traffic should be difficult to generate since it will not
> >> be
> >>>>>> multicast. I tried to reuse NETNS_START_DHCPD, but I encountered
> >> various
> >>>>>> permission errors and couldn't get it to work. Dnsmasq seems easier
> to
> >>>>> use
> >>>>>> and doesn't rely on an external config file.
> >>>>>>
> >>>>>>
> >>>>>>>          /* Do not send multicast packets to conntrack. */
> >>>>>>>          ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110,
> >>>>> "eth.mcast",
> >>>>>>>                        "next;", lflow_ref);
> >>>>>>>          ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 110,
> >>>>> "eth.mcast",
> >>>>>>>                        "next;", lflow_ref);
> >>>>>>>
> >>>>>>> We basically skip CT in case of multicast, and because the
> >> translation
> >>>>>>> adds new
> >>>>>>> prereq of +ct.trk we will never match that flow. I'm afraid that
> >> there
> >>>>>>> isn't any
> >>>>>>> easy/good solution. If we disable those flows we are imposing
> >>>>> performance
> >>>>>>> penalty on multicast traffic. If we say that's acceptable we need
> to
> >>>>>>> clearly
> >>>>>>> state that too in the documentation it's in the same ballpark with
> >>>>> "broken"
> >>>>>>> HW offload.
> >>>>>>>
> >>>>>>
> >>>>>> What if we discard it only for DHCP traffic? I got it passing with
> >>>>>> something like this:
> >>>>>>
> >>>>>>          if (acl_udp_ct_translation) {
> >>>>>>              ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 120,
> >>>>>>                            "eth.mcast && udp && "
> >>>>>>                            "(udp.src == 67 || udp.src == 68 || "
> >>>>>>                            "udp.dst == 67 || udp.dst == 68)",
> >>>>>>                            REGBIT_CONNTRACK_DEFRAG" = 1; next;",
> >>>>>>                            lflow_ref);
> >>>>>>          }
> >>>>>>
> >>>>>> This takes higher priority than the more general multicast rule, so
> >>>>>> performance should not be a problem if we apply it for DHCP.
> >>>>>
> >>>
> >>> Yeah, well, that is not good enough IMO, what if we are forwarding
> other
> >>> types of IP multicast (or broadcast) traffic, for other UDP ports?
> >>>
> >>>>> The other thing I noticed when looking at the code you highlighted
> was
> >>>>> this section right above it:
> >>>>>
> >>>>>           /* Ingress and Egress Pre-ACL Table (Priority 110).
> >>>>>            *
> >>>>>            * Not to do conntrack on ND and ICMP destination
> >>>>>            * unreachable packets. */
> >>>>>           ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110,
> >>>>>                         "nd || nd_rs || nd_ra || mldv1 || mldv2 || "
> >>>>>                         "(udp && udp.src == 546 && udp.dst == 547)",
> >>>>> "next;",
> >>>>>                         lflow_ref);
> >>>>>           ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 110,
> >>>>>                         "nd || nd_rs || nd_ra || mldv1 || mldv2 || "
> >>>>>                         "(udp && udp.src == 546 && udp.dst == 547)",
> >>>>> "next;",
> >>>>>                         lflow_ref);
> >>>>>
> >>>>> That seems to always allow DHCPv6 traffic with ports 546/547. I
> realize
> >>>>> Erlon's proposal would only apply if the new flag was set, but is
> there
> >>>>> a reason we shouldn't just bypass conntrack for IPv4 with ports 67/68
> >>>>> similarly?
> >>>>>
> >>>>
> >>>> That's a good point, I think we can allow DHCP traffic to pass by
> >> default.
> >>>> But we still have a problem with multicast traffic in general. Which
> >> might
> >>>> be broken by this change, so maybe we need to actually do both. Allow
> >>>> DHCP traffic by default and disable that rule when the translation is
> >>>> configured.
> >>>>
> >>>
> >>> I would change DHCPv4 skip-conntrack semantics as a separate patch
> >>> though.  It doesn't have much to do with what this patch is fixing.
> >>>
> >>> For now it seems that the only way out is to skip these flows that
> >>> bypass conntrack for multicast destinations when the translation is
> >>> configured.
> >>>
> >>
> >> Given that Ales had comments on this version and that we know now that
> >> we need update the logical flows to not bypass conntrack in case
> >> translation is enabled, I'm moving this patch to "Changes Requested" in
> >> patchwork.
> >>
> >> Looking forward to v3.
> >>
> >> Regards,
> >> Dumitru
> >>
> >>
> >
>
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to