Hi all, I want to make sure I’m correctly understanding the different points raised so far, because I think some of the discussion is touching the same issue from slightly different angles, and I may just be missing an implicit context switch.
Brian’s original question about whether we should bypass conntrack for DHCPv4 (67/68) made me look again at the existing flows. As far as I can tell, today DHCPv4 already bypasses conntrack for the multicast/broadcast part of the exchange via the generic eth.mcast -> next rule, so at least DHCPDISCOVER/DHCPOFFER never go through CT in the current pipeline. When Ales mentioned “allowing DHCP traffic to pass by default”, I initially read that as baseline behavior, which confused me given the multicast bypass above. On a second read, I think what was really meant there is *allowing DHCP to pass through CT when ACL CT translation is enabled*, which makes sense in that context. The later comment about CT for DHCP traffic in general seems to be in the same direction, but it still runs into the fact that multicast traffic is currently excluded from CT altogether. That’s why my initial thought was to special-case DHCP when acl_ct_translation is enabled. However, I agree with Dumitru’s point that this doesn’t really address the broader issue: once translation requires +ct.trk, *any* multicast UDP traffic will be incompatible with the existing skip-CT multicast flows, not just DHCP. So my current understanding is: - With translation enabled, multicast-bypass and CT-based translation are fundamentally at odds. - From a correctness standpoint, the only consistent solution is to disable (or override) the multicast skip-conntrack flows when translation is configured, even if that has a multicast performance cost that needs to be documented. Before I go further in that direction, I’d appreciate confirmation that this matches everyone else’s understanding, or if I’m still missing an important piece of the puzzle. Thanks, Erlon On Wed, Jan 28, 2026 at 7:11 AM Dumitru Ceara <[email protected]> wrote: > On 1/27/26 11:13 AM, Dumitru Ceara wrote: > > On 1/27/26 7:38 AM, Ales Musil via dev wrote: > >> On Tue, Jan 27, 2026 at 12:37 AM Brian Haley <[email protected]> > wrote: > >> > >>> HI Ales, > >>> > > > > Hi Erlon, Brian, Ales, > > > >>> Thanks for finding the issue, just one comment for you and Erlon. > >>> > >>> On 1/26/26 3:24 PM, Erlon Rodrigues Cruz via dev wrote: > >>>> Hey Ales, > >>> > >>> <snip> > >>> > >>>>> I was able to figure out why is it happening, dunno if we should keep > >>> the > >>>>> test around > >>>>> but it can be reproduced even with multicast UDP traffic so it might > be > >>>>> enough to adjust > >>>>> the UDP test to also send multicast. The following two logical flows > >>> are > >>>>> to "blame": > >>>>> > >>>>> > >>>> Well, I think the broader the coverage, the better. The advantage of > >>> having > >>>> a DHCP client/server is that the traffic will contain all the nuances > of > >>>> real traffic, and we don't have to encode all protocol data. For > example, > >>>> the return traffic should be difficult to generate since it will not > be > >>>> multicast. I tried to reuse NETNS_START_DHCPD, but I encountered > various > >>>> permission errors and couldn't get it to work. Dnsmasq seems easier to > >>> use > >>>> and doesn't rely on an external config file. > >>>> > >>>> > >>>>> /* Do not send multicast packets to conntrack. */ > >>>>> ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110, > >>> "eth.mcast", > >>>>> "next;", lflow_ref); > >>>>> ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 110, > >>> "eth.mcast", > >>>>> "next;", lflow_ref); > >>>>> > >>>>> We basically skip CT in case of multicast, and because the > translation > >>>>> adds new > >>>>> prereq of +ct.trk we will never match that flow. I'm afraid that > there > >>>>> isn't any > >>>>> easy/good solution. If we disable those flows we are imposing > >>> performance > >>>>> penalty on multicast traffic. If we say that's acceptable we need to > >>>>> clearly > >>>>> state that too in the documentation it's in the same ballpark with > >>> "broken" > >>>>> HW offload. > >>>>> > >>>> > >>>> What if we discard it only for DHCP traffic? I got it passing with > >>>> something like this: > >>>> > >>>> if (acl_udp_ct_translation) { > >>>> ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 120, > >>>> "eth.mcast && udp && " > >>>> "(udp.src == 67 || udp.src == 68 || " > >>>> "udp.dst == 67 || udp.dst == 68)", > >>>> REGBIT_CONNTRACK_DEFRAG" = 1; next;", > >>>> lflow_ref); > >>>> } > >>>> > >>>> This takes higher priority than the more general multicast rule, so > >>>> performance should not be a problem if we apply it for DHCP. > >>> > > > > Yeah, well, that is not good enough IMO, what if we are forwarding other > > types of IP multicast (or broadcast) traffic, for other UDP ports? > > > >>> The other thing I noticed when looking at the code you highlighted was > >>> this section right above it: > >>> > >>> /* Ingress and Egress Pre-ACL Table (Priority 110). > >>> * > >>> * Not to do conntrack on ND and ICMP destination > >>> * unreachable packets. */ > >>> ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110, > >>> "nd || nd_rs || nd_ra || mldv1 || mldv2 || " > >>> "(udp && udp.src == 546 && udp.dst == 547)", > >>> "next;", > >>> lflow_ref); > >>> ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 110, > >>> "nd || nd_rs || nd_ra || mldv1 || mldv2 || " > >>> "(udp && udp.src == 546 && udp.dst == 547)", > >>> "next;", > >>> lflow_ref); > >>> > >>> That seems to always allow DHCPv6 traffic with ports 546/547. I realize > >>> Erlon's proposal would only apply if the new flag was set, but is there > >>> a reason we shouldn't just bypass conntrack for IPv4 with ports 67/68 > >>> similarly? > >>> > >> > >> That's a good point, I think we can allow DHCP traffic to pass by > default. > >> But we still have a problem with multicast traffic in general. Which > might > >> be broken by this change, so maybe we need to actually do both. Allow > >> DHCP traffic by default and disable that rule when the translation is > >> configured. > >> > > > > I would change DHCPv4 skip-conntrack semantics as a separate patch > > though. It doesn't have much to do with what this patch is fixing. > > > > For now it seems that the only way out is to skip these flows that > > bypass conntrack for multicast destinations when the translation is > > configured. > > > > Given that Ales had comments on this version and that we know now that > we need update the logical flows to not bypass conntrack in case > translation is enabled, I'm moving this patch to "Changes Requested" in > patchwork. > > Looking forward to v3. > > Regards, > Dumitru > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
