On 2/6/26 3:20 PM, Erlon Rodrigues Cruz wrote:
> Hi all,
> 

Hi Erlon,

> I want to make sure I’m correctly understanding the different points raised
> so far, because I think some of the discussion is touching the same issue
> from slightly different angles, and I may just be missing an implicit
> context switch.
> 
> Brian’s original question about whether we should bypass conntrack for
> DHCPv4 (67/68) made me look again at the existing flows. As far as I can
> tell, today DHCPv4 already bypasses conntrack for the multicast/broadcast
> part of the exchange via the generic eth.mcast -> next rule, so at least
> DHCPDISCOVER/DHCPOFFER never go through CT in the current pipeline.
> 
> When Ales mentioned “allowing DHCP traffic to pass by default”, I initially
> read that as baseline behavior, which confused me given the multicast
> bypass above. On a second read, I think what was really meant there is
> *allowing
> DHCP to pass through CT when ACL CT translation is enabled*, which makes
> sense in that context. The later comment about CT for DHCP traffic in
> general seems to be in the same direction, but it still runs into the fact
> that multicast traffic is currently excluded from CT altogether.
> 
> That’s why my initial thought was to special-case DHCP when
> acl_ct_translation is enabled. However, I agree with Dumitru’s point that
> this doesn’t really address the broader issue: once translation requires
> +ct.trk, *any* multicast UDP traffic will be incompatible with the existing
> skip-CT multicast flows, not just DHCP.
> 
> So my current understanding is:
> 
>    -
> 
>    With translation enabled, multicast-bypass and CT-based translation are
>    fundamentally at odds.
>    -
> 
>    From a correctness standpoint, the only consistent solution is to
>    disable (or override) the multicast skip-conntrack flows when translation
>    is configured, even if that has a multicast performance cost that needs to
>    be documented.

Correct, this is the only way forward for now.

> 
> Before I go further in that direction, I’d appreciate confirmation that
> this matches everyone else’s understanding, or if I’m still missing an
> important piece of the puzzle.
> 

Your understanding matches mine.  Looking forward to the next revision.

Regards,
Dumitru

> Thanks,
> Erlon
> 
> On Wed, Jan 28, 2026 at 7:11 AM Dumitru Ceara <[email protected]> wrote:
> 
>> On 1/27/26 11:13 AM, Dumitru Ceara wrote:
>>> On 1/27/26 7:38 AM, Ales Musil via dev wrote:
>>>> On Tue, Jan 27, 2026 at 12:37 AM Brian Haley <[email protected]>
>> wrote:
>>>>
>>>>> HI Ales,
>>>>>
>>>
>>> Hi Erlon, Brian, Ales,
>>>
>>>>> Thanks for finding the issue, just one comment for you and Erlon.
>>>>>
>>>>> On 1/26/26 3:24 PM, Erlon Rodrigues Cruz via dev wrote:
>>>>>> Hey Ales,
>>>>>
>>>>> <snip>
>>>>>
>>>>>>> I was able to figure out why is it happening, dunno if we should keep
>>>>> the
>>>>>>> test around
>>>>>>> but it can be reproduced even with multicast UDP traffic so it might
>> be
>>>>>>> enough to adjust
>>>>>>> the UDP test to also send multicast.  The following two logical flows
>>>>> are
>>>>>>> to "blame":
>>>>>>>
>>>>>>>
>>>>>> Well, I think the broader the coverage, the better. The advantage of
>>>>> having
>>>>>> a DHCP client/server is that the traffic will contain all the nuances
>> of
>>>>>> real traffic, and we don't have to encode all protocol data. For
>> example,
>>>>>> the return traffic should be difficult to generate since it will not
>> be
>>>>>> multicast. I tried to reuse NETNS_START_DHCPD, but I encountered
>> various
>>>>>> permission errors and couldn't get it to work. Dnsmasq seems easier to
>>>>> use
>>>>>> and doesn't rely on an external config file.
>>>>>>
>>>>>>
>>>>>>>          /* Do not send multicast packets to conntrack. */
>>>>>>>          ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110,
>>>>> "eth.mcast",
>>>>>>>                        "next;", lflow_ref);
>>>>>>>          ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 110,
>>>>> "eth.mcast",
>>>>>>>                        "next;", lflow_ref);
>>>>>>>
>>>>>>> We basically skip CT in case of multicast, and because the
>> translation
>>>>>>> adds new
>>>>>>> prereq of +ct.trk we will never match that flow. I'm afraid that
>> there
>>>>>>> isn't any
>>>>>>> easy/good solution. If we disable those flows we are imposing
>>>>> performance
>>>>>>> penalty on multicast traffic. If we say that's acceptable we need to
>>>>>>> clearly
>>>>>>> state that too in the documentation it's in the same ballpark with
>>>>> "broken"
>>>>>>> HW offload.
>>>>>>>
>>>>>>
>>>>>> What if we discard it only for DHCP traffic? I got it passing with
>>>>>> something like this:
>>>>>>
>>>>>>          if (acl_udp_ct_translation) {
>>>>>>              ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 120,
>>>>>>                            "eth.mcast && udp && "
>>>>>>                            "(udp.src == 67 || udp.src == 68 || "
>>>>>>                            "udp.dst == 67 || udp.dst == 68)",
>>>>>>                            REGBIT_CONNTRACK_DEFRAG" = 1; next;",
>>>>>>                            lflow_ref);
>>>>>>          }
>>>>>>
>>>>>> This takes higher priority than the more general multicast rule, so
>>>>>> performance should not be a problem if we apply it for DHCP.
>>>>>
>>>
>>> Yeah, well, that is not good enough IMO, what if we are forwarding other
>>> types of IP multicast (or broadcast) traffic, for other UDP ports?
>>>
>>>>> The other thing I noticed when looking at the code you highlighted was
>>>>> this section right above it:
>>>>>
>>>>>           /* Ingress and Egress Pre-ACL Table (Priority 110).
>>>>>            *
>>>>>            * Not to do conntrack on ND and ICMP destination
>>>>>            * unreachable packets. */
>>>>>           ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110,
>>>>>                         "nd || nd_rs || nd_ra || mldv1 || mldv2 || "
>>>>>                         "(udp && udp.src == 546 && udp.dst == 547)",
>>>>> "next;",
>>>>>                         lflow_ref);
>>>>>           ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 110,
>>>>>                         "nd || nd_rs || nd_ra || mldv1 || mldv2 || "
>>>>>                         "(udp && udp.src == 546 && udp.dst == 547)",
>>>>> "next;",
>>>>>                         lflow_ref);
>>>>>
>>>>> That seems to always allow DHCPv6 traffic with ports 546/547. I realize
>>>>> Erlon's proposal would only apply if the new flag was set, but is there
>>>>> a reason we shouldn't just bypass conntrack for IPv4 with ports 67/68
>>>>> similarly?
>>>>>
>>>>
>>>> That's a good point, I think we can allow DHCP traffic to pass by
>> default.
>>>> But we still have a problem with multicast traffic in general. Which
>> might
>>>> be broken by this change, so maybe we need to actually do both. Allow
>>>> DHCP traffic by default and disable that rule when the translation is
>>>> configured.
>>>>
>>>
>>> I would change DHCPv4 skip-conntrack semantics as a separate patch
>>> though.  It doesn't have much to do with what this patch is fixing.
>>>
>>> For now it seems that the only way out is to skip these flows that
>>> bypass conntrack for multicast destinations when the translation is
>>> configured.
>>>
>>
>> Given that Ales had comments on this version and that we know now that
>> we need update the logical flows to not bypass conntrack in case
>> translation is enabled, I'm moving this patch to "Changes Requested" in
>> patchwork.
>>
>> Looking forward to v3.
>>
>> Regards,
>> Dumitru
>>
>>
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to