On 4/11/25 12:40 PM, Ales Musil wrote: > On Thu, Apr 10, 2025 at 11:43 AM Dumitru Ceara <dce...@redhat.com> wrote: > >> Since commit f2e8130d0a42 ("northd: Explicitly handle SNAT for ICMP >> need frag."), if LRP.gw_mtu is set, ovn-northd adds a priority 160 >> logical flow in the lr_in_larger_pkts router pipeline that matches >> on "ct.trk && ct.rpl && ct.dnat" for egress traffic that's larger >> than the configured MTU (if REGBIT_PKT_LARGER] == 1): >> >> table=23(lr_in_larger_pkts ), priority=160, >> match=(inport == "lrp1" && outport == "lrp0" && ip4 && >> reg9[1] && reg9[0] == 0 && ct.trk && ct.rpl && ct.dnat), >> action=(icmp4_error {...};) >> >> The lr_in_larger_pkts logical pipeline stage (23) is translated by >> ovn-controller to OpenFlow table 31. >> >> table=31, priority=160, >> ct_state=+rpl+trk+dnat,ip,reg9=0x2/0x3,reg14=0x2,reg15=0x1,metadata=0x3 >> actions=controller(...) >> >> Due to the way ovs-vswitchd translates OF rules to datapath flows, >> all OpenFlow rules in table 31 that have a priority lower than 160 >> automatically get an implicit match on the inverse ct_state match >> that's generated for the priority 160 OF rule, that is: >> >> ct_state=-trk-rpl-dnat >> >> On specific NICs, such matches (on dnat/snat ct_state) are not >> offloadable to hardware. >> >> However, in OVN, logical switches and logical routers pipelines both >> get translated to the same set of OpenFlow tables. In theory that's >> fine because the logical datapath distinction happens thanks to >> additional metadata field matches (the metadata OF field is actually >> logical datapath id). But in this specific case it means that all >> logical switch traffic that hits OpenFlow table 31 will also generate >> a datapath flow that matches on ct_state=-trk-rpl-dnat, making it >> non-offloadable. >> >> E.g., in an OVN cluster with a logical switch that has no stateful >> config (no stateful ACLs or LBs) traffic hitting the following >> logical switch pipeline stage will also fail to be offloaded to >> hardware if there's a logical router with gw_mtu configured: >> >> table=23(ls_in_dhcp_options ), priority=0, match=(1), action=(next;) >> >> That is essentially all traffic forwarded on that logical switch. >> >> Change the way we match on large packets that have been DNATed and >> instead of directly matching on ct_state first save that into a >> register (using the new ct_state_save() logical action). >> >> That means that with the configuration mentioned above: >> - all routed traffic that is less than MTU will now be forwarded by >> a datapath flow matching on ct_state=-trk >> - all switched traffic will be forwarded by a datapath flow matching on >> ct_state=-trk >> >> In order to not disrupt upgrades on stable branches generate the new >> logical flows only if the ct_state_save() action is supported by all >> ovn-controllers. >> >> Reported-at: https://issues.redhat.com/browse/FDP-1271 >> Fixes: f2e8130d0a42 ("northd: Explicitly handle SNAT for ICMP need frag.") >> Signed-off-by: Dumitru Ceara <dce...@redhat.com> >> --- >> > > Hi Dumitru, > > thank you for the patch. I have one small comment down below. >
Hi Ales, Thanks for the review! >> + /* Additional match at egress on tracked and reply and dnat-ed >> traffic. */ >> + char *ct_match = features->ct_state_save >> + ? op->od->nbr->n_ports > 1 >> + ? xasprintf("%s && %s && %s", >> + reg_ct_state[CS_TRACKED], >> + reg_ct_state[CS_REPLY_DIR], >> + reg_ct_state[CS_DST_NAT]) >> + : NULL >> + : xstrdup("ct.trk && ct.rpl && ct.dnat"); >> > > This is a bit hard to read, I think we don't need the extra > port check. > > You're right, I changed this in v2 and dropped the micro-optimization. It was making the code hard to read indeed. [...] >> >> > With that addressed: > Acked-by: Ales Musil <amu...@redhat.com> > > Thanks, > Ales > It would be great if you could have another quick look at the new revision: https://patchwork.ozlabs.org/project/ovn/list/?series=452578&state=* Thanks, Dumitru _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev