On 11/11/24 04:39, Numan Siddique wrote: > On Fri, Nov 8, 2024 at 3:09 PM Dumitru Ceara <[email protected]> wrote: >> >> On 11/7/24 21:59, [email protected] wrote: >>> From: Numan Siddique <[email protected]> >>> >>> If we have a UDP load balancer - 10.0.0.10:80 = 10.0.0.3:8080, in order to >>> determine if the load balanced traffic needs to be hairpinned, the >>> vip - 10.0.0.10 and the vip port - 80 are stored in the registers before >>> the packet is load balanced using the below logical flow - >>> >>> table=6 (ls_in_pre_stateful ), priority=120 , >>> match=(reg0[2] == 1 && ip4.dst == 10.0.0.10 && tcp.dst == 80), >>> action=(reg1 = 10.0.0.10; reg2[0..15] = 80; ct_lb_mark;) >>> >>> These registers are used in the later stages to check if the load balanced >>> packet needs to be hairpinned or not. >>> >>> However, if the packet is fragmented we may not be able to match on the >>> L4 fields (tcp, udp or sctp dest port) and this breaks the hairpin >>> traffic. >>> >>> This patch addressed this issue by making use of ct_nw_dst/ct_ip6_dst and >>> ct_tp_dst conntrack fields to determine the hairpin load balanced >>> traffic. >>> >>> In order to not break hardware offload on certain smart nics, care is taken >>> to match on these fields only for fragmented packets. >>> >>> Note: Relying on conntrack to reassemble packets is not exactly correct, it >>> only accidentaly works with the kernel datapath. In our internal bug >>> tracking system we have this issue to track this incorrect assumption: >>> https://issues.redhat.com/browse/FDP-913 >>> >>> Reported-at: https://issues.redhat.com/browse/FDP-905 >>> Fixes: 1139b655c996 ("Don't blindly save original dst IP and Port to avoid >>> megaflow unwildcarding.") >>> CC: Han Zhou <[email protected]> >>> Suggested-by: Dumitru Ceara <[email protected]> >>> Signed-off-by: Numan Siddique <[email protected]> >>> --- >>> >>> v1 -> v2 >>> ------ >>> * Addressed review comments from Dumitru. >> >> Hi Numan, >> >> From my perspective this looks good. I did trigger a recheck of the >> ovn-kubernetes CI jobs (one was timing out but I think that's just an >> unrelated flake). >> >> I left a few small comments below which can be addressed at merge time. >> With them addressed: >> >> Acked-by: Dumitru Ceara <[email protected]> > > Thanks for the review and the comments. I addressed the below > comments and applied to main and backported to branch-24.09, 24.03 and > 23.09. > > Numan >
I had a closer look and I realized we can probably avoid using new registers for this case. I posted a patch: https://patchwork.ozlabs.org/project/ovn/list/?series=432242&state=* For the long term maybe we can find a generic way of doing this, e.g.: - push xxreg to the stack - resubmit into the "result table" where we store the result into xxreg (or part of it) - copy xxreg (or part of it) to the destination register - pop xxreg I didn't do that yet because it still has some limitations: xxreg and the destination register must not overlap. I couldn't think of a clean way of ensuring that all "result table" flows use the xxreg and not a different register. But maybe we can do all this as follow up. Regards, Dumitru _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
