On 11/11/24 04:39, Numan Siddique wrote:
> On Fri, Nov 8, 2024 at 3:09 PM Dumitru Ceara <[email protected]> wrote:
>>
>> On 11/7/24 21:59, [email protected] wrote:
>>> From: Numan Siddique <[email protected]>
>>>
>>> If we have a UDP load balancer - 10.0.0.10:80 = 10.0.0.3:8080, in order to
>>> determine if the load balanced traffic needs to be hairpinned, the
>>> vip - 10.0.0.10 and the vip port - 80 are stored in the registers before
>>> the packet is load balanced using the below logical flow -
>>>
>>> table=6 (ls_in_pre_stateful ), priority=120  ,
>>>   match=(reg0[2] == 1 && ip4.dst == 10.0.0.10 && tcp.dst == 80),
>>>   action=(reg1 = 10.0.0.10; reg2[0..15] = 80; ct_lb_mark;)
>>>
>>> These registers are used in the later stages to check if the load balanced
>>> packet needs to be hairpinned or not.
>>>
>>> However, if the packet is fragmented we may not be able to match on the
>>> L4 fields (tcp, udp or sctp dest port) and this breaks the hairpin
>>> traffic.
>>>
>>> This patch addressed this issue by making use of ct_nw_dst/ct_ip6_dst and
>>> ct_tp_dst conntrack fields to determine the hairpin load balanced
>>> traffic.
>>>
>>> In order to not break hardware offload on certain smart nics, care is taken
>>> to match on these fields only for fragmented packets.
>>>
>>> Note: Relying on conntrack to reassemble packets is not exactly correct, it
>>> only accidentaly works with the kernel datapath.  In our internal bug
>>> tracking system we have this issue to track this incorrect assumption:
>>> https://issues.redhat.com/browse/FDP-913
>>>
>>> Reported-at: https://issues.redhat.com/browse/FDP-905
>>> Fixes: 1139b655c996 ("Don't blindly save original dst IP and Port to avoid 
>>> megaflow unwildcarding.")
>>> CC: Han Zhou <[email protected]>
>>> Suggested-by: Dumitru Ceara <[email protected]>
>>> Signed-off-by: Numan Siddique <[email protected]>
>>> ---
>>>
>>> v1 -> v2
>>> ------
>>>   * Addressed review comments from Dumitru.
>>
>> Hi Numan,
>>
>> From my perspective this looks good.  I did trigger a recheck of the
>> ovn-kubernetes CI jobs (one was timing out but I think that's just an
>> unrelated flake).
>>
>> I left a few small comments below which can be addressed at merge time.
>> With them addressed:
>>
>> Acked-by: Dumitru Ceara <[email protected]>
> 
> Thanks for the review and the comments.  I addressed the below
> comments and applied to main and backported to branch-24.09, 24.03 and
> 23.09.
> 
> Numan
> 

I had a closer look and I realized we can probably avoid using new
registers for this case.  I posted a patch:

https://patchwork.ozlabs.org/project/ovn/list/?series=432242&state=*

For the long term maybe we can find a generic way of doing this, e.g.:
- push xxreg to the stack
- resubmit into the "result table" where we store the result into xxreg
(or part of it)
- copy xxreg (or part of it) to the destination register
- pop xxreg

I didn't do that yet because it still has some limitations: xxreg and
the destination register must not overlap.  I couldn't think of a clean
way of ensuring that all "result table" flows use the xxreg and not a
different register.  But maybe we can do all this as follow up.

Regards,
Dumitru

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to