On 5/19/25 12:20 PM, Q Kay via discuss wrote: > Attached topology > > Vào Th 2, 19 thg 5, 2025 vào lúc 17:19 Q Kay <tqkhang...@gmail.com> đã > viết: > >> Dear OVN Team, >>
Hi Ice Bear, >> I would like to report an issue observed with OVN networking related to >> asymmetric routing. The problem occurs when using instances to transit >> traffic between two routed logical switch, and appears to be caused by OVN >> connection tracking, which I wish to bypass for stateless forwarding. >> >> Environment Information >> >> - OVN Version: 24.03.2 (same issue observed on 24.09). >> - Port security disabled. >> >> Issue Description >> >> I have two instances, each with a loopback IP configured (5.5.5.5 on >> Instance A and 6.6.6.6 on Instance B), deployed on different compute nodes >> (Compute 1 and Compute 2 respectively). The instances are connected to two >> different networks (10.10.10.0/24 and 10.10.20.0/24). >> I have configured static routes on both instances as follows: >> >> - Instance A: Route 6.6.6.6/32 via 10.10.10.218 >> - Instance B: Route 5.5.5.5/32 via 10.10.20.41 >> >> >> Topology is in attached file below. >> Expected Behavior >> I should be able to communicate using ICMP. between the two endpoint IPs >> (5.5.5.5 and 6.6.6.6) with the routing path as configured above. >> ICMP: >> >> - On Instance A: ping 6.6.6.6 -I 5.5.5.5 (using 5.5.5.5 as source IP) >> => should succeed >> - On Instance B: ping 5.5.5.5 -I 6.6.6.6 (using 6.6.6.6 as source IP) >> => should succeed >> >> >> Actual Behavior >> When attempting to ping between these loopback IPs, I observe that traffic >> only works in one direction: >> >> - On Instance A: ping 6.6.6.6 -I 5.5.5.5 (using 5.5.5.5 as source IP) >> => fails >> - On Instance B: ping 5.5.5.5 -I 6.6.6.6 (using 6.6.6.6 as source IP) >> => succeeds >> >> >> Despite disabling port security and ensuring necessary routes are >> configured, the asymmetric routing scenario still fails in one direction in >> ICMP, and both failed in TCP. I have verified that packet handling at the >> instance level is working correctly (confirmed with tcpdump at the tap >> port). >> I've tried moving both instances to a single compute node, but the same >> issue still occurs. >> Troubleshooting Steps1. Reversed routing direction: >> >> - On Instance A: route 6.6.6.6/32 via 10.10.10.78 >> - On Instance B: route 5.5.5.5/32 via 10.10.20.102 => Result: Ping >> from A to B succeeds, from B to A fails (opposite of initial results) >> >> 2. Using OVN trace: >> ovn-trace --no-leader-only 70974da0-2e9d-469a-9782-455a0380ab95 'inport == >> "319cd637-10fb-4b45-9708-d02beefd698a" && eth.src==fa:16:3e:ea:67:18 && >> eth.dst==fa:16:3e:04:28:c7 && ip4.src==6.6.6.6 && ip4.dst==5.5.5.5 && >> ip.proto==1 && ip.ttl==64' >> >> *Output*: >> ingress(dp="A", inport="319cd6") 0. ls_in_check_port_sec: priority 50 >> reg0[15] = check_in_port_sec(); next; 2. ls_in_lookup_fdb: inport == >> "319cd6", priority 100 reg0[11] = lookup_fdb(inport, eth.src); next; 27. >> ls_in_l2_lkup: eth.dst == fa:16:3e:04:28:c7, priority 50 outport = >> "869b33"; output; >> egress(dp="A", inport="319cd6", outport="869b33") 9. >> ls_out_check_port_sec: priority 0 reg0[15] = check_out_port_sec(); next; >> 10. ls_out_apply_port_sec: priority 0 output; /* output to "869b33" */ >> >> 3. Examining recirculation to identify where my flow is being dropped >> *For successful ping flow: 5.5.5.5 -> 6.6.6.6* >> *- On Compute 1 (containing source instance): * >> >> 'recirc_id(0x3d71),in_port(28),ct_state(+new-est-rel-rpl-inv+trk),ct_mark(0/0x1),eth(src=fa:16:3e:81:ed:92,dst=fa:16:3e:72:fd:e5),eth_type(0x0800),ipv4(src= >> 4.0.0.0/252.0.0.0,dst=0.0.0.0/248.0.0.0,proto=1,tos=0/0x3,frag=no), >> packets:55, bytes:5390, used:0.205s, >> actions:ct(commit,zone=87,mark=0/0x1,nat(src)),set(tunnel(tun_id=0x6,dst=10.10.10.85,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x50006}),flags(df|csum|key))),9' >> >> 'recirc_id(0),in_port(28),eth(src=fa:16:3e:81:ed:92,dst=fa:16:3e:72:fd:e5),eth_type(0x0800),ipv4(proto=1,frag=no), >> packets:55, bytes:5390, used:0.205s, actions:ct(zone=87),recirc(0x3d71)' >> >> 'recirc_id(0),tunnel(tun_id=0x2,src=10.10.10.85,dst=10.10.10.84,geneve({class=0x102,type=0x80,len=4,0xb000a/0x7fffffff}),flags(-df+csum+key)),in_port(9),eth(src=fa:16:3e:ea:67:18,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=0/0xfe), >> packets:55, bytes:5390, used:0.204s, actions:29' >> >> *- On Compute 2: * >> 'recirc_id(0),tunnel(tun_id=0x6,src=10.10.10.84,dst=10.10.10.85,geneve({class=0x102,type=0x80,len=4,0x50006/0x7fffffff}),flags(-df+csum+key)),in_port(10),eth(src=fa:16:3e:81:ed:92,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=8/0xf8), >> packets:193, bytes:18914, used:0.009s, actions:ct(zone=53),recirc(0x1791e)' >> >> 'recirc_id(0x1791e),tunnel(tun_id=0x6,src=10.10.10.84,dst=10.10.10.85,geneve({}{}),flags(-df+csum+key)),in_port(10),ct_state(+new-est-rel-rpl-inv+trk),ct_mark(0/0x1),eth(src=fa:16:3e:81:ed:92,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(frag=no), >> packets:193, bytes:18914, used:0.009s, >> actions:ct(commit,zone=53,mark=0/0x1,nat(src)),23' >> >> 'recirc_id(0),in_port(21),eth(src=fa:16:3e:ea:67:18,dst=fa:16:3e:04:28:c7),eth_type(0x0800),ipv4(src=6.6.6.6,dst=5.5.5.5,proto=1,tos=0/0x3,frag=no), >> packets:193, bytes:18914, used:0.008s, >> actions:set(tunnel(tun_id=0x2,dst=10.10.10.84,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0xb000a}),flags(df|csum|key))),10' >> >> >> *For failed ping flow: 6.6.6.6 -> 5.5.5.5* >> *- On Compute 2 (containing source instance): * >> 'recirc_id(0),in_port(21),eth(src=fa:16:3e:ea:67:18,dst=fa:16:3e:04:28:c7),eth_type(0x0800),ipv4(src=6.6.6.6,dst=5.5.5.5,proto=1,tos=0/0x3,frag=no), >> packets:5, bytes:490, used:0.728s, >> actions:set(tunnel(tun_id=0x2,dst=10.10.10.84,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0xb000a}),flags(df|csum|key))),10' >> >> *- On Compute 1: * >> 'recirc_id(0),tunnel(tun_id=0x2,src=10.10.10.85,dst=10.10.10.84,geneve({class=0x102,type=0x80,len=4,0xb000a/0x7fffffff}),flags(-df+csum+key)),in_port(9),eth(src=fa:16:3e:ea:67:18,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=8/0xf8), >> packets:48, bytes:4704, used:0.940s, actions:29' >> >> 'recirc_id(0),in_port(28),eth(src=fa:16:3e:81:ed:92,dst=fa:16:3e:72:fd:e5),eth_type(0x0800),ipv4(proto=1,frag=no), >> packets:48, bytes:4704, used:0.940s, actions:ct(zone=87),recirc(0x3d77)' >> >> 'recirc_id(0x3d77),in_port(28),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(),eth_type(0x0800),ipv4(frag=no), >> packets:48, bytes:4704, used:0.940s, actions:drop' >> Thanks for all the details! >> Observations >> I've noticed that packet handling at the compute nodes is not consistent. Actually, I'd argue that it is. >> My hypothesis is that the handling of ct_state flags is causing the return >> traffic to be dropped. This may be because the outgoing and return >> connections do not share the same logical_switch datapath. If original and reply paths of a connection are not processed by the same logical switch then that's exactly the problem, you're right. >> The critical evidence is in the failed flow, where we see: >> 'recirc_id(0x3d77),in_port(28),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(),eth_type(0x0800),ipv4(frag=no), >> packets:48, bytes:4704, used:0.940s, actions:drop' >> The packet is being marked as invalid (+inv) and subsequently dropped. It's a bit weird though that this isn't a +rpl traffic. Is this hit by the ICMP echo or by the ICMP echo-reply packet? >> Impact >> This unexplained packet drop significantly impacts my service when I use >> instances for transit purpose in OVN environment. Although I have disabled >> port security to use stateless mode, the behavior is not as expected. >> Request for Clarification >> Based on the situation described above, I have the following questions: >> >> 1. Is the packet drop behavior described above consistent with OVN's >> design? If original and reply directions of a session (in conntrack terms) are processed on different logical switches, then yes. >> 2. If this is the expected behavior of OVN, please explain why packets >> are being dropped. OVN switches drop all traffic marked as ct_state=+trk+inv by default. >> 3. If this is not the expected behavior, could you confirm whether >> this is a bug that will be fixed in the future? >> I'd say it's not a bug. However, if you want to change the default behavior you can use the NB_Global.options:use_ct_inv_match=true knob to allow +inv packets in the logical switch pipeline. There's one caveat though: if you're using hardware offload some NICs (e.g., NVIDIA CX-5/6) will not be able to offload the traffic it's forwarded on ct_state=+trk+inv. >> >> I can provide additional information as needed. Please let me know if you >> require any further details. >> Thank you very much for your time and support. I greatly appreciate your >> guidance to better understand OVN's behavior design here. >> >> >> Best regards, >> Ice Bear >> Regards, Dumitru _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss