On 5/22/25 9:05 AM, Q Kay wrote: > Hi Dumitru, > Hi Ice Bear,
Please keep the ovs-discuss mailing list in CC. > I am very willing to provide NB DB file for you (attached). > I will provide more information about the ports for you to check. > > Logical switch 1 id: 70974da0-2e9d-469a-9782-455a0380ab95 > Logical switch 2 id: ec22da44-9964-49ff-9c29-770a26794ba4 > > Instance A: > port 1 (connect to ls1): 61a871bc-7709-4072-9991-8e3a1096b02a > port 2 (connect to ls2): 63d76c2b-2960-4a89-97ac-9f7a7d4bb718 > > > Instance B: > port 1: 46848e3c-7a73-46ce-8b3a-b6331e14fc74 > port 2: 7d39750a-29d6-40df-b42b-54a17efcc423 > Thanks for the info. However, it's easier to investigate if you just share the actual NB DB (json) file instead of the ovsdb-client dump. It's probably located in a path similar to /etc/ovn/ovnnb_db.db. Like that I could just load it in a sandbox and run ovn-nbctl commands against it directly. Regards, Dumitru > > Best regards, > Ice Bear > Vào Th 4, 21 thg 5, 2025 vào lúc 16:19 Dumitru Ceara <dce...@redhat.com> > đã viết: > >> On 5/21/25 5:16 AM, Q Kay wrote: >>> Hi Dumitru, >> >> Hi Ice Bear, >> >> CC: ovs-discuss@openvswitch.org >> >>> Thanks for your answer. First, I will address some of your questions. >>> >>>>> The critical evidence is in the failed flow, where we see: >>>>> >> 'recirc_id(0x3d77),in_port(28),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(),eth_type(0x0800),ipv4(frag=no), >>>>> packets:48, bytes:4704, used:0.940s, actions:drop' >>>>> The packet is being marked as invalid (+inv) and subsequently dropped. >>>>> It's a bit weird though that this isn't a +rpl traffic. Is this hit by >> the ICMP echo or by the ICMP echo-reply packet? >>> >>> This recirc hit by icmp echo reply packet. >>> >> >> OK, that's good. >> >>> I understand what you mean. The outgoing and return traffic from >>> different logical switches will be flagged as inv. If that's the case, >>> it will work correctly with TCP (both are dropped). But for ICMP, I >>> notice something a bit strange. >>> >>>>> My hypothesis is that the handling of ct_state flags is causing the >> return >>>>> traffic to be dropped. This may be because the outgoing and return >>>>> connections do not share the same logical_switch datapath. >>> >>> According to your reasoning, ICMP reply packets from a different logical >>> switch than the request packets will be dropped. However, in practice, >>> when I initiate an ICMP request from 6.6.6.6 <https://6.6.6.6> to >>> 5.5.5.5 <https://5.5.5.5>, the result I get is success (note that echo >>> request and reply come from different logical switches regardless of >>> whether they are initiated by 5.5.5.5 <https://5.5.5.5> or 6.6.6.6 >>> <https://6.6.6.6>). You can compare the two recirculation flows to see >>> this oddity. You can take a look at the attached image for better >>> visualization. >>> >> >> OK. From the ovn-trace command you shared >> >>> 2. Using OVN trace: >>> ovn-trace --no-leader-only 70974da0-2e9d-469a-9782-455a0380ab95 'inport >> == >>> "319cd637-10fb-4b45-9708-d02beefd698a" && eth.src==fa:16:3e:ea:67:18 && >>> eth.dst==fa:16:3e:04:28:c7 && ip4.src==6.6.6.6 && ip4.dst==5.5.5.5 && >>> ip.proto==1 && ip.ttl==64' >> >> I'm guessing the fa:16:3e:ea:67:18 MAC is the one owned by 6.6.6.6. >> >> Now, after filtering only the ICMP ECHO reply flows in your initial >> datapath >> flow dump: >> >>> *For successful ping flow: 5.5.5.5 -> 6.6.6.6* >> >> Note: ICMP reply comes from 6.6.6.6 to 5.5.5.5 (B -> A). >> >>> *- On Compute 1 (containing source instance): * >>> >> 'recirc_id(0),tunnel(tun_id=0x2,src=10.10.10.85,dst=10.10.10.84,geneve({class=0x102,type=0x80,len=4,0xb000a/0x7fffffff}),flags(-df+csum+key)),in_port(9),eth(src=fa:16:3e:ea:67:18,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=0/0xfe), >>> packets:55, bytes:5390, used:0.204s, actions:29' >> >> We see no conntrack fields in the match. So, based on the diagram you >> shared, >> I'm guessing there's no allow-related ACL or load balancer on logical >> switch 2. >> >> But then for the failed ping flow: >> >>> *For failed ping flow: 6.6.6.6 -> 5.5.5.5* >> >> Note: ICMP reply comes from 5.5.5.5 to 6.6.6.6 (A -> B). >> >>> *- On Compute 1: * >> >> [...] >> >>> >>> >> 'recirc_id(0),in_port(28),eth(src=fa:16:3e:81:ed:92,dst=fa:16:3e:72:fd:e5),eth_type(0x0800),ipv4(proto=1,frag=no), >>> packets:48, bytes:4704, used:0.940s, actions:ct(zone=87),recirc(0x3d77)' >>> >>> >> 'recirc_id(0x3d77),in_port(28),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(),eth_type(0x0800),ipv4(frag=no), >>> packets:48, bytes:4704, used:0.940s, actions:drop' >> >> In this case we _do_ have conntrack fields in the match/actions. >> Is it possible that logical switch 1 has allow-related ACLs or LBs? >> >> On the TCP side of things: it's kind of hard to tell what's going on >> without having the complete configuration of your OVN deployment. >> >> NOTE: if an ACL is applied to a port group, that is equivalent to applying >> the ACL to all logical switches that have ports in that port group. >> >>>>> I'd say it's not a bug. However, if you want to change the default >>>>> behavior you can use the NB_Global.options:use_ct_inv_match=true knob >> to >>>>> allow +inv packets in the logical switch pipeline. >>> >>> I tried setting the option use_ct_inv_match=. The result is just as you >>> said, everything works successfully with both ICMP and TCP. >>> Based on this experiment, I suspect there might be a small bug when OVN >>> handles ICMP packets. Could you please let me know if my experiment and >>> reasoning are correct? >>> >> >> As said above, it really depends on the full configuration. Maybe we can >> tell more if you can share the NB database? Or at least if you share the >> ACLs applied on the two logical switches (or port groups). >> >>> >>> Thanks for your support. >>> >> >> No problem. >> >>> >>> >>> >>> Best regards, >>> Ice Bear >> >> Regards, >> Dumitru >> >> > _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss