Hi Vladislav, On 9/13/21 6:14 PM, Vladislav Odintsov wrote: > Hi Numan, > > I’ve checked with OVS 2.16.0 and OVN master. The problem persists. > Symptoms are the same. > > # grep ct_zero_snat /var/log/openvswitch/ovs-vswitchd.log > 2021-09-13T16:10:01.792Z|00019|ofproto_dpif|INFO|system@ovs-system: Datapath > supports ct_zero_snat
This shouldn't be related to the problem we fixed with ct_zero_snat. > > Regards, > Vladislav Odintsov > >> On 13 Sep 2021, at 17:54, Numan Siddique <[email protected]> wrote: >> >> On Mon, Sep 13, 2021 at 8:10 AM Vladislav Odintsov <[email protected] >> <mailto:[email protected]>> wrote: >>> >>> Hi, >>> >>> we’ve encountered a next problem with stateful ACLs. >>> >>> Suppose, we have one logical switch (ls1) and attached to it a VIF type >>> logical ports (lsp1, lsp2). >>> Each logical port has a linux VM besides it. >>> >>> Logical ports reside in port group (pg1) and two ACLs are created within >>> this PG: >>> to-lport outport == @pg1 && ip4 && ip4.dst == 0.0.0.0/0 allow-related >>> from-lport outport == @pg1 && ip4 && ip4.src == 0.0.0.0/0 allow-related >>> >>> When we have a high-connection rate service between VMs, the tcp >>> source/dest ports may be reused before the connection is deleted from >>> LSP’s-related conntrack zones on the host. >>> Let’s use curl with passing --local-port argument to have each time same >>> source port. >>> >>> Run it from VM to another VM (172.31.0.18 -> 172.31.0.17): >>> curl --local-port 44444 http://172.31.0.17/ >>> >>> Check connections in client’s and server’s vif zones (client - zone=20, >>> server - zone=1): >>> run while true script to check connections state per-second, while running >>> new connection with same source/dest 5-tuple: >>> >>> while true; do date; grep -e 'zone=1 ' -e zone=20 /proc/net/nf_conntrack; >>> sleep 0.2; done >>> >>> Right after we’ve succesfully run curl, the connection is getting >>> time-closed and next time-wait states: >>> >>> Mon Sep 13 14:34:39 MSK 2021 >>> ipv4 2 tcp 6 59 CLOSE_WAIT src=172.31.0.18 dst=172.31.0.17 >>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 >>> [ASSURED] mark=0 zone=1 use=2 >>> ipv4 2 tcp 6 59 CLOSE_WAIT src=172.31.0.18 dst=172.31.0.17 >>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 >>> [ASSURED] mark=0 zone=20 use=2 >>> Mon Sep 13 14:34:39 MSK 2021 >>> ipv4 2 tcp 6 119 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 >>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 >>> [ASSURED] mark=0 zone=1 use=2 >>> ipv4 2 tcp 6 119 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 >>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 >>> [ASSURED] mark=0 zone=20 use=2 >>> >>> And it remains in time-wait state for nf_conntrack_time_wait_timeout (120 >>> seconds for centos 7). >>> >>> Everything is okay for now. >>> While we have installed connections in TW state in zone 1 and 20, lets run >>> this curl (source port 44444) again: >>> 1st SYN packet is lost. It didn’t get to destination VM. In conntrack we >>> have: >>> >>> Mon Sep 13 14:34:41 MSK 2021 >>> ipv4 2 tcp 6 118 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 >>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 >>> [ASSURED] mark=0 zone=1 use=2 >>> >>> We see that TW connection was dropped in source vif’s zone (20). >>> >>> Next, after one second TCP sends retry and connection in destination >>> (server’s) zone is dropped and a new connection is created in source zone >>> (client’s): >>> >>> Mon Sep 13 14:34:41 MSK 2021 >>> ipv4 2 tcp 6 120 SYN_SENT src=172.31.0.18 dst=172.31.0.17 >>> sport=44444 dport=80 [UNREPLIED] src=172.31.0.17 dst=172.31.0.18 sport=80 >>> dport=44444 mark=0 zone=20 use=2 >>> >>> Server VM still didn’t get this SYN packet. It got dropped. >>> >>> Then, after 2 seconds TCP sends retry again and connection is working well: >>> >>> Mon Sep 13 14:34:44 MSK 2021 >>> ipv4 2 tcp 6 59 CLOSE_WAIT src=172.31.0.18 dst=172.31.0.17 >>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 >>> [ASSURED] mark=0 zone=1 use=2 >>> ipv4 2 tcp 6 59 CLOSE_WAIT src=172.31.0.18 dst=172.31.0.17 >>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 >>> [ASSURED] mark=0 zone=20 use=2 >>> Mon Sep 13 14:34:44 MSK 2021 >>> ipv4 2 tcp 6 119 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 >>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 >>> [ASSURED] mark=0 zone=1 use=2 >>> ipv4 2 tcp 6 119 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 >>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 >>> [ASSURED] mark=0 zone=20 use=2 >>> >>> I guess, that it could happen: >>> 1. Run curl with an empty conntrack zones. Everything is good, we’ve got >>> http response, closed the connection. There’s one TW entry in client’s and >>> one in server’s zonntrack zones. >>> 2. Run curl with same source port within nf_conntrack_time_wait_timeout >>> seconds. >>> 2.1. OVS gets packet from VM, sends it to client’s conntrack zone=20. It >>> matches pre-existed conntrack entry in tw state from previous curl run. TW >>> connection in conntrack is deleted. A copy of a packet is returned to OVS >>> and recirculated packet has ct.inv (?) and !ct.trk states and got dropped >>> (I’m NOT sure, it’s just an assumption!) >>> 3. After one second client VM resends TCP SYN. >>> 3.1. OVS gets packet, sends through client’s conntrack zone=20, a new >>> connection is added, packet has ct.trk and ct.new states set. Packet goes >>> to recirculation. >>> 3.2. OVS sends packet to server’s conntrack zone=1. It matches pre-existed >>> conntrack entry in tw state from previous run. Conntrack removes this >>> entry. Packet is returned to OVS with ct.inv (?) and !ct.trk. Packet got >>> dropped. >>> 4. Client’s VM again sends TCP SYN after 2 more seconds left. >>> 4.1 OVS gets packet from client’s VIF, sends to client’s conntrack zone=20, >>> it matches pre-existed SYN_SENT conntrack entry state, packets is returned >>> to OVS with ct.new, ct.trk flags set. >> >> >>> 4.2 OVS sends packet to server’s conntrack zone=1. Conntrack table for >>> zone=1 is empty, it adds new entry, returns packet to OVS with ct.trk and >>> ct.new flags set. >>> 4.3 OVS sends packet to server’s VIF, next traffic operates normally. >>> >>> So, with such behaviour connection establishment sometimes takes up to >>> three seconds (2 TCP SYN retries) and makes troubles in overlay services. >>> (Application timeouts and service outages). >>> >>> I’ve checked how conntrack works inside VMs with such traffic and it looks >>> like if conntrack gets a packet within a TW connection it recreates a new >>> conntrack entry. No tuning inside VMs was performed. As a server I used >>> apache with default config from CentOS distribution. I don't have a centos 7 at hand but I do have a rhel 7 (3.10.0-1160.36.2.el7.x86_64) and I didn't manage to hit the issue you reported here (using OVS and OVN upstream master). The SYN matching the conntrack entry in state TIME_WAIT moves the entry to NEW and seems to be forwarded just fine, the session afterwards go to ESTABLISHED. Wed Sep 15 04:18:35 AM EDT 2021 conntrack v1.4.5 (conntrack-tools): 7 flow entries have been shown. tcp 6 431930 ESTABLISHED src=42.42.42.2 dst=42.42.42.3 sport=4141 dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=6 use=1 tcp 6 431930 ESTABLISHED src=42.42.42.2 dst=42.42.42.3 sport=4141 dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=3 use=1 -- Wed Sep 15 04:18:36 AM EDT 2021 conntrack v1.4.5 (conntrack-tools): 7 flow entries have been shown. tcp 6 119 TIME_WAIT src=42.42.42.2 dst=42.42.42.3 sport=4141 dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=6 use=1 tcp 6 119 TIME_WAIT src=42.42.42.2 dst=42.42.42.3 sport=4141 dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=3 use=1 -- Wed Sep 15 04:18:38 AM EDT 2021 conntrack v1.4.5 (conntrack-tools): 7 flow entries have been shown. tcp 6 431999 ESTABLISHED src=42.42.42.2 dst=42.42.42.3 sport=4141 dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=6 use=1 tcp 6 431999 ESTABLISHED src=42.42.42.2 dst=42.42.42.3 sport=4141 dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=3 use=1 -- DP flows just after the second session is initiated also seem to confirm that everything is fine: # ovs-appctl dpctl/dump-flows | grep -oE "ct_state(.*),ct_label" ct_state(+new-est-rel-rpl-inv+trk),ct_label ct_state(-new+est-rel-rpl-inv+trk),ct_label ct_state(-new+est-rel+rpl-inv+trk),ct_label ct_state(+new-est-rel-rpl-inv+trk),ct_label ct_state(-new+est-rel+rpl-inv+trk),ct_label ct_state(-new+est-rel-rpl-inv+trk),ct_label I also tried it out on a Fedora 34 with 5.13.14-200.fc34.x86_64, still works fine. What kernel and openvswitch module versions do you use? Regards, Dumitru >>> >>> @Numan, @Han, @Mark, can you please take a look at this and give any >>> suggestions/thoughts how this can be fixed. >>> The problem is actual with OVS 2.13.4 and latest OVN master branch, however >>> we’ve met it on 20.06.3 with same OVS and it’s very important for us. >> >> Hi Vladislav, >> >> From what I understand this commit should help your use case - >> https://github.com/ovn-org/ovn/commit/58683a4271e6a885f2f2aea27f3df88e69a5c388 >> >> <https://github.com/ovn-org/ovn/commit/58683a4271e6a885f2f2aea27f3df88e69a5c388> >> >> Looks to me like there's a tuple collision. And you would need the >> latest OVS (ovs 2.16) along with the latest OVN having the above >> commit. >> >> @Dumitru Ceara please correct me If I'm wrong. >> >> Thanks >> Numan >> >>> >>> Thanks. >>> >>> >>> Regards, >>> Vladislav Odintsov >>> _______________________________________________ _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
