Sorry, by OOT I meant non-inbox kmod. I’ve tried to use inbox kernel module (from kernel package) and problem resolved.
Regards, Vladislav Odintsov > On 16 Sep 2021, at 17:17, Vladislav Odintsov <[email protected]> wrote: > > Hi Dumitru, > > I’ve tried to exclude OOT OVS kernel module. > With OVN 20.06.3 + OVS 2.13.4 the problem solved. > > Could you please try with OOT kmod? For me it looks like a bug in OOT OVS > kernel module code. > > Thanks. > > Regards, > Vladislav Odintsov > >> On 16 Sep 2021, at 11:02, Dumitru Ceara <[email protected] >> <mailto:[email protected]>> wrote: >> >> On 9/16/21 2:50 AM, Vladislav Odintsov wrote: >>> Hi Dumitru, >>> >>> thanks for your reply. >>> >>> Regards, >>> Vladislav Odintsov >>> >>>> On 15 Sep 2021, at 11:24, Dumitru Ceara <[email protected]> wrote: >>>> >>>> Hi Vladislav, >>>> >>>> On 9/13/21 6:14 PM, Vladislav Odintsov wrote: >>>>> Hi Numan, >>>>> >>>>> I’ve checked with OVS 2.16.0 and OVN master. The problem persists. >>>>> Symptoms are the same. >>>>> >>>>> # grep ct_zero_snat /var/log/openvswitch/ovs-vswitchd.log >>>>> 2021-09-13T16:10:01.792Z|00019|ofproto_dpif|INFO|system@ovs-system: >>>>> Datapath supports ct_zero_snat >>>> >>>> This shouldn't be related to the problem we fixed with ct_zero_snat. >>>> >>>>> >>>>> Regards, >>>>> Vladislav Odintsov >>>>> >>>>>> On 13 Sep 2021, at 17:54, Numan Siddique <[email protected]> wrote: >>>>>> >>>>>> On Mon, Sep 13, 2021 at 8:10 AM Vladislav Odintsov <[email protected] >>>>>> <mailto:[email protected]>> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> we’ve encountered a next problem with stateful ACLs. >>>>>>> >>>>>>> Suppose, we have one logical switch (ls1) and attached to it a VIF type >>>>>>> logical ports (lsp1, lsp2). >>>>>>> Each logical port has a linux VM besides it. >>>>>>> >>>>>>> Logical ports reside in port group (pg1) and two ACLs are created >>>>>>> within this PG: >>>>>>> to-lport outport == @pg1 && ip4 && ip4.dst == 0.0.0.0/0 allow-related >>>>>>> from-lport outport == @pg1 && ip4 && ip4.src == 0.0.0.0/0 allow-related >>>>>>> >>>>>>> When we have a high-connection rate service between VMs, the tcp >>>>>>> source/dest ports may be reused before the connection is deleted from >>>>>>> LSP’s-related conntrack zones on the host. >>>>>>> Let’s use curl with passing --local-port argument to have each time >>>>>>> same source port. >>>>>>> >>>>>>> Run it from VM to another VM (172.31.0.18 -> 172.31.0.17): >>>>>>> curl --local-port 44444 http://172.31.0.17/ >>>>>>> >>>>>>> Check connections in client’s and server’s vif zones (client - zone=20, >>>>>>> server - zone=1): >>>>>>> run while true script to check connections state per-second, while >>>>>>> running new connection with same source/dest 5-tuple: >>>>>>> >>>>>>> while true; do date; grep -e 'zone=1 ' -e zone=20 >>>>>>> /proc/net/nf_conntrack; sleep 0.2; done >>>>>>> >>>>>>> Right after we’ve succesfully run curl, the connection is getting >>>>>>> time-closed and next time-wait states: >>>>>>> >>>>>>> Mon Sep 13 14:34:39 MSK 2021 >>>>>>> ipv4 2 tcp 6 59 CLOSE_WAIT src=172.31.0.18 dst=172.31.0.17 >>>>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 >>>>>>> dport=44444 [ASSURED] mark=0 zone=1 use=2 >>>>>>> ipv4 2 tcp 6 59 CLOSE_WAIT src=172.31.0.18 dst=172.31.0.17 >>>>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 >>>>>>> dport=44444 [ASSURED] mark=0 zone=20 use=2 >>>>>>> Mon Sep 13 14:34:39 MSK 2021 >>>>>>> ipv4 2 tcp 6 119 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 >>>>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 >>>>>>> dport=44444 [ASSURED] mark=0 zone=1 use=2 >>>>>>> ipv4 2 tcp 6 119 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 >>>>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 >>>>>>> dport=44444 [ASSURED] mark=0 zone=20 use=2 >>>>>>> >>>>>>> And it remains in time-wait state for nf_conntrack_time_wait_timeout >>>>>>> (120 seconds for centos 7). >>>>>>> >>>>>>> Everything is okay for now. >>>>>>> While we have installed connections in TW state in zone 1 and 20, lets >>>>>>> run this curl (source port 44444) again: >>>>>>> 1st SYN packet is lost. It didn’t get to destination VM. In conntrack >>>>>>> we have: >>>>>>> >>>>>>> Mon Sep 13 14:34:41 MSK 2021 >>>>>>> ipv4 2 tcp 6 118 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 >>>>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 >>>>>>> dport=44444 [ASSURED] mark=0 zone=1 use=2 >>>>>>> >>>>>>> We see that TW connection was dropped in source vif’s zone (20). >>>>>>> >>>>>>> Next, after one second TCP sends retry and connection in destination >>>>>>> (server’s) zone is dropped and a new connection is created in source >>>>>>> zone (client’s): >>>>>>> >>>>>>> Mon Sep 13 14:34:41 MSK 2021 >>>>>>> ipv4 2 tcp 6 120 SYN_SENT src=172.31.0.18 dst=172.31.0.17 >>>>>>> sport=44444 dport=80 [UNREPLIED] src=172.31.0.17 dst=172.31.0.18 >>>>>>> sport=80 dport=44444 mark=0 zone=20 use=2 >>>>>>> >>>>>>> Server VM still didn’t get this SYN packet. It got dropped. >>>>>>> >>>>>>> Then, after 2 seconds TCP sends retry again and connection is working >>>>>>> well: >>>>>>> >>>>>>> Mon Sep 13 14:34:44 MSK 2021 >>>>>>> ipv4 2 tcp 6 59 CLOSE_WAIT src=172.31.0.18 dst=172.31.0.17 >>>>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 >>>>>>> dport=44444 [ASSURED] mark=0 zone=1 use=2 >>>>>>> ipv4 2 tcp 6 59 CLOSE_WAIT src=172.31.0.18 dst=172.31.0.17 >>>>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 >>>>>>> dport=44444 [ASSURED] mark=0 zone=20 use=2 >>>>>>> Mon Sep 13 14:34:44 MSK 2021 >>>>>>> ipv4 2 tcp 6 119 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 >>>>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 >>>>>>> dport=44444 [ASSURED] mark=0 zone=1 use=2 >>>>>>> ipv4 2 tcp 6 119 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 >>>>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 >>>>>>> dport=44444 [ASSURED] mark=0 zone=20 use=2 >>>>>>> >>>>>>> I guess, that it could happen: >>>>>>> 1. Run curl with an empty conntrack zones. Everything is good, we’ve >>>>>>> got http response, closed the connection. There’s one TW entry in >>>>>>> client’s and one in server’s zonntrack zones. >>>>>>> 2. Run curl with same source port within nf_conntrack_time_wait_timeout >>>>>>> seconds. >>>>>>> 2.1. OVS gets packet from VM, sends it to client’s conntrack zone=20. >>>>>>> It matches pre-existed conntrack entry in tw state from previous curl >>>>>>> run. TW connection in conntrack is deleted. A copy of a packet is >>>>>>> returned to OVS and recirculated packet has ct.inv (?) and !ct.trk >>>>>>> states and got dropped (I’m NOT sure, it’s just an assumption!) >>>>>>> 3. After one second client VM resends TCP SYN. >>>>>>> 3.1. OVS gets packet, sends through client’s conntrack zone=20, a new >>>>>>> connection is added, packet has ct.trk and ct.new states set. Packet >>>>>>> goes to recirculation. >>>>>>> 3.2. OVS sends packet to server’s conntrack zone=1. It matches >>>>>>> pre-existed conntrack entry in tw state from previous run. Conntrack >>>>>>> removes this entry. Packet is returned to OVS with ct.inv (?) and >>>>>>> !ct.trk. Packet got dropped. >>>>>>> 4. Client’s VM again sends TCP SYN after 2 more seconds left. >>>>>>> 4.1 OVS gets packet from client’s VIF, sends to client’s conntrack >>>>>>> zone=20, it matches pre-existed SYN_SENT conntrack entry state, packets >>>>>>> is returned to OVS with ct.new, ct.trk flags set. >>>>>> >>>>>> >>>>>>> 4.2 OVS sends packet to server’s conntrack zone=1. Conntrack table for >>>>>>> zone=1 is empty, it adds new entry, returns packet to OVS with ct.trk >>>>>>> and ct.new flags set. >>>>>>> 4.3 OVS sends packet to server’s VIF, next traffic operates normally. >>>>>>> >>>>>>> So, with such behaviour connection establishment sometimes takes up to >>>>>>> three seconds (2 TCP SYN retries) and makes troubles in overlay >>>>>>> services. (Application timeouts and service outages). >>>>>>> >>>>>>> I’ve checked how conntrack works inside VMs with such traffic and it >>>>>>> looks like if conntrack gets a packet within a TW connection it >>>>>>> recreates a new conntrack entry. No tuning inside VMs was performed. As >>>>>>> a server I used apache with default config from CentOS distribution. >>>> >>>> I don't have a centos 7 at hand but I do have a rhel 7 >>>> (3.10.0-1160.36.2.el7.x86_64) and I didn't manage to hit the issue you >>>> reported here (using OVS and OVN upstream master). The SYN matching the >>>> conntrack entry in state TIME_WAIT moves the entry to NEW and seems to >>>> be forwarded just fine, the session afterwards go to ESTABLISHED. >>>> >>>> Wed Sep 15 04:18:35 AM EDT 2021 >>>> conntrack v1.4.5 (conntrack-tools): 7 flow entries have been shown. >>>> tcp 6 431930 ESTABLISHED src=42.42.42.2 dst=42.42.42.3 sport=4141 >>>> dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED] >>>> mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=6 use=1 >>>> tcp 6 431930 ESTABLISHED src=42.42.42.2 dst=42.42.42.3 sport=4141 >>>> dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED] >>>> mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=3 use=1 >>>> -- >>>> Wed Sep 15 04:18:36 AM EDT 2021 >>>> conntrack v1.4.5 (conntrack-tools): 7 flow entries have been shown. >>>> tcp 6 119 TIME_WAIT src=42.42.42.2 dst=42.42.42.3 sport=4141 >>>> dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED] >>>> mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=6 use=1 >>>> tcp 6 119 TIME_WAIT src=42.42.42.2 dst=42.42.42.3 sport=4141 >>>> dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED] >>>> mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=3 use=1 >>>> -- >>>> Wed Sep 15 04:18:38 AM EDT 2021 >>>> conntrack v1.4.5 (conntrack-tools): 7 flow entries have been shown. >>>> tcp 6 431999 ESTABLISHED src=42.42.42.2 dst=42.42.42.3 sport=4141 >>>> dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED] >>>> mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=6 use=1 >>>> tcp 6 431999 ESTABLISHED src=42.42.42.2 dst=42.42.42.3 sport=4141 >>>> dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED] >>>> mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=3 use=1 >>>> -- >>>> >>>> DP flows just after the second session is initiated also seem to confirm >>>> that everything is fine: >>>> >>>> # ovs-appctl dpctl/dump-flows | grep -oE "ct_state(.*),ct_label" >>>> ct_state(+new-est-rel-rpl-inv+trk),ct_label >>>> ct_state(-new+est-rel-rpl-inv+trk),ct_label >>>> ct_state(-new+est-rel+rpl-inv+trk),ct_label >>>> ct_state(+new-est-rel-rpl-inv+trk),ct_label >>>> ct_state(-new+est-rel+rpl-inv+trk),ct_label >>>> ct_state(-new+est-rel-rpl-inv+trk),ct_label >>>> >>>> I also tried it out on a Fedora 34 with 5.13.14-200.fc34.x86_64, still >>>> works fine. >>>> >>>> What kernel and openvswitch module versions do you use? >>>> >>> On my box there is CentOS 7.5 with kernel 3.10.0-862.14.4.el7 and OOT >>> kernel module. >>> I’ve tested two versions, in both the problem was hit: >>> openvswitch-kmod-2.13.4-1.el7_5.x86_64 >>> openvswitch-kmod-2.16.0-1.el7_5.x86_64 >>> >>> Do you think the problem could be related to kernel (conntrack) and kernel >>> must be upgraded here? >>> Or, maybe I should try master OVS, as you did? >> >> I just tried with OVS v2.13.4, OVN master and it all worked fine (both >> on Fedora 34 and rhel 7). I don't think the problem is in user space. >> >> Regards, >> Dumitru >> >> _______________________________________________ >> dev mailing list >> [email protected] <mailto:[email protected]> >> <mailto:[email protected] <mailto:[email protected]>> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> <https://mail.openvswitch.org/mailman/listinfo/ovs-dev> >> <https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>> > _______________________________________________ > dev mailing list > [email protected] <mailto:[email protected]> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > <https://mail.openvswitch.org/mailman/listinfo/ovs-dev> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
