Re: [ovs-dev] [ovn] problem: long tcp session instantiation with stateful ACLs

Vladislav Odintsov Wed, 15 Sep 2021 17:51:00 -0700

Hi Dumitru,

thanks for your reply.


Regards,
Vladislav Odintsov

> On 15 Sep 2021, at 11:24, Dumitru Ceara <[email protected]> wrote:
> 
> Hi Vladislav,
> 
> On 9/13/21 6:14 PM, Vladislav Odintsov wrote:
>> Hi Numan,
>> 
>> I’ve checked with OVS 2.16.0 and OVN master. The problem persists.
>> Symptoms are the same.
>> 
>> # grep ct_zero_snat /var/log/openvswitch/ovs-vswitchd.log
>> 2021-09-13T16:10:01.792Z|00019|ofproto_dpif|INFO|system@ovs-system: Datapath 
>> supports ct_zero_snat
> 
> This shouldn't be related to the problem we fixed with ct_zero_snat.
> 
>> 
>> Regards,
>> Vladislav Odintsov
>> 
>>> On 13 Sep 2021, at 17:54, Numan Siddique <[email protected]> wrote:
>>> 
>>> On Mon, Sep 13, 2021 at 8:10 AM Vladislav Odintsov <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> we’ve encountered a next problem with stateful ACLs.
>>>> 
>>>> Suppose, we have one logical switch (ls1) and attached to it a VIF type 
>>>> logical ports (lsp1, lsp2).
>>>> Each logical port has a linux VM besides it.
>>>> 
>>>> Logical ports reside in port group (pg1) and two ACLs are created within 
>>>> this PG:
>>>> to-lport outport == @pg1 && ip4 && ip4.dst == 0.0.0.0/0 allow-related
>>>> from-lport outport == @pg1 && ip4 && ip4.src == 0.0.0.0/0 allow-related
>>>> 
>>>> When we have a high-connection rate service between VMs, the tcp 
>>>> source/dest ports may be reused before the connection is deleted from 
>>>> LSP’s-related conntrack zones on the host.
>>>> Let’s use curl with passing --local-port argument to have each time same 
>>>> source port.
>>>> 
>>>> Run it from VM to another VM (172.31.0.18 -> 172.31.0.17):
>>>> curl --local-port 44444 http://172.31.0.17/
>>>> 
>>>> Check connections in client’s and server’s vif zones (client - zone=20, 
>>>> server - zone=1):
>>>> run while true script to check connections state per-second, while running 
>>>> new connection with same source/dest 5-tuple:
>>>> 
>>>> while true; do date; grep -e 'zone=1 ' -e zone=20 /proc/net/nf_conntrack; 
>>>> sleep 0.2; done
>>>> 
>>>> Right after we’ve succesfully run curl, the connection is getting 
>>>> time-closed and next time-wait states:
>>>> 
>>>> Mon Sep 13 14:34:39 MSK 2021
>>>> ipv4     2 tcp      6 59 CLOSE_WAIT src=172.31.0.18 dst=172.31.0.17 
>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 
>>>> [ASSURED] mark=0 zone=1 use=2
>>>> ipv4     2 tcp      6 59 CLOSE_WAIT src=172.31.0.18 dst=172.31.0.17 
>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 
>>>> [ASSURED] mark=0 zone=20 use=2
>>>> Mon Sep 13 14:34:39 MSK 2021
>>>> ipv4     2 tcp      6 119 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 
>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 
>>>> [ASSURED] mark=0 zone=1 use=2
>>>> ipv4     2 tcp      6 119 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 
>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 
>>>> [ASSURED] mark=0 zone=20 use=2
>>>> 
>>>> And it remains in time-wait state for nf_conntrack_time_wait_timeout (120 
>>>> seconds for centos 7).
>>>> 
>>>> Everything is okay for now.
>>>> While we have installed connections in TW state in zone 1 and 20, lets run 
>>>> this curl (source port 44444) again:
>>>> 1st SYN packet is lost. It didn’t get to destination VM. In conntrack we 
>>>> have:
>>>> 
>>>> Mon Sep 13 14:34:41 MSK 2021
>>>> ipv4     2 tcp      6 118 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 
>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 
>>>> [ASSURED] mark=0 zone=1 use=2
>>>> 
>>>> We see that TW connection was dropped in source vif’s zone (20).
>>>> 
>>>> Next, after one second TCP sends retry and connection in destination 
>>>> (server’s) zone is dropped and a new connection is created in source zone 
>>>> (client’s):
>>>> 
>>>> Mon Sep 13 14:34:41 MSK 2021
>>>> ipv4     2 tcp      6 120 SYN_SENT src=172.31.0.18 dst=172.31.0.17 
>>>> sport=44444 dport=80 [UNREPLIED] src=172.31.0.17 dst=172.31.0.18 sport=80 
>>>> dport=44444 mark=0 zone=20 use=2
>>>> 
>>>> Server VM still didn’t get this SYN packet. It got dropped.
>>>> 
>>>> Then, after 2 seconds TCP sends retry again and connection is working well:
>>>> 
>>>> Mon Sep 13 14:34:44 MSK 2021
>>>> ipv4     2 tcp      6 59 CLOSE_WAIT src=172.31.0.18 dst=172.31.0.17 
>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 
>>>> [ASSURED] mark=0 zone=1 use=2
>>>> ipv4     2 tcp      6 59 CLOSE_WAIT src=172.31.0.18 dst=172.31.0.17 
>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 
>>>> [ASSURED] mark=0 zone=20 use=2
>>>> Mon Sep 13 14:34:44 MSK 2021
>>>> ipv4     2 tcp      6 119 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 
>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 
>>>> [ASSURED] mark=0 zone=1 use=2
>>>> ipv4     2 tcp      6 119 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 
>>>> sport=44444 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 dport=44444 
>>>> [ASSURED] mark=0 zone=20 use=2
>>>> 
>>>> I guess, that it could happen:
>>>> 1. Run curl with an empty conntrack zones. Everything is good, we’ve got 
>>>> http response, closed the connection. There’s one TW entry in client’s and 
>>>> one in server’s zonntrack zones.
>>>> 2. Run curl with same source port within nf_conntrack_time_wait_timeout 
>>>> seconds.
>>>> 2.1. OVS gets packet from VM, sends it to client’s conntrack zone=20. It 
>>>> matches pre-existed conntrack entry in tw state from previous curl run. TW 
>>>> connection in conntrack is deleted. A copy of a packet is returned to OVS 
>>>> and recirculated packet has ct.inv (?) and !ct.trk states and got dropped 
>>>> (I’m NOT sure, it’s just an assumption!)
>>>> 3. After one second client VM resends TCP SYN.
>>>> 3.1. OVS gets packet, sends through client’s conntrack zone=20, a new 
>>>> connection is added, packet has ct.trk and ct.new states set. Packet goes 
>>>> to recirculation.
>>>> 3.2. OVS sends packet to server’s conntrack zone=1. It matches pre-existed 
>>>> conntrack entry in tw state from previous run. Conntrack removes this 
>>>> entry. Packet is returned to OVS with ct.inv (?) and !ct.trk. Packet got 
>>>> dropped.
>>>> 4. Client’s VM again sends TCP SYN after 2 more seconds left.
>>>> 4.1 OVS gets packet from client’s VIF, sends to client’s conntrack 
>>>> zone=20, it matches pre-existed SYN_SENT conntrack entry state, packets is 
>>>> returned to OVS with ct.new, ct.trk flags set.
>>> 
>>> 
>>>> 4.2 OVS sends packet to server’s conntrack zone=1. Conntrack table for 
>>>> zone=1 is empty, it adds new entry, returns packet to OVS with ct.trk and 
>>>> ct.new flags set.
>>>> 4.3 OVS sends packet to server’s VIF, next traffic operates normally.
>>>> 
>>>> So, with such behaviour connection establishment sometimes takes up to 
>>>> three seconds (2 TCP SYN retries) and makes troubles in overlay services. 
>>>> (Application timeouts and service outages).
>>>> 
>>>> I’ve checked how conntrack works inside VMs with such traffic and it looks 
>>>> like if conntrack gets a packet within a TW connection it recreates a new 
>>>> conntrack entry. No tuning inside VMs was performed. As a server I used 
>>>> apache with default config from CentOS distribution.
> 
> I don't have a centos 7 at hand but I do have a rhel 7
> (3.10.0-1160.36.2.el7.x86_64) and I didn't manage to hit the issue you
> reported here (using OVS and OVN upstream master).  The SYN matching the
> conntrack entry in state TIME_WAIT moves the entry to NEW and seems to
> be forwarded just fine, the session afterwards go to ESTABLISHED.
> 
> Wed Sep 15 04:18:35 AM EDT 2021
> conntrack v1.4.5 (conntrack-tools): 7 flow entries have been shown.
> tcp      6 431930 ESTABLISHED src=42.42.42.2 dst=42.42.42.3 sport=4141
> dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED]
> mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=6 use=1
> tcp      6 431930 ESTABLISHED src=42.42.42.2 dst=42.42.42.3 sport=4141
> dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED]
> mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=3 use=1
> --
> Wed Sep 15 04:18:36 AM EDT 2021
> conntrack v1.4.5 (conntrack-tools): 7 flow entries have been shown.
> tcp      6 119 TIME_WAIT src=42.42.42.2 dst=42.42.42.3 sport=4141
> dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED]
> mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=6 use=1
> tcp      6 119 TIME_WAIT src=42.42.42.2 dst=42.42.42.3 sport=4141
> dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED]
> mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=3 use=1
> --
> Wed Sep 15 04:18:38 AM EDT 2021
> conntrack v1.4.5 (conntrack-tools): 7 flow entries have been shown.
> tcp      6 431999 ESTABLISHED src=42.42.42.2 dst=42.42.42.3 sport=4141
> dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED]
> mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=6 use=1
> tcp      6 431999 ESTABLISHED src=42.42.42.2 dst=42.42.42.3 sport=4141
> dport=4242 src=42.42.42.3 dst=42.42.42.2 sport=4242 dport=4141 [ASSURED]
> mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=3 use=1
> --
> 
> DP flows just after the second session is initiated also seem to confirm
> that everything is fine:
> 
> # ovs-appctl dpctl/dump-flows | grep -oE "ct_state(.*),ct_label"
> ct_state(+new-est-rel-rpl-inv+trk),ct_label
> ct_state(-new+est-rel-rpl-inv+trk),ct_label
> ct_state(-new+est-rel+rpl-inv+trk),ct_label
> ct_state(+new-est-rel-rpl-inv+trk),ct_label
> ct_state(-new+est-rel+rpl-inv+trk),ct_label
> ct_state(-new+est-rel-rpl-inv+trk),ct_label
> 
> I also tried it out on a Fedora 34 with 5.13.14-200.fc34.x86_64, still
> works fine.
> 
> What kernel and openvswitch module versions do you use?
> 
On my box there is CentOS 7.5 with kernel 3.10.0-862.14.4.el7 and OOT kernel 
module.
I’ve tested two versions, in both the problem was hit:
openvswitch-kmod-2.13.4-1.el7_5.x86_64
openvswitch-kmod-2.16.0-1.el7_5.x86_64

Do you think the problem could be related to kernel (conntrack) and kernel must 
be upgraded here?
Or, maybe I should try master OVS, as you did?

> Regards,
> Dumitru
> 
>>>> 
>>>> @Numan, @Han, @Mark, can you please take a look at this and give any 
>>>> suggestions/thoughts how this can be fixed.
>>>> The problem is actual with OVS 2.13.4 and latest OVN master branch, 
>>>> however we’ve met it on 20.06.3 with same OVS and it’s very important for 
>>>> us.
>>> 
>>> Hi Vladislav,
>>> 
>>> From what I understand this commit should help your use case -
>>> https://github.com/ovn-org/ovn/commit/58683a4271e6a885f2f2aea27f3df88e69a5c388
>>>  
>>> <https://github.com/ovn-org/ovn/commit/58683a4271e6a885f2f2aea27f3df88e69a5c388>
>>> 
>>> Looks to me like there's a tuple collision.  And you would need the
>>> latest OVS (ovs 2.16) along with the latest OVN having the above
>>> commit.
>>> 
>>> @Dumitru Ceara please correct me If I'm wrong.
>>> 
>>> Thanks
>>> Numan
>>> 
>>>> 
>>>> Thanks.
>>>> 
>>>> 
>>>> Regards,
>>>> Vladislav Odintsov
>>>> _______________________________________________
> 
> _______________________________________________
> dev mailing list
> [email protected] <mailto:[email protected]>
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev 
> <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [ovn] problem: long tcp session instantiation with stateful ACLs

Reply via email to