Paolo Valerio, Jun 05, 2022 at 19:37: > Just a note that may be useful. > After some tests, I noticed that establishing e.g. two TCP connections, > and leaving the first one idle after 3whs, once the second connection > expires (after moving to TIME_WAIT as a result of termination), the > second doesn't get evicted until any event gets scheduled for the first. > > ovs-appctl dpctl/dump-conntrack -s > tcp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=9090,dport=8080),reply=(src=10.1.1.2,dst=10.1.1.1,sport=8080,dport=9090),zone=1,timeout=84576,protoinfo=(state=ESTABLISHED) > tcp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=9091,dport=8080),reply=(src=10.1.1.2,dst=10.1.1.1,sport=8080,dport=9091),zone=1,timeout=0,protoinfo=(state=TIME_WAIT) > > This may be somewhat related to your results as during the > test, the number of connections may reach the limit so apparently reducing > the performances.
Indeed, there was an issue in my test procedure. Due to the way T-Rex generates connections, it is easy to fill the conntrack table after a few iterations, making the test results inconsistent. Also, the flows which I had configured were not correct. There was an extraneous action=NORMAL flow at the end. When the conntrack table is full and a new packet cannot be tracked, it is marked as +trk+inv and not dropped. This behaviour is specific to the userspace datapath. The linux kernel datapath seems to drop the packet when it cannot be added to connection tracking. Gaƫtan's series (v4) seems less resilient to the conntrack table being full, especially when there is more than one PMD core. I have changed the t-rex script to allow running arbitrary commands in between traffic iterations. This is leveraged to flush the conntrack table and run each iteration in the same conditions. https://github.com/cisco-system-traffic-generator/trex-core/blob/v2.98/scripts/cps_ndr.py To avoid filling the conntrack table, the max size was increased to 50M. The DUT configuration can be summarized as the following: ovs-vsctl set open_vswitch . other_config:dpdk-init=true ovs-vsctl set open_vswitch . other_config:pmd-cpu-mask="0x15554" ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev ovs-vsctl add-port br0 pf0 -- set Interface pf0 type=dpdk \ options:dpdk-devargs=0000:3b:00.0 options:n_rxq=4 options:n_rxq_desc=4096 ovs-vsctl add-port br0 pf1 -- set Interface pf1 type=dpdk \ options:dpdk-devargs=0000:3b:00.1 options:n_rxq=4 options:n_rxq_desc=4096 ovs-appctl dpctl/ct-set-maxconns 50000000 ovs-ofctl add-flow br0 "table=0,priority=10,ip,ct_state=-trk,actions=ct(table=0)" ovs-ofctl add-flow br0 "table=0,priority=10,ip,ct_state=+trk+new,actions=ct(commit),NORMAL" ovs-ofctl add-flow br0 "table=0,priority=10,ip,ct_state=+trk+est,actions=NORMAL" ovs-ofctl add-flow br0 "table=0,priority=0,actions=drop" Short Lived Connections ----------------------- ./cps_ndr.py --sample-time 10 --max-iterations 8 --error-threshold 0.01 \ --reset-command "ssh $dut ovs-appctl dpctl/flush-conntrack" \ --udp-percent 1 --num-messages 1 --message-size 20 --server-wait 0 \ --min-cps 10k --max-cps 600k ============== =============== ============== ============== =========== Series Num. Flows CPS PPS BPS ============== =============== ============== ============== =========== Baseline 40.1K 79.3K/s 556Kp/s 347Mb/s Gaetan v1 60.5K 121K/s 837Kp/s 522Mb/s Gaetan v4 61.4K 122K/s 850Kp/s 530Mb/s Paolo 377K 756K/s 5.3Mp/s 3.3Gb/s ============== =============== ============== ============== =========== Even after fixing the test procedure, Paolo's series still performs a lot better with short lived connections. Long Lived Connections ---------------------- ./cps_ndr.py --sample-time 30 --max-iterations 8 --error-threshold 0.01 \ --reset-command "ssh $dut ovs-appctl dpctl/flush-conntrack" \ --udp-percent 1 --num-messages 500 --message-size 20 --server-wait 50 \ --min-cps 100 --max-cps 10k ============== =============== ============== ============== =========== Series Num. Flows CPS PPS BPS ============== =============== ============== ============== =========== Baseline 17.4K 504/s 633Kp/s 422Mb/s Gaetan v1 80.4K 3.1K/s 4.6Mp/s 3.0Gb/s Gaetan v4 139K 5.4K/s 8.2Mp/s 5.4Gb/s Paolo 132K 5.2K/s 7.7Mp/s 5.2Gb/s ============== =============== ============== ============== =========== Thanks to Paolo for his help on this second round of tests. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev