On Fri, Jun 21, 2019, 11:47 AM Han Zhou <[email protected]> wrote: > > > On Tue, Jun 11, 2019 at 9:16 AM Daniel Alvarez Sanchez < > [email protected]> wrote: > > > > Thanks a lot Han for the answer! > > > > On Tue, Jun 11, 2019 at 5:57 PM Han Zhou <[email protected]> wrote: > > > > > > > > > > > > > > > On Tue, Jun 11, 2019 at 5:12 AM Dumitru Ceara <[email protected]> > wrote: > > > > > > > > On Tue, Jun 11, 2019 at 10:40 AM Daniel Alvarez Sanchez > > > > <[email protected]> wrote: > > > > > > > > > > Hi Han, all, > > > > > > > > > > Lucas, Numan and I have been doing some 'scale' testing of > OpenStack > > > > > using OVN and wanted to present some results and issues that we've > > > > > found with the Incremental Processing feature in ovn-controller. > Below > > > > > is the scenario that we executed: > > > > > > > > > > * 7 baremetal nodes setup: 3 controllers (running > > > > > ovn-northd/ovsdb-servers in A/P with pacemaker) + 4 compute nodes. > OVS > > > > > 2.10. > > > > > * The test consists on: > > > > > - Create openstack network (OVN LS), subnet and router > > > > > - Attach subnet to the router and set gw to the external network > > > > > - Create an OpenStack port and apply a Security Group (ACLs to > allow > > > > > UDP, SSH and ICMP). > > > > > - Bind the port to one of the 4 compute nodes (randomly) by > > > > > attaching it to a network namespace. > > > > > - Wait for the port to be ACTIVE in Neutron ('up == True' in NB) > > > > > - Wait until the test can ping the port > > > > > * Running browbeat/rally with 16 simultaneous process to execute > the > > > > > test above 150 times. > > > > > * When all the 150 'fake VMs' are created, browbeat will delete all > > > > > the OpenStack/OVN resources. > > > > > > > > > > We first tried with OVS/OVN 2.10 and pulled some results which > showed > > > > > 100% success but ovn-controller is quite loaded (as expected) in > all > > > > > the nodes especially during the deletion phase: > > > > > > > > > > - Compute node: https://imgur.com/a/tzxfrIR > > > > > - Controller node (ovn-northd and ovsdb-servers): > https://imgur.com/a/8ffKKYF > > > > > > > > > > After conducting the tests above, we replaced ovn-controller in > all 7 > > > > > nodes by the one with the current master branch (actually from last > > > > > week). We also replaced ovn-northd and ovsdb-servers but the > > > > > ovs-vswitchd has been left untouched (still on 2.10). The expected > > > > > results were to get less ovn-controller CPU usage and also better > > > > > times due to the Incremental Processing feature introduced > recently. > > > > > However, the results don't look very good: > > > > > > > > > > - Compute node: https://imgur.com/a/wuq87F1 > > > > > - Controller node (ovn-northd and ovsdb-servers): > https://imgur.com/a/99kiyDp > > > > > > > > > > One thing that we can tell from the ovs-vswitchd CPU consumption is > > > > > that it's much less in the Incremental Processing (IP) case which > > > > > apparently doesn't make much sense. This led us to think that > perhaps > > > > > ovn-controller was not installing the necessary flows in the switch > > > > > and we confirmed this hypothesis by looking into the dataplane > > > > > results. Out of the 150 VMs, 10% of them were unreachable via ping > > > > > when using ovn-controller from master. > > > > > > > > > > @Han, others, do you have any ideas as of what could be happening > > > > > here? We'll be able to use this setup for a few more days so let me > > > > > know if you want us to pull some other data/traces, ... > > > > > > > > > > Some other interesting things: > > > > > On each of the compute nodes, (with an almost evenly distributed > > > > > number of logical ports bound to them), the max amount of logical > > > > > flows in br-int is ~90K (by the end of the test, right before > deleting > > > > > the resources). > > > > > > > > > > It looks like with the IP version, ovn-controller leaks some > memory: > > > > > https://imgur.com/a/trQrhWd > > > > > While with OVS 2.10, it remains pretty flat during the test: > > > > > https://imgur.com/a/KCkIT4O > > > > > > > > Hi Daniel, Han, > > > > > > > > I just sent a small patch for the ovn-controller memory leak: > > > > https://patchwork.ozlabs.org/patch/1113758/ > > > > > > > > At least on my setup this is what valgrind was pointing at. > > > > > > > > Cheers, > > > > Dumitru > > > > > > > > > > > > > > Looking forward to hearing back :) > > > > > Daniel > > > > > > > > > > PS. Sorry for my previous email, I sent it by mistake without the > subject > > > > > _______________________________________________ > > > > > discuss mailing list > > > > > [email protected] > > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > > > > > Thanks Daniel for the testing and reporting, and thanks Dumitru for > fixing the memory leak. > > > > > > Currently ovn-controller incremental processing only handles below SB > changes incrementally: > > > - logical_flow > > > - port_binding (for regular VIF binding NOT on current chassis) > > > - mc_group > > > - address_set > > > - port_group > > > - mac_binding > > > > > > So, in test scenario you described, since each iteration creates > network (SB datapath changes) and router ports (port_binding changes for > non VIF), the incremental processing would not help much, because most > steps in your test should trigger recompute. It would help if you create > more Fake VMs in each iteration, e.g. create 10 VMs or more on each LS. > Secondly, when VIF port-binding happens on current chassis, the > ovn-controller will still do re-compute, and because you have only 4 > compute nodes, so 1/4 of the compute node will still recompute even when > binding a regular VIF port. When you have more compute nodes you would see > incremental processing more effective. > > > > Got it, it makes sense (although then worst case, it should be at > > least what we had before and not worse but it can also be because > > we're mixing version here: 2.10 vs master). > > > > > > However, what really worries me is the 10% VM unreachable. I have one > confusion here on the test steps. The last step you described was: - Wait > until the test can ping the port. So if the VM is not pingable the test > won't continue? > > > > Sorry I should've explained it better. We wait for 2 minutes to the > > port to respond to pings, if it's not reachable then we continue with > > the next port (16 rally processes are running simultaneously so the > > rest of the process may be doing stuff at the same time). > > > > > > > > To debug the problem, the first thing is to identify what flows are > missing for the VMs that is unreachable. Could you do ovs-appctl > ofproto/trace for the ICMP flow of any VM with ping failure? And then, > please enable debug log for ovn-controller with ovs-appctl -t > ovn-controller vlog/set file:dbg. There may be too many logs so please > enable it for as short time as any VM with ping failure is reproduced. If > the last step "wait until the test can ping the port" is there then it > should be able to detect the first occurrence if the VM is not reachable in > e.g. 30 sec. > > > > We'll need to hack a bit here but let's see :) > > > > > > In the ovn-scale-test we didn't have data plane test, but this problem > was not seen in our live environment either, with a far larger scale. The > major difference in your test v.s. our environment are: > > > - We are runing with an older version. So there might be some > rebase/refactor problem caused this. To eliminate this, I'd suggest to try > a branch I created for 2.10 ( > https://github.com/hzhou8/ovs/tree/ip12_rebase_on_2.10), which matches > the base test you did which is also 2.10. It may also eliminate > compatibility problem, if there is any, between OVN master branch and OVS > 2.10 as you mentioned is used in the test. > > > - We don't use Security Group (I guess the ~90k OVS flows you > mentioned were mainly introduced by the Security Group use, if all ports > were put in same group). The incremental processing is expected to be > correct for security-groups, and handling it incrementally because of > address_set and port_group incremental processing. However, since the > testing only relied on the regression tests, I am not 100% sure if the test > coverage was sufficient. So could you try disabling Security Group to rule > out the problem? > > > > Ok will try to repeat the tests without the SGs. > > > > > > Thanks, > > > Han > > > > Thanks once again! > > Daniel > > Hi Daniel, > > Any updates? Do you still see the 10% VM unreachable >
> Thanks, > Han > Hi Han, As such there is no datapath impact. After increasing the ping wait timeout value from 120 seconds to 180 seconds its 100% now. But the time taken to program the flows is too huge when compared to OVN master without IP patches. Here is some data - http://paste.openstack.org/show/753224/ . I am still investigating it. I will update my findings in some time. Please see the times for the action - vm.wait_for_ping Thanks Numan ***** OVN Master with No IP patches +----------------------------------------------------------------------------------------------------------------------------------+ | Response Times (sec) | +-------------------------------+-----------+--------------+--------------+--------------+-----------+-----------+---------+-------+ | Action | Min (sec) | Median (sec) | 90%ile (sec) | 95%ile (sec) | Max (sec) | Avg (sec) | Success | Count | +-------------------------------+-----------+--------------+--------------+--------------+-----------+-----------+---------+-------+ | neutron.create_router | 4.092 | 8.584 | 45.556 | 51.174 | 69.556 | 18.541 | 100.0% | 150 | | neutron.create_network | 0.53 | 2.21 | 8.724 | 9.696 | 17.238 | 3.774 | 100.0% | 150 | | neutron.create_subnet | 0.561 | 2.554 | 12.159 | 13.399 | 17.784 | 4.893 | 100.0% | 150 | | neutron.add_interface_router | 3.946 | 8.342 | 59.856 | 70.381 | 84.243 | 23.663 | 100.0% | 150 | | neutron.create_floating_ip | 2.296 | 7.12 | 34.13 | 39.495 | 56.925 | 13.441 | 100.0% | 150 | | -> neutron.list_networks | 0.122 | 1.639 | 4.078 | 4.48 | 5.915 | 1.762 | 100.0% | 150 | | neutron.create_port | 1.281 | 4.527 | 32.941 | 37.49 | 47.53 | 11.482 | 100.0% | 150 | | neutron._associate_fip | 0.739 | 2.934 | 14.862 | 17.891 | 22.46 | 5.6 | 100.0% | 150 | | neutron._wait_for_port_active | 0.06 | 3.229 | 6.642 | 17.374 | 44.244 | 4.771 | 100.0% | 150 | | vm.wait_for_ping | 0.008 | 4.023 | 13.056 | 15.066 | 63.83 | 5.605 | 100.0% | 150 | | total | 35.488 | 89.948 | 140.028 | 151.065 | 186.065 | 92.585 | 100.0% | 150 | | -> duration | 35.488 | 89.948 | 140.028 | 151.065 | 186.065 | 92.585 | 100.0% | 150 | | -> idle_duration | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 100.0% | 150 | +-------------------------------+-----------+--------------+--------------+--------------+-----------+-----------+---------+-------+ OVN master with IP patches +----------------------------------------------------------------------------------------------------------------------------------+ | Response Times (sec) | +-------------------------------+-----------+--------------+--------------+--------------+-----------+-----------+---------+-------+ | Action | Min (sec) | Median (sec) | 90%ile (sec) | 95%ile (sec) | Max (sec) | Avg (sec) | Success | Count | +-------------------------------+-----------+--------------+--------------+--------------+-----------+-----------+---------+-------+ | neutron.create_router | 2.209 | 7.052 | 39.981 | 43.599 | 70.568 | 13.703 | 100.0% | 150 | | neutron.create_network | 0.26 | 1.606 | 6.723 | 8.062 | 12.443 | 2.671 | 100.0% | 150 | | neutron.create_subnet | 0.388 | 2.009 | 9.205 | 11.574 | 19.874 | 3.621 | 100.0% | 150 | | neutron.add_interface_router | 2.523 | 6.964 | 54.998 | 62.473 | 90.751 | 17.793 | 100.0% | 150 | | neutron.create_floating_ip | 1.927 | 6.104 | 28.42 | 34.572 | 46.617 | 11.169 | 100.0% | 150 | | -> neutron.list_networks | 0.11 | 1.051 | 2.854 | 3.765 | 4.564 | 1.318 | 100.0% | 150 | | neutron.create_port | 1.051 | 3.585 | 27.96 | 31.856 | 56.297 | 8.227 | 100.0% | 150 | | neutron._associate_fip | 0.753 | 2.414 | 13.392 | 15.23 | 19.643 | 4.363 | 100.0% | 150 | | neutron._wait_for_port_active | 0.04 | 3.976 | 7.673 | 11.712 | 19.408 | 4.623 | 100.0% | 150 | | vm.wait_for_ping | 0.013 | 28.122 | 110.513 | 127.63 | 148.637 | 42.417 | 100.0% | 150 | | total | 23.746 | 107.644 | 159.715 | 172.618 | 191.713 | 109.398 | 100.0% | 150 | | -> duration | 23.746 | 107.644 | 159.715 | 172.618 | 191.713 | 109.398 | 100.0% | 150 | | -> idle_duration | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 100.0% | 150 | +-------------------------------+-----------+--------------+--------------+--------------+-----------+-----------+---------+-------+ > _______________________________________________ > discuss mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >
_______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
