On Mon, Jul 22, 2019 at 12:35 PM Daniel Alvarez Sanchez <[email protected]> wrote:
> Neat! Thanks folks :) > I'll try to get an OSP setup where we can patch this and re-run the > same tests than previous time to confirm but looks promising. > > On Fri, Jul 19, 2019 at 11:12 PM Han Zhou <[email protected]> wrote: > > > > > > > > On Fri, Jul 19, 2019 at 12:37 PM Numan Siddique <[email protected]> > wrote: > >> > >> > >> > >> On Fri, Jul 19, 2019 at 6:19 PM Numan Siddique <[email protected]> > wrote: > >>> > >>> > >>> > >>> On Fri, Jul 19, 2019 at 6:28 AM Han Zhou <[email protected]> wrote: > >>>> > >>>> > >>>> > >>>> On Tue, Jul 9, 2019 at 12:13 AM Numan Siddique <[email protected]> > wrote: > >>>> > > >>>> > > >>>> > > >>>> > On Tue, Jul 9, 2019 at 12:25 PM Daniel Alvarez Sanchez < > [email protected]> wrote: > >>>> >> > >>>> >> Thanks Numan for running these tests outside OpenStack! > >>>> >> > >>>> >> On Tue, Jul 9, 2019 at 7:50 AM Numan Siddique <[email protected]> > wrote: > >>>> >> > > >>>> >> > > >>>> >> > > >>>> >> > On Tue, Jul 9, 2019 at 11:05 AM Han Zhou <[email protected]> > wrote: > >>>> >> >> > >>>> >> >> > >>>> >> >> > >>>> >> >> On Fri, Jun 21, 2019 at 12:31 AM Han Zhou <[email protected]> > wrote: > >>>> >> >> > > >>>> >> >> > > >>>> >> >> > > >>>> >> >> > On Thu, Jun 20, 2019 at 11:42 PM Numan Siddique < > [email protected]> wrote: > >>>> >> >> > > > >>>> >> >> > > > >>>> >> >> > > > >>>> >> >> > > On Fri, Jun 21, 2019, 11:47 AM Han Zhou <[email protected]> > wrote: > >>>> >> >> > >> > >>>> >> >> > >> > >>>> >> >> > >> > >>>> >> >> > >> On Tue, Jun 11, 2019 at 9:16 AM Daniel Alvarez Sanchez < > [email protected]> wrote: > >>>> >> >> > >> > > >>>> >> >> > >> > Thanks a lot Han for the answer! > >>>> >> >> > >> > > >>>> >> >> > >> > On Tue, Jun 11, 2019 at 5:57 PM Han Zhou < > [email protected]> wrote: > >>>> >> >> > >> > > > >>>> >> >> > >> > > > >>>> >> >> > >> > > > >>>> >> >> > >> > > > >>>> >> >> > >> > > On Tue, Jun 11, 2019 at 5:12 AM Dumitru Ceara < > [email protected]> wrote: > >>>> >> >> > >> > > > > >>>> >> >> > >> > > > On Tue, Jun 11, 2019 at 10:40 AM Daniel Alvarez > Sanchez > >>>> >> >> > >> > > > <[email protected]> wrote: > >>>> >> >> > >> > > > > > >>>> >> >> > >> > > > > Hi Han, all, > >>>> >> >> > >> > > > > > >>>> >> >> > >> > > > > Lucas, Numan and I have been doing some 'scale' > testing of OpenStack > >>>> >> >> > >> > > > > using OVN and wanted to present some results and > issues that we've > >>>> >> >> > >> > > > > found with the Incremental Processing feature in > ovn-controller. Below > >>>> >> >> > >> > > > > is the scenario that we executed: > >>>> >> >> > >> > > > > > >>>> >> >> > >> > > > > * 7 baremetal nodes setup: 3 controllers (running > >>>> >> >> > >> > > > > ovn-northd/ovsdb-servers in A/P with pacemaker) + > 4 compute nodes. OVS > >>>> >> >> > >> > > > > 2.10. > >>>> >> >> > >> > > > > * The test consists on: > >>>> >> >> > >> > > > > - Create openstack network (OVN LS), subnet and > router > >>>> >> >> > >> > > > > - Attach subnet to the router and set gw to the > external network > >>>> >> >> > >> > > > > - Create an OpenStack port and apply a Security > Group (ACLs to allow > >>>> >> >> > >> > > > > UDP, SSH and ICMP). > >>>> >> >> > >> > > > > - Bind the port to one of the 4 compute nodes > (randomly) by > >>>> >> >> > >> > > > > attaching it to a network namespace. > >>>> >> >> > >> > > > > - Wait for the port to be ACTIVE in Neutron ('up > == True' in NB) > >>>> >> >> > >> > > > > - Wait until the test can ping the port > >>>> >> >> > >> > > > > * Running browbeat/rally with 16 simultaneous > process to execute the > >>>> >> >> > >> > > > > test above 150 times. > >>>> >> >> > >> > > > > * When all the 150 'fake VMs' are created, > browbeat will delete all > >>>> >> >> > >> > > > > the OpenStack/OVN resources. > >>>> >> >> > >> > > > > > >>>> >> >> > >> > > > > We first tried with OVS/OVN 2.10 and pulled some > results which showed > >>>> >> >> > >> > > > > 100% success but ovn-controller is quite loaded > (as expected) in all > >>>> >> >> > >> > > > > the nodes especially during the deletion phase: > >>>> >> >> > >> > > > > > >>>> >> >> > >> > > > > - Compute node: https://imgur.com/a/tzxfrIR > >>>> >> >> > >> > > > > - Controller node (ovn-northd and ovsdb-servers): > https://imgur.com/a/8ffKKYF > >>>> >> >> > >> > > > > > >>>> >> >> > >> > > > > After conducting the tests above, we replaced > ovn-controller in all 7 > >>>> >> >> > >> > > > > nodes by the one with the current master branch > (actually from last > >>>> >> >> > >> > > > > week). We also replaced ovn-northd and > ovsdb-servers but the > >>>> >> >> > >> > > > > ovs-vswitchd has been left untouched (still on > 2.10). The expected > >>>> >> >> > >> > > > > results were to get less ovn-controller CPU usage > and also better > >>>> >> >> > >> > > > > times due to the Incremental Processing feature > introduced recently. > >>>> >> >> > >> > > > > However, the results don't look very good: > >>>> >> >> > >> > > > > > >>>> >> >> > >> > > > > - Compute node: https://imgur.com/a/wuq87F1 > >>>> >> >> > >> > > > > - Controller node (ovn-northd and ovsdb-servers): > https://imgur.com/a/99kiyDp > >>>> >> >> > >> > > > > > >>>> >> >> > >> > > > > One thing that we can tell from the ovs-vswitchd > CPU consumption is > >>>> >> >> > >> > > > > that it's much less in the Incremental Processing > (IP) case which > >>>> >> >> > >> > > > > apparently doesn't make much sense. This led us to > think that perhaps > >>>> >> >> > >> > > > > ovn-controller was not installing the necessary > flows in the switch > >>>> >> >> > >> > > > > and we confirmed this hypothesis by looking into > the dataplane > >>>> >> >> > >> > > > > results. Out of the 150 VMs, 10% of them were > unreachable via ping > >>>> >> >> > >> > > > > when using ovn-controller from master. > >>>> >> >> > >> > > > > > >>>> >> >> > >> > > > > @Han, others, do you have any ideas as of what > could be happening > >>>> >> >> > >> > > > > here? We'll be able to use this setup for a few > more days so let me > >>>> >> >> > >> > > > > know if you want us to pull some other > data/traces, ... > >>>> >> >> > >> > > > > > >>>> >> >> > >> > > > > Some other interesting things: > >>>> >> >> > >> > > > > On each of the compute nodes, (with an almost > evenly distributed > >>>> >> >> > >> > > > > number of logical ports bound to them), the max > amount of logical > >>>> >> >> > >> > > > > flows in br-int is ~90K (by the end of the test, > right before deleting > >>>> >> >> > >> > > > > the resources). > >>>> >> >> > >> > > > > > >>>> >> >> > >> > > > > It looks like with the IP version, ovn-controller > leaks some memory: > >>>> >> >> > >> > > > > https://imgur.com/a/trQrhWd > >>>> >> >> > >> > > > > While with OVS 2.10, it remains pretty flat during > the test: > >>>> >> >> > >> > > > > https://imgur.com/a/KCkIT4O > >>>> >> >> > >> > > > > >>>> >> >> > >> > > > Hi Daniel, Han, > >>>> >> >> > >> > > > > >>>> >> >> > >> > > > I just sent a small patch for the ovn-controller > memory leak: > >>>> >> >> > >> > > > https://patchwork.ozlabs.org/patch/1113758/ > >>>> >> >> > >> > > > > >>>> >> >> > >> > > > At least on my setup this is what valgrind was > pointing at. > >>>> >> >> > >> > > > > >>>> >> >> > >> > > > Cheers, > >>>> >> >> > >> > > > Dumitru > >>>> >> >> > >> > > > > >>>> >> >> > >> > > > > > >>>> >> >> > >> > > > > Looking forward to hearing back :) > >>>> >> >> > >> > > > > Daniel > >>>> >> >> > >> > > > > > >>>> >> >> > >> > > > > PS. Sorry for my previous email, I sent it by > mistake without the subject > >>>> >> >> > >> > > > > _______________________________________________ > >>>> >> >> > >> > > > > discuss mailing list > >>>> >> >> > >> > > > > [email protected] > >>>> >> >> > >> > > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > >>>> >> >> > >> > > > >>>> >> >> > >> > > Thanks Daniel for the testing and reporting, and > thanks Dumitru for fixing the memory leak. > >>>> >> >> > >> > > > >>>> >> >> > >> > > Currently ovn-controller incremental processing only > handles below SB changes incrementally: > >>>> >> >> > >> > > - logical_flow > >>>> >> >> > >> > > - port_binding (for regular VIF binding NOT on current > chassis) > >>>> >> >> > >> > > - mc_group > >>>> >> >> > >> > > - address_set > >>>> >> >> > >> > > - port_group > >>>> >> >> > >> > > - mac_binding > >>>> >> >> > >> > > > >>>> >> >> > >> > > So, in test scenario you described, since each > iteration creates network (SB datapath changes) and router ports > (port_binding changes for non VIF), the incremental processing would not > help much, because most steps in your test should trigger recompute. It > would help if you create more Fake VMs in each iteration, e.g. create 10 > VMs or more on each LS. Secondly, when VIF port-binding happens on current > chassis, the ovn-controller will still do re-compute, and because you have > only 4 compute nodes, so 1/4 of the compute node will still recompute even > when binding a regular VIF port. When you have more compute nodes you would > see incremental processing more effective. > >>>> >> >> > >> > > >>>> >> >> > >> > Got it, it makes sense (although then worst case, it > should be at > >>>> >> >> > >> > least what we had before and not worse but it can also > be because > >>>> >> >> > >> > we're mixing version here: 2.10 vs master). > >>>> >> >> > >> > > > >>>> >> >> > >> > > However, what really worries me is the 10% VM > unreachable. I have one confusion here on the test steps. The last step you > described was: - Wait until the test can ping the port. So if the VM is not > pingable the test won't continue? > >>>> >> >> > >> > > >>>> >> >> > >> > Sorry I should've explained it better. We wait for 2 > minutes to the > >>>> >> >> > >> > port to respond to pings, if it's not reachable then we > continue with > >>>> >> >> > >> > the next port (16 rally processes are running > simultaneously so the > >>>> >> >> > >> > rest of the process may be doing stuff at the same time). > >>>> >> >> > >> > > >>>> >> >> > >> > > > >>>> >> >> > >> > > To debug the problem, the first thing is to identify > what flows are missing for the VMs that is unreachable. Could you do > ovs-appctl ofproto/trace for the ICMP flow of any VM with ping failure? And > then, please enable debug log for ovn-controller with ovs-appctl -t > ovn-controller vlog/set file:dbg. There may be too many logs so please > enable it for as short time as any VM with ping failure is reproduced. If > the last step "wait until the test can ping the port" is there then it > should be able to detect the first occurrence if the VM is not reachable in > e.g. 30 sec. > >>>> >> >> > >> > > >>>> >> >> > >> > We'll need to hack a bit here but let's see :) > >>>> >> >> > >> > > > >>>> >> >> > >> > > In the ovn-scale-test we didn't have data plane test, > but this problem was not seen in our live environment either, with a far > larger scale. The major difference in your test v.s. our environment are: > >>>> >> >> > >> > > - We are runing with an older version. So there might > be some rebase/refactor problem caused this. To eliminate this, I'd suggest > to try a branch I created for 2.10 ( > https://github.com/hzhou8/ovs/tree/ip12_rebase_on_2.10), which matches > the base test you did which is also 2.10. It may also eliminate > compatibility problem, if there is any, between OVN master branch and OVS > 2.10 as you mentioned is used in the test. > >>>> >> >> > >> > > - We don't use Security Group (I guess the ~90k OVS > flows you mentioned were mainly introduced by the Security Group use, if > all ports were put in same group). The incremental processing is expected > to be correct for security-groups, and handling it incrementally because of > address_set and port_group incremental processing. However, since the > testing only relied on the regression tests, I am not 100% sure if the test > coverage was sufficient. So could you try disabling Security Group to rule > out the problem? > >>>> >> >> > >> > > >>>> >> >> > >> > Ok will try to repeat the tests without the SGs. > >>>> >> >> > >> > > > >>>> >> >> > >> > > Thanks, > >>>> >> >> > >> > > Han > >>>> >> >> > >> > > >>>> >> >> > >> > Thanks once again! > >>>> >> >> > >> > Daniel > >>>> >> >> > >> > >>>> >> >> > >> Hi Daniel, > >>>> >> >> > >> > >>>> >> >> > >> Any updates? Do you still see the 10% VM unreachable > >>>> >> >> > >> > >>>> >> >> > >> > >>>> >> >> > >> Thanks, > >>>> >> >> > >> Han > >>>> >> >> > > > >>>> >> >> > > > >>>> >> >> > > Hi Han, > >>>> >> >> > > > >>>> >> >> > > As such there is no datapath impact. After increasing the > ping wait timeout value from 120 seconds to 180 seconds its 100% now. > >>>> >> >> > > > >>>> >> >> > > But the time taken to program the flows is too huge when > compared to OVN master without IP patches. > >>>> >> >> > > Here is some data - > http://paste.openstack.org/show/753224/ . I am still investigating it. I > will update my findings in some time. > >>>> >> >> > > > >>>> >> >> > > Please see the times for the action - vm.wait_for_ping > >>>> >> >> > > > >>>> >> >> > > >>>> >> >> > Thanks Numan for the investigation and update. Glad to hear > there is no correctness issue, but sorry for the slowness in your test > scenario. I expect that the operations in your test trigger recomputing and > the worst case should be similar performance as withour I-P. It is weird > that it turned out so much slower in your test. There can be some extra > overhead when it tries to do incremental processing and then fallback to > full recompute, but it shouldn't cause that big difference. It might be > that for some reason the main loop iteration is triggered more times > unnecessarily. I'd suggest to compare the coverage counter "lflow_run" > between the tests, and also check perf report to see if the hotspot is > somewhere else. (Sorry that I can't provide full-time help now since I am > still on vacation but I will try to be useful if things are blocked) > >>>> >> >> > >>>> >> >> Hi Numan/Daniel, do you have any new findings on why I-P got > worse result in your test? The extremely long latency (2 - 3 min) shown in > your report reminds me a similar problem I reported before: > https://mail.openvswitch.org/pipermail/ovs-dev/2018-April/346321.html > >>>> >> >> > >>>> >> >> The root cause of that problem was still not clear. In that > report, the extremely long latency (7 min) was observed without I-P and it > didn't happen with I-P. If it is the same problem, then I suspect it is not > related to I-P or non I-P, but some problem related to ovsdb monitor > condition change. To confirm if it is same problem, could you: > >>>> >> >> 1. pause the test when the scale is big enough (e.g. when the > test is almost completed), and then > >>>> >> >> 2. enable ovn-controller debug log, and then > >>>> >> >> 3. run one more iteration of the test, and see if the time was > spent on waiting for SB DB update notification. > >>>> >> >> > >>>> >> >> Please ignore my speculation above if you already found the > root cause and it would be great if you could share it :) > >>>> >> > > >>>> >> > > >>>> >> > Thanks for sharing this Han. > >>>> >> > > >>>> >> > I do not have any new findings. Yesterday I ran ovn-scale-test > comparing OVN with IP vs without IP (using the master branch). > >>>> >> > The test creates a new logical switch, adds it to a router, few > ACLs and creates 2 logical ports and pings between them. > >>>> >> > I am using physical deployment which creates actual namespaces > instead of sandboxes. > >>>> >> > > >>>> >> > The results doesn't show any huge difference between the two. > >>>> >> 2300 vs 2900 seconds total time or 44 vs 56 seconds for the > 95%ile? > >>>> >> It is not negligible IMHO. It's a >25% penalty with the IP. Maybe I > >>>> >> missed something from the results? > >>>> >> > >>>> > > >>>> > Initially I ran with ovn-nbctl running commands as one batch (ie > combining commands with "--"). The results were very similar. Like this one > >>>> > > >>>> > ******* > >>>> > > >>>> > With non IP - ovn-nbctl NO daemon mode > >>>> > > >>>> > > +--------------------------------------------------------------------------------------------------------------+ > >>>> > | Response Times (sec) > | > >>>> > > +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+ > >>>> > | action | min | median | 90%ile | > 95%ile | max | avg | success | count | > >>>> > > +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+ > >>>> > | ovn_network.create_routers | 0.288 | 0.429 | 5.454 | > 5.538 | 20.531 | 1.523 | 100.0% | 1000 | > >>>> > | ovn.create_lswitch | 0.046 | 0.139 | 0.202 | > 5.084 | 10.259 | 0.441 | 100.0% | 1000 | > >>>> > | ovn_network.connect_network_to_router | 0.164 | 0.411 | 5.307 | > 5.491 | 15.636 | 1.128 | 100.0% | 1000 | > >>>> > | ovn.create_lport | 0.11 | 0.272 | 0.478 | > 5.284 | 15.496 | 0.835 | 100.0% | 1000 | > >>>> > | ovn_network.bind_port | 1.302 | 2.367 | 2.834 | > 3.24 | 12.409 | 2.527 | 100.0% | 1000 | > >>>> > | ovn_network.wait_port_up | 0.0 | 0.001 | 0.001 | > 0.001 | 0.002 | 0.001 | 100.0% | 1000 | > >>>> > | ovn_network.ping_ports | 0.04 | 10.24 | 10.397 | > 10.449 | 10.82 | 6.767 | 100.0% | 1000 | > >>>> > | total | 2.219 | 13.903 | 23.068 | > 24.538 | 49.437 | 13.222 | 100.0% | 1000 | > >>>> > > +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+ > >>>> > > >>>> > > >>>> > With IP - ovn-nbctl NO daemon mode > >>>> > > >>>> > concurrency - 10 > >>>> > > >>>> > > +--------------------------------------------------------------------------------------------------------------+ > >>>> > | Response Times (sec) > | > >>>> > > +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+ > >>>> > | action | min | median | 90%ile | > 95%ile | max | avg | success | count | > >>>> > > +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+ > >>>> > | ovn_network.create_routers | 0.274 | 0.402 | 0.493 | > 0.51 | 0.584 | 0.408 | 100.0% | 1000 | > >>>> > | ovn.create_lswitch | 0.064 | 0.137 | 0.213 | > 0.244 | 0.33 | 0.146 | 100.0% | 1000 | > >>>> > | ovn_network.connect_network_to_router | 0.203 | 0.395 | 0.677 | > 0.766 | 0.912 | 0.427 | 100.0% | 1000 | > >>>> > | ovn.create_lport | 0.13 | 0.261 | 0.437 | > 0.497 | 0.604 | 0.283 | 100.0% | 1000 | > >>>> > | ovn_network.bind_port | 1.307 | 2.374 | 2.816 | > 2.904 | 3.401 | 2.325 | 100.0% | 1000 | > >>>> > | ovn_network.wait_port_up | 0.0 | 0.001 | 0.001 | > 0.001 | 0.002 | 0.001 | 100.0% | 1000 | > >>>> > | ovn_network.ping_ports | 0.028 | 10.237 | 10.422 | > 10.474 | 11.281 | 6.453 | 100.0% | 1000 | > >>>> > | total | 2.251 | 13.631 | 14.822 | > 15.008 | 15.901 | 10.044 | 100.0% | 1000 | > >>>> > > +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+ > >>>> > > >>>> > ***************** > >>>> > > >>>> > The results I shared in the previous email were with ACLs added > and ovn-nbctl - batch mode disabled. > >>>> > > >>>> > I agree with you. Let me do few more runs to be sure that the > results are consistent. > >>>> > > >>>> > Thanks > >>>> > Numan > >>>> > > >>>> > > >>>> >> > I will test with OVN 2.9 vs 2.11 master along with what you have > suggested above and see if there are any problems related to ovsdb monitor > condition change. > >>>> >> > > >>>> >> > Thanks > >>>> >> > Numan > >>>> >> > > >>>> >> > Below are the results > >>>> >> > > >>>> >> > > >>>> >> > With IP master - nbctl daemon node - No batch mode > >>>> >> > concurrency - 10 > >>>> >> > > >>>> >> > > +--------------------------------------------------------------------------------------------------------------+ > >>>> >> > | Response Times > (sec) | > >>>> >> > > +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+ > >>>> >> > | action | min | median | > 90%ile | 95%ile | max | avg | success | count | > >>>> >> > > +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+ > >>>> >> > | ovn_network.create_routers | 0.269 | 0.661 | > 10.426 | 15.422 | 37.259 | 3.721 | 100.0% | 1000 | > >>>> >> > | ovn.create_lswitch | 0.313 | 0.45 | > 12.107 | 15.373 | 30.405 | 4.185 | 100.0% | 1000 | > >>>> >> > | ovn_network.connect_network_to_router | 0.163 | 0.255 | > 10.121 | 10.64 | 20.475 | 2.655 | 100.0% | 1000 | > >>>> >> > | ovn.create_lport | 0.351 | 0.514 | > 12.255 | 15.511 | 34.74 | 4.621 | 100.0% | 1000 | > >>>> >> > | ovn_network.bind_port | 1.362 | 2.447 | 7.34 > | 7.651 | 17.651 | 3.146 | 100.0% | 1000 | > >>>> >> > | ovn_network.wait_port_up | 0.086 | 2.734 | > 5.272 | 7.827 | 22.717 | 2.957 | 100.0% | 1000 | > >>>> >> > | ovn_network.ping_ports | 0.038 | 10.196 | > 20.285 | 20.39 | 40.74 | 7.52 | 100.0% | 1000 | > >>>> >> > | total | 2.862 | 27.267 | > 49.956 | 56.39 | 90.884 | 28.808 | 100.0% | 1000 | > >>>> >> > > +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+ > >>>> >> > Load duration: 2950.4133141 > >>>> >> > Full duration: 2951.58845997 seconds > >>>> >> > > >>>> >> > *********** > >>>> >> > With non IP - nbctl daemin node -ACLs - No batch mode > >>>> >> > > >>>> >> > concurrency - 10 > >>>> >> > > >>>> >> > > +--------------------------------------------------------------------------------------------------------------+ > >>>> >> > | Response Times > (sec) | > >>>> >> > > +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+ > >>>> >> > | action | min | median | > 90%ile | 95%ile | max | avg | success | count | > >>>> >> > > +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+ > >>>> >> > | ovn_network.create_routers | 0.267 | 0.421 | > 10.395 | 10.735 | 25.501 | 3.09 | 100.0% | 1000 | > >>>> >> > | ovn.create_lswitch | 0.314 | 0.408 | > 10.331 | 10.483 | 25.357 | 3.049 | 100.0% | 1000 | > >>>> >> > | ovn_network.connect_network_to_router | 0.153 | 0.249 | > 6.552 | 10.268 | 20.545 | 2.236 | 100.0% | 1000 | > >>>> >> > | ovn.create_lport | 0.344 | 0.49 | > 10.566 | 15.428 | 25.542 | 3.906 | 100.0% | 1000 | > >>>> >> > | ovn_network.bind_port | 1.372 | 2.409 | > 7.437 | 7.665 | 17.518 | 3.192 | 100.0% | 1000 | > >>>> >> > | ovn_network.wait_port_up | 0.086 | 1.323 | > 5.157 | 7.769 | 20.166 | 2.291 | 100.0% | 1000 | > >>>> >> > | ovn_network.ping_ports | 0.034 | 2.077 | > 10.347 | 10.427 | 20.307 | 5.123 | 100.0% | 1000 | > >>>> >> > | total | 3.109 | 21.26 | > 39.245 | 44.495 | 70.197 | 22.889 | 100.0% | 1000 | > >>>> >> > > +---------------------------------------+-------+--------+--------+--------+--------+--------+---------+-------+ > >>>> >> > Load duration: 2328.11378407 > >>>> >> > Full duration: 2334.43504095 seconds > >>>> >> > > >>>> >> > >>>> > >>>> Hi Numan/Daniel, > >>>> > >>>> I spent some time investigating this problem you reported. Thanks > Numan for the offline help sharing the details. > >>>> > >>>> Although I still didn't reproduce the slowness in my current single > node testing env with almost same steps and ACLs shared by Numan, I think I > may have figured out a highy probable cause of what you have seen. > >>>> > >>>> Here is my theory: there is a difference between the I-P and non-I-P > in the main loop. The non-I-P version checks ofctrl_can_put() before doing > any flow computation (which is introduced to solve a serious performance > problem when there are many OVS flows on a single node, see [1]). When > worked out the I-P version, I found this may not be the best approach, > since there can be new incremental changes coming and we want to process > them in current iteration incrementally, so that we don't need to fallback > to recompute in next iteration. So this logic is changed so that we always > prioritize computing new changes and keeping the desired flow table up to > date, while the in-flight messages to ovs-vswitchd may still pending for an > older version of desired state. In the end the final desired state will be > synced again to ovs-vswitchd. If there are new changes that triggers > recompute again, the recompute (which is always slow) will slow down the > ofctrl_run() which keeps sending old pending messages to ovs-vswitchd by > the same main thread. (But it won't cause the original performance problem > any more because incremental processing engine will not recompute when > there is no input change). > >>>> > >>>> However, when the test scenario triggers recompute frequently, each > single change may take longer to be enforced in OVS, because of this new > approach. The later recompute iterations would slow down the previous > computed OVS flow installation. In your test you used parallel 10, which > means at any point there might be new changes from one client such as > creating new router that triggers recomputing, which can block the OVS flow > installation triggered earlier for another client. So overall you will see > much bigger latency for each individual test iteration. > >>>> > >>>> This can also explain why I didn't reproduce the problem in my > single-client single-node environment, since each iteration is serialized. > >>>> > >>>> [1] > https://github.com/openvswitch/ovs/commit/74c760c8fe99d554b94423d49d13d5ca3dea0d9e > >>>> > >>>> To prove this theory, could you help with two tests reusing your > environment? Thanks a lot! > >>>> > >>> > >>> Thanks Han. I will try these and come back to you with the results. > >>> > >>> Numan > >>> > >>>> > >>>> 1. Instead of parallelism of 10, try 1, to make sure the test is > serialized. I'd expect the result should be similar w/ v.s. w/o I-P. > >>>> > >>>> 2. Try below patch on the I-P version you are testing, to see if the > problem is gone. > >>>> ----8><--------------------------------------------><8--------------- > >>>> diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c > >>>> index 043abd6..0fcaa72 100644 > >>>> --- a/ovn/controller/ofctrl.c > >>>> +++ b/ovn/controller/ofctrl.c > >>>> @@ -985,7 +985,7 @@ add_meter(struct ovn_extend_table_info *m_desired, > >>>> * in the correct state and not backlogged with existing flow_mods. > (Our > >>>> * criteria for being backlogged appear very conservative, but the > socket > >>>> * between ovn-controller and OVS provides some buffering.) */ > >>>> -static bool > >>>> +bool > >>>> ofctrl_can_put(void) > >>>> { > >>>> if (state != S_UPDATE_FLOWS > >>>> diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h > >>>> index ed8918a..2b21c11 100644 > >>>> --- a/ovn/controller/ofctrl.h > >>>> +++ b/ovn/controller/ofctrl.h > >>>> @@ -51,6 +51,7 @@ void ofctrl_put(struct ovn_desired_flow_table *, > >>>> const struct sbrec_meter_table *, > >>>> int64_t nb_cfg, > >>>> bool flow_changed); > >>>> +bool ofctrl_can_put(void); > >>>> void ofctrl_wait(void); > >>>> void ofctrl_destroy(void); > >>>> int64_t ofctrl_get_cur_cfg(void); > >>>> diff --git a/ovn/controller/ovn-controller.c > b/ovn/controller/ovn-controller.c > >>>> index c4883aa..c85c6fa 100644 > >>>> --- a/ovn/controller/ovn-controller.c > >>>> +++ b/ovn/controller/ovn-controller.c > >>>> @@ -1954,7 +1954,7 @@ main(int argc, char *argv[]) > >>>> > >>>> stopwatch_start(CONTROLLER_LOOP_STOPWATCH_NAME, > >>>> time_msec()); > >>>> - if (ovnsb_idl_txn) { > >>>> + if (ovnsb_idl_txn && ofctrl_can_put()) { > >>>> engine_run(&en_flow_output, ++engine_run_id); > >>>> } > >>>> stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME, > >> > >> > >> > >> Hi Han, > >> > >> So far I could do just one run after applying your above suggested > patch with the I-P version and results look promising. > >> It seems to me the problem is gone. > >> > >> > +--------------------------------------------------------------------------------------------------------------------------+ > >> | Response Times (sec) > | > >> > +----------------------------------+--------+----------+----------+----------+---------+---------+------------+-------+ > >> | action | min | median | 90%ile | > 95%ile | max | avg | success | count | > >> > +----------------------------------+--------+----------+----------+----------+---------+---------+------------+-------+ > >> | ovn_network.ping_ports | 0.037 | 10.236 | 10.392 | 10.462 | 20.455 > | 7.15 | 100.0% | 1000 | > >> > +----------------------------------+--------+----------+----------+----------+---------+---------+------------+-------+ > >> | ovn_network.ping_ports | 0.036 | 10.255 | 10.448 | 11.323 | 20.791 > | 7.83 | 100.0% | 1000 | > >> > +----------------------------------+--------+----------+----------+----------+---------+---------+------------+-------+ > >> > >> The first row represents Non IP and the 2nd row represents IP + your > suggested patch. > >> The values are comparable and lot better compared to without your patch. > >> > >> On monday I will do more runs to be sure that the data is consistent > and get back to you. > >> > >> If the results are consistent, I would try to run the tests which > Daniel and Lucas ran on an openstack deployment. > Hi Han, I got some test results. I deployed devstack with OVN, configure browbeat and patched it to include Daniel's test case - https://github.com/danalsan/browbeat/commit/0ff72da52ddf17aa9f7269f191eebd890899bdad Ran the tests for 100 times with a concurrency of 25. The setup has 3 nodes - 1 controller and 2 compute nodes. The fake namespace VMs are created on the compute nodes and controller node act as gateway node. Below are the results. ---------------------------------------------------------------------------------- | Ping | Non IP | Master (IP) | Master (IP) with Han's Fix | ------------------------------------------------------------------------------------------- | Min (sec) | 0.023 | 0.017 | 0.022 | ------------------------------------------------------------------------------------------- | Median (sec) | 0.029 | 7.097 | 0.029 | ------------------------------------------------------------------------------------------- | 90%ile | 2.254 | 47.625 | 2.047 | ------------------------------------------------------------------------------------------- | 95%ile | 4.065 | 55.26 | 4.052 | ------------------------------------------------------------------------------------------- | Max | 6.088 | 66.987 | 6.075 | ------------------------------------------------------------------------------------------- | Avg | 0.877 | 17.732 | 0.599 | ------------------------------------------------------------------------------------------- Your patch is definitely fixing the issue. Non IP - commit - ffbe41dbcb4882aafdf80d86afa1906b2a00199e + a62128adc303d49901509a02f7e894d0c699e5bb Master IP - commit - f627cf1dd922bb644b6480bfbda67a9460cb2947 Master (IP) with Han's fix - f627cf1dd922bb644b6480bfbda67a9460cb2947 + Above fix from Han. Thanks Numan >> > >> Thanks > >> Numan > >> > > > > Glad to see the test result improved! Thanks a lot and looking forward > to more data. Once it is finally confirmed, we can discuss whether this > should be submitted as a formal patch considering real world scenarios. >
_______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
