On Mon, Jun 24, 2019 at 1:51 PM aginwala <[email protected]> wrote: > Hi: > As per irc meeting discussion, some nice findings were already discussed > by Numan (Thanks for sharing the details). When changing external_ids for > a claimed port e.g. ovn-nbctl set logical_switch_port sw0-port1 > external_ids:foo=bar triggers re-computation on local compute. I do see the > same behavior. Numan is proposing a patch to skip computation for > external_ids column for an already claimed port for port_binding table > because of runtime_data, can't handle change for input SB_port_binding, > fall back to recompute ( > https://github.com/openvswitch/ovs/blob/master/ovn/lib/inc-proc-eng.h#L77). > However, I don't see external_ids in port_binding table for the port being > set explicitly when setting Interface table in the test code that Daniel > posted [1] which could trigger extra re-computation in current test > scenario. >
ovn-northd just copies the external_ids of a logical switch port to external_ids of port binding. And networking-ovn makes use of external_ids a lot. > > Also ovs-vsctl add-br test will also trigger re-computation on local > compute and yes I can see the same. Since we don't have any handlers for > Ports and Interfaces table similar to port_binding and other handlers @ > https://github.com/openvswitch/ovs/blob/master/ovn/controller/ovn-controller.c#L1769, > adding a new bridge also causes re-computation on the local compute. Not > sure if its required immediately because as per the patch shared by Daniel > [1], I don't see any new test bridges getting created apart from br-int > and hence wont be much impact. Or may be I missed to see if they are also > creating test bridges during testing. Of course, any new ovs-vsctl command > for attaching/detaching vif will sure trigger recompute on br-int as and > when VIF(vm) gets added/deleted to program the flow on local compute. > It would impact how the CMS creates the ovs port. If suppose If I do something like below --- ovs-vsctl add-port br-int foo ovs-vsctl set interface foo type=internal ovs-vsctl set Interface foo external_ids:iface-id=foo-id ---- and if ovn-controller gets 3 updates from ovsdb-server, this would result in 3 recomputations. However if I do ovs-vsctl add-port br-int foo -- set interface foo type=internal -- set interface foo external_ids:iface-id=foo-id this could result in only 1 recomputation. I think ovn-controller should handle the local ovsdb changes for 1. external_ids of openvswitch table 2. if an ovs interface's external_ids:iface-id is updated. We should try to ignore or any other changes to the local ovsdb. > I didn't get a chance to verify when a chassisredirect port is claimed on > a gateway chassis, it triggers computation on all computes registered with > SB as per code > https://github.com/openvswitch/ovs/blob/master/ovn/controller/binding.c#L722 > which was also raises further optimization for chassisredirect flow that > Numan is suggesting. > > 1. > https://github.com/danalsan/browbeat/commit/0ff72da52ddf17aa9f7269f191eebd890899bdad > > I submitted the patches just now to address some of the issues - https://patchwork.ozlabs.org/project/openvswitch/list/?series=115737 I also ran the test with these patches, but it didn't help in any improvement. Although the patches I submitted avoids recomputation for some of the scenarios, I think I still need to dig further to see what's causing the performance impact when compared with non IP patches, Thanks Numan On Fri, Jun 21, 2019 at 12:32 AM Han Zhou <[email protected]> wrote: > >> >> >> On Thu, Jun 20, 2019 at 11:42 PM Numan Siddique <[email protected]> >> wrote: >> > >> > >> > >> > On Fri, Jun 21, 2019, 11:47 AM Han Zhou <[email protected]> wrote: >> >> >> >> >> >> >> >> On Tue, Jun 11, 2019 at 9:16 AM Daniel Alvarez Sanchez < >> [email protected]> wrote: >> >> > >> >> > Thanks a lot Han for the answer! >> >> > >> >> > On Tue, Jun 11, 2019 at 5:57 PM Han Zhou <[email protected]> wrote: >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > On Tue, Jun 11, 2019 at 5:12 AM Dumitru Ceara <[email protected]> >> wrote: >> >> > > > >> >> > > > On Tue, Jun 11, 2019 at 10:40 AM Daniel Alvarez Sanchez >> >> > > > <[email protected]> wrote: >> >> > > > > >> >> > > > > Hi Han, all, >> >> > > > > >> >> > > > > Lucas, Numan and I have been doing some 'scale' testing of >> OpenStack >> >> > > > > using OVN and wanted to present some results and issues that >> we've >> >> > > > > found with the Incremental Processing feature in >> ovn-controller. Below >> >> > > > > is the scenario that we executed: >> >> > > > > >> >> > > > > * 7 baremetal nodes setup: 3 controllers (running >> >> > > > > ovn-northd/ovsdb-servers in A/P with pacemaker) + 4 compute >> nodes. OVS >> >> > > > > 2.10. >> >> > > > > * The test consists on: >> >> > > > > - Create openstack network (OVN LS), subnet and router >> >> > > > > - Attach subnet to the router and set gw to the external >> network >> >> > > > > - Create an OpenStack port and apply a Security Group (ACLs >> to allow >> >> > > > > UDP, SSH and ICMP). >> >> > > > > - Bind the port to one of the 4 compute nodes (randomly) by >> >> > > > > attaching it to a network namespace. >> >> > > > > - Wait for the port to be ACTIVE in Neutron ('up == True' in >> NB) >> >> > > > > - Wait until the test can ping the port >> >> > > > > * Running browbeat/rally with 16 simultaneous process to >> execute the >> >> > > > > test above 150 times. >> >> > > > > * When all the 150 'fake VMs' are created, browbeat will >> delete all >> >> > > > > the OpenStack/OVN resources. >> >> > > > > >> >> > > > > We first tried with OVS/OVN 2.10 and pulled some results which >> showed >> >> > > > > 100% success but ovn-controller is quite loaded (as expected) >> in all >> >> > > > > the nodes especially during the deletion phase: >> >> > > > > >> >> > > > > - Compute node: https://imgur.com/a/tzxfrIR >> >> > > > > - Controller node (ovn-northd and ovsdb-servers): >> https://imgur.com/a/8ffKKYF >> >> > > > > >> >> > > > > After conducting the tests above, we replaced ovn-controller >> in all 7 >> >> > > > > nodes by the one with the current master branch (actually from >> last >> >> > > > > week). We also replaced ovn-northd and ovsdb-servers but the >> >> > > > > ovs-vswitchd has been left untouched (still on 2.10). The >> expected >> >> > > > > results were to get less ovn-controller CPU usage and also >> better >> >> > > > > times due to the Incremental Processing feature introduced >> recently. >> >> > > > > However, the results don't look very good: >> >> > > > > >> >> > > > > - Compute node: https://imgur.com/a/wuq87F1 >> >> > > > > - Controller node (ovn-northd and ovsdb-servers): >> https://imgur.com/a/99kiyDp >> >> > > > > >> >> > > > > One thing that we can tell from the ovs-vswitchd CPU >> consumption is >> >> > > > > that it's much less in the Incremental Processing (IP) case >> which >> >> > > > > apparently doesn't make much sense. This led us to think that >> perhaps >> >> > > > > ovn-controller was not installing the necessary flows in the >> switch >> >> > > > > and we confirmed this hypothesis by looking into the dataplane >> >> > > > > results. Out of the 150 VMs, 10% of them were unreachable via >> ping >> >> > > > > when using ovn-controller from master. >> >> > > > > >> >> > > > > @Han, others, do you have any ideas as of what could be >> happening >> >> > > > > here? We'll be able to use this setup for a few more days so >> let me >> >> > > > > know if you want us to pull some other data/traces, ... >> >> > > > > >> >> > > > > Some other interesting things: >> >> > > > > On each of the compute nodes, (with an almost evenly >> distributed >> >> > > > > number of logical ports bound to them), the max amount of >> logical >> >> > > > > flows in br-int is ~90K (by the end of the test, right before >> deleting >> >> > > > > the resources). >> >> > > > > >> >> > > > > It looks like with the IP version, ovn-controller leaks some >> memory: >> >> > > > > https://imgur.com/a/trQrhWd >> >> > > > > While with OVS 2.10, it remains pretty flat during the test: >> >> > > > > https://imgur.com/a/KCkIT4O >> >> > > > >> >> > > > Hi Daniel, Han, >> >> > > > >> >> > > > I just sent a small patch for the ovn-controller memory leak: >> >> > > > https://patchwork.ozlabs.org/patch/1113758/ >> >> > > > >> >> > > > At least on my setup this is what valgrind was pointing at. >> >> > > > >> >> > > > Cheers, >> >> > > > Dumitru >> >> > > > >> >> > > > > >> >> > > > > Looking forward to hearing back :) >> >> > > > > Daniel >> >> > > > > >> >> > > > > PS. Sorry for my previous email, I sent it by mistake without >> the subject >> >> > > > > _______________________________________________ >> >> > > > > discuss mailing list >> >> > > > > [email protected] >> >> > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >> >> > > >> >> > > Thanks Daniel for the testing and reporting, and thanks Dumitru >> for fixing the memory leak. >> >> > > >> >> > > Currently ovn-controller incremental processing only handles below >> SB changes incrementally: >> >> > > - logical_flow >> >> > > - port_binding (for regular VIF binding NOT on current chassis) >> >> > > - mc_group >> >> > > - address_set >> >> > > - port_group >> >> > > - mac_binding >> >> > > >> >> > > So, in test scenario you described, since each iteration creates >> network (SB datapath changes) and router ports (port_binding changes for >> non VIF), the incremental processing would not help much, because most >> steps in your test should trigger recompute. It would help if you create >> more Fake VMs in each iteration, e.g. create 10 VMs or more on each LS. >> Secondly, when VIF port-binding happens on current chassis, the >> ovn-controller will still do re-compute, and because you have only 4 >> compute nodes, so 1/4 of the compute node will still recompute even when >> binding a regular VIF port. When you have more compute nodes you would see >> incremental processing more effective. >> >> > >> >> > Got it, it makes sense (although then worst case, it should be at >> >> > least what we had before and not worse but it can also be because >> >> > we're mixing version here: 2.10 vs master). >> >> > > >> >> > > However, what really worries me is the 10% VM unreachable. I have >> one confusion here on the test steps. The last step you described was: - >> Wait until the test can ping the port. So if the VM is not pingable the >> test won't continue? >> >> > >> >> > Sorry I should've explained it better. We wait for 2 minutes to the >> >> > port to respond to pings, if it's not reachable then we continue with >> >> > the next port (16 rally processes are running simultaneously so the >> >> > rest of the process may be doing stuff at the same time). >> >> > >> >> > > >> >> > > To debug the problem, the first thing is to identify what flows >> are missing for the VMs that is unreachable. Could you do ovs-appctl >> ofproto/trace for the ICMP flow of any VM with ping failure? And then, >> please enable debug log for ovn-controller with ovs-appctl -t >> ovn-controller vlog/set file:dbg. There may be too many logs so please >> enable it for as short time as any VM with ping failure is reproduced. If >> the last step "wait until the test can ping the port" is there then it >> should be able to detect the first occurrence if the VM is not reachable in >> e.g. 30 sec. >> >> > >> >> > We'll need to hack a bit here but let's see :) >> >> > > >> >> > > In the ovn-scale-test we didn't have data plane test, but this >> problem was not seen in our live environment either, with a far larger >> scale. The major difference in your test v.s. our environment are: >> >> > > - We are runing with an older version. So there might be some >> rebase/refactor problem caused this. To eliminate this, I'd suggest to try >> a branch I created for 2.10 ( >> https://github.com/hzhou8/ovs/tree/ip12_rebase_on_2.10), which matches >> the base test you did which is also 2.10. It may also eliminate >> compatibility problem, if there is any, between OVN master branch and OVS >> 2.10 as you mentioned is used in the test. >> >> > > - We don't use Security Group (I guess the ~90k OVS flows you >> mentioned were mainly introduced by the Security Group use, if all ports >> were put in same group). The incremental processing is expected to be >> correct for security-groups, and handling it incrementally because of >> address_set and port_group incremental processing. However, since the >> testing only relied on the regression tests, I am not 100% sure if the test >> coverage was sufficient. So could you try disabling Security Group to rule >> out the problem? >> >> > >> >> > Ok will try to repeat the tests without the SGs. >> >> > > >> >> > > Thanks, >> >> > > Han >> >> > >> >> > Thanks once again! >> >> > Daniel >> >> >> >> Hi Daniel, >> >> >> >> Any updates? Do you still see the 10% VM unreachable >> >> >> >> >> >> Thanks, >> >> Han >> > >> > >> > Hi Han, >> > >> > As such there is no datapath impact. After increasing the ping wait >> timeout value from 120 seconds to 180 seconds its 100% now. >> > >> > But the time taken to program the flows is too huge when compared to >> OVN master without IP patches. >> > Here is some data - http://paste.openstack.org/show/753224/ . I am >> still investigating it. I will update my findings in some time. >> > >> > Please see the times for the action - vm.wait_for_ping >> > >> >> Thanks Numan for the investigation and update. Glad to hear there is no >> correctness issue, but sorry for the slowness in your test scenario. I >> expect that the operations in your test trigger recomputing and the worst >> case should be similar performance as withour I-P. It is weird that it >> turned out so much slower in your test. There can be some extra overhead >> when it tries to do incremental processing and then fallback to full >> recompute, but it shouldn't cause that big difference. It might be that for >> some reason the main loop iteration is triggered more times unnecessarily. >> I'd suggest to compare the coverage counter "lflow_run" between the tests, >> and also check perf report to see if the hotspot is somewhere else. (Sorry >> that I can't provide full-time help now since I am still on vacation but I >> will try to be useful if things are blocked) >> _______________________________________________ >> discuss mailing list >> [email protected] >> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >> >
_______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
