On Fri, Jul 16, 2021 at 2:53 PM Han Zhou <[email protected]> wrote:
>
> On Fri, Jul 16, 2021 at 7:15 AM Numan Siddique <[email protected]> wrote:
> >
> > On Mon, Jul 12, 2021 at 11:09 AM Mark Michelson <[email protected]>
> wrote:
> > >
> > > Full disclosure: I have not looked at all the details in this patch,
> > > since it is quite large. However, I felt I should comment on the idea.
> >
> > Hi Mark,
> >
> > Thanks for the review and the comments.
> >
> > >
> > > The memory savings in ovn-northd and the southbound database are quite
> > > nice. It's to be expected since the southbound database has fewer
> > > logical flows, and ovn-northd doesn't have to calculate logical flows. I
> > > think it's telling that the scale tests didn't show any noticeable
> > > improvement in performance.
> > >
> > > I think one thing that could help would be to just skip logical flow
> > > creation altogether in ovn-controller. It doesn't make much sense to
> > > take the source data, translate it into logical flow syntax, just to run
> > > it through the expression parser and translate it into OpenFlow. You
> > > could presumably create OpenFlow matches and actions directly, saving
> > > lots of processing. Of course, this takes this
> > > semi-backwards-incompatible change and makes it wholly
> > > backwards-incompatible :)
> > >
> > >
> >
> > I thought about skipping logical flow generation altogether.  But it has
> its own
> > challenges
> >   -  We will break backward compatibility and many tools - ovn-sbctl,
> > ovn-trace/ovn-detrace
> >   -  And it is a lot of effort.
> >
> > If breaking those tools is Ok,  I guess we can explore this option.
> > So far ovsdb-server
> > has been a bottleneck in the scale environments and the goal of this
> > approach was
> > to relieve that pressure from ovsdb-servers.  Since Ilya's relay
> > feature is merged we
> > can definitely test that feature and see if it solves all the scale
> > concerns.  If it
> > solves,  then I think we can drop the idea of removing the logical
> > flows from the
> > Southbound db.  If not,  I think we may have to revisit this idea.  In
> > my free cycles
> > I'd like to experiment a bit and see if we can generate OF flows
> > directly in ovn-controller.
> >
> > I think it is still good to discuss in this thread if generating OF
> > flows directly can be a potential
> > re-architecture or not.
> >
> > @Han - You mentioned in the irc meeting that you have some comments
> > too.  It would be
> > great if you can share your thoughts.
>
> Sorry for my slow response. My comments regarding this RFC patch are
> primarily the ones we discussed offline after the irc meeting. In fact most
> of them are already covered by Mark's comments in his reply. Let me put it
> here with some additional points:
>
> 1) I agree with Mark that this patch achieves its goal of reducing the size
> of SB DB, without obvious e2e performance gains which is also expected.
> While the SB DB reduction is significant, the benefit might not justify
> this big architecture change. And I agree with you that let's see how the
> OVSDB relay approach helps for the SB DB scale, and revisit this patch if
> necessary.
>
> 2) I also mentioned the same idea a while ago for removing the logical flow
> phase. If we remove logical flows from SB DB, it is more natural and
> efficient to not generate them at all. However, I also thought about that
> again recently and I feel not so sure now if it is the right way to
> proceed. The open issues are:
>     - First of all, as you mentioned, this is an even bigger change,
> probably close to rewriting all the components, and the tools. (We might
> look at directly using DDlog to achieve that, and hopefully some code in
> the northd-ddlog can be reused.) Most importantly, it took a great amount
> of effort for OVN to stabilize (well, to some extent) for production, so
> I'd expect such a big change at this level, if implemented, would require a
> really long period of time, probably very painful, to stabilize again.
>     - We lose the benefit of logical-physical separation, which could make
> trouble-shooting much more difficult. Nevertheless, the debuggability may
> be achieved by other ways, and nothing prevents us from implementing tools
> that generate human friendly logical flows for debugging purposes only, if
> it is helpful.
>     - It seems ovn-controller should connect to NB DB directly, but at the
> same time maintain SB DB for port-bindings and chassis information. (This
> is not a problem, but something to think about)
>     - Although it avoids the cost of logical flow parsing, it doesn't
> completely avoid the cost of parsing "match" strings, because the "match"
> fields in ACL (and also Policy Routing, QoS) are still flexible strings
> specified by users. If these objects are too many, specifically, if ACLs
> contribute to a significant part of today's logical flow handling, then
> even if logical flows are removed, parsing ACL matches could still be a
> bottleneck, at least for some use cases. So we may need to lower our
> expectations for the e2e performance gains after all the great efforts.
>
> So I am still open to the idea of rearchitecting to remove logical flows.
> It would be great to have a POC to evaluate the e2e performance gains,
> which itself can take a really big effort, at the risk of being completely
> discarded. Since there is an ongoing effort of rewriting ovn-controller
> using DDlog, which I believe is a big effort, too, perhaps this can be
> considered as an option together. @Ben Pfaff <[email protected]>
>
> 3) Regarding the potential performance gains of this patch by using
> incremental processing in ovn-controller, I don't think it is a strong
> argument, because I believe our goal for ovn-northd is still making
> ovn-northd-ddlog mature and replace the C version, so when comparing e2e
> performance we should consider ovn-northd-ddlog which inherently does
> incremental processing, instead of comparing with the current C version of
> ovn-northd. (In case the direction of using ovn-northd-ddlog is somehow
> changed, hopefully not, we need to raise it clearly in the community and
> discuss it seriously.)
>
> 4) It seems we (the community) put more focus on scalability again recently
> and a lot of improvements have been achieved, and more ongoing. Although
> none of them are architectural changes, they turned out to be effective (or
> with great potential). We could re-evaluate the longer term plan after we
> complete these non-architectural improvements.

Hi Han,

Thanks a lot for your comments.  I agree with your comments.  Let's
see how it goes and if we still need to revisit this RFC patch.

Numan


>
> Thanks,
> Han
>
> >
> > Few comments below
> >
> > >
> > > I'm concerned about a few things in the patch as presented.
> > >
> > > First, moving logical flow generation from ovn-northd to ovn-controller
> > > seems like it's just shifting the work from one place to another. And in
> > > the case where datapaths have ports bound on many nodes, it means a lot
> > > of the same work is being done on multiple ovn-controller instances. I-P
> > > in ovn-controller could reduce the amount of work ovn-controller is
> > > performing per iteration, but it also introduces error potential.
> >
> > It definitely adds some load to ovn-controllers. But I don't think it
> > is significant.
> > There are many logical flows which are generic (which datapath groups
> > has already
> > leveraged) and the proposed RFC patch generates those at start up.
> >
> > I do agree that there is potential for errors and adding new features
> > could be a bit
> > challenging.  But it should not be so hard.
> >
> >
> > >
> > > Second, is there a danger in moving logical flows out of the southbound
> > > database? You modified `ovn-sbctl lflow-list` to do some magic to list
> > > the logical flows, but are there any CMS's that access the Logical_Flow
> > > table directly in the SB DB for any reason?
> >
> > I don't think CMS should rely on the logical_flow table.  I'm not
> > aware of any CMS
> > doing it.  Even if any CMS is doing it, I think it probably shouldn't.
> >
> > >
> > > Third, does ovn-trace still work?
> >
> > In the proposed patch, ovn-trace doesn't work but it should be
> straightforward
> > to make it work.  In this proposed patch, a new command is added in
> ovn-sbctl
> > called - "ctrl-lflow-list"  and it calls the same functions to
> > generate logical flows
> > as ovn-controller.  ovn-trace can do the same.
> >
> > >
> > > Overall, I'm fine with the idea of re-architecting things in OVN, but I
> > > think it requires a bit more of a plan than this. A change like this is
> > > doing its best to hide that anything has changed, when behind the scenes
> > > things have changed a lot. I think a true rearchitecting will require
> > > user-facing changes, too. If we still used major-minor versioning for
> > > OVN, this would be the sort of thing that would result in a major
> > > version bump.
> >
> > I agree.
> >
> > Thanks
> > Numan
> >
> > >
> > >
> > > On 6/25/21 7:31 PM, [email protected] wrote:
> > > > From: Numan Siddique <[email protected]>
> > > >
> > > > This is a an RFC patch to move the logical flow generation from
> > > > ovn-northd to ovn-controller.
> > > >
> > > > This patch doesn't move all the generation to ovn-controller.
> > > > ovn-northd still does the flow generation for
> > > >    - ACLs/Port groups.
> > > >    - DHCP options
> > > >    - Multicast groups
> > > >    - And few others.
> > > >
> > > > Other than ACLs and Port groups it would be possible to move
> > > > flow generation for the above mentioned things.  But before doing
> > > > all that, it is worth to evaluate if the proposed RFC makes sense.
> > > >
> > > > The main motivation for this RFC effort is to
> > > >    - Address scale issues seen.  For large scale deployments
> > > >      ovn-northd takes lot of CPU for the computation (ovn-northd-ddlog
> > > >      should help here) and memory and so does Southbound
> ovsdb-servers.
> > > >
> > > >    - Having a very huge southbound DB and logical flows affects the
> > > >      raft consenses and it requires increasing the raft election
> timers.
> > > >
> > > >    - Logical flows contributes majorly to the overall south bound DB.
> > > >
> > > > This RFC demonstrates that it is possible for each ovn-controller
> > > > to generate logical flows.
> > > >
> > > > These are some of the findings with my general and scale testing.
> > > >
> > > > Below are the test findings with a huge pre-existing Northbound
> database with
> > > > datapath groups enabled with a size of 13 M
> > > >    - Southbound DB size is:
> > > >           * with ovn-northd-master - 35 M
> > > >           * with ovn-northd-proposed-rfc - 12M
> > > >
> > > >    - Number of logical flows:
> > > >           * with ovn-northd-master - 78581
> > > >           * with ovn-northd-proposed-rfc - 7933
> > > >
> > > >    - RSS Memory consumption of
> > > >          * ovn-northd-master -       441368 KiB
> > > >          * ovn-northd-proposed-rfc - 115540 KiB
> > > >
> > > >          * ovn-controller-master -       1267716 KiB
> > > >          * ovn-controller-proposed-rfc - 915876 KiB
> > > >
> > > >          * SB ovsdb-server-with-ovn-master -       612296 KiB
> > > >          * SB ovsdb-server-with-proposed-rfc-ovn - 134680 KiB
> > > >
> > > > With the scale testing of 500 fake multinode nodes and each node
> > > > creating having a few port bindings claimed,  the end result is
> > > > almost similar.  No signifcant improvements seen with the proposed
> > > > RFC patch.  The results are identical.
> > > >
> > > > I think more scale testing needs to be done to determine if
> > > > the CPU usage and memory usage reduction in the ovn-northd and
> > > > ovsdb-servers will have a major impact or not.  Testing with
> > > > a real Kubernetes/Openstack deployments would help.
> > > >
> > > > Few Observations
> > > >    -  It is possible to move the flow generation to each
> ovn-controller.
> > > >
> > > >    -  Each ovn-controller only generates the logical flows if required
> > > >       i.e if the datapath is in the 'local_datapaths'.
> > > >
> > > >    -  This RFC patch do complicate ovn-controller code which has
> already
> > > >       many complicated bits.
> > > >
> > > >    -  I was expecting the scale test results to improve and the
> > > >       end-to-end time of a pod/VM creation would be quicker. But that
> is
> > > >       not the case, which is a disappointment.
> > > >
> > > > Submitting this RFC patch to get a feed back and have conversation
> > > > if it is worth the effort.
> > > >
> >
> > > >
> > >
> > >
> > > _______________________________________________
> > > dev mailing list
> > > [email protected]
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > _______________________________________________
> > dev mailing list
> > [email protected]
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to