ONE-VR approach in ACS 5.0. It is time to plan for a major release and break some things...
On Wed, Feb 7, 2018 at 7:17 AM, Paul Angus <paul.an...@shapeblue.com> wrote: > It seems sensible to me to have ONE VR, and I like the idea of that we all > VRs are 'redundant-ready', again supporting the ONE-VR approach. > > The question I have is: > > - how do we handle the transition - does it need ACS 5.0? > The API and the UI separate the VR and the VPC, so what is the most > logical presentation of the proposed solution to the users/operators. > > > Kind regards, > > Paul Angus > > paul.an...@shapeblue.com > www.shapeblue.com > 53 Chandos Place, Covent Garden, London WC2N 4HSUK > @shapeblue > > > > > -----Original Message----- > From: Daan Hoogland [mailto:daan.hoogl...@gmail.com] > Sent: 07 February 2018 08:58 > To: dev <dev@cloudstack.apache.org> > Subject: Re: [DISCUSS] VR upgrade downtime reduction > > Reading all the reactions I am getting wary of all the possible solutions > that we have. > We do have a fragile VR and Remi's way seems the only one to stabilise it. > It also answers the question on which of my two tactics we should follow. > Wido's abjection may be valid but services that are not started are not > crashing and thus should not hinder him. > As for Wei's changes I think the most important one is in the PR I ported > forward to master, using his older commit. I metntioned it in > > [1] https://github.com/apache/cloudstack/pull/2435 > I am looking forward to any of your PRs as well Wei. > > Making all VRs redundant is a bit of a hack and the biggest risk in it is > making sure that only one will get started. > > There is one point I'd like consensus on; We have only one system > template and we are well served by letting it have only one form as VR. Do > we agree on that? > > comments, flames, questions, regards, > > > On Tue, Feb 6, 2018 at 9:04 PM, Wei ZHOU <ustcweiz...@gmail.com> wrote: > > > Hi Remi, > > > > Actually in our fork, there are more changes than restartnetwork and > > restart vpc, similar as your changes. > > (1) edit networks from offering with single VR to offerings with RVR, > > will hack VR (set new guest IP, start keepalived and conntrackd, > > blablabla) > > (2) restart vpc from single VR to RVR. similar changes will be made. > > The downtime is around 5s. However, these changes are based 4.7.1, we > > are not sure if it still work in 4.11 > > > > We have lots of changes , we will port the changes to 4.11 LTS and > > create PRs in the next months. > > > > -Wei > > > > > > 2018-02-06 14:47 GMT+01:00 Remi Bergsma <rberg...@schubergphilis.com>: > > > > > Hi Daan, > > > > > > In my opinion the biggest issue is the fact that there are a lot of > > > different code paths: VPC versus non-VPC, VPC versus redundant-VPC, > etc. > > > That's why you cannot simply switch from a single VPC to a redundant > > > VPC for example. > > > > > > For SBP, we mitigated that in Cosmic by converting all non-VPCs to a > > > VPC with a single tier and made sure all features are supported. > > > Next we > > merged > > > the single and redundant VPC code paths. The idea here is that > > > redundancy or not should only be a difference in the number of > > > routers. Code should > > be > > > the same. A single router, is also "master" but there just is no > > "backup". > > > > > > That simplifies things A LOT, as keepalived is now the master of the > > whole > > > thing. No more assigning ip addresses in Python, but leave that to > > > keepalived instead. Lots of code deleted. Easier to maintain, way > > > more stable. We just released Cosmic 6 that has this feature and are > > > now > > rolling > > > it out in production. Looking good so far. This change unlocks a lot > > > of possibilities, like live upgrading from a single VPC to a > > > redundant one (and back). In the end, if the redundant VPC is rock > > > solid, you most > > likely > > > don't even want single VPCs any more. But that will come. > > > > > > As I said, we're rolling this out as we speak. In a few weeks when > > > everything is upgraded I can share what we learned and how well it > works. > > > CloudStack could use a similar approach. > > > > > > Kind Regards, > > > Remi > > > > > > > > > > > > On 05/02/2018, 16:44, "Daan Hoogland" <daan.hoogl...@gmail.com> > wrote: > > > > > > H devs, > > > > > > I have recently (re-)submitted two PRs, one by Wei [1] and one > > > by > > Remi > > > [2], > > > that reduce downtime for redundant routers and redundant VPCs > > > respectively. > > > (please review those) > > > Now from customers we hear that they also want to reduce downtime > for > > > regular VRs so as we discussed this we came to two possible > > > solutions that > > > we want to implement one of: > > > > > > 1. start and configure a new router before destroying the old > > > one and then > > > as a last minute action stop the old one. > > > 2. make all routers start up redundancy services but for regular > > > routers > > > start only one until an upgrade is required at which time a new, > > second > > > router can be started before killing the old one. > > > > > > obviously both solutions have their merits, so I want to have > > > your input > > > to make the broadest supported implementation. > > > -1 means there will be an overlap or a small delay and > > > interruption > > of > > > service. > > > +1 It can be argued, "they got what they payed for". > > > -2 means a overhead in memory usage by the router by the extra > > services > > > running on it. > > > +2 the number of router-varieties will be further reduced. > > > > > > -1&-2 We have to deal with potentially large upgrade steps from > > > way before > > > the cloudstack era even and might be stuck to 1 because of that, > > > needing to > > > hack around it. Any dealing with older VRs, pre 4.5 and > > > especially > > pre > > > 4.0 > > > will be hard. > > > > > > I am not cross posting though this might be one of these > > > occasions where it > > > is appropriate to include users@. Just my puristic inhibitions. > > > > > > Of course I have preferences but can you share your thoughts, > please? > > > > > > And don't forget to review Wei's [1] and Remi's [2] work please. > > > > > > [1] https://github.com/apache/cloudstack/pull/2435 > > > [2] https://github.com/apache/cloudstack/pull/2436 > > > > > > -- > > > Daan > > > > > > > > > > > > > > > -- > Daan > -- Rafael Weingärtner