On Wed, May 24, 2017 at 7:39 PM, Russell Bryant <[email protected]> wrote:
> On Wed, May 24, 2017 at 7:19 AM, Miguel Angel Ajo Pelayo > <[email protected]> wrote: > > I wanted to share a small status update: > > > > Anil and I have been working on this [1] and we expect to post > > some preliminary patches before the end of the week. > > > > I can confirm that the BFD + bundle(active_backup) strategy > > works well from the hypervisors point of view. With 1 sec BFD > > pings we get a ~2.8s failover time. > > > > So far we have only focused on case "2" so far for the distributed > > routers where we specify a set of hosts to act as chassis. > > > > """ > > ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \ > > -- set Logical_Router_Port alice \ > > options:redirect-chassis=gw1:10,gw2:20,gw3:30 > > """ > > > > Thanks for the update! Sounds like great progress. > > > We wonder if there's any value at all in exploring support on "1" > > the old way of pinning a logical router to a chassis. > > You mean only specifying a single chassis here? Does it add a lot of > complexity to only support a single gateway? I don't know yet, I need to look at that specific case. > If not, it definitely > seems worth keeping. Supporting simpler setups is a good thing. > Ignorant question, setting the chassis option on the Logical Router, doesn't make the router "less distributed", i.e. making the E/W traffic flow through the specific chassis?. This is basically why I didn't pay attention to it, but if it's not the case, of course, it's worth working on it. > > > If anybody wants to give it a try you can use [2] to quickly deploy > > 2 gw hosts + 2 "hv" hosts, + 1 service host (accessible through an > > 'external' network via gw1 and gw2) (see ascii diagram [3] and [4] > details) > > > > > > Then you can ping the external service from a port in hv1 with: > > > > $ vagrant ssh hv1 -c "sudo ip netns exec vm1 ping 10.0.0.111" > > > > or the vm3 via floating point with: > > > > $ vagrant ssh svc1 -c "ping 10.0.0.16" > > > > you can trigger a failover anytime by doing: > > > > $ vagrant ssh gw1 -c "sudo ifdown eth1" > > > > and a failback, by: > > > > $ vagrant ssh gw1 -c "sudo ifup eth1" > > > > > > > > We are currently working on: > > > > 1) Addressing the monitoring of the inter-gateway bfd, to make sure that > > non-master routers will drop any packet (external/internal) or any > ARP > > request. > > 2) Same as 1 but for playing gARPs when a router is in a new chassis. > > 3) Documentation changes. > > 4) Tests > > > > And we have some questions: > > > > About preemption (see failover/failback example above), we have > several > > options: > > a) we stick to have preemptive failbacks (if a gateway chassis comes > > back online, the routers which were scheduled there will bome back) > > b) not preemtive: (when a chassis goes down all logical router ports > with > > redirect chassis will be recalculated). or > > c) we make it configurable. > > > > My intuition says that with very low failover times "a" could be a > > reasonable > > thing for most cases, since your load stays balanced when your gateway > > chassis > > comes back. But I'm not an operator, how could we gather feedback on > > this area? > > Good question and good point about wanting to ensure load remains > balanced when a chassis comes back. > > With (a), I'd be worried about the case where a chassis is in more of > a half-dead (zombie?) state. right, good point. > We don't want failover bouncing back and > forth because we keep thinking a chassis is going up and down. Any > thoughts on how to mitigate this? > we could make the preemptive behaviour a little bit more complex, by adding a recovery time, and during that recovery time the failover would be non-preemptive. For non-preemptive I was considering that northd could see the new chassis binding to the redirect port, and reordering the priorities in SBDB and NBDB. > > > > > Best regards, > > Miguel Ángel Ajo > > > > [1] https://github.com/mangelajo/ovs/commits/l3ha > > [2] https://github.com/mangelajo/vagrants/tree/master/ovn-l3-ha > > [3] > > https://github.com/mangelajo/vagrants/blob/master/ovn-l3- > ha/Vagrantfile#L16 > > [4] https://github.com/mangelajo/vagrants/blob/master/ovn-l3- > ha/gw1.sh#L67 > > > > On Fri, Apr 7, 2017 at 9:14 AM, Miguel Angel Ajo Pelayo < > [email protected] > >> wrote: > > > >> Updating what I wrote yesterday (I hope I won't make people's > >> eyes hurt today) after a talk on IRC (thank you Mickey Spiegel > >> and Gurucharan Shetty): > >> > >> I propose having: > >> > >> 1) chassis on NB/Logical_Router accept multiple chassis, to cover > >> HA on the centralized gateway case for DNAT/SNAT. > >> > >> ovn-nbctl create Logical_Router name=edge1 \ > >> options:chassis=gw1:10,gw2:20,gw3:30 > >> > >> Or multiple chassis without priorities: > >> > >> ovn-nbctl create Logical_Router name=edge1 \ > >> options:chassis=gw1,gw2,gw3 > >> > >> and in this case we let ovn decide -and rewrite the option- > >> to balance priorities between gateways to spread the load. > >> > >> 2) redirect-chassis on NB/Logical_Router_Port to accept multiple > >> chassis to cover HA for centralized SNAT on distributed routers. > >> > >> ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \ > >> -- set Logical_Router_Port alice \ > >> options:redirect-chassis=gw1:10,gw2:20,gw3:30 > >> > >> (or again, without priorities) > >> > >> ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \ > >> -- set Logical_Router_Port alice \ > >> options:redirect-chassis=gw1,gw2,gw3 > >> > >> > >> These logical model changes allow for Active/Active L3 when we have > >> that implemented, for example by assigning the same priorities. > >> > >> Alternatively in such case we could add another option > >> for case (1): ha-chassis-mode=active_standby/active_active, > >> and ha-redirect-mode=active_standby/active_active for case (2). > >> > >> For the dataplane implementation I propose following what [1] defines > >> for Active/Standby per-router implemetation, with BFD monitoring for > >> tunnel endpoints, where the location of the master router is > >> independently calculated at every chassis, making the solution > >> independent of the controller connection via SB database. > >> > >> There are to start with, a few gaps that we need to properly defined > yet: > >> > >> 1) I'd like to see reporting of the master gateway somehow through > >> SB db [up to the NB db?], in a way that the administrator can inspect > >> the system and see what's it's current state. > >> > >> 2) While how hypervisors will direct traffic to the calculated > >> master router via the bundle action with the active_backup algorithm, > >> I believe we can't have anything in OpenFlow to drop packets in the > >> standby routers based on the inter-gateway link matrix status. > >> > >> 3) Other related changes in the SouthBound DB. > >> > >> Best regards, > >> Miguel Ángel Ajo > >> > >> [1] https://github.com/openvswitch/ovs/blob/master/ > >> Documentation/topics/high-availability.rst > >> > >> > >> On Thu, Apr 6, 2017 at 12:13 PM, Miguel Angel Ajo Pelayo < > >> [email protected]> wrote: > >> > >>> Hello everybody, > >>> > >>> First I'd like to say hello, because I'll be able to spend more > time > >>> working > >>> with this community, and I'm sure it will be an enjoyable journey for > >>> what I've > >>> seen (silently watching) during the last few years. > >>> > >>> I'm planning to start work (together with Anil) on the L3 High > >>> availability area of OVN. We've been reading [1], and it seems quite > >>> reasonable. > >>> > >>> We're wondering to fast forward and skip the naive active/backup > >>> implementation > >>> in favor of the Active/Standby (per router) based on bfd + > >>> bundle(active_backup) > >>> output actions, since the proposal of having ovn-northd monitoring the > >>> gateways > >>> seems a bit unnatural, and the difference in effort (naive vs > >>> active/standby) is > >>> probably not very big (warning: I tend to be optimistic). > >>> > >>> I spent a couple of days looking at how L3 works now, and, very > >>> naively, I would > >>> propose either having the redirect-chassis option of > Logical_Router_Ports > >>> accept > >>> multiple chassis with priorities. > >>> > >>> For example: > >>> > >>> ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \ > >>> -- set Logical_Router_Port alice > >>> options:redirect-chassis=gw1:10,gw2:20,gw3:30 > >>> > >>> Or multiple chassis without priorities: > >>> > >>> ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \ > >>> -- set Logical_Router_Port alice > >>> options:redirect-chassis=gw1,gw2,gw3 > >>> > >>> (and in this case we let ovn decide -and may be rewrite the > >>> option- how to balance > >>> priorities between gateways to spread the load) > >>> > >>> > >>> We may want to have another field in the Logical_Router_Port, > to > >>> let us know which > >>> one(s) is(are) the active gateway(s) > >>> > >>> > >>> This logical model would also allow for Active/Active L3 when we > >>> have that implemented, > >>> for example by assigning the same priorities. > >>> > >>> > >>> Alternatively we could have two options: > >>> * ha-redirect-chassis=<chassis>:<priority>[ .. > >>> :<chassis2>:<priority2>] > >>> * ha-redirect-mode=active_standby/active_active > >>> > >>> > >>> Best regards, > >>> Miguel Ángel Ajo > >>> > >>> [1] https://github.com/openvswitch/ovs/blob/master/Documenta > >>> tion/topics/high-availability.rst > >>> > >>> > >> > > _______________________________________________ > > dev mailing list > > [email protected] > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > -- > Russell Bryant > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
