Re: [ovs-dev] OVN L3-HA request for feedback

Miguel Angel Ajo Pelayo Thu, 25 May 2017 02:13:02 -0700

On Wed, May 24, 2017 at 7:39 PM, Russell Bryant <[email protected]> wrote:


> On Wed, May 24, 2017 at 7:19 AM, Miguel Angel Ajo Pelayo
> <[email protected]> wrote:
> > I wanted to share a small status update:
> >
> > Anil and I have been working on this [1] and we expect to post
> > some preliminary  patches before the end of the week.
> >
> > I can confirm that the BFD + bundle(active_backup) strategy
> > works well from the hypervisors point of view. With 1 sec BFD
> > pings we get a ~2.8s failover time.
> >
> > So far we have only focused on case "2" so far for the distributed
> > routers where we specify a set of hosts to act as chassis.
> >
> > """
> >      ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \
> >                   -- set Logical_Router_Port alice \
> >                   options:redirect-chassis=gw1:10,gw2:20,gw3:30
> > """
> >
>
> Thanks for the update!  Sounds like great progress.
>
> > We wonder if there's any value at all in exploring support on "1"
> > the old way of pinning a logical router to a chassis.
>
> You mean only specifying a single chassis here?  Does it add a lot of
> complexity to only support a single gateway?


I don't know yet, I need to look at that specific case.


> If not, it definitely
> seems worth keeping.  Supporting simpler setups is a good thing.
>

Ignorant question, setting the chassis option on the Logical Router,
doesn't make the router "less distributed", i.e. making the E/W traffic
flow through the specific chassis?. This is basically why I didn't pay
attention to it, but if it's not the case, of course, it's worth working on
it.



>
> > If anybody wants to give it a try you can use [2] to quickly deploy
> > 2 gw hosts + 2 "hv" hosts, + 1 service host (accessible through an
> > 'external' network via gw1 and gw2) (see ascii diagram [3] and [4]
> details)
> >
> >
> > Then you can ping the external service from a port in hv1 with:
> >
> >     $ vagrant ssh hv1 -c "sudo ip netns exec vm1 ping 10.0.0.111"
> >
> > or the vm3 via floating point with:
> >
> >     $ vagrant ssh svc1 -c "ping 10.0.0.16"
> >
> > you can trigger a failover anytime by doing:
> >
> >     $ vagrant ssh gw1 -c "sudo ifdown eth1"
> >
> > and a failback, by:
> >
> >      $ vagrant ssh gw1 -c "sudo ifup eth1"
> >
> >
> >
> > We are currently working on:
> >
> > 1) Addressing the monitoring of the inter-gateway bfd, to make sure that
> >     non-master routers will drop any packet (external/internal) or any
> ARP
> > request.
> > 2) Same as 1 but for playing gARPs when a router is in a new chassis.
> > 3) Documentation changes.
> > 4) Tests
> >
> > And we have some questions:
> >
> >     About preemption (see failover/failback example above), we have
> several
> > options:
> >    a) we stick to have preemptive failbacks (if a gateway chassis comes
> > back online, the routers which were scheduled there will bome back)
> >    b) not preemtive: (when a chassis goes down all logical router ports
> with
> > redirect chassis will be recalculated). or
> >    c) we make it configurable.
> >
> > My intuition says that with very low failover times "a" could be a
> > reasonable
> > thing for most cases, since your load stays balanced when your gateway
> > chassis
> > comes back.  But I'm not an operator, how could we gather feedback on
> > this area?
>
> Good question and good point about wanting to ensure load remains
> balanced when a chassis comes back.
>
> With (a), I'd be worried about the case where a chassis is in more of
> a half-dead (zombie?) state.


right, good point.


> We don't want failover bouncing back and
> forth because we keep thinking a chassis is going up and down.  Any
> thoughts on how to mitigate this?
>

we could make the preemptive behaviour a little bit more complex, by
adding a recovery time, and during that recovery time the failover would be
non-preemptive.

 For non-preemptive I was considering that northd could see the new chassis
binding to the redirect port, and reordering the priorities in SBDB and
NBDB.


>
> >
> > Best regards,
> > Miguel Ángel Ajo
> >
> > [1] https://github.com/mangelajo/ovs/commits/l3ha
> > [2] https://github.com/mangelajo/vagrants/tree/master/ovn-l3-ha
> > [3]
> > https://github.com/mangelajo/vagrants/blob/master/ovn-l3-
> ha/Vagrantfile#L16
> > [4] https://github.com/mangelajo/vagrants/blob/master/ovn-l3-
> ha/gw1.sh#L67
> >
> > On Fri, Apr 7, 2017 at 9:14 AM, Miguel Angel Ajo Pelayo <
> [email protected]
> >> wrote:
> >
> >> Updating what I wrote yesterday (I hope I won't make people's
> >> eyes hurt today) after a talk on IRC (thank you Mickey Spiegel
> >> and Gurucharan Shetty):
> >>
> >> I propose having:
> >>
> >>    1) chassis on NB/Logical_Router accept multiple chassis, to cover
> >> HA on the centralized gateway case for DNAT/SNAT.
> >>
> >>            ovn-nbctl create Logical_Router name=edge1 \
> >>                      options:chassis=gw1:10,gw2:20,gw3:30
> >>
> >>         Or multiple chassis without priorities:
> >>
> >>            ovn-nbctl create Logical_Router name=edge1 \
> >>                      options:chassis=gw1,gw2,gw3
> >>
> >>         and in this case we let ovn decide -and rewrite the option-
> >>         to balance priorities between gateways to spread the load.
> >>
> >>    2) redirect-chassis on NB/Logical_Router_Port to accept multiple
> >> chassis to cover HA for centralized SNAT on distributed routers.
> >>
> >>         ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \
> >>                   -- set Logical_Router_Port alice \
> >>                   options:redirect-chassis=gw1:10,gw2:20,gw3:30
> >>
> >>         (or again, without priorities)
> >>
> >>         ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \
> >>                   -- set Logical_Router_Port alice \
> >>                   options:redirect-chassis=gw1,gw2,gw3
> >>
> >>
> >> These logical model changes allow for Active/Active L3 when we have
> >> that implemented, for example by assigning the same priorities.
> >>
> >> Alternatively in such case we could add another option
> >> for case (1):  ha-chassis-mode=active_standby/active_active,
> >> and ha-redirect-mode=active_standby/active_active for case (2).
> >>
> >> For the dataplane implementation I propose following what [1] defines
> >> for Active/Standby per-router implemetation, with BFD monitoring for
> >> tunnel endpoints, where the location of the master router is
> >> independently calculated at every chassis,  making the solution
> >> independent of the controller connection via SB database.
> >>
> >> There are to start with, a few gaps that we need to properly defined
> yet:
> >>
> >> 1) I'd like to see reporting of the master gateway somehow through
> >> SB db [up to the NB db?], in a way that the administrator can inspect
> >> the system and see what's it's current state.
> >>
> >> 2) While how hypervisors will direct traffic to the calculated
> >> master router via the bundle action with the active_backup algorithm,
> >> I believe we can't have anything in OpenFlow to drop packets in the
> >> standby routers based on the inter-gateway link matrix status.
> >>
> >> 3) Other related changes in the SouthBound DB.
> >>
> >> Best regards,
> >> Miguel Ángel Ajo
> >>
> >> [1] https://github.com/openvswitch/ovs/blob/master/
> >> Documentation/topics/high-availability.rst
> >>
> >>
> >> On Thu, Apr 6, 2017 at 12:13 PM, Miguel Angel Ajo Pelayo <
> >> [email protected]> wrote:
> >>
> >>> Hello everybody,
> >>>
> >>>      First I'd like to say hello, because I'll be able to spend more
> time
> >>> working
> >>> with this community, and I'm sure it will be an enjoyable journey for
> >>> what I've
> >>> seen (silently watching) during the last few years.
> >>>
> >>>      I'm planning to start work (together with Anil) on the L3 High
> >>> availability area of OVN. We've been reading [1], and it seems quite
> >>> reasonable.
> >>>
> >>>      We're wondering to fast forward and skip the naive active/backup
> >>> implementation
> >>> in favor of the Active/Standby (per router) based on bfd +
> >>> bundle(active_backup)
> >>> output actions, since the proposal of having ovn-northd monitoring the
> >>> gateways
> >>> seems a bit unnatural, and the difference in effort (naive vs
> >>> active/standby) is
> >>> probably not very big (warning: I tend to be optimistic).
> >>>
> >>>      I spent a couple of days looking at how L3 works now, and, very
> >>> naively, I would
> >>> propose either having the redirect-chassis option of
> Logical_Router_Ports
> >>> accept
> >>> multiple chassis with priorities.
> >>>
> >>>     For example:
> >>>
> >>>         ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \
> >>>              -- set Logical_Router_Port alice
> >>> options:redirect-chassis=gw1:10,gw2:20,gw3:30
> >>>
> >>> Or multiple chassis without priorities:
> >>>
> >>>         ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \
> >>>              -- set Logical_Router_Port alice
> >>> options:redirect-chassis=gw1,gw2,gw3
> >>>
> >>>         (and in this case we let ovn decide -and may be rewrite the
> >>> option- how to balance
> >>> priorities between gateways to spread the load)
> >>>
> >>>
> >>>         We may want to have another field in the Logical_Router_Port,
> to
> >>> let us know which
> >>> one(s) is(are) the active gateway(s)
> >>>
> >>>
> >>>        This logical model would also allow for Active/Active L3 when we
> >>> have that implemented,
> >>> for example by assigning the same priorities.
> >>>
> >>>
> >>>        Alternatively we could have two options:
> >>>           * ha-redirect-chassis=<chassis>:<priority>[ ..
> >>> :<chassis2>:<priority2>]
> >>>           * ha-redirect-mode=active_standby/active_active
> >>>
> >>>
> >>> Best regards,
> >>> Miguel Ángel Ajo
> >>>
> >>> [1] https://github.com/openvswitch/ovs/blob/master/Documenta
> >>> tion/topics/high-availability.rst
> >>>
> >>>
> >>
> > _______________________________________________
> > dev mailing list
> > [email protected]
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
>
> --
> Russell Bryant
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] OVN L3-HA request for feedback

Reply via email to