On Tue, Mar 5, 2019 at 2:00 PM Han Zhou <[email protected]> wrote:

> On Thu, Feb 28, 2019 at 12:59 AM Numan Siddique <[email protected]>
> wrote:
> >
> >
> >
> > On Thu, Feb 28, 2019 at 1:50 AM Han Zhou <[email protected]> wrote:
> >>
> >> On Wed, Feb 27, 2019 at 2:45 AM <[email protected]> wrote:
> >> >
> >> > From: Numan Siddique <[email protected]>
> >> >
> >> > This patch adds the tables - 'HA_Chassis_Group' and 'HA_Chassis' in
> >> > both OVN Northbound and Southbound DBs to support generic HA Chassis
> >> > groups in OVN. CMS can create a group of HA chassis with the
> priorities
> >> > assigned to each chassis in the group. An HA chassis group can be
> associated to
> >> > a distributed logical router port. An upcoming patch will make
> >> > use of it while supporting  'external'* logical ports.
> >> >
> >> > HA chassis group is similar to the existing gateway chassis support in
> >> > OVN which is used by the distributed gateway router ports.
> >> > This patch tries to abstract this so that, the HA chassis support
> >> > can be leveraged by not just distributed gateway router ports.
> >> >
> >> > If a logical router port has a set of gateway chassis associated to
> >> > it, ovn-northd will create HA chassis group in Southbound
> >> > DB and add these gateway chassis to this group. ovn-northd would
> still create
> >> > gateway chassis in Southbound DB as ovn-controller still doesn't
> support
> >> > using the HA chassis group.
> >> >
> >> > Next patch in the series will add the support in ovn-controller to
> >> > make use of HA chassis group instead of gateway chassis. The patch
> following
> >> > that will delete creation of gateway chassis in Southbound DB.
> >> >
> >> > HA_Chasss_Group table in Southbound DB has a column - 'ref_chassis'.
> >> > This column is used to store the list of chassis which references the
> >> > HA chassis group. This information will be used by ovn-controller in
> an
> >> > upcoming patch to establish BFD sessions with the required chassis.
> >> >
> >> > Suppose if there is an HA chassis group - 'hagrp1' in the Southbound
> >> > DB and it has HA chasiss list - ha1, ha2 and ha3 and this HA chassis
> >> > group is used by a distributed logical router port, then ovn-northd
> >> > will update the 'ref_chassis' with the list of chassis which has
> claimed
> >> > all the logical switch ports which are connected to the logical router
> >> > which has this distributed logical router port.
> >> >
> >>
> >> I haven't finished the detailed review, but a concern for the design
> >> here. It seems the benefit of the *group* is to be able to create it
> >> once and reused by different logical router ports (or external ports).
> >> However, the best practice (at least ours) is to distribute logical
> >> router ports evenly on all available gateway chassises, which means
> >> from different lrp's point of view, the priority of each gateway
> >> chassis is different. E.g. there are 2 GWs as a HA group (gw1, gw2),
> >> lrp1 uses gw1 as active (gw1 - priority 2, gw2 - priority 1) and lrp2
> >> uses gw2 as active (gw1 - priority 1, gw2 - priority 2). I understand
> >> that the use case for external ports may be different, since as we
> >> discussed in your earlier patch reviews that it is better to have only
> >> one dedicated node to host all external ports to avoid MAC flapping
> >> problem, so the HA group does simplify the external port HA support
> >> from CMS point of view. However for gateway scenario, to use the HA
> >> group while balancing the gateway load, we will have to create
> >> multiple HA groups just to assign different priority combinations,
> >> which becomes much more complex and inconvenient comparing with
> >> current gateway_chassis implementation (without group). Do you have
> >> any thoughts on this?
> >
> >
> > Yes. I thought about this.
> > With the existing gateway_chassis approach, CMS would do something like
> >
> > ****
> > ovn-nbctl lrp-set-gateway-chassis lr0-p1 gw1 20
> > ovn-nbctl lrp-set-gateway-chassis lr0-p1 gw2 30
> >
> > ovn-nbctl lrp-set-gateway-chassis lr0-p2 gw1 30
> > ovn-nbctl lrp-set-gateway-chassis lr0-p2 gw2 20
> > *****
> >
> > With the proposed approach it would be something like
> >
> > *****
> > ovn-nbctl  ha-chassis-group-add hagrp1
> > ovn-nbctl ha-chassis-group-add-chassis hagrp1 gw1 20
> > ovn-nbctl ha-chassis-group-add-chassis hagrp1 gw2 30
> > ovn-nbctl set logical_router_port lr0-p1 ha_chassis_group=hagrp1
> >
> > ovn-nbctl ha-chassis-group-add hagrp2
> > ovn-nbctl ha-chassis-group-add-chassis hagrp2 gw1 30
> > ovn-nbctl ha-chassis-group-add-chassis hagrp2 gw2 20
> > ovn-nbctl set logical_router_port lr0-p2 ha_chassis_group=hagrp2
> > *****
> >
> > I agree that from CMS perspective, it requires a bit more of work.
> > Is that a concern ?
> >
> > The proposed solution is completely backward compatible. CMS can still
> create
> > gateway_chassis. ovn-northd internally creates ha_chassis_group in
> Southbound DB corresponding
> > to the set of gateway_chassis associated to a logical rotuter port.
> >
> > If CMS does the below
> > ****
> > ovn-nbctl lrp-set-gateway-chassis lr0-p1 gw1 20
> > ovn-nbctl lrp-set-gateway-chassis lr0-p1 gw2 30
> > ****
> >
> > ovn-northd will create one ha_chassis_group in SB DB with the name
> "lr0-p1" and add 2 ha chassis to this group
> > and associate this group to Port_Binding.Ha_Chassis_Group.
> >
> > We have few options here
> >   1. Allow CMS to use either of the approaches but deprecate the usage
> of gateway_chassis in NB DB (which is the proposed solution)
> >   2. Allow CMS to use either of the approaches. So we will not deprecate
> the usage of gateway_chassis in NB DB.
> >   3. Don't use ha_chassis_group in NB DB for distributed logical router
> ports. Internally ovn-northd will create ha_chassis_group in SB DB. So that
> >       HA implementation in ovn-controller is simple and generic.
> >
> > Regardless of any approach we take, I think it would be better  in SB DB
> to use HA_Chassis_Group for both logical router ports and logical switch
> ports
> > of type 'external'.
> >
> > My preference would be take (2).
> >
> > Please let me know your thoughts on this.
> >
> > Thanks
> > Numan
> >
>
> Numan, I prefer option (2), too.
>
> I am still reviewing the code. It is a big change :) I think here is a
> major finding worth posting before I finish the full review.
> This change introduces ref_chassis concept, which intends to figure
> out the clients relying on the service provided by a HA group, so that
> BFD sessions will be enabled between the clients and servers only.
> However, in build_ha_chassis_group_ref_chassis(), it only checks
> directly related datapath, i.e. logical switch ports directly
> connected to logical routers. It will prevent a chassis with a logical
> switch port in-directly connects to a logical router to enable BFD
> with the gateway that hosts the logical router port. E.g. lsp1(on
> Chassis1) - ls1 -  lr1 - ls2 - lr2 - lrp2 (on HA-GRP1). In this
> example, Chassis1 should be in the ref_chassis of HA-GRP1.
>

Thanks Han for pointing this out. I didn't even think of this use case. The
existing
test cases passed. So I thought I haven't broken anything.

I have addressed this in v3 and posted it here -
https://patchwork.ozlabs.org/project/openvswitch/list/?series=95504

Thanks for the review.

Numan
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to