On Thu, Feb 28, 2019 at 12:59 AM Numan Siddique <[email protected]> wrote: > > > > On Thu, Feb 28, 2019 at 1:50 AM Han Zhou <[email protected]> wrote: >> >> On Wed, Feb 27, 2019 at 2:45 AM <[email protected]> wrote: >> > >> > From: Numan Siddique <[email protected]> >> > >> > This patch adds the tables - 'HA_Chassis_Group' and 'HA_Chassis' in >> > both OVN Northbound and Southbound DBs to support generic HA Chassis >> > groups in OVN. CMS can create a group of HA chassis with the priorities >> > assigned to each chassis in the group. An HA chassis group can be >> > associated to >> > a distributed logical router port. An upcoming patch will make >> > use of it while supporting 'external'* logical ports. >> > >> > HA chassis group is similar to the existing gateway chassis support in >> > OVN which is used by the distributed gateway router ports. >> > This patch tries to abstract this so that, the HA chassis support >> > can be leveraged by not just distributed gateway router ports. >> > >> > If a logical router port has a set of gateway chassis associated to >> > it, ovn-northd will create HA chassis group in Southbound >> > DB and add these gateway chassis to this group. ovn-northd would still >> > create >> > gateway chassis in Southbound DB as ovn-controller still doesn't support >> > using the HA chassis group. >> > >> > Next patch in the series will add the support in ovn-controller to >> > make use of HA chassis group instead of gateway chassis. The patch >> > following >> > that will delete creation of gateway chassis in Southbound DB. >> > >> > HA_Chasss_Group table in Southbound DB has a column - 'ref_chassis'. >> > This column is used to store the list of chassis which references the >> > HA chassis group. This information will be used by ovn-controller in an >> > upcoming patch to establish BFD sessions with the required chassis. >> > >> > Suppose if there is an HA chassis group - 'hagrp1' in the Southbound >> > DB and it has HA chasiss list - ha1, ha2 and ha3 and this HA chassis >> > group is used by a distributed logical router port, then ovn-northd >> > will update the 'ref_chassis' with the list of chassis which has claimed >> > all the logical switch ports which are connected to the logical router >> > which has this distributed logical router port. >> > >> >> I haven't finished the detailed review, but a concern for the design >> here. It seems the benefit of the *group* is to be able to create it >> once and reused by different logical router ports (or external ports). >> However, the best practice (at least ours) is to distribute logical >> router ports evenly on all available gateway chassises, which means >> from different lrp's point of view, the priority of each gateway >> chassis is different. E.g. there are 2 GWs as a HA group (gw1, gw2), >> lrp1 uses gw1 as active (gw1 - priority 2, gw2 - priority 1) and lrp2 >> uses gw2 as active (gw1 - priority 1, gw2 - priority 2). I understand >> that the use case for external ports may be different, since as we >> discussed in your earlier patch reviews that it is better to have only >> one dedicated node to host all external ports to avoid MAC flapping >> problem, so the HA group does simplify the external port HA support >> from CMS point of view. However for gateway scenario, to use the HA >> group while balancing the gateway load, we will have to create >> multiple HA groups just to assign different priority combinations, >> which becomes much more complex and inconvenient comparing with >> current gateway_chassis implementation (without group). Do you have >> any thoughts on this? > > > Yes. I thought about this. > With the existing gateway_chassis approach, CMS would do something like > > **** > ovn-nbctl lrp-set-gateway-chassis lr0-p1 gw1 20 > ovn-nbctl lrp-set-gateway-chassis lr0-p1 gw2 30 > > ovn-nbctl lrp-set-gateway-chassis lr0-p2 gw1 30 > ovn-nbctl lrp-set-gateway-chassis lr0-p2 gw2 20 > ***** > > With the proposed approach it would be something like > > ***** > ovn-nbctl ha-chassis-group-add hagrp1 > ovn-nbctl ha-chassis-group-add-chassis hagrp1 gw1 20 > ovn-nbctl ha-chassis-group-add-chassis hagrp1 gw2 30 > ovn-nbctl set logical_router_port lr0-p1 ha_chassis_group=hagrp1 > > ovn-nbctl ha-chassis-group-add hagrp2 > ovn-nbctl ha-chassis-group-add-chassis hagrp2 gw1 30 > ovn-nbctl ha-chassis-group-add-chassis hagrp2 gw2 20 > ovn-nbctl set logical_router_port lr0-p2 ha_chassis_group=hagrp2 > ***** > > I agree that from CMS perspective, it requires a bit more of work. > Is that a concern ? > > The proposed solution is completely backward compatible. CMS can still create > gateway_chassis. ovn-northd internally creates ha_chassis_group in Southbound > DB corresponding > to the set of gateway_chassis associated to a logical rotuter port. > > If CMS does the below > **** > ovn-nbctl lrp-set-gateway-chassis lr0-p1 gw1 20 > ovn-nbctl lrp-set-gateway-chassis lr0-p1 gw2 30 > **** > > ovn-northd will create one ha_chassis_group in SB DB with the name "lr0-p1" > and add 2 ha chassis to this group > and associate this group to Port_Binding.Ha_Chassis_Group. > > We have few options here > 1. Allow CMS to use either of the approaches but deprecate the usage of > gateway_chassis in NB DB (which is the proposed solution) > 2. Allow CMS to use either of the approaches. So we will not deprecate the > usage of gateway_chassis in NB DB. > 3. Don't use ha_chassis_group in NB DB for distributed logical router > ports. Internally ovn-northd will create ha_chassis_group in SB DB. So that > HA implementation in ovn-controller is simple and generic. > > Regardless of any approach we take, I think it would be better in SB DB to > use HA_Chassis_Group for both logical router ports and logical switch ports > of type 'external'. > > My preference would be take (2). > > Please let me know your thoughts on this. > > Thanks > Numan >
Numan, I prefer option (2), too. I am still reviewing the code. It is a big change :) I think here is a major finding worth posting before I finish the full review. This change introduces ref_chassis concept, which intends to figure out the clients relying on the service provided by a HA group, so that BFD sessions will be enabled between the clients and servers only. However, in build_ha_chassis_group_ref_chassis(), it only checks directly related datapath, i.e. logical switch ports directly connected to logical routers. It will prevent a chassis with a logical switch port in-directly connects to a logical router to enable BFD with the gateway that hosts the logical router port. E.g. lsp1(on Chassis1) - ls1 - lr1 - ls2 - lr2 - lrp2 (on HA-GRP1). In this example, Chassis1 should be in the ref_chassis of HA-GRP1. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
