On Thu, Jan 24, 2019 at 10:56 AM Numan Siddique <[email protected]> wrote: > > > > On Thu, Jan 24, 2019 at 9:20 PM Han Zhou <[email protected]> wrote: >> >> On Mon, Jan 21, 2019 at 7:06 AM Miguel Angel Ajo Pelayo >> <[email protected]> wrote: >> > >> > >> > >> > On Mon, Jan 21, 2019 at 4:02 PM Numan Siddique <[email protected]> wrote: >> >> >> >> >> >> Hi Han, >> >> >> >> I have addressed your comments. But before posting the patch I wanted to >> >> get an opinion >> >> on the HA support for these external ports. >> >> >> >> The proposed patch doesn't support HA. If the requested chassis goes down >> >> for some reason >> >> it is expected that CMS would detect it and change the requested-chassis >> >> option to other >> >> suitable chassis. >> >> >> >> The openstack OVN folks think this would be too much for the CMS to >> >> handle and it would >> >> complicate the code in networking-ovn which I agree with. >> >> >> > >> > Not only the complexity part. If we implement this from the CMS, then >> > every CMS using ovn >> > will need to replicate that behaviour. >> > >> > That's in my opinion a good reason why it's better to handle HA within OVN >> > itself. >> > >> >> >> >> I am thinking to add the HA support on the lines of gateway chassis >> >> support and I want to >> >> submit this patch after adding the HA support. I think this would be >> >> better as we won't add >> >> more options in OVN (first requested-chassis for external ports and then >> >> later HA chassis support). >> >> Thoughts? >> >> I thought it would be easier to support outside of OVN combining with >> chassis life-cycle management, but I didn't go deeper in any CMS >> implementation. I agree it is better to handle HA in OVN than >> implementing it in every CMS. But I am also worring about the >> complexity in OVN itself. Could you describe briefly how would you >> support it in OVN? For example, how to detect if a chassis failed? It >> is different from gateway chassis because the major use case of >> external port is for bridged networks (vlan/flat), so I think the BFD >> mechanism for tunnel health monitoring may not be a good fit here. > > > At present, as you know, ovn-controller's do establish geneve tunnels > even if there are only logical switches representing bridged networks. > So I feel we can leverage it unless we have a better mechanism > to detect health monitoring. > > Do you have any thoughts/ideas of any other possibilities ? > > I am also trying to make the HA support more generic and a bit simpler > (hopefully :)) > so that it can be used either for "external" ports or for the redirectchassis > router ports. > One example that the BFD monitoring may not be good for external port HA is, in purely bridged environment there is a potential optimization to not creat the tunnel mesh at all, but this BFD dependency makes it harder. However it seems not a strong blocker, since the optimization can selectively create tunnels.
If we have to implement HV in OVN, I don't have any better idea than BFD for now. It may be good in practice. We'd better hear opinion from more people. It may worth abstracting data plane monitoring mechanism, and implement new mechanisms in the future, but it doesn't need to be in this version of code if BFD is the only mechanism for now anyway. Outside of OVN, I believe all chassis manage systems should have their own ways of monitoring. Would it be sufficient just to provide an interface from OVN (and even CMS) so that the failures detected by external systems can be used to trigger the failover? In addition, the current priority based gateway chassis HA mechanism has the split-brain problem upon network partitioning. This may be a independent topic ;) > > Thanks > Numan > >> Thanks, >> Han >> >> >> >> Thanks >> >> Numan >> >> >> >> >> >> On Sat, Jan 19, 2019 at 12:42 AM Numan Siddique <[email protected]> >> >> wrote: >> >>> >> >>> >> >>> >> >>> On Sat, Jan 19, 2019, 12:32 AM Han Zhou <[email protected] wrote: >> >>>> >> >>>> On Fri, Jan 18, 2019 at 10:16 AM Numan Siddique <[email protected]> >> >>>> wrote: >> >>>> > >> >>>> > >> >>>> > >> >>>> > On Fri, Jan 18, 2019 at 2:11 AM Han Zhou <[email protected]> wrote: >> >>>> >> >> >>>> >> On Thu, Jan 17, 2019 at 11:32 AM Han Zhou <[email protected]> wrote: >> >>>> >> > >> >>>> >> > On Thu, Jan 17, 2019 at 11:25 AM Numan Siddique >> >>>> >> > <[email protected]> wrote: >> >>>> >> > > >> >>>> >> > > >> >>>> >> > > >> >>>> >> > > On Fri, Jan 18, 2019 at 12:21 AM Han Zhou <[email protected]> >> >>>> >> > > wrote: >> >>>> >> > >> >> >>>> >> > >> Hi Numan, >> >>>> >> > >> >> >>>> >> > >> With v5 the new test case "external logical port" fails. >> >>>> >> > >> And please see more comments inlined. >> >>>> >> > >> >> >>>> >> > >> On Tue, Jan 15, 2019 at 12:09 PM <[email protected]> wrote: >> >>>> >> > >> > >> >>>> >> > >> > From: Numan Siddique <[email protected]> >> >>>> >> > >> > >> >>>> >> > >> > In the case of OpenStack + OVN, when the VMs are booted on >> >>>> >> > >> > hypervisors supporting SR-IOV nics, there are no OVS ports >> >>>> >> > >> > for these VMs. When these VMs sends DHCPv4, DHPCv6 or IPv6 >> >>>> >> > >> > Router Solicitation requests, the local ovn-controller >> >>>> >> > >> > cannot reply to these packets. OpenStack Neutron dhcp agent >> >>>> >> > >> > service needs to be run to serve these requests. >> >>>> >> > >> > >> >>>> >> > >> > With the new logical port type - 'external', OVN itself can >> >>>> >> > >> > handle these requests avoiding the need to deploy any >> >>>> >> > >> > external services like neutron dhcp agent. >> >>>> >> > >> > >> >>>> >> > >> > To make use of this feature, CMS has to >> >>>> >> > >> > - create a logical port for such VMs >> >>>> >> > >> > - set the type to 'external' >> >>>> >> > >> > - set requested-chassis="<chassis-name>" in the options >> >>>> >> > >> > column. >> >>>> >> > >> > - create a localnet port for the logical switch >> >>>> >> > >> > - configure the ovn-bridge-mappings option in the OVS db. >> >>>> >> > >> > >> >>>> >> > >> > When the ovn-controller running in that 'chassis', detects >> >>>> >> > >> > the Port_Binding row, it adds the necessary DHCPv4/v6 OF >> >>>> >> > >> > flows. Since the packet enters the logical switch pipeline >> >>>> >> > >> > via the localnet port, the inport register (reg14) is set >> >>>> >> > >> > to the tunnel key of localnet port in the match conditions. >> >>>> >> > >> > >> >>>> >> > >> > In case the chassis goes down for some reason, it is the >> >>>> >> > >> > responsibility of CMS to change the 'requested-chassis' >> >>>> >> > >> > option to some other active chassis, so that it can serve >> >>>> >> > >> > these requests. >> >>>> >> > >> > >> >>>> >> > >> > When the VM with the external port, sends an ARP request for >> >>>> >> > >> > the router ips, only the chassis which has claimed the port, >> >>>> >> > >> > will reply to the ARP requests. Rest of the chassis on >> >>>> >> > >> > receiving these packets drop them in the ingress switch >> >>>> >> > >> > datapath stage - S_SWITCH_IN_EXTERNAL_PORT which is just >> >>>> >> > >> > before S_SWITCH_IN_L2_LKUP. >> >>>> >> > >> > >> >>>> >> > >> > This would guarantee that only the chassis which has claimed >> >>>> >> > >> > the external ports will run the router datapath pipeline. >> >>>> >> > >> > >> >>>> >> > >> > Signed-off-by: Numan Siddique <[email protected]> >> >>>> >> > >> > --- >> >>>> >> > >> > >> >>>> >> > >> > v4 -> v5 >> >>>> >> > >> > ------ >> >>>> >> > >> > * Addressed review comments from Han Zhou. >> >>>> >> > >> > >> >>>> >> > >> > v3 -> v4 >> >>>> >> > >> > ------ >> >>>> >> > >> > * Updated the documention as per Han Zhou's suggestion. >> >>>> >> > >> > >> >>>> >> > >> > v2 -> v3 >> >>>> >> > >> > ------- >> >>>> >> > >> > * Rebased >> >>>> >> > >> > >> >>>> >> > >> > ovn/controller/binding.c | 12 + >> >>>> >> > >> > ovn/controller/lflow.c | 41 ++- >> >>>> >> > >> > ovn/controller/lflow.h | 2 + >> >>>> >> > >> > ovn/controller/lport.c | 26 ++ >> >>>> >> > >> > ovn/controller/lport.h | 5 + >> >>>> >> > >> > ovn/controller/ovn-controller.c | 6 + >> >>>> >> > >> > ovn/lib/ovn-util.c | 1 + >> >>>> >> > >> > ovn/northd/ovn-northd.8.xml | 37 ++- >> >>>> >> > >> > ovn/northd/ovn-northd.c | 85 ++++- >> >>>> >> > >> > ovn/ovn-architecture.7.xml | 78 +++++ >> >>>> >> > >> > ovn/ovn-nb.xml | 47 +++ >> >>>> >> > >> > tests/ovn.at | 530 >> >>>> >> > >> > +++++++++++++++++++++++++++++++- >> >>>> >> > >> > 12 files changed, 848 insertions(+), 22 deletions(-) >> >>>> >> > >> > >> >>>> >> > >> > diff --git a/ovn/controller/binding.c >> >>>> >> > >> > b/ovn/controller/binding.c >> >>>> >> > >> > index 021ecddcf..64e605b92 100644 >> >>>> >> > >> > --- a/ovn/controller/binding.c >> >>>> >> > >> > +++ b/ovn/controller/binding.c >> >>>> >> > >> > @@ -471,6 +471,18 @@ consider_local_datapath(struct >> >>>> >> > >> > ovsdb_idl_txn *ovnsb_idl_txn, >> >>>> >> > >> > * for them. */ >> >>>> >> > >> > sset_add(local_lports, binding_rec->logical_port); >> >>>> >> > >> > our_chassis = false; >> >>>> >> > >> > + } else if (!strcmp(binding_rec->type, "external")) { >> >>>> >> > >> > + const char *chassis_id = >> >>>> >> > >> > smap_get(&binding_rec->options, >> >>>> >> > >> > + >> >>>> >> > >> > "requested-chassis"); >> >>>> >> > >> > + our_chassis = chassis_id && ( >> >>>> >> > >> > + !strcmp(chassis_id, chassis_rec->name) || >> >>>> >> > >> > + !strcmp(chassis_id, chassis_rec->hostname)); >> >>>> >> > >> > + if (our_chassis) { >> >>>> >> > >> > + add_local_datapath(sbrec_datapath_binding_by_key, >> >>>> >> > >> > + >> >>>> >> > >> > sbrec_port_binding_by_datapath, >> >>>> >> > >> > + sbrec_port_binding_by_name, >> >>>> >> > >> > + binding_rec->datapath, true, >> >>>> >> > >> > local_datapaths); >> >>>> >> > >> > + } >> >>>> >> > >> > } >> >>>> >> > >> > >> >>>> >> > >> > if (our_chassis >> >>>> >> > >> > diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c >> >>>> >> > >> > index 8db81927e..98e8ed3b9 100644 >> >>>> >> > >> > --- a/ovn/controller/lflow.c >> >>>> >> > >> > +++ b/ovn/controller/lflow.c >> >>>> >> > >> > @@ -52,7 +52,10 @@ lflow_init(void) >> >>>> >> > >> > struct lookup_port_aux { >> >>>> >> > >> > struct ovsdb_idl_index >> >>>> >> > >> > *sbrec_multicast_group_by_name_datapath; >> >>>> >> > >> > struct ovsdb_idl_index *sbrec_port_binding_by_name; >> >>>> >> > >> > + struct ovsdb_idl_index *sbrec_port_binding_by_type; >> >>>> >> > >> > + struct ovsdb_idl_index *sbrec_datapath_binding_by_key; >> >>>> >> > >> > const struct sbrec_datapath_binding *dp; >> >>>> >> > >> > + const struct sbrec_chassis *chassis; >> >>>> >> > >> > }; >> >>>> >> > >> > >> >>>> >> > >> > struct condition_aux { >> >>>> >> > >> > @@ -66,6 +69,8 @@ static void consider_logical_flow( >> >>>> >> > >> > struct ovsdb_idl_index *sbrec_chassis_by_name, >> >>>> >> > >> > struct ovsdb_idl_index >> >>>> >> > >> > *sbrec_multicast_group_by_name_datapath, >> >>>> >> > >> > struct ovsdb_idl_index *sbrec_port_binding_by_name, >> >>>> >> > >> > + struct ovsdb_idl_index *sbrec_port_binding_by_type, >> >>>> >> > >> > + struct ovsdb_idl_index *sbrec_datapath_binding_by_key, >> >>>> >> > >> > const struct sbrec_logical_flow *, >> >>>> >> > >> > const struct hmap *local_datapaths, >> >>>> >> > >> > const struct sbrec_chassis *, >> >>>> >> > >> > @@ -89,8 +94,24 @@ lookup_port_cb(const void *aux_, const >> >>>> >> > >> > char *port_name, unsigned int *portp) >> >>>> >> > >> > const struct sbrec_port_binding *pb >> >>>> >> > >> > = >> >>>> >> > >> > lport_lookup_by_name(aux->sbrec_port_binding_by_name, >> >>>> >> > >> > port_name); >> >>>> >> > >> > if (pb && pb->datapath == aux->dp) { >> >>>> >> > >> > - *portp = pb->tunnel_key; >> >>>> >> > >> > - return true; >> >>>> >> > >> > + if (strcmp(pb->type, "external")) { >> >>>> >> > >> > + *portp = pb->tunnel_key; >> >>>> >> > >> > + return true; >> >>>> >> > >> > + } >> >>>> >> > >> > + const char *chassis_id = smap_get(&pb->options, >> >>>> >> > >> > + >> >>>> >> > >> > "requested-chassis"); >> >>>> >> > >> > + if (chassis_id && (!strcmp(chassis_id, >> >>>> >> > >> > aux->chassis->name) || >> >>>> >> > >> > + !strcmp(chassis_id, >> >>>> >> > >> > aux->chassis->hostname))) { >> >>>> >> > >> > + const struct sbrec_port_binding *localnet_pb >> >>>> >> > >> > + = >> >>>> >> > >> > lport_lookup_by_type(aux->sbrec_datapath_binding_by_key, >> >>>> >> > >> > + >> >>>> >> > >> > aux->sbrec_port_binding_by_type, >> >>>> >> > >> > + aux->dp->tunnel_key, >> >>>> >> > >> > "localnet"); >> >>>> >> > >> > + if (localnet_pb) { >> >>>> >> > >> > + *portp = localnet_pb->tunnel_key; >> >>>> >> > >> > + return true; >> >>>> >> > >> > + } >> >>>> >> > >> > + } >> >>>> >> > >> > + return false; >> >>>> >> > >> > } >> >>>> >> > >> > >> >>>> >> > >> > const struct sbrec_multicast_group *mg = >> >>>> >> > >> > mcgroup_lookup_by_dp_name( >> >>>> >> > >> > @@ -144,6 +165,8 @@ add_logical_flows( >> >>>> >> > >> > struct ovsdb_idl_index *sbrec_chassis_by_name, >> >>>> >> > >> > struct ovsdb_idl_index >> >>>> >> > >> > *sbrec_multicast_group_by_name_datapath, >> >>>> >> > >> > struct ovsdb_idl_index *sbrec_port_binding_by_name, >> >>>> >> > >> > + struct ovsdb_idl_index *sbrec_port_binding_by_type, >> >>>> >> > >> > + struct ovsdb_idl_index *sbrec_datapath_binding_by_key, >> >>>> >> > >> > const struct sbrec_dhcp_options_table >> >>>> >> > >> > *dhcp_options_table, >> >>>> >> > >> > const struct sbrec_dhcpv6_options_table >> >>>> >> > >> > *dhcpv6_options_table, >> >>>> >> > >> > const struct sbrec_logical_flow_table >> >>>> >> > >> > *logical_flow_table, >> >>>> >> > >> > @@ -183,6 +206,8 @@ add_logical_flows( >> >>>> >> > >> > consider_logical_flow(sbrec_chassis_by_name, >> >>>> >> > >> > >> >>>> >> > >> > sbrec_multicast_group_by_name_datapath, >> >>>> >> > >> > sbrec_port_binding_by_name, >> >>>> >> > >> > + sbrec_port_binding_by_type, >> >>>> >> > >> > + sbrec_datapath_binding_by_key, >> >>>> >> > >> > lflow, local_datapaths, >> >>>> >> > >> > chassis, &dhcp_opts, >> >>>> >> > >> > &dhcpv6_opts, &nd_ra_opts, >> >>>> >> > >> > addr_sets, port_groups, >> >>>> >> > >> > active_tunnels, >> >>>> >> > >> > @@ -200,6 +225,8 @@ consider_logical_flow( >> >>>> >> > >> > struct ovsdb_idl_index *sbrec_chassis_by_name, >> >>>> >> > >> > struct ovsdb_idl_index >> >>>> >> > >> > *sbrec_multicast_group_by_name_datapath, >> >>>> >> > >> > struct ovsdb_idl_index *sbrec_port_binding_by_name, >> >>>> >> > >> > + struct ovsdb_idl_index *sbrec_port_binding_by_type, >> >>>> >> > >> > + struct ovsdb_idl_index *sbrec_datapath_binding_by_key, >> >>>> >> > >> > const struct sbrec_logical_flow *lflow, >> >>>> >> > >> > const struct hmap *local_datapaths, >> >>>> >> > >> > const struct sbrec_chassis *chassis, >> >>>> >> > >> > @@ -292,7 +319,10 @@ consider_logical_flow( >> >>>> >> > >> > .sbrec_multicast_group_by_name_datapath >> >>>> >> > >> > = sbrec_multicast_group_by_name_datapath, >> >>>> >> > >> > .sbrec_port_binding_by_name = >> >>>> >> > >> > sbrec_port_binding_by_name, >> >>>> >> > >> > - .dp = lflow->logical_datapath >> >>>> >> > >> > + .sbrec_port_binding_by_type = >> >>>> >> > >> > sbrec_port_binding_by_type, >> >>>> >> > >> > + .sbrec_datapath_binding_by_key = >> >>>> >> > >> > sbrec_datapath_binding_by_key, >> >>>> >> > >> > + .dp = lflow->logical_datapath, >> >>>> >> > >> > + .chassis = chassis >> >>>> >> > >> > }; >> >>>> >> > >> > struct condition_aux cond_aux = { >> >>>> >> > >> > .sbrec_chassis_by_name = sbrec_chassis_by_name, >> >>>> >> > >> > @@ -463,6 +493,8 @@ void >> >>>> >> > >> > lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name, >> >>>> >> > >> > struct ovsdb_idl_index >> >>>> >> > >> > *sbrec_multicast_group_by_name_datapath, >> >>>> >> > >> > struct ovsdb_idl_index *sbrec_port_binding_by_name, >> >>>> >> > >> > + struct ovsdb_idl_index *sbrec_port_binding_by_type, >> >>>> >> > >> > + struct ovsdb_idl_index >> >>>> >> > >> > *sbrec_datapath_binding_by_key, >> >>>> >> > >> > const struct sbrec_dhcp_options_table >> >>>> >> > >> > *dhcp_options_table, >> >>>> >> > >> > const struct sbrec_dhcpv6_options_table >> >>>> >> > >> > *dhcpv6_options_table, >> >>>> >> > >> > const struct sbrec_logical_flow_table >> >>>> >> > >> > *logical_flow_table, >> >>>> >> > >> > @@ -481,7 +513,8 @@ lflow_run(struct ovsdb_idl_index >> >>>> >> > >> > *sbrec_chassis_by_name, >> >>>> >> > >> > >> >>>> >> > >> > add_logical_flows(sbrec_chassis_by_name, >> >>>> >> > >> > sbrec_multicast_group_by_name_datapath, >> >>>> >> > >> > - sbrec_port_binding_by_name, >> >>>> >> > >> > dhcp_options_table, >> >>>> >> > >> > + sbrec_port_binding_by_name, >> >>>> >> > >> > sbrec_port_binding_by_type, >> >>>> >> > >> > + sbrec_datapath_binding_by_key, >> >>>> >> > >> > dhcp_options_table, >> >>>> >> > >> > dhcpv6_options_table, >> >>>> >> > >> > logical_flow_table, >> >>>> >> > >> > local_datapaths, chassis, addr_sets, >> >>>> >> > >> > port_groups, >> >>>> >> > >> > active_tunnels, local_lport_ids, >> >>>> >> > >> > flow_table, group_table, >> >>>> >> > >> > diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h >> >>>> >> > >> > index d19338140..b2911e0eb 100644 >> >>>> >> > >> > --- a/ovn/controller/lflow.h >> >>>> >> > >> > +++ b/ovn/controller/lflow.h >> >>>> >> > >> > @@ -68,6 +68,8 @@ void lflow_init(void); >> >>>> >> > >> > void lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name, >> >>>> >> > >> > struct ovsdb_idl_index >> >>>> >> > >> > *sbrec_multicast_group_by_name_datapath, >> >>>> >> > >> > struct ovsdb_idl_index >> >>>> >> > >> > *sbrec_port_binding_by_name, >> >>>> >> > >> > + struct ovsdb_idl_index >> >>>> >> > >> > *sbrec_port_binding_by_type, >> >>>> >> > >> > + struct ovsdb_idl_index >> >>>> >> > >> > *sbrec_datapath_binding_by_key, >> >>>> >> > >> > const struct sbrec_dhcp_options_table *, >> >>>> >> > >> > const struct sbrec_dhcpv6_options_table *, >> >>>> >> > >> > const struct sbrec_logical_flow_table *, >> >>>> >> > >> > diff --git a/ovn/controller/lport.c b/ovn/controller/lport.c >> >>>> >> > >> > index cc5c5fbb2..9c827d9b0 100644 >> >>>> >> > >> > --- a/ovn/controller/lport.c >> >>>> >> > >> > +++ b/ovn/controller/lport.c >> >>>> >> > >> > @@ -64,6 +64,32 @@ lport_lookup_by_key(struct ovsdb_idl_index >> >>>> >> > >> > *sbrec_datapath_binding_by_key, >> >>>> >> > >> > return retval; >> >>>> >> > >> > } >> >>>> >> > >> > >> >>>> >> > >> > +const struct sbrec_port_binding * >> >>>> >> > >> > +lport_lookup_by_type(struct ovsdb_idl_index >> >>>> >> > >> > *sbrec_datapath_binding_by_key, >> >>>> >> > >> > + struct ovsdb_idl_index >> >>>> >> > >> > *sbrec_port_binding_by_type, >> >>>> >> > >> > + uint64_t dp_key, const char *port_type) >> >>>> >> > >> > +{ >> >>>> >> > >> > + /* Lookup datapath corresponding to dp_key. */ >> >>>> >> > >> > + const struct sbrec_datapath_binding *db = >> >>>> >> > >> > datapath_lookup_by_key( >> >>>> >> > >> > + sbrec_datapath_binding_by_key, dp_key); >> >>>> >> > >> > + if (!db) { >> >>>> >> > >> > + return NULL; >> >>>> >> > >> > + } >> >>>> >> > >> > + >> >>>> >> > >> > + /* Build key for an indexed lookup. */ >> >>>> >> > >> > + struct sbrec_port_binding *pb = >> >>>> >> > >> > sbrec_port_binding_index_init_row( >> >>>> >> > >> > + sbrec_port_binding_by_type); >> >>>> >> > >> > + sbrec_port_binding_index_set_datapath(pb, db); >> >>>> >> > >> > + sbrec_port_binding_index_set_type(pb, port_type); >> >>>> >> > >> > + >> >>>> >> > >> > + const struct sbrec_port_binding *retval = >> >>>> >> > >> > sbrec_port_binding_index_find( >> >>>> >> > >> > + sbrec_port_binding_by_type, pb); >> >>>> >> > >> > + >> >>>> >> > >> > + sbrec_port_binding_index_destroy_row(pb); >> >>>> >> > >> > + >> >>>> >> > >> > + return retval; >> >>>> >> > >> > +} >> >>>> >> > >> > + >> >>>> >> > >> > const struct sbrec_datapath_binding * >> >>>> >> > >> > datapath_lookup_by_key(struct ovsdb_idl_index >> >>>> >> > >> > *sbrec_datapath_binding_by_key, >> >>>> >> > >> > uint64_t dp_key) >> >>>> >> > >> > diff --git a/ovn/controller/lport.h b/ovn/controller/lport.h >> >>>> >> > >> > index 7dcd5bee0..2d49792f6 100644 >> >>>> >> > >> > --- a/ovn/controller/lport.h >> >>>> >> > >> > +++ b/ovn/controller/lport.h >> >>>> >> > >> > @@ -42,6 +42,11 @@ const struct sbrec_port_binding >> >>>> >> > >> > *lport_lookup_by_key( >> >>>> >> > >> > struct ovsdb_idl_index *sbrec_port_binding_by_key, >> >>>> >> > >> > uint64_t dp_key, uint64_t port_key); >> >>>> >> > >> > >> >>>> >> > >> > +const struct sbrec_port_binding *lport_lookup_by_type( >> >>>> >> > >> > + struct ovsdb_idl_index *sbrec_datapath_binding_by_key, >> >>>> >> > >> > + struct ovsdb_idl_index *sbrec_port_binding_by_type, >> >>>> >> > >> > + uint64_t dp_key, const char *port_type); >> >>>> >> > >> > + >> >>>> >> > >> > const struct sbrec_datapath_binding *datapath_lookup_by_key( >> >>>> >> > >> > struct ovsdb_idl_index *sbrec_datapath_binding_by_key, >> >>>> >> > >> > uint64_t dp_key); >> >>>> >> > >> > >> >>>> >> > >> > diff --git a/ovn/controller/ovn-controller.c >> >>>> >> > >> > b/ovn/controller/ovn-controller.c >> >>>> >> > >> > index 4e9a5865f..5aab9142f 100644 >> >>>> >> > >> > --- a/ovn/controller/ovn-controller.c >> >>>> >> > >> > +++ b/ovn/controller/ovn-controller.c >> >>>> >> > >> > @@ -145,6 +145,7 @@ update_sb_monitors(struct ovsdb_idl >> >>>> >> > >> > *ovnsb_idl, >> >>>> >> > >> > * ports that have a Gateway_Chassis that point's to our >> >>>> >> > >> > own >> >>>> >> > >> > * chassis */ >> >>>> >> > >> > sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, >> >>>> >> > >> > "chassisredirect"); >> >>>> >> > >> > + sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, >> >>>> >> > >> > "external"); >> >>>> >> > >> > if (chassis) { >> >>>> >> > >> > /* This should be mostly redundant with the other >> >>>> >> > >> > clauses for port >> >>>> >> > >> > * bindings, but it allows us to catch any ports >> >>>> >> > >> > that are assigned to >> >>>> >> > >> > @@ -616,6 +617,9 @@ main(int argc, char *argv[]) >> >>>> >> > >> > struct ovsdb_idl_index *sbrec_port_binding_by_datapath >> >>>> >> > >> > = ovsdb_idl_index_create1(ovnsb_idl_loop.idl, >> >>>> >> > >> > >> >>>> >> > >> > &sbrec_port_binding_col_datapath); >> >>>> >> > >> > + struct ovsdb_idl_index *sbrec_port_binding_by_type >> >>>> >> > >> > + = ovsdb_idl_index_create1(ovnsb_idl_loop.idl, >> >>>> >> > >> > + >> >>>> >> > >> > &sbrec_port_binding_col_type); >> >>>> >> > >> >> >>>> >> > >> This index is used with two columns: datapath_binding and type, >> >>>> >> > >> so it >> >>>> >> > >> should be created with both columns using create2. >> >>>> >> > >> >> >>>> >> > >> > struct ovsdb_idl_index *sbrec_datapath_binding_by_key >> >>>> >> > >> > = ovsdb_idl_index_create1(ovnsb_idl_loop.idl, >> >>>> >> > >> > >> >>>> >> > >> > &sbrec_datapath_binding_col_tunnel_key); >> >>>> >> > >> > @@ -743,6 +747,8 @@ main(int argc, char *argv[]) >> >>>> >> > >> > sbrec_chassis_by_name, >> >>>> >> > >> > >> >>>> >> > >> > sbrec_multicast_group_by_name_datapath, >> >>>> >> > >> > sbrec_port_binding_by_name, >> >>>> >> > >> > + sbrec_port_binding_by_type, >> >>>> >> > >> > + sbrec_datapath_binding_by_key, >> >>>> >> > >> > >> >>>> >> > >> > sbrec_dhcp_options_table_get(ovnsb_idl_loop.idl), >> >>>> >> > >> > >> >>>> >> > >> > sbrec_dhcpv6_options_table_get(ovnsb_idl_loop.idl), >> >>>> >> > >> > >> >>>> >> > >> > sbrec_logical_flow_table_get(ovnsb_idl_loop.idl), >> >>>> >> > >> > diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c >> >>>> >> > >> > index aa03919bb..a9d4b8736 100644 >> >>>> >> > >> > --- a/ovn/lib/ovn-util.c >> >>>> >> > >> > +++ b/ovn/lib/ovn-util.c >> >>>> >> > >> > @@ -319,6 +319,7 @@ static const char *OVN_NB_LSP_TYPES[] = { >> >>>> >> > >> > "localport", >> >>>> >> > >> > "router", >> >>>> >> > >> > "vtep", >> >>>> >> > >> > + "external", >> >>>> >> > >> > }; >> >>>> >> > >> > >> >>>> >> > >> > bool >> >>>> >> > >> > diff --git a/ovn/northd/ovn-northd.8.xml >> >>>> >> > >> > b/ovn/northd/ovn-northd.8.xml >> >>>> >> > >> > index 392a5efc9..c8883d60d 100644 >> >>>> >> > >> > --- a/ovn/northd/ovn-northd.8.xml >> >>>> >> > >> > +++ b/ovn/northd/ovn-northd.8.xml >> >>>> >> > >> > @@ -626,7 +626,8 @@ nd_na_router { >> >>>> >> > >> > <p> >> >>>> >> > >> > This table adds the DHCPv4 options to a DHCPv4 packet >> >>>> >> > >> > from the >> >>>> >> > >> > logical ports configured with IPv4 address(es) and >> >>>> >> > >> > DHCPv4 options, >> >>>> >> > >> > - and similarly for DHCPv6 options. >> >>>> >> > >> > + and similarly for DHCPv6 options. This table also adds >> >>>> >> > >> > flows for the >> >>>> >> > >> > + logical ports of type <code>external</code>. >> >>>> >> > >> > </p> >> >>>> >> > >> > >> >>>> >> > >> > <ul> >> >>>> >> > >> > @@ -827,7 +828,39 @@ output; >> >>>> >> > >> > </li> >> >>>> >> > >> > </ul> >> >>>> >> > >> > >> >>>> >> > >> > - <h3>Ingress Table 16 Destination Lookup</h3> >> >>>> >> > >> > + <h3>Ingress table 16 External ports</h3> >> >>>> >> > >> > + >> >>>> >> > >> > + <p> >> >>>> >> > >> > + Traffic from the <code>external</code> logical ports >> >>>> >> > >> > enter the ingress >> >>>> >> > >> > + datapath pipeline via the <code>localnet</code> port. >> >>>> >> > >> > This table adds the >> >>>> >> > >> > + below logical flows to handle the traffic from these >> >>>> >> > >> > ports. >> >>>> >> > >> > + </p> >> >>>> >> > >> > + >> >>>> >> > >> > + <ul> >> >>>> >> > >> > + <li> >> >>>> >> > >> > + <p> >> >>>> >> > >> > + A priority-100 flow is added for each >> >>>> >> > >> > <code>external</code> logical >> >>>> >> > >> > + port which doesn't reside on a chassis to drop the >> >>>> >> > >> > ARP/IPv6 NS >> >>>> >> > >> > + request to the router IP(s) (of the logical >> >>>> >> > >> > switch) which matches >> >>>> >> > >> > + on the <code>inport</code> of the >> >>>> >> > >> > <code>external</code> logical port >> >>>> >> > >> > + and the valid <code>eth.src</code> address(es) of >> >>>> >> > >> > the >> >>>> >> > >> > + <code>external</code> logical port. >> >>>> >> > >> > + </p> >> >>>> >> > >> > + >> >>>> >> > >> > + <p> >> >>>> >> > >> > + This flow guarantees that the ARP/NS request to >> >>>> >> > >> > the router IP >> >>>> >> > >> > + address from the external ports is responded by >> >>>> >> > >> > only the chassis >> >>>> >> > >> > + which has claimed these external ports. All the >> >>>> >> > >> > other chassis, >> >>>> >> > >> > + drops these packets. >> >>>> >> > >> > + </p> >> >>>> >> > >> > + </li> >> >>>> >> > >> > + >> >>>> >> > >> > + <li> >> >>>> >> > >> > + A priority-0 flow that matches all packets to >> >>>> >> > >> > advances to table 17. >> >>>> >> > >> > + </li> >> >>>> >> > >> > + </ul> >> >>>> >> > >> > + >> >>>> >> > >> > + <h3>Ingress Table 17 Destination Lookup</h3> >> >>>> >> > >> > >> >>>> >> > >> > <p> >> >>>> >> > >> > This table implements switching behavior. It contains >> >>>> >> > >> > these logical >> >>>> >> > >> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c >> >>>> >> > >> > index 3fd8a8757..87208c6c1 100644 >> >>>> >> > >> > --- a/ovn/northd/ovn-northd.c >> >>>> >> > >> > +++ b/ovn/northd/ovn-northd.c >> >>>> >> > >> > @@ -119,7 +119,8 @@ enum ovn_stage { >> >>>> >> > >> > PIPELINE_STAGE(SWITCH, IN, DHCP_RESPONSE, 13, >> >>>> >> > >> > "ls_in_dhcp_response") \ >> >>>> >> > >> > PIPELINE_STAGE(SWITCH, IN, DNS_LOOKUP, 14, >> >>>> >> > >> > "ls_in_dns_lookup") \ >> >>>> >> > >> > PIPELINE_STAGE(SWITCH, IN, DNS_RESPONSE, 15, >> >>>> >> > >> > "ls_in_dns_response") \ >> >>>> >> > >> > - PIPELINE_STAGE(SWITCH, IN, L2_LKUP, 16, >> >>>> >> > >> > "ls_in_l2_lkup") \ >> >>>> >> > >> > + PIPELINE_STAGE(SWITCH, IN, EXTERNAL_PORT, 16, >> >>>> >> > >> > "ls_in_external_port") \ >> >>>> >> > >> > + PIPELINE_STAGE(SWITCH, IN, L2_LKUP, 17, >> >>>> >> > >> > "ls_in_l2_lkup") \ >> >>>> >> > >> > >> >>>> >> > >> > \ >> >>>> >> > >> > /* Logical switch egress stages. */ >> >>>> >> > >> > \ >> >>>> >> > >> > PIPELINE_STAGE(SWITCH, OUT, PRE_LB, 0, >> >>>> >> > >> > "ls_out_pre_lb") \ >> >>>> >> > >> > @@ -2942,6 +2943,12 @@ lsp_is_up(const struct >> >>>> >> > >> > nbrec_logical_switch_port *lsp) >> >>>> >> > >> > return !lsp->up || *lsp->up; >> >>>> >> > >> > } >> >>>> >> > >> > >> >>>> >> > >> > +static bool >> >>>> >> > >> > +lsp_is_external(const struct nbrec_logical_switch_port *nbsp) >> >>>> >> > >> > +{ >> >>>> >> > >> > + return !strcmp(nbsp->type, "external"); >> >>>> >> > >> > +} >> >>>> >> > >> > + >> >>>> >> > >> > static bool >> >>>> >> > >> > build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip, >> >>>> >> > >> > struct ds *options_action, struct ds >> >>>> >> > >> > *response_action, >> >>>> >> > >> > @@ -4185,7 +4192,7 @@ build_lswitch_flows(struct hmap >> >>>> >> > >> > *datapaths, struct hmap *ports, >> >>>> >> > >> > * - port type is localport >> >>>> >> > >> > */ >> >>>> >> > >> > if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, >> >>>> >> > >> > "router") && >> >>>> >> > >> > - strcmp(op->nbsp->type, "localport")) { >> >>>> >> > >> > + strcmp(op->nbsp->type, "localport") && >> >>>> >> > >> > lsp_is_external(op->nbsp)) { >> >>>> >> > >> >> >>>> >> > >> Sorry that I missed this in last review. The && condition has >> >>>> >> > >> problem. >> >>>> >> > >> It will cause ARP responder flows added for all lports that are >> >>>> >> > >> not >> >>>> >> > >> external. I think it should be || here. >> >>>> >> > > >> >>>> >> > > >> >>>> >> > > Agree. To make it easier to read, I will add a new "if" with >> >>>> >> > > continue - below this one for >> >>>> >> > > external port types. >> >>>> >> > > >> >>>> >> > > >> >>>> >> > >> >> >>>> >> > >> >> >>>> >> > >> > continue; >> >>>> >> > >> > } >> >>>> >> > >> > >> >>>> >> > >> > @@ -4297,6 +4304,13 @@ build_lswitch_flows(struct hmap >> >>>> >> > >> > *datapaths, struct hmap *ports, >> >>>> >> > >> > continue; >> >>>> >> > >> > } >> >>>> >> > >> > >> >>>> >> > >> > + bool is_external = lsp_is_external(op->nbsp); >> >>>> >> > >> > + if (is_external && !op->od->localnet_port) { >> >>>> >> > >> > + /* If it's an external port and there is no >> >>>> >> > >> > localnet port >> >>>> >> > >> > + * ignore it. */ >> >>>> >> > >> > + continue; >> >>>> >> > >> > + } >> >>>> >> > >> > + >> >>>> >> > >> > for (size_t i = 0; i < op->n_lsp_addrs; i++) { >> >>>> >> > >> > for (size_t j = 0; j < >> >>>> >> > >> > op->lsp_addrs[i].n_ipv4_addrs; j++) { >> >>>> >> > >> > struct ds options_action = >> >>>> >> > >> > DS_EMPTY_INITIALIZER; >> >>>> >> > >> > @@ -4309,8 +4323,8 @@ build_lswitch_flows(struct hmap >> >>>> >> > >> > *datapaths, struct hmap *ports, >> >>>> >> > >> > ds_put_format( >> >>>> >> > >> > &match, "inport == %s && eth.src == >> >>>> >> > >> > %s && " >> >>>> >> > >> > "ip4.src == 0.0.0.0 && ip4.dst == >> >>>> >> > >> > 255.255.255.255 && " >> >>>> >> > >> > - "udp.src == 68 && udp.dst == 67", >> >>>> >> > >> > op->json_key, >> >>>> >> > >> > - op->lsp_addrs[i].ea_s); >> >>>> >> > >> > + "udp.src == 68 && udp.dst == 67", >> >>>> >> > >> > + op->json_key, op->lsp_addrs[i].ea_s); >> >>>> >> > >> >> >>>> >> > >> No change here? >> >>>> >> > > >> >>>> >> > > >> >>>> >> > > I think it's unwanted and unrelated change. I will correct it. >> >>>> >> > >> >> >>>> >> > >> > >> >>>> >> > >> > ovn_lflow_add(lflows, op->od, >> >>>> >> > >> > S_SWITCH_IN_DHCP_OPTIONS, >> >>>> >> > >> > 100, ds_cstr(&match), >> >>>> >> > >> > @@ -4415,7 +4429,9 @@ build_lswitch_flows(struct hmap >> >>>> >> > >> > *datapaths, struct hmap *ports, >> >>>> >> > >> > /* Ingress table 12 and 13: DHCP options and response, >> >>>> >> > >> > by default goto >> >>>> >> > >> > * next. (priority 0). >> >>>> >> > >> > * Ingress table 14 and 15: DNS lookup and response, by >> >>>> >> > >> > default goto next. >> >>>> >> > >> > - * (priority 0).*/ >> >>>> >> > >> > + * (priority 0). >> >>>> >> > >> > + * Ingress table 16 - External port handling, by default >> >>>> >> > >> > goto next. >> >>>> >> > >> > + * (priority 0). */ >> >>>> >> > >> > >> >>>> >> > >> > HMAP_FOR_EACH (od, key_node, datapaths) { >> >>>> >> > >> > if (!od->nbs) { >> >>>> >> > >> > @@ -4426,9 +4442,58 @@ build_lswitch_flows(struct hmap >> >>>> >> > >> > *datapaths, struct hmap *ports, >> >>>> >> > >> > ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_RESPONSE, >> >>>> >> > >> > 0, "1", "next;"); >> >>>> >> > >> > ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_LOOKUP, 0, >> >>>> >> > >> > "1", "next;"); >> >>>> >> > >> > ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_RESPONSE, >> >>>> >> > >> > 0, "1", "next;"); >> >>>> >> > >> > + ovn_lflow_add(lflows, od, S_SWITCH_IN_EXTERNAL_PORT, >> >>>> >> > >> > 0, "1", "next;"); >> >>>> >> > >> > } >> >>>> >> > >> > >> >>>> >> > >> > - /* Ingress table 16: Destination lookup, broadcast and >> >>>> >> > >> > multicast handling >> >>>> >> > >> > + HMAP_FOR_EACH (op, key_node, ports) { >> >>>> >> > >> > + if (!op->nbsp || !lsp_is_external(op->nbsp)) { >> >>>> >> > >> > + continue; >> >>>> >> > >> > + } >> >>>> >> > >> > + >> >>>> >> > >> > + /* Table 16: External port. Drop ARP request for >> >>>> >> > >> > router ips from >> >>>> >> > >> > + * external ports on chassis not binding those >> >>>> >> > >> > ports. >> >>>> >> > >> > + * This makes the router pipeline to be run only on >> >>>> >> > >> > the chassis >> >>>> >> > >> > + * binding the external ports. */ >> >>>> >> > >> > + >> >>>> >> > >> > + for (size_t i = 0; i < op->n_lsp_addrs; i++) { >> >>>> >> > >> > + for (size_t j = 0; j < op->od->n_router_ports; >> >>>> >> > >> > j++) { >> >>>> >> > >> > + struct ovn_port *rp = >> >>>> >> > >> > op->od->router_ports[j]; >> >>>> >> > >> > + for (size_t k = 0; k < rp->n_lsp_addrs; k++) >> >>>> >> > >> > { >> >>>> >> > >> > + for (size_t l = 0; l < >> >>>> >> > >> > rp->lsp_addrs[k].n_ipv4_addrs; >> >>>> >> > >> > + l++) { >> >>>> >> > >> > + ds_clear(&match); >> >>>> >> > >> > + ds_put_cstr(&match, "ip4"); >> >>>> >> > >> > + ds_put_format( >> >>>> >> > >> > + &match, "inport == %s && eth.src >> >>>> >> > >> > == %s" >> >>>> >> > >> > + " && !is_chassis_resident(%s)" >> >>>> >> > >> > + " && arp.tpa == %s && arp.op == >> >>>> >> > >> > 1", >> >>>> >> > >> > + op->json_key, >> >>>> >> > >> > op->lsp_addrs[i].ea_s, op->json_key, >> >>>> >> > >> >> >>>> >> > >> I believe the inport should match the localnet port's json_key >> >>>> >> > >> here, >> >>>> >> > >> since it is coming from a localnet port. >> >>>> >> > > >> >>>> >> > > >> >>>> >> > > Both would work. If you see the code in lflow.c in this patch - >> >>>> >> > > it will get the tunnel >> >>>> >> > > key of the localnet port if the port_binding type is "external". >> >>>> >> > > >> >>>> >> > > That's how even the DHCP requests are handled. ovn-controller >> >>>> >> > > will translate >> >>>> >> > > the logical flows with action "put_dhcp_opts" only the chassis >> >>>> >> > > claiming the >> >>>> >> > > external ports. >> >>>> >> > >> >>>> >> > Oh, yes you are right. Actually I read that part in v4 and it >> >>>> >> > somehow >> >>>> >> > slipped my mind. Thanks for explain. >> >>>> >> >> >>>> >> I thought it a second time, and I'd suggest to do the convertion here >> >>>> >> in northd instead of ovn-controller, for two reasons: >> >>>> >> >> >>>> >> 1. In ovn-controller there is no extra context so it just blindly >> >>>> >> transate all references to external logical port into localnet port >> >>>> >> key. This could lead to unexpected behavior. For example, if someone >> >>>> >> uses external logical port in ACL match condition. The match >> >>>> >> condition >> >>>> >> would then apply to all packets to/from localnet port which is >> >>>> >> definitely unwanted. (at the same time it would be better to document >> >>>> >> that features like port-security, ACL should not be used for external >> >>>> >> logical ports) >> >>>> >> >> >>>> > >> >>>> > That's not how it works in the present patch. Lets say you have 2 >> >>>> > chassis >> >>>> > hv1 and hv2 and an external port sw0-ext1 and a localnet port >> >>>> > "ln-public". >> >>>> > Suppose if the requested-chassis is set to hv1, then all the logical >> >>>> > flows with the >> >>>> > match "inport == sw0-ext1" will be converted to OF flows only on hv1 >> >>>> > as this port >> >>>> > is bound by hv1 and the function 'lookup_port_cb()' would return true >> >>>> > only >> >>>> > on hv1 . In hv2, lookup_port_cb() would return false. >> >>>> >> >>>> Yes, this is well understood. >> >>>> >> >>>> > >> >>>> > If we want to do the conversion in ovn-northd.c the match condition >> >>>> > would have to >> >>>> > be - "inport == "ln-public" && is_chassis_resident(sw0-ext1) && ...." >> >>>> > instead of the present one - "inport == sw0-ext1 && ...". >> >>>> >> >>>> Yes, this is what I would suggest (see reason below). >> >>>> >> >>>> > >> >>>> > And the ACL match condition would not be an issue because of the >> >>>> > above mentioned >> >>>> > reason. i.e the ACL flows will be applied only on the chassis binding >> >>>> > the external >> >>>> > port. >> >>>> >> >>>> Here is the concern. For example, chassis A has regular port sw0-lsp1 >> >>>> bound. Chassis A is also set as requested-chassis for external port >> >>>> sw0-ext1. And now the user adds an ACL: to-port, 1001, 'outport == >> >>>> "sw0-ext1", drop. This would get translated to something like: >> >>>> to-port, 1001, 'outport == "ln-public"', drop. Wouldn't this impact >> >>>> the traffic from sw0-lsp1 to the bridged network? Even if it doesn't >> >>>> impact because of some subtle reasons of current implementation, I >> >>>> would say it is risky and could leads to problems under certain >> >>>> conditions, because the conversion in ovn-controller widens the >> >>>> original intent. Whereas doing it in northd only for specific lflows >> >>>> would ensure it has impact only for intended use cases. >> >>> >> >>> >> >>> >> >>> Thanks for the detailed explanation. I agree. It's clear to me now. I >> >>> will update accordingly in v6. >> >>> >> >>> Regards >> >>> Numan >> >>> >> >>>> >> >>>> > >> >>>> > The test case added checks that the OF flows are applied only on the >> >>>> > bound chassis. >> >>>> > >> >>>> > I think it is better to do it in ovn-controller instead of >> >>>> > ovn-northd. Please let me know >> >>>> > if you still have any concerns. >> >>>> > >> >>>> > >> >>>> > >> >>>> >> 2. A less important reason is, it is better to do it at earlier stage >> >>>> >> than later. northd handles common processing. This part of logic is >> >>>> >> common for all chassises, so it would be better if we explicitely >> >>>> >> handle it in northd, instead of let every chassis to process. And the >> >>>> >> change in northd would likely be simpler than in ovn-controller. >> >>>> >> >>>> This is less critical problem, but I think it is worth consideration, >> >>>> too. With current logic, although the conversion would take effect >> >>>> only if "is_chassis_resident()" is true, but the code logic and >> >>>> processing has to happen on every chassis. >> >>>> >> >>>> >> >> >>>> >> Thanks, >> >>>> >> Han >> > >> > >> > >> > -- >> > Miguel Ángel Ajo >> > OSP / Networking DFG, OVN Squad Engineering _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
