On Fri, Jan 26, 2024 at 8:05 PM Han Zhou <[email protected]> wrote: > > > On Thu, Jan 25, 2024 at 10:54 PM Ales Musil <[email protected]> wrote: > > > > > > > > On Fri, Jan 26, 2024 at 4:07 AM Han Zhou <[email protected]> wrote: > >> > >> > >> > >> On Tue, Jan 23, 2024 at 5:29 AM Ales Musil <[email protected]> wrote: > >> > > >> > > >> > > >> > On Wed, Jan 17, 2024 at 6:48 AM Han Zhou <[email protected]> wrote: > >> >> > >> >> Commit dd527a283cd8 partially supported multiple encap IPs. It > supported > >> >> remote encap IP selection based on the destination VIF's encap_ip > >> >> configuration. This patch adds the support for selecting local encap > IP > >> >> based on the source VIF's encap_ip configuration. > >> >> > >> >> Co-authored-by: Lei Huang <[email protected]> > >> >> Signed-off-by: Lei Huang <[email protected]> > >> >> Signed-off-by: Han Zhou <[email protected]> > >> >> --- > >> > > >> > > >> > Hi Han and Lei, > >> > > >> > thank you for the patch, I have a couple of comments/questions down > below. > >> > >> > >> Thanks Ales. > >> > >> > > >> > > >> >> NEWS | 3 + > >> >> controller/chassis.c | 2 +- > >> >> controller/local_data.c | 2 +- > >> >> controller/local_data.h | 2 +- > >> >> controller/ovn-controller.8.xml | 30 ++++++- > >> >> controller/ovn-controller.c | 49 ++++++++++++ > >> >> controller/physical.c | 134 > ++++++++++++++++++++++---------- > >> >> controller/physical.h | 2 + > >> >> include/ovn/logical-fields.h | 4 +- > >> >> ovn-architecture.7.xml | 18 ++++- > >> >> tests/ovn.at | 51 +++++++++++- > >> >> 11 files changed, 243 insertions(+), 54 deletions(-) > >> >> > >> >> diff --git a/NEWS b/NEWS > >> >> index 5f267b4c64cc..5a3eed608617 100644 > >> >> --- a/NEWS > >> >> +++ b/NEWS > >> >> @@ -14,6 +14,9 @@ Post v23.09.0 > >> >> - ovn-northd-ddlog has been removed. > >> >> - A new LSP option "enable_router_port_acl" has been added to > enable > >> >> conntrack for the router port whose peer is l3dgw_port if set > it true. > >> >> + - Support selecting encapsulation IP based on the > source/destination VIF's > >> >> + settting. See ovn-controller(8) 'external_ids:ovn-encap-ip' for > more > >> >> + details. > >> >> > >> >> OVN v23.09.0 - 15 Sep 2023 > >> >> -------------------------- > >> >> diff --git a/controller/chassis.c b/controller/chassis.c > >> >> index a6f13ccc42d5..55f2beb37674 100644 > >> >> --- a/controller/chassis.c > >> >> +++ b/controller/chassis.c > >> >> @@ -61,7 +61,7 @@ struct ovs_chassis_cfg { > >> >> > >> >> /* Set of encap types parsed from the 'ovn-encap-type' > external-id. */ > >> >> struct sset encap_type_set; > >> >> - /* Set of encap IPs parsed from the 'ovn-encap-type' > external-id. */ > >> >> + /* Set of encap IPs parsed from the 'ovn-encap-ip' external-id. > */ > >> >> struct sset encap_ip_set; > >> >> /* Interface type list formatted in the OVN-SB Chassis required > format. */ > >> >> struct ds iface_types; > >> >> diff --git a/controller/local_data.c b/controller/local_data.c > >> >> index a9092783958f..8606414f8728 100644 > >> >> --- a/controller/local_data.c > >> >> +++ b/controller/local_data.c > >> >> @@ -514,7 +514,7 @@ chassis_tunnels_destroy(struct hmap > *chassis_tunnels) > >> >> */ > >> >> struct chassis_tunnel * > >> >> chassis_tunnel_find(const struct hmap *chassis_tunnels, const char > *chassis_id, > >> >> - char *remote_encap_ip, char *local_encap_ip) > >> >> + char *remote_encap_ip, const char > *local_encap_ip) > >> > > >> > > >> > nit: Unrelated change. > >> > >> > >> Ack > > Hi Ales, sorry that I just realized this change is related, which is > because of the const char * array introduced in this patch that stores the > parsed encap_ips, and it makes sense to use const char * since we should > never expect this string to be modified in the function. > > >> > >> > > >> > > >> >> { > >> >> /* > >> >> * If the specific encap_ip is given, look for the chassisid_ip > entry, > >> >> diff --git a/controller/local_data.h b/controller/local_data.h > >> >> index bab95bcc3824..ca3905bd20e6 100644 > >> >> --- a/controller/local_data.h > >> >> +++ b/controller/local_data.h > >> >> @@ -150,7 +150,7 @@ bool local_nonvif_data_handle_ovs_iface_changes( > >> >> struct chassis_tunnel *chassis_tunnel_find(const struct hmap > *chassis_tunnels, > >> >> const char *chassis_id, > >> >> char *remote_encap_ip, > >> >> - char *local_encap_ip); > >> >> + const char > *local_encap_ip); > >> > > >> > > >> > Same as above. > >> > >> > >> Ack > >> > >> > > >> > > >> >> > >> >> bool get_chassis_tunnel_ofport(const struct hmap *chassis_tunnels, > >> >> const char *chassis_name, > >> >> diff --git a/controller/ovn-controller.8.xml > b/controller/ovn-controller.8.xml > >> >> index efa65e3fd927..5ebef048d721 100644 > >> >> --- a/controller/ovn-controller.8.xml > >> >> +++ b/controller/ovn-controller.8.xml > >> >> @@ -176,10 +176,32 @@ > >> >> > >> >> <dt><code>external_ids:ovn-encap-ip</code></dt> > >> >> <dd> > >> >> - The IP address that a chassis should use to connect to this > node > >> >> - using encapsulation types specified by > >> >> - <code>external_ids:ovn-encap-type</code>. Multiple > encapsulation IPs > >> >> - may be specified with a comma-separated list. > >> >> + <p> > >> >> + The IP address that a chassis should use to connect to > this node > >> >> + using encapsulation types specified by > >> >> + <code>external_ids:ovn-encap-type</code>. Multiple > encapsulation IPs > >> >> + may be specified with a comma-separated list. > >> >> + </p> > >> >> + <p> > >> >> + In scenarios where multiple encapsulation IPs are > present, distinct > >> >> + tunnels are established for each remote chassis. These > tunnels are > >> >> + differentiated by setting unique > <code>options:local_ip</code> and > >> >> + <code>options:remote_ip</code> values in the tunnel > interface. When > >> >> + transmitting a packet to a remote chassis, the selection > of local_ip > >> >> + is guided by the > <code>Interface:external_ids:encap-ip</code> from > >> >> + the local OVSDB, corresponding to the VIF originating the > packet, if > >> >> + specified. The > <code>Interface:external_ids:encap-ip</code> setting > >> >> + of the VIF is also populated to the > <code>Port_Binding</code> > >> >> + table in the OVN SB database via the <code>encap</code> > column. > >> >> + Consequently, when a remote chassis needs to send a > packet to a > >> >> + port-binding associated with this VIF, it utilizes the > tunnel with > >> >> + the appropriate <code>options:remote_ip</code> that > matches the > >> >> + <code>ip</code> in <code>Port_Binding:encap</code>. This > mechanism > >> >> + is particularly beneficial for chassis with multiple > physical > >> >> + interfaces designated for tunneling, where each interface > is > >> >> + optimized for handling specific traffic associated with > particular > >> >> + VIFs. > >> >> + </p> > >> >> </dd> > >> >> > >> >> <dt><code>external_ids:ovn-encap-df_default</code></dt> > >> >> diff --git a/controller/ovn-controller.c > b/controller/ovn-controller.c > >> >> index 856e5e270822..4ab57fe970fe 100644 > >> >> --- a/controller/ovn-controller.c > >> >> +++ b/controller/ovn-controller.c > >> >> @@ -4521,6 +4521,31 @@ struct ed_type_pflow_output { > >> >> struct physical_debug debug; > >> >> }; > >> >> > >> >> +static void > >> >> +parse_encap_ips(const struct ovsrec_open_vswitch_table *ovs_table, > >> >> + size_t *n_encap_ips, const char * **encap_ips) > >> >> +{ > >> >> + const struct ovsrec_open_vswitch *cfg = > >> >> + ovsrec_open_vswitch_table_first(ovs_table); > >> >> + const char *chassis_id = get_ovs_chassis_id(ovs_table); > >> >> + const char *encap_ips_str = > >> >> + get_chassis_external_id_value(&cfg->external_ids, > chassis_id, > >> >> + "ovn-encap-ip", NULL); > >> >> + struct sset encap_ip_set; > >> >> + sset_init(&encap_ip_set); > >> >> + sset_from_delimited_string(&encap_ip_set, encap_ips_str, ","); > >> >> + > >> >> + /* Sort the ips so that their index is deterministic. */ > >> >> + *encap_ips = sset_sort(&encap_ip_set); > >> >> + > >> >> + /* Copy the strings so that we can destroy the sset. */ > >> >> + for (size_t i = 0; (*encap_ips)[i]; i++) { > >> >> + (*encap_ips)[i] = xstrdup((*encap_ips)[i]); > >> >> + } > >> >> + *n_encap_ips = sset_count(&encap_ip_set); > >> >> + sset_destroy(&encap_ip_set); > >> >> +} > >> >> + > >> > > >> > > >> > I wonder, could we store this array or maybe preferably svec in > "en_non_vif_data" I-P node? This way it would be handled in the node and we > could avoid the destroy calls after any pflow processing WDYT? > >> > >> > >> Yes we could do the same in en_non_vif_data, but I think it is not > really necessary and it seems more straightforward just parsing it here, > because: > >> 1. We don't need I-P for this ovn-encap-ip configuration change, so we > don't have to persist this data. > >> 2. It should be very rare to have more than 5 (or even 3) encap IPs per > node, so the list should be very small and the cost of this parsing (and > sset construct/destroy) is negligible. > >> Using svec is also a valid option, but the sset (and array) is used > here just to reuse the sset_from_delimited_string and sset_sort for > convenience. I noticed that the sset_init() call can in fact be removed > because sset_from_delimited_string already does that. I can remove it. > >> Does this sound reasonable to you? > > > > > > It makes sense, the main thing it would help with is the need to destroy > the ctx in kinda unexpected places which might be slightly error prone. > However, it being functionally the same it's fine either way. > > > > Thanks Ales. So I will add the below small change. Would you give your Ack > if it looks good to you? > ----------------------- > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > index 4ab57fe970fe..6873c02288c6 100644 > --- a/controller/ovn-controller.c > +++ b/controller/ovn-controller.c > @@ -4532,7 +4532,6 @@ parse_encap_ips(const struct > ovsrec_open_vswitch_table *ovs_table, > get_chassis_external_id_value(&cfg->external_ids, chassis_id, > "ovn-encap-ip", NULL); > struct sset encap_ip_set; > - sset_init(&encap_ip_set); > sset_from_delimited_string(&encap_ip_set, encap_ips_str, ","); > > /* Sort the ips so that their index is deterministic. */ > @@ -4615,8 +4614,6 @@ static void init_physical_ctx(struct engine_node > *node, > p_ctx->patch_ofports = &non_vif_data->patch_ofports; > p_ctx->chassis_tunnels = &non_vif_data->chassis_tunnels; > > - > - > struct controller_engine_ctx *ctrl_ctx = > engine_get_context()->client_ctx; > p_ctx->if_mgr = ctrl_ctx->if_mgr; > ------------------------- > > Thanks, > Han >
Yeah the diff looks ok. Acked-by: Ales Musil <[email protected]> Thanks, Ales > > > Thanks, > > Ales > > > >> > >> > >> Thanks, > >> Han > >> > >> > > >> >> > >> >> static void init_physical_ctx(struct engine_node *node, > >> >> struct ed_type_runtime_data *rt_data, > >> >> struct ed_type_non_vif_data > *non_vif_data, > >> >> @@ -4572,6 +4597,7 @@ static void init_physical_ctx(struct > engine_node *node, > >> >> engine_get_input_data("ct_zones", node); > >> >> struct simap *ct_zones = &ct_zones_data->current; > >> >> > >> >> + parse_encap_ips(ovs_table, &p_ctx->n_encap_ips, > &p_ctx->encap_ips); > >> >> p_ctx->sbrec_port_binding_by_name = sbrec_port_binding_by_name; > >> >> p_ctx->sbrec_port_binding_by_datapath = > sbrec_port_binding_by_datapath; > >> >> p_ctx->port_binding_table = port_binding_table; > >> >> @@ -4589,12 +4615,23 @@ static void init_physical_ctx(struct > engine_node *node, > >> >> p_ctx->patch_ofports = &non_vif_data->patch_ofports; > >> >> p_ctx->chassis_tunnels = &non_vif_data->chassis_tunnels; > >> >> > >> >> + > >> >> + > >> >> > >> > > >> > nit: Unrelated change. > >> > >> > >> Ack > >> > > >> > > >> >> > >> >> struct controller_engine_ctx *ctrl_ctx = > engine_get_context()->client_ctx; > >> >> p_ctx->if_mgr = ctrl_ctx->if_mgr; > >> >> > >> >> pflow_output_get_debug(node, &p_ctx->debug); > >> >> } > >> >> > >> >> +static void > >> >> +destroy_physical_ctx(struct physical_ctx *p_ctx) > >> >> +{ > >> >> + for (size_t i = 0; i < p_ctx->n_encap_ips; i++) { > >> >> + free((char *)(p_ctx->encap_ips[i])); > >> >> + } > >> >> + free(p_ctx->encap_ips); > >> >> +} > >> >> + > >> >> static void * > >> >> en_pflow_output_init(struct engine_node *node OVS_UNUSED, > >> >> struct engine_arg *arg OVS_UNUSED) > >> >> @@ -4631,6 +4668,7 @@ en_pflow_output_run(struct engine_node *node, > void *data) > >> >> struct physical_ctx p_ctx; > >> >> init_physical_ctx(node, rt_data, non_vif_data, &p_ctx); > >> >> physical_run(&p_ctx, pflow_table); > >> >> + destroy_physical_ctx(&p_ctx); > >> >> > >> >> engine_set_node_state(node, EN_UPDATED); > >> >> } > >> >> @@ -4675,6 +4713,7 @@ pflow_output_if_status_mgr_handler(struct > engine_node *node, > >> >> bool removed = > sbrec_port_binding_is_deleted(binding); > >> >> if (!physical_handle_flows_for_lport(binding, > removed, &p_ctx, > >> >> > &pfo->flow_table)) { > >> >> + destroy_physical_ctx(&p_ctx); > >> >> return false; > >> >> } > >> >> } > >> >> @@ -4684,11 +4723,13 @@ pflow_output_if_status_mgr_handler(struct > engine_node *node, > >> >> bool removed = sbrec_port_binding_is_deleted(pb); > >> >> if (!physical_handle_flows_for_lport(pb, removed, > &p_ctx, > >> >> &pfo->flow_table)) > { > >> >> + destroy_physical_ctx(&p_ctx); > >> >> return false; > >> >> } > >> >> } > >> >> engine_set_node_state(node, EN_UPDATED); > >> >> } > >> >> + destroy_physical_ctx(&p_ctx); > >> >> return true; > >> >> } > >> >> > >> >> @@ -4715,11 +4756,13 @@ pflow_output_sb_port_binding_handler(struct > engine_node *node, > >> >> bool removed = sbrec_port_binding_is_deleted(pb); > >> >> if (!physical_handle_flows_for_lport(pb, removed, &p_ctx, > >> >> &pfo->flow_table)) { > >> >> + destroy_physical_ctx(&p_ctx); > >> >> return false; > >> >> } > >> >> } > >> >> > >> >> engine_set_node_state(node, EN_UPDATED); > >> >> + destroy_physical_ctx(&p_ctx); > >> >> return true; > >> >> } > >> >> > >> >> @@ -4739,6 +4782,7 @@ pflow_output_sb_multicast_group_handler(struct > engine_node *node, void *data) > >> >> physical_handle_mc_group_changes(&p_ctx, &pfo->flow_table); > >> >> > >> >> engine_set_node_state(node, EN_UPDATED); > >> >> + destroy_physical_ctx(&p_ctx); > >> >> return true; > >> >> } > >> >> > >> >> @@ -4771,6 +4815,7 @@ pflow_output_runtime_data_handler(struct > engine_node *node, void *data) > >> >> if (tdp->tracked_type != TRACKED_RESOURCE_UPDATED) { > >> >> /* Fall back to full recompute when a local datapath > >> >> * is added or deleted. */ > >> >> + destroy_physical_ctx(&p_ctx); > >> >> return false; > >> >> } > >> >> > >> >> @@ -4781,12 +4826,14 @@ pflow_output_runtime_data_handler(struct > engine_node *node, void *data) > >> >> lport->tracked_type == TRACKED_RESOURCE_REMOVED ? > true: false; > >> >> if (!physical_handle_flows_for_lport(lport->pb, > removed, &p_ctx, > >> >> &pfo->flow_table)) > { > >> >> + destroy_physical_ctx(&p_ctx); > >> >> return false; > >> >> } > >> >> } > >> >> } > >> >> > >> >> engine_set_node_state(node, EN_UPDATED); > >> >> + destroy_physical_ctx(&p_ctx); > >> >> return true; > >> >> } > >> >> > >> >> @@ -4843,12 +4890,14 @@ pflow_output_activated_ports_handler(struct > engine_node *node, void *data) > >> >> if (pb) { > >> >> if (!physical_handle_flows_for_lport(pb, false, &p_ctx, > >> >> &pfo->flow_table)) > { > >> >> + destroy_physical_ctx(&p_ctx); > >> >> return false; > >> >> } > >> >> tag_port_as_activated_in_engine(pp); > >> >> } > >> >> } > >> >> engine_set_node_state(node, EN_UPDATED); > >> >> + destroy_physical_ctx(&p_ctx); > >> >> return true; > >> >> } > >> >> > >> >> diff --git a/controller/physical.c b/controller/physical.c > >> >> index e93bfbd2cffb..c192aed751d5 100644 > >> >> --- a/controller/physical.c > >> >> +++ b/controller/physical.c > >> >> @@ -71,6 +71,8 @@ struct tunnel { > >> >> static void > >> >> load_logical_ingress_metadata(const struct sbrec_port_binding > *binding, > >> >> const struct zone_ids *zone_ids, > >> >> + size_t n_encap_ips, > >> >> + const char **encap_ips, > >> >> struct ofpbuf *ofpacts_p); > >> >> static int64_t get_vxlan_port_key(int64_t port_key); > >> >> > >> >> @@ -138,14 +140,14 @@ put_resubmit(uint8_t table_id, struct ofpbuf > *ofpacts) > >> >> */ > >> >> static struct chassis_tunnel * > >> >> get_port_binding_tun(const struct sbrec_encap *remote_encap, > >> >> - const struct sbrec_encap *local_encap, > >> >> const struct sbrec_chassis *chassis, > >> >> - const struct hmap *chassis_tunnels) > >> >> + const struct hmap *chassis_tunnels, > >> >> + const char *local_encap_ip) > >> >> { > >> >> struct chassis_tunnel *tun = NULL; > >> >> tun = chassis_tunnel_find(chassis_tunnels, chassis->name, > >> >> remote_encap ? remote_encap->ip : > NULL, > >> >> - local_encap ? local_encap->ip : NULL); > >> >> + local_encap_ip); > >> >> > >> >> if (!tun) { > >> >> tun = chassis_tunnel_find(chassis_tunnels, chassis->name, > NULL, NULL); > >> >> @@ -329,7 +331,8 @@ find_additional_encap_for_chassis(const struct > sbrec_port_binding *pb, > >> >> static struct ovs_list * > >> >> get_remote_tunnels(const struct sbrec_port_binding *binding, > >> >> const struct sbrec_chassis *chassis, > >> >> - const struct hmap *chassis_tunnels) > >> >> + const struct hmap *chassis_tunnels, > >> >> + const char *local_encap_ip) > >> >> { > >> >> const struct chassis_tunnel *tun; > >> >> > >> >> @@ -337,8 +340,8 @@ get_remote_tunnels(const struct > sbrec_port_binding *binding, > >> >> ovs_list_init(tunnels); > >> >> > >> >> if (binding->chassis && binding->chassis != chassis) { > >> >> - tun = get_port_binding_tun(binding->encap, NULL, > binding->chassis, > >> >> - chassis_tunnels); > >> >> + tun = get_port_binding_tun(binding->encap, binding->chassis, > >> >> + chassis_tunnels, local_encap_ip); > >> >> if (!tun) { > >> >> static struct vlog_rate_limit rl = > VLOG_RATE_LIMIT_INIT(5, 1); > >> >> VLOG_WARN_RL( > >> >> @@ -358,9 +361,9 @@ get_remote_tunnels(const struct > sbrec_port_binding *binding, > >> >> } > >> >> const struct sbrec_encap *additional_encap; > >> >> additional_encap = > find_additional_encap_for_chassis(binding, chassis); > >> >> - tun = get_port_binding_tun(additional_encap, NULL, > >> >> + tun = get_port_binding_tun(additional_encap, > >> >> binding->additional_chassis[i], > >> >> - chassis_tunnels); > >> >> + chassis_tunnels, local_encap_ip); > >> >> if (!tun) { > >> >> static struct vlog_rate_limit rl = > VLOG_RATE_LIMIT_INIT(5, 1); > >> >> VLOG_WARN_RL( > >> >> @@ -384,36 +387,48 @@ put_remote_port_redirect_overlay(const struct > sbrec_port_binding *binding, > >> >> struct ofpbuf *ofpacts_p, > >> >> const struct sbrec_chassis > *chassis, > >> >> const struct hmap *chassis_tunnels, > >> >> + size_t n_encap_ips, > >> >> + const char **encap_ips, > >> >> struct ovn_desired_flow_table > *flow_table) > >> >> { > >> >> /* Setup encapsulation */ > >> >> - struct ovs_list *tuns = get_remote_tunnels(binding, chassis, > >> >> - chassis_tunnels); > >> >> - if (!ovs_list_is_empty(tuns)) { > >> >> - bool is_vtep_port = !strcmp(binding->type, "vtep"); > >> >> - /* rewrite MFF_IN_PORT to bypass OpenFlow loopback check > for ARP/ND > >> >> - * responder in L3 networks. */ > >> >> - if (is_vtep_port) { > >> >> - put_load(ofp_to_u16(OFPP_NONE), MFF_IN_PORT, 0, 16, > ofpacts_p); > >> >> - } > >> >> + for (size_t i = 0; i < n_encap_ips; i++) { > >> >> + struct ofpbuf *ofpacts_clone = ofpbuf_clone(ofpacts_p); > >> >> > >> >> + > >> >> + match_set_reg_masked(match, MFF_LOG_ENCAP_ID - MFF_REG0, i > << 16, > >> >> + (uint32_t) 0xFFFF << 16); > >> >> + struct ovs_list *tuns = get_remote_tunnels(binding, chassis, > >> >> + chassis_tunnels, > >> >> + encap_ips[i]); > >> >> + if (!ovs_list_is_empty(tuns)) { > >> >> + bool is_vtep_port = !strcmp(binding->type, "vtep"); > >> >> + /* rewrite MFF_IN_PORT to bypass OpenFlow loopback > check for ARP/ND > >> >> + * responder in L3 networks. */ > >> >> + if (is_vtep_port) { > >> >> + put_load(ofp_to_u16(OFPP_NONE), MFF_IN_PORT, 0, 16, > >> >> + ofpacts_clone); > >> >> + } > >> >> > >> >> - struct tunnel *tun; > >> >> - LIST_FOR_EACH (tun, list_node, tuns) { > >> >> - put_encapsulation(mff_ovn_geneve, tun->tun, > >> >> - binding->datapath, port_key, > is_vtep_port, > >> >> - ofpacts_p); > >> >> - ofpact_put_OUTPUT(ofpacts_p)->port = tun->tun->ofport; > >> >> + struct tunnel *tun; > >> >> + LIST_FOR_EACH (tun, list_node, tuns) { > >> >> + put_encapsulation(mff_ovn_geneve, tun->tun, > >> >> + binding->datapath, port_key, > is_vtep_port, > >> >> + ofpacts_clone); > >> >> + ofpact_put_OUTPUT(ofpacts_clone)->port = > tun->tun->ofport; > >> >> + } > >> >> + put_resubmit(OFTABLE_LOCAL_OUTPUT, ofpacts_clone); > >> >> + ofctrl_add_flow(flow_table, OFTABLE_REMOTE_OUTPUT, 100, > >> >> + binding->header_.uuid.parts[0], match, > >> >> + ofpacts_clone, &binding->header_.uuid); > >> >> } > >> >> - put_resubmit(OFTABLE_LOCAL_OUTPUT, ofpacts_p); > >> >> - ofctrl_add_flow(flow_table, OFTABLE_REMOTE_OUTPUT, 100, > >> >> - binding->header_.uuid.parts[0], match, > ofpacts_p, > >> >> - &binding->header_.uuid); > >> >> - } > >> >> - struct tunnel *tun_elem; > >> >> - LIST_FOR_EACH_POP (tun_elem, list_node, tuns) { > >> >> - free(tun_elem); > >> >> + struct tunnel *tun_elem; > >> >> + LIST_FOR_EACH_POP (tun_elem, list_node, tuns) { > >> >> + free(tun_elem); > >> >> + } > >> >> + free(tuns); > >> >> + > >> >> + ofpbuf_delete(ofpacts_clone); > >> >> } > >> >> - free(tuns); > >> >> } > >> >> > >> >> static void > >> >> @@ -673,7 +688,8 @@ put_replace_chassis_mac_flows(const struct simap > *ct_zones, > >> >> if (tag) { > >> >> ofpact_put_STRIP_VLAN(ofpacts_p); > >> >> } > >> >> - load_logical_ingress_metadata(localnet_port, &zone_ids, > ofpacts_p); > >> >> + load_logical_ingress_metadata(localnet_port, &zone_ids, 0, > NULL, > >> >> + ofpacts_p); > >> >> replace_mac = ofpact_put_SET_ETH_SRC(ofpacts_p); > >> >> replace_mac->mac = router_port_mac; > >> >> > >> >> @@ -853,7 +869,7 @@ put_zones_ofpacts(const struct zone_ids > *zone_ids, struct ofpbuf *ofpacts_p) > >> >> { > >> >> if (zone_ids) { > >> >> if (zone_ids->ct) { > >> >> - put_load(zone_ids->ct, MFF_LOG_CT_ZONE, 0, 32, > ofpacts_p); > >> >> + put_load(zone_ids->ct, MFF_LOG_CT_ZONE, 0, 16, > ofpacts_p); > >> >> } > >> >> if (zone_ids->dnat) { > >> >> put_load(zone_ids->dnat, MFF_LOG_DNAT_ZONE, 0, 32, > ofpacts_p); > >> >> @@ -1014,9 +1030,23 @@ put_local_common_flows(uint32_t dp_key, > >> >> } > >> >> } > >> >> > >> >> +static size_t > >> >> +encap_ip_to_id(size_t n_encap_ips, const char **encap_ips, > >> >> + const char *ip) > >> >> +{ > >> >> + for (size_t i = 0; i < n_encap_ips; i++) { > >> >> + if (!strcmp(ip, encap_ips[i])) { > >> >> + return i; > >> >> + } > >> >> + } > >> >> + return 0; > >> >> +} > >> >> + > >> >> static void > >> >> load_logical_ingress_metadata(const struct sbrec_port_binding > *binding, > >> >> const struct zone_ids *zone_ids, > >> >> + size_t n_encap_ips, > >> >> + const char **encap_ips, > >> >> struct ofpbuf *ofpacts_p) > >> >> { > >> >> put_zones_ofpacts(zone_ids, ofpacts_p); > >> >> @@ -1026,6 +1056,14 @@ load_logical_ingress_metadata(const struct > sbrec_port_binding *binding, > >> >> uint32_t port_key = binding->tunnel_key; > >> >> put_load(dp_key, MFF_LOG_DATAPATH, 0, 64, ofpacts_p); > >> >> put_load(port_key, MFF_LOG_INPORT, 0, 32, ofpacts_p); > >> >> + > >> >> + /* Default encap_id 0. */ > >> >> + size_t encap_id = 0; > >> >> + if (encap_ips && binding->encap) { > >> >> + encap_id = encap_ip_to_id(n_encap_ips, encap_ips, > >> >> + binding->encap->ip); > >> >> + } > >> >> + put_load(encap_id, MFF_LOG_ENCAP_ID, 16, 16, ofpacts_p); > >> >> } > >> >> > >> >> static const struct sbrec_port_binding * > >> >> @@ -1070,7 +1108,7 @@ setup_rarp_activation_strategy(const struct > sbrec_port_binding *binding, > >> >> match_set_dl_type(&match, htons(ETH_TYPE_RARP)); > >> >> match_set_in_port(&match, ofport); > >> >> > >> >> - load_logical_ingress_metadata(binding, zone_ids, &ofpacts); > >> >> + load_logical_ingress_metadata(binding, zone_ids, 0, NULL, > &ofpacts); > >> >> > >> >> encode_controller_op(ACTION_OPCODE_ACTIVATION_STRATEGY_RARP, > >> >> NX_CTLR_NO_METER, &ofpacts); > >> >> @@ -1384,7 +1422,7 @@ enforce_tunneling_for_multichassis_ports( > >> >> } > >> >> > >> >> struct ovs_list *tuns = get_remote_tunnels(binding, chassis, > >> >> - chassis_tunnels); > >> >> + chassis_tunnels, > NULL); > >> >> if (ovs_list_is_empty(tuns)) { > >> >> free(tuns); > >> >> return; > >> >> @@ -1446,6 +1484,8 @@ consider_port_binding(struct ovsdb_idl_index > *sbrec_port_binding_by_name, > >> >> const struct sbrec_chassis *chassis, > >> >> const struct physical_debug *debug, > >> >> const struct if_status_mgr *if_mgr, > >> >> + size_t n_encap_ips, > >> >> + const char **encap_ips, > >> >> struct ovn_desired_flow_table *flow_table, > >> >> struct ofpbuf *ofpacts_p) > >> >> { > >> >> @@ -1479,9 +1519,10 @@ consider_port_binding(struct ovsdb_idl_index > *sbrec_port_binding_by_name, > >> >> ofpact_put_CT_CLEAR(ofpacts_p); > >> >> put_load(0, MFF_LOG_DNAT_ZONE, 0, 32, ofpacts_p); > >> >> put_load(0, MFF_LOG_SNAT_ZONE, 0, 32, ofpacts_p); > >> >> - put_load(0, MFF_LOG_CT_ZONE, 0, 32, ofpacts_p); > >> >> + put_load(0, MFF_LOG_CT_ZONE, 0, 16, ofpacts_p); > >> >> struct zone_ids peer_zones = get_zone_ids(peer, ct_zones); > >> >> - load_logical_ingress_metadata(peer, &peer_zones, ofpacts_p); > >> >> + load_logical_ingress_metadata(peer, &peer_zones, > n_encap_ips, > >> >> + encap_ips, ofpacts_p); > >> >> put_load(0, MFF_LOG_FLAGS, 0, 32, ofpacts_p); > >> >> put_load(0, MFF_LOG_OUTPORT, 0, 32, ofpacts_p); > >> >> for (int i = 0; i < MFF_N_LOG_REGS; i++) { > >> >> @@ -1697,7 +1738,8 @@ consider_port_binding(struct ovsdb_idl_index > *sbrec_port_binding_by_name, > >> >> * as we're going to remove this with ofpbuf_pull() later. > */ > >> >> uint32_t ofpacts_orig_size = ofpacts_p->size; > >> >> > >> >> - load_logical_ingress_metadata(binding, &zone_ids, > ofpacts_p); > >> >> + load_logical_ingress_metadata(binding, &zone_ids, > n_encap_ips, > >> >> + encap_ips, ofpacts_p); > >> >> > >> >> if (!strcmp(binding->type, "localport")) { > >> >> /* mark the packet as incoming from a localport */ > >> >> @@ -1904,7 +1946,7 @@ consider_port_binding(struct ovsdb_idl_index > *sbrec_port_binding_by_name, > >> >> } else { > >> >> put_remote_port_redirect_overlay( > >> >> binding, mff_ovn_geneve, port_key, &match, ofpacts_p, > >> >> - chassis, chassis_tunnels, flow_table); > >> >> + chassis, chassis_tunnels, n_encap_ips, encap_ips, > flow_table); > >> >> } > >> >> out: > >> >> if (ha_ch_ordered) { > >> >> @@ -2102,7 +2144,7 @@ consider_mc_group(struct ovsdb_idl_index > *sbrec_port_binding_by_name, > >> >> > >> >> int zone_id = simap_get(ct_zones, port->logical_port); > >> >> if (zone_id) { > >> >> - put_load(zone_id, MFF_LOG_CT_ZONE, 0, 32, &ofpacts); > >> >> + put_load(zone_id, MFF_LOG_CT_ZONE, 0, 16, &ofpacts); > >> >> } > >> >> > >> >> const char *lport_name = (port->parent_port && > *port->parent_port) ? > >> >> @@ -2278,7 +2320,10 @@ physical_eval_port_binding(struct > physical_ctx *p_ctx, > >> >> p_ctx->patch_ofports, > >> >> p_ctx->chassis_tunnels, > >> >> pb, p_ctx->chassis, &p_ctx->debug, > >> >> - p_ctx->if_mgr, flow_table, &ofpacts); > >> >> + p_ctx->if_mgr, > >> >> + p_ctx->n_encap_ips, > >> >> + p_ctx->encap_ips, > >> >> + flow_table, &ofpacts); > >> >> ofpbuf_uninit(&ofpacts); > >> >> } > >> >> > >> >> @@ -2402,7 +2447,10 @@ physical_run(struct physical_ctx *p_ctx, > >> >> p_ctx->patch_ofports, > >> >> p_ctx->chassis_tunnels, binding, > >> >> p_ctx->chassis, &p_ctx->debug, > >> >> - p_ctx->if_mgr, flow_table, &ofpacts); > >> >> + p_ctx->if_mgr, > >> >> + p_ctx->n_encap_ips, > >> >> + p_ctx->encap_ips, > >> >> + flow_table, &ofpacts); > >> >> } > >> >> > >> >> /* Handle output to multicast groups, in tables 40 and 41. */ > >> >> diff --git a/controller/physical.h b/controller/physical.h > >> >> index 1f1ed55efa16..7fe8ee3c18ed 100644 > >> >> --- a/controller/physical.h > >> >> +++ b/controller/physical.h > >> >> @@ -66,6 +66,8 @@ struct physical_ctx { > >> >> struct shash *local_bindings; > >> >> struct simap *patch_ofports; > >> >> struct hmap *chassis_tunnels; > >> >> + size_t n_encap_ips; > >> >> + const char **encap_ips; > >> >> struct physical_debug debug; > >> >> }; > >> >> > >> >> diff --git a/include/ovn/logical-fields.h > b/include/ovn/logical-fields.h > >> >> index 272277ec4fa0..d3455a462a0d 100644 > >> >> --- a/include/ovn/logical-fields.h > >> >> +++ b/include/ovn/logical-fields.h > >> >> @@ -37,7 +37,9 @@ enum ovn_controller_event { > >> >> #define MFF_LOG_SNAT_ZONE MFF_REG12 /* conntrack snat zone for > gateway router > >> >> * (32 bits). */ > >> >> #define MFF_LOG_CT_ZONE MFF_REG13 /* Logical conntrack zone for > lports > >> >> - * (32 bits). */ > >> >> + * (0..15 of the 32 bits). */ > >> >> +#define MFF_LOG_ENCAP_ID MFF_REG13 /* Encap ID for lports > >> >> + * (16..31 of the 32 bits). */ > >> >> #define MFF_LOG_INPORT MFF_REG14 /* Logical input port (32 > bits). */ > >> >> #define MFF_LOG_OUTPORT MFF_REG15 /* Logical output port (32 > bits). */ > >> >> > >> >> diff --git a/ovn-architecture.7.xml b/ovn-architecture.7.xml > >> >> index 96294fe2b795..bfd8680cedfc 100644 > >> >> --- a/ovn-architecture.7.xml > >> >> +++ b/ovn-architecture.7.xml > >> >> @@ -1247,8 +1247,8 @@ > >> >> chassis. This is initialized to 0 at the beginning of the > logical > >> >> <!-- Keep the following in sync with MFF_LOG_CT_ZONE in > >> >> ovn/lib/logical-fields.h. --> > >> >> - ingress pipeline. OVN stores this in Open vSwitch extension > register > >> >> - number 13. > >> >> + ingress pipeline. OVN stores this in the lower 16 bits of > the Open > >> >> + vSwitch extension register number 13. > >> >> </dd> > >> >> > >> >> <dt>conntrack zone fields for routers</dt> > >> >> @@ -1263,6 +1263,20 @@ > >> >> traffic (for SNATing) in Open vSwitch extension register > number 12. > >> >> </dd> > >> >> > >> >> + <dt>Encap ID for logical ports</dt> > >> >> + <dd> > >> >> + A field that records an ID that indicates which encapsulation > IP should > >> >> + be used when sending packets to a remote chassis, according > to the > >> >> + original input logical port. This is useful when there are > multiple IPs > >> >> + available for encapsulation. The value only has local > significance and is > >> >> + not meaningful between chassis. This is initialized to 0 at > the beginning > >> >> + of the logical > >> >> + <!-- Keep the following in sync with MFF_LOG_ENCAP_ID in > >> >> + ovn/lib/logical-fields.h. --> > >> >> + ingress pipeline. OVN stores this in the higher 16 bits of > the Open > >> >> + vSwitch extension register number 13. > >> >> + </dd> > >> >> + > >> >> <dt>logical flow flags</dt> > >> >> <dd> > >> >> The logical flags are intended to handle keeping context > between > >> >> diff --git a/tests/ovn.at b/tests/ovn.at > >> >> index 243fe0b8246c..e7f0c9681f60 100644 > >> >> --- a/tests/ovn.at > >> >> +++ b/tests/ovn.at > >> >> @@ -30335,19 +30335,33 @@ AT_CLEANUP > >> >> > >> >> > >> >> OVN_FOR_EACH_NORTHD([ > >> >> -AT_SETUP([multiple encap ips tunnel creation]) > >> >> +AT_SETUP([multiple encap ips selection based on VIF's encap_ip]) > >> >> +AT_SKIP_IF([test $HAVE_SCAPY = no]) > >> >> ovn_start > >> >> net_add n1 > >> >> > >> >> +ovn-nbctl ls-add ls1 > >> >> + > >> >> # 2 HVs, each with 2 encap-ips. > >> >> +# lspXY is the Yth LSP on hvX, with encap-ip 192.168.0.XY. > >> >> for i in 1 2; do > >> >> sim_add hv$i > >> >> as hv$i > >> >> ovs-vsctl add-br br-phys-$j > >> >> ovn_attach n1 br-phys-$j 192.168.0.${i}1 > >> >> ovs-vsctl set open . > external_ids:ovn-encap-ip=192.168.0.${i}1,192.168.0.${i}2 > >> >> + > >> >> + for j in 1 2; do > >> >> + ovs-vsctl add-port br-int vif$i$j -- set Interface vif$i$j \ > >> >> + external_ids:iface-id=lsp$i$j \ > >> >> + external_ids:encap-ip=192.168.0.$i$j \ > >> >> + options:tx_pcap=hv$i/vif$i$j-tx.pcap > options:rxq_pcap=hv$i/vif$i$j-rx.pcap > >> >> + ovn-nbctl lsp-add ls1 lsp$i$j -- lsp-set-addresses lsp$i$j > "f0:00:00:00:00:$i$j 10.0.0.$i$j" > >> >> + > >> >> + done > >> >> done > >> >> > >> >> +wait_for_ports_up > >> >> check ovn-nbctl --wait=hv sync > >> >> > >> >> check_tunnel_port() { > >> >> @@ -30376,6 +30390,41 @@ check_tunnel_port hv2 br-int > [email protected]%192.168.0.21 > >> >> check_tunnel_port hv2 br-int [email protected]%192.168.0.22 > >> >> check_tunnel_port hv2 br-int [email protected]%192.168.0.22 > >> >> > >> >> +vif_to_hv() { > >> >> + case $1 in dnl ( > >> >> + vif[[1]]?) echo hv1 ;; dnl ( > >> >> + vif[[2]]?) echo hv2 ;; dnl ( > >> >> + *) AT_FAIL_IF([:]) ;; > >> >> + esac > >> >> +} > >> >> + > >> >> +# check_packet_tunnel SRC_VIF DST_VIF > >> >> +# Verify that a packet from SRC_VIF to DST_VIF goes through the > corresponding > >> >> +# tunnel that matches the VIFs' encap_ip configurations. > >> >> +check_packet_tunnel() { > >> >> + local src=$1 dst=$2 > >> >> + local src_mac=f0:00:00:00:00:$src > >> >> + local dst_mac=f0:00:00:00:00:$dst > >> >> + local src_ip=10.0.0.$src > >> >> + local dst_ip=10.0.0.$dst > >> >> + local local_encap_ip=192.168.0.$src > >> >> + local remote_encap_ip=192.168.0.$dst > >> >> + local packet=$(fmt_pkt "Ether(dst='${dst_mac}', > src='${src_mac}')/ \ > >> >> + IP(dst='${dst_ip}', src='${src_ip}')/ \ > >> >> + ICMP(type=8)") > >> >> + hv=`vif_to_hv vif$src` > >> >> + as $hv > >> >> + echo "vif$src -> vif$dst should go through tunnel > $local_encap_ip -> $remote_encap_ip" > >> >> + tunnel_ofport=$(ovs-vsctl --bare --column=ofport find interface > options:local_ip=$local_encap_ip options:remote_ip=$remote_encap_ip) > >> >> + AT_CHECK([test $(ovs-appctl ofproto/trace br-int > in_port=vif$src $packet | grep "output\:" | awk -F ':' '{ print $2 }') == > $tunnel_ofport]) > >> >> +} > >> >> + > >> >> +for i in 1 2; do > >> >> + for j in 1 2; do > >> >> + check_packet_tunnel 1$i 2$j > >> >> + done > >> >> +done > >> >> + > >> >> OVN_CLEANUP([hv1],[hv2]) > >> >> AT_CLEANUP > >> >> ]) > >> >> -- > >> >> 2.38.1 > >> >> > >> >> _______________________________________________ > >> >> dev mailing list > >> >> [email protected] > >> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> >> > >> > > >> > Thanks, > >> > Ales > >> > > >> > -- > >> > > >> > Ales Musil > >> > > >> > Senior Software Engineer - OVN Core > >> > > >> > Red Hat EMEA > >> > > >> > [email protected] > > > > > > > > -- > > > > Ales Musil > > > > Senior Software Engineer - OVN Core > > > > Red Hat EMEA > > > > [email protected] > -- Ales Musil Senior Software Engineer - OVN Core Red Hat EMEA <https://www.redhat.com> [email protected] <https://red.ht/sig> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
