On Sun, Nov 13, 2016 at 11:15 PM, Gurucharan Shetty <[email protected]> wrote:
> When multiple gateway routers exist, a packet can > enter any gateway router. Once the packet reaches its > destination, its reverse direction should be via the > same gateway router. This is achieved by doing a SNAT > of the packet in the inward direction (towards logical space) > with a IP address of the gateway router such that packet travels back > to the same gateway router. > > To do the above, we introduce two new options in the logical router. > > options:dnat_force_snat_ip=$IP will force SNAT any packet to $IP if > it has been previously DNATted. > > options:lb_force_snat_ip=$IP will force SNAT any packet to $IP if > it has been previously load-balanced. > Overall it looks very good. Three comments inline. Signed-off-by: Gurucharan Shetty <[email protected]> > --- > v2->v3: > Adds a unit test for load-balancing and multiple gateway routers. > --- > ovn/lib/logical-fields.c | 4 + > ovn/lib/logical-fields.h | 5 + > ovn/northd/ovn-northd.8.xml | 28 +++- > ovn/northd/ovn-northd.c | 98 ++++++++++- > ovn/ovn-nb.xml | 25 +++ > tests/system-ovn.at | 386 ++++++++++++++++++++++++++++++ > ++++++++++++++ > 6 files changed, 543 insertions(+), 3 deletions(-) > > diff --git a/ovn/lib/logical-fields.c b/ovn/lib/logical-fields.c > index d4578c3..ef5188a 100644 > --- a/ovn/lib/logical-fields.c > +++ b/ovn/lib/logical-fields.c > @@ -88,6 +88,10 @@ ovn_init_symtab(struct shash *symtab) > char flags_str[16]; > snprintf(flags_str, sizeof flags_str, "flags[%d]", > MLF_ALLOW_LOOPBACK_BIT); > expr_symtab_add_subfield(symtab, "flags.loopback", NULL, flags_str); > + snprintf(flags_str, sizeof flags_str, "flags[%d]", > + MLF_FORCE_SNAT_FOR_DNAT_BIT); > + expr_symtab_add_subfield(symtab, "flags.force_snat_for_dnat", NULL, > + flags_str); > > /* Connection tracking state. */ > expr_symtab_add_field(symtab, "ct_mark", MFF_CT_MARK, NULL, false); > diff --git a/ovn/lib/logical-fields.h b/ovn/lib/logical-fields.h > index a1f1da6..1681fff 100644 > --- a/ovn/lib/logical-fields.h > +++ b/ovn/lib/logical-fields.h > @@ -47,6 +47,7 @@ void ovn_init_symtab(struct shash *symtab); > enum mff_log_flags_bits { > MLF_ALLOW_LOOPBACK_BIT = 0, > MLF_RCV_FROM_VXLAN_BIT = 1, > + MLF_FORCE_SNAT_FOR_DNAT_BIT = 2, > You chose to define an explicit bit for the DNAT case, but not for the LB case. If there is only going to be a bit defined for one of the cases, I prefer that the explicit bit be for the LB case. The LB bit will be required for the implementation of centralized load balancing on an otherwise distributed router. > }; > > /* MFF_LOG_FLAGS_REG flag assignments */ > @@ -59,6 +60,10 @@ enum mff_log_flags { > * VXLAN encapsulation. Egress port information is available for > * Geneve and STT tunnel types. */ > MLF_RCV_FROM_VXLAN = (1 << MLF_RCV_FROM_VXLAN_BIT), > + > + /* Indicate that a packet needs a force SNAT in the gateway router > when > + * DNAT has taken place. */ > + MLF_FORCE_SNAT_FOR_DNAT = (1 << MLF_FORCE_SNAT_FOR_DNAT_BIT), > }; > > #endif /* ovn/lib/logical-fields.h */ > diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml > index df53d4c..860995b 100644 > --- a/ovn/northd/ovn-northd.8.xml > +++ b/ovn/northd/ovn-northd.8.xml > @@ -1153,6 +1153,14 @@ icmp4 { > </p> > > <p> > + If the Gateway router has been configured to force SNAT (any > + previously DNATted or Load-balanced packets) to <var>B</var>, > + a priority-100 flow matches <code>ip && > + ip4.dst == <var>B</var></code> with an action <code>ct_snat; > + next;</code>. > + </p> > + > + <p> > A priority-0 logical flow with match <code>1</code> has actions > <code>next;</code>. > </p> > @@ -1194,7 +1202,11 @@ icmp4 { > to change the destination IP address of a packet from > <var>A</var> to > <var>B</var>, a priority-100 flow matches <code>ip && > ip4.dst == <var>A</var></code> with an action > - <code>flags.loopback = 1; ct_dnat(<var>B</var>);</code>. > + <code>flags.loopback = 1; ct_dnat(<var>B</var>);</code>. If the > + Gateway router is configured to force SNAT any DNATed packet, > + the above action will be replaced by > + <code>flags.force_snat_for_dnat = 1; flags.loopback = 1; > + ct_dnat(<var>B</var>);</code>. > </li> > > <li> > @@ -1433,6 +1445,20 @@ arp { > <ul> > <li> > <p> > + If the Gateway router in the OVN Northbound database has been > + configured to force SNAT a packet (that has been previously > DNATted) > + to <var>B</var>, a priority-110 flow matches > + <code>flags.force_snat_for_dnat == 1 && ip</code> with > an > + action <code>ct_snat(<var>B</var>);</code>. > + </p> > + <p> > + If the Gateway router in the OVN Northbound database has been > + configured to force SNAT a packet (that has been previously > + load-balanced) to <var>B</var>, a priority-100 flow matches > + <code>ct.dnat && ip</code> with an action > + <code>ct_snat(<var>B</var>);</code>. > + </p> > + <p> > This proposal uses an explicit flag force_snat_for_dnat for the DNAT case, then assumes that all other cases where ct.dnat is set are for load balancing. Typically, target scenarios would have both dnat_force_snat_ip and lb_force_snat_ip specified, or neither specified. However, there is a corner case when lb_force_snat_ip is specified but dnat_force_snat_ip is not specified. In that case, wouldn't this code force SNAT using the lb_force_snat_ip in all cases including DNAT? For each configuration in the OVN Northbound database, that asks > to change the source IP address of a packet from an IP address > of > <var>A</var> or to change the source IP address of a packet that > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c > index 437da9f..0d7d6f8 100644 > --- a/ovn/northd/ovn-northd.c > +++ b/ovn/northd/ovn-northd.c > @@ -3452,6 +3452,31 @@ op_put_v6_networks(struct ds *ds, const struct > ovn_port *op) > ds_put_cstr(ds, "}"); > } > > +static const char * > +get_force_snat_ip(struct ovn_datapath *od, const char *key_type, ovs_be32 > *ip) > +{ > + char *key = xasprintf("%s_force_snat_ip", key_type); > + const char *ip_address = smap_get(&od->nbr->options, key); > + free(key); > + > + if (ip_address) { > + ovs_be32 mask; > + char *error = ip_parse_masked(ip_address, ip, &mask); > + if (error || mask != OVS_BE32_MAX) { > + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1); > + VLOG_WARN_RL(&rl, "bad ip %s in options of router "UUID_FMT"", > + ip_address, UUID_ARGS(&od->key)); > + free(error); > + *ip = 0; > + return NULL; > + } > + return ip_address; > + } > + > + *ip = 0; > + return NULL; > +} > + > static void > build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, > struct hmap *lflows) > @@ -3673,8 +3698,26 @@ build_lrouter_flows(struct hmap *datapaths, struct > hmap *ports, > > sset_destroy(&all_ips); > > - ovs_be32 *snat_ips = xmalloc(sizeof *snat_ips * > op->od->nbr->n_nat); > + /* A gateway router can have 2 SNAT IP addresses to force DNATed > and > + * LBed traffic respectively to be SNATed. In addition, there > can be > + * a number of SNAT rules in the NAT table. */ > + ovs_be32 *snat_ips = xmalloc(sizeof *snat_ips * > + (op->od->nbr->n_nat + 2)); > size_t n_snat_ips = 0; > + > + ovs_be32 snat_ip; > + const char *dnat_force_snat_ip = get_force_snat_ip(op->od, "dnat", > + &snat_ip); > + if (dnat_force_snat_ip) { > + snat_ips[n_snat_ips++] = snat_ip; > + } > + > + const char *lb_force_snat_ip = get_force_snat_ip(op->od, "lb", > + &snat_ip); > + if (lb_force_snat_ip) { > + snat_ips[n_snat_ips++] = snat_ip; > + } > + > for (int i = 0; i < op->od->nbr->n_nat; i++) { > const struct nbrec_nat *nat; > > @@ -3845,6 +3888,12 @@ build_lrouter_flows(struct hmap *datapaths, struct > hmap *ports, > continue; > } > > + ovs_be32 snat_ip; > + const char *dnat_force_snat_ip = get_force_snat_ip(od, "dnat", > + &snat_ip); > + const char *lb_force_snat_ip = get_force_snat_ip(od, "lb", > + &snat_ip); > + > /* A set to hold all ips that need defragmentation and tracking. > */ > struct sset all_ips = SSET_INITIALIZER(&all_ips); > > @@ -3981,7 +4030,13 @@ build_lrouter_flows(struct hmap *datapaths, struct > hmap *ports, > ds_clear(&match); > ds_put_format(&match, "ip && ip4.dst == %s", > nat->external_ip); > ds_clear(&actions); > - ds_put_format(&actions,"flags.loopback = 1; > ct_dnat(%s);", > + if (dnat_force_snat_ip) { > + /* Indicate to the future tables that a DNAT has taken > + * place and a force SNAT needs to be done in the > Egress > + * SNAT table. */ > + ds_put_format(&actions, "flags.force_snat_for_dnat = > 1; "); > + } > + ds_put_format(&actions, "flags.loopback = 1; > ct_dnat(%s);", > nat->logical_ip); > ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 100, > ds_cstr(&match), ds_cstr(&actions)); > @@ -4006,6 +4061,45 @@ build_lrouter_flows(struct hmap *datapaths, struct > hmap *ports, > } > } > > + /* Handle force SNAT options set in the gateway router. */ > + if (dnat_force_snat_ip) { > + /* If a packet with destination IP address as that of the > + * gateway router (as set in options:dnat_force_snat_ip) is > seen, > + * UNSNAT it. */ > + ds_clear(&match); > + ds_put_format(&match, "ip && ip4.dst == %s", > dnat_force_snat_ip); > + ovn_lflow_add(lflows, od, S_ROUTER_IN_UNSNAT, 100, > + ds_cstr(&match), "ct_snat; next;"); > Since the rest of the gateway router NAT rule code is organized by table, IMO it would be better to put the code above in the Ingress UNSNAT table section. + > + /* Higher priority rules to force SNAT with the IP addresses > + * configured in the Gateway router. This only takes effect > + * when the packet has already been DNATed once. */ > + ds_clear(&match); > + ds_put_format(&match, "flags.force_snat_for_dnat == 1 && ip"); > + ds_clear(&actions); > + ds_put_format(&actions, "ct_snat(%s);", dnat_force_snat_ip); > + ovn_lflow_add(lflows, od, S_ROUTER_OUT_SNAT, 110, > + ds_cstr(&match), ds_cstr(&actions)); > + } > + if (lb_force_snat_ip) { > + /* If a packet with destination IP address as that of the > + * gateway router (as set in options:lb_force_snat_ip) is > seen, > + * UNSNAT it. */ > + ds_clear(&match); > + ds_put_format(&match, "ip && ip4.dst == %s", > lb_force_snat_ip); > + ovn_lflow_add(lflows, od, S_ROUTER_IN_UNSNAT, 100, > + ds_cstr(&match), "ct_snat; next;"); > Since the rest of the gateway router NAT rule code is organized by table, IMO it would be better to put the code above in the Ingress UNSNAT table section. Mickey + > + /* Load balanced traffic (a subset of DNATed traffic) will > have > + * ct.dnat set. Force SNAT it. */ > + ds_clear(&match); > + ds_put_format(&match, "ct.dnat && ip"); > + ds_clear(&actions); > + ds_put_format(&actions, "ct_snat(%s);", lb_force_snat_ip); > + ovn_lflow_add(lflows, od, S_ROUTER_OUT_SNAT, 100, > + ds_cstr(&match), ds_cstr(&actions)); > + } > + > /* Re-circulate every packet through the DNAT zone. > * This helps with three things. > * > diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml > index 49fbe00..3e40881 100644 > --- a/ovn/ovn-nb.xml > +++ b/ovn/ovn-nb.xml > @@ -924,6 +924,31 @@ > router. > </p> > </column> > + <column name="options" key="dnat_force_snat_ip"> > + <p> > + If set, indicates the IP address to use to force SNAT a packet > + that has already been DNATed in the gateway router. When > multiple > + gateway routers are configured, a packet can potentially enter > any > + of the gateway router, get DNATted and eventually reach the > logical > + switch port. For the return traffic to go back to the same > gateway > + router (for unDNATing), the packet needs a SNAT in the first > place. > + This can be achieved by setting the above option with a gateway > + specific IP address. > + </p> > + </column> > + <column name="options" key="lb_force_snat_ip"> > + <p> > + If set, indicates the IP address to use to force SNAT a packet > + that has already been load-balanced in the gateway router. When > + multiple gateway routers are configured, a packet can > potentially > + enter any of the gateway router, get DNATted as part of the > load- > + balancing and eventually reach the logical switch port. > + For the return traffic to go back to the same gateway router > (for > + unDNATing), the packet needs a SNAT in the first place. This > can be > + achieved by setting the above option with a gateway specific IP > + address. > + </p> > + </column> > </group> > > <group title="Common Columns"> > <snip> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
