On Fri, Sep 21, 2018 at 11:06 PM Guru Shetty <[email protected]> wrote:
> >
> >
> > I have tried to make sense of this patch series a few times. I think
> > adding increasing complications like this will make gateway code
> > unmaintainable. The whole gateway redirect chassis already makes it
> > un-understandable and now this will mean that no one will be able to
> > understand it as we go forward. [/rant off]
> >
>
> Sorry about the "rant" if it came out as rude - was not my intention. I
> will try to offer something more constructive. When you initially looked at
> the gateway redirect code, how long did it take for you to understand it?
> My main concern is not with your code, but the underlying code. I initially
> reviewed the underlying redirect code. And it took me a good 2 days to
> understand it. Back then, we wanted something working and it looked like
> something that can eventually be simplified. But, we have been adding more
> things to it. And right now, if this patch series is added, it will likely
> get very very hard to understand it as we go forward. Do you reckon that a
> year from now, you will be able to debug a corner case in the code?
>
> If there are others who have understood the "redirect" gateway code easily,
> please do chime in.
>
> My suggestion is to re-look at the redirect gateway design and see whether
> the code can be simplified. Or whether we can write detailed documentation
> about it to make it easy to understand. Only once we are satisfied with it,
> we can attempt to add more stuff to it.
>
>
I always forget the gateway code and the design. (May be because I haven't
worked
on this part of the code much) and I have to refresh my memory everytime.
I agree that we need to have a detailed documentation. May be "a journey of
a packet "
kind of documentation in ovn-architecture would also help.
The proposed solution in this patch series, If I understand it correctly
runs the router pipeline
in the source chassis and then resumes it on the gateway chassis. Recently
I was looking into
SR-IOV kind of scenarios, where the VM traffic by passes the host kernel,
and I think the
proposed solution will not work. In order to support N/S traffic for SR-IOV
instances
we have to find probably another way and that would also complicate the
code further.
However I think it is possible to have one solution to support gateway
traffic for
VLAN tenant networks which covers SR-IOV and the normal case. I think the
router pipeline
should be executed only on the gateway chassis. I will have a discussion
with
Anil on this, this monday and see if it's possible.
But unfortunately, supporting these use cases will add some amount of
complexity :(.
Thanks
Numan
>
> >
> >> ---
> >>
> >> v6->v7:
> >> * Added this patch
> >>
> >>
> >> ovn/controller/physical.c | 60
> >> ++++++++++++++++++++++++++++++++++++++++++---
> >> ovn/northd/ovn-northd.8.xml | 10 ++++++++
> >> ovn/northd/ovn-northd.c | 29 ++++++++++++++++++++++
> >> ovn/ovn-architecture.7.xml | 4 ++-
> >> 4 files changed, 99 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
> >> index f269a1d..1f41f59 100644
> >> --- a/ovn/controller/physical.c
> >> +++ b/ovn/controller/physical.c
> >> @@ -190,7 +190,9 @@ get_zone_ids(const struct sbrec_port_binding
> *binding,
> >> static void
> >> put_local_common_flows(uint32_t dp_key, uint32_t port_key,
> >> bool nested_container, const struct zone_ids
> >> *zone_ids,
> >> - struct ofpbuf *ofpacts_p, struct hmap
> *flow_table)
> >> + struct ofpbuf *ofpacts_p, struct hmap
> *flow_table,
> >> + struct local_datapath *ld,
> >> + const struct hmap *local_datapaths)
> >> {
> >> struct match match;
> >>
> >> @@ -221,11 +223,63 @@ put_local_common_flows(uint32_t dp_key, uint32_t
> >> port_key,
> >> }
> >> }
> >>
> >> + struct ofpbuf *clone = NULL;
> >> + clone = ofpbuf_clone(ofpacts_p);
> >> +
> >> /* Resubmit to table 34. */
> >> put_resubmit(OFTABLE_CHECK_LOOPBACK, ofpacts_p);
> >> ofctrl_add_flow(flow_table, OFTABLE_LOCAL_OUTPUT, 100, 0,
> >> &match, ofpacts_p);
> >>
> >> + /* For a reply packet from gateway with VLAN switch port as
> >> destination
> >> + * (excluding localnet_port and external VLAN networks), gateway
> >> router
> >> + * will use gateway MAC address as source MAC instead of router
> >> internal
> >> + * port MAC, so that external switches can learn gateway MAC
> address.
> >> + * Here (before packet is given to the port) we replace router
> >> gateway
> >> + * MAC address with router internal port MAC. */
> >> + if (ld->localnet_port && (port_key !=
> >> ld->localnet_port->tunnel_key)) {
> >> + for (int i = 0; i < ld->n_peer_dps; i++) {
> >> + struct local_datapath *peer_ldp = get_local_datapath(
> >> + local_datapaths, ld->peer_dps[i]->peer_dp->tunnel_key);
> >> + const struct sbrec_port_binding *crp;
> >> + crp = peer_ldp->chassisredirect_port;
> >> + if (!crp) {
> >> + continue;
> >> + }
> >> +
> >> + if (strcmp(smap_get(&crp->options, "distributed-port"),
> >> + ld->peer_dps[i]->peer->logical_port) &&
> >> + (port_key != ld->peer_dps[i]->patch->tunnel_key)) {
> >> + for (int j = 0; j < crp->n_mac; j++) {
> >> + struct lport_addresses laddrs;
> >> + if (!extract_lsp_addresses(crp->mac[j], &laddrs)) {
> >> + continue;
> >> + }
> >> + match_set_dl_src(&match, laddrs.ea);
> >> + destroy_lport_addresses(&laddrs);
> >> + break;
> >> + }
> >> + for (int j = 0; j < ld->peer_dps[i]->peer->n_mac; j++)
> {
> >> + struct lport_addresses laddrs;
> >> + uint64_t mac64;
> >> + if (!extract_lsp_addresses(
> >> + ld->peer_dps[i]->peer->mac[j], &laddrs)) {
> >> + continue;
> >> + }
> >> + mac64 = eth_addr_to_uint64(laddrs.ea);
> >> + put_load(mac64,
> >> + MFF_ETH_SRC, 0, 48, clone);
> >> + destroy_lport_addresses(&laddrs);
> >> + break;
> >> + }
> >> + put_resubmit(OFTABLE_CHECK_LOOPBACK, clone);
> >> + ofctrl_add_flow(flow_table, OFTABLE_LOCAL_OUTPUT, 150,
> 0,
> >> + &match, clone);
> >> + }
> >> + }
> >> + }
> >> + ofpbuf_delete(clone);
> >> +
> >> /* Table 34, Priority 100.
> >> * =======================
> >> *
> >> @@ -330,7 +384,7 @@ consider_port_binding(struct ovsdb_idl_index
> >> *sbrec_chassis_by_name,
> >>
> >> struct zone_ids binding_zones = get_zone_ids(binding,
> ct_zones);
> >> put_local_common_flows(dp_key, port_key, false, &binding_zones,
> >> - ofpacts_p, flow_table);
> >> + ofpacts_p, flow_table, ld,
> >> local_datapaths);
> >>
> >> match_init_catchall(&match);
> >> ofpbuf_clear(ofpacts_p);
> >> @@ -531,7 +585,7 @@ consider_port_binding(struct ovsdb_idl_index
> >> *sbrec_chassis_by_name,
> >>
> >> struct zone_ids zone_ids = get_zone_ids(binding, ct_zones);
> >> put_local_common_flows(dp_key, port_key, nested_container,
> >> &zone_ids,
> >> - ofpacts_p, flow_table);
> >> + ofpacts_p, flow_table, ld,
> >> local_datapaths);
> >>
> >> /* Table 0, Priority 200, 150 and 100.
> >> * ==============================
> >> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> >> index 8fa5272..876c121 100644
> >> --- a/ovn/northd/ovn-northd.8.xml
> >> +++ b/ovn/northd/ovn-northd.8.xml
> >> @@ -2013,6 +2013,16 @@ next;
> >> </li>
> >>
> >> <li>
> >> + A priority-100 logical flow with match
> >> + <code>inport == <var>GW</var> &&
> >> + flags.rcv_from_vlan == 1</code> has actions
> >> + <code>eth.dst = <var>E</var>; next;</code>, where
> >> + <var>GW</var> is the logical router distributed gateway
> >> + port and <var>E</var> is the MAC address of router
> >> + distributed gateway port.
> >> + </li>
> >> +
> >> + <li>
> >> For each NAT rule in the OVN Northbound database that can
> >> be handled in a distributed manner, a priority-100 logical
> >> flow with match <code>ip4.src == <var>B</var> &&
> >> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> >> index bcf0b66..d012bb8 100644
> >> --- a/ovn/northd/ovn-northd.c
> >> +++ b/ovn/northd/ovn-northd.c
> >> @@ -4419,6 +4419,15 @@ add_route(struct hmap *lflows, const struct
> >> ovn_port *op,
> >> } else {
> >> ds_put_format(&actions, "ip%s.dst", is_ipv4 ? "4" : "6");
> >> }
> >> +
> >> + if (op->peer && op->peer->od->localnet_port &&
> >> + op->od->l3dgw_port && op->od->l3redirect_port &&
> >> + (op != op->od->l3redirect_port) &&
> >> + (op != op->od->l3dgw_port)) {
> >> + ds_put_format(&match, " && is_chassis_resident(%s)",
> >> + op->od->l3redirect_port->json_key);
> >> + ds_put_format(&actions, "; flags.rcv_from_vlan = 1");
> >> + }
> >> ds_put_format(&actions, "; "
> >> "%sreg1 = %s; "
> >> "eth.src = %s; "
> >> @@ -6131,6 +6140,26 @@ build_lrouter_flows(struct hmap *datapaths,
> struct
> >> hmap *ports,
> >> op->lrp_networks.ipv6_addrs[i].network_s,
> >> op->lrp_networks.ipv6_addrs[i].plen, NULL, NULL);
> >> }
> >> +
> >> + /* For a reply packet from gateway with VLAN switch port as
> >> + * destination, replace router internal port MAC with router
> >> gateway
> >> + * MAC address, so that external switches can learn gateway MAC
> >> + * address. Later before delivering the packet to the port,
> >> + * controller will replace the gateway MAC with router internal
> >> port
> >> + * MAC in table 33. */
> >> + if (op->od->l3dgw_port && (op == op->od->l3dgw_port) &&
> >> + op->od->l3redirect_port) {
> >> + ds_clear(&actions);
> >> + ds_clear(&match);
> >> + ds_put_format(&match, "inport == %s", op->json_key);
> >> + ds_put_format(&match, " && flags.rcv_from_vlan == 1");
> >> + ds_put_format(&match, " && is_chassis_resident(%s)",
> >> + op->od->l3redirect_port->json_key);
> >> + ds_put_format(&actions,
> >> + "eth.src = %s; next;",
> op->lrp_networks.ea_s);
> >> + ovn_lflow_add(lflows, op->od, S_ROUTER_IN_GW_REDIRECT, 100,
> >> + ds_cstr(&match), ds_cstr(&actions));
> >> + }
> >> }
> >>
> >> /* Convert the static routes to flows. */
> >> diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
> >> index ad2101c..0de41d2 100644
> >> --- a/ovn/ovn-architecture.7.xml
> >> +++ b/ovn/ovn-architecture.7.xml
> >> @@ -1067,7 +1067,9 @@
> >>
> >> <p>
> >> Flows in table 33 resemble those in table 32 but for logical
> >> ports that
> >> - reside locally rather than remotely. For unicast logical
> output
> >> ports
> >> + reside locally rather than remotely. If these are VLAN ports
> and
> >> + packet has router gateway port MAC address as source, replace
> it
> >> with
> >> + router internal port MAC address. For unicast logical output
> >> ports
> >> on the local hypervisor, the actions just resubmit to table 34.
> >> For
> >> multicast output ports that include one or more logical ports
> on
> >> the
> >> local hypervisor, for each such logical port <var>P</var>, the
> >> actions
> >> --
> >> 1.8.3.1
> >>
> >> _______________________________________________
> >> dev mailing list
> >> [email protected]
> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>
> >
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev