On Sat, Nov 3, 2018 at 5:55 PM Numan Siddique <[email protected]> wrote:
> Hi Ankur, > Thanks for the review and for the comments. Please see below > > Thanks > Numan > > > On Fri, Nov 2, 2018 at 6:21 AM Ankur Sharma <[email protected]> > wrote: > >> Hi Numan, Mark, >> >> Thanks for the patch. >> Description explains the problem statement and solution really well. >> >> I have following comments: >> >> a. Regarding the solution: >> i. I think referring to a non connected port in a logical datapath >> pipeline is probably not the right way. >> i.e as per the description, in ls0 pipeline we are referring to >> lr0-public. >> lr0-public is not a peer interface of ls0-lr0 patch port pair, >> hence ls0 pipeline should be totally agnostic to it (just my 2 cents). >> >> table=16(ls_in_l2_lkup), priority=50, match=(eth.dst == >> 00:00:00:00:af:12 && is_chassis_resident("cr-lr0-public")), >> action=(outport = "sw0-lr0"; output;) >> > > If we want the packet to enter the router pipeline on the chassis which is > hosting the gateway port, then I couldn't find any otherway. > > >> >> b. Nit: Patch 2 in this series does not look related to patch 1 (unless >> I am missing something), >> if current then we should have a separate patch for it. >> > > Agree. But since both the patches are related to VLAN I thought I would > club it. > >> >> c. Nit: Any new config that we are adding, we should add corresponding >> ovn-nbctl CLI. >> This way non openstack deployments can also play with the >> feature 😊 >> > > We could add a new command in ovn-nbctl - "ovn-nbctl lrp-set-options". > Generally I tend to use the > generic DB commands - set/add/remove/clear/destroy. In this case we can use > "ovn-nbctl set logical_router_port <LRP_NAME> > options:reside-on-redirect-chassis=true". > >> >> =================================================== >> >> Regarding the solution: >> ------------------------------ >> >> As I understand, we are trying to achieve following here: >> a. For a distributed virtual router, we want some router interface to be >> fully centralized, rather than distributed. >> i.e instead of creating 2 versions of a router port (lrp-* and >> cr-lrp*), one distributed and one centralized, >> We want the lrp-* itself to be centralized. >> i.e in your case, from vlan backed logical switch, you want to enter >> router pipeline only on the gateway chassis and NOT on the source chassis. >> >> b. Just wanted to propose an alternate approach here, which we >> implemented for a slightly different use case. >> Looks like this alternate approach would help in your scenario as >> well. >> >> Alternate Approach: >> -------------------------- >> a. Convert the port pair between logical switch and logical router to be >> of type "l3gateway", rather than "patch". >> b. i.e in your configuration, lr0-sw0 and sw0-lr0 should be implemented >> as type l3gateway rather than patch. >> c. In other words, we are simulating a centralized router (for specific >> peer logical switches) in a distributed router. >> We will be discussing this approach in OVS Conf as well: >> > > I actually thought about this approach. I remember Goushai Li had > submitted a patch to support multiple gateway ports. > https://patchwork.ozlabs.org/patch/884351/. Unfortunately I didn't review > it when it was submitted (although I had promised to > look into it). I somehow missed it. When I re looked it, I thought that > may be it's not the right approach since the VLAN tenant > networks are internal tenant networks and they are not externally > reachable networks and semantically doesn't seem right > to set type as "l3gateway" when these VLAN networks don't provide external > connectivity. If multiple VLAN networks are added > to a logical router with one providing external connectivity, I am not > sure how NATting would be handled there. > > > > > >> https://ovsfall2018.sched.com/event/IO9w/connectivity-for-external-networks-on-the-overlay?iframe=no&w=100%&sidebar=yes&bg=no >> >> =================================================== >> >> Please feel free to point, if I missed something here. >> Please feel free to comment on the proposed alternative. >> >> Thanks >> >> Regards, >> Ankur >> >> >> -----Original Message----- >> From: [email protected] <[email protected]> >> On Behalf Of Numan Siddique >> Sent: Monday, October 15, 2018 2:38 AM >> To: Mark Michelson <[email protected]> >> Cc: ovs dev <[email protected]> >> Subject: Re: [ovs-dev] [PATCH 1/2] ovn: Avoid tunneling for VLAN packets >> redirected to a gateway chassis >> >> On Sat, Oct 13, 2018 at 1:26 AM Mark Michelson <mailto: >> [email protected]> wrote: >> >> > Hi Numan, >> > >> > The patch does a good job of explaining the routing behavior and the >> > tunneling problem solved within. >> > >> > Prior to the patch, you can have a distributed gateway router with a >> > redirect-chassis port set on it. This allows for east-west traffic to >> > have an optimal direct path between hypervisors, but for the >> > north-south use case, when the traffic is redirected to the >> > redirect-chassis, the traffic is encapsulated. >> > >> > With this patch, you add the reside-on-redirect-chassis option to >> > router ports. This essentially makes all traffic destined for the >> > router port get redirected to the gateway chassis prior to running the >> > router pipeline. This removes the encapsulation issue, but it also >> > means that east-west traffic is now also centralized. >> > >> >> That's right. That's the trade off to solve this issue for VLAN tenant >> networks. >> >> >> > >> > I'm curious what the current behavior is when you specify a gateway >> > router by setting options:chassis. Specifically, I'm curious about how >> > it compares if you define a router where the "external" port has >> > options:redirect-chassis set on it and all other ports have >> > options:reside-on-redirect-chassis set on them. Have you essentially >> > just created the same thing? Or is there some subtle difference? >> > >> >> There is a difference. In the case of gateway router (options:chassis >> set) scenario, it is expected that the tenant VLAN logical switches will be >> connected to a normal router and this normal router will be connected to >> the gateway router via a transit switch. >> So the east west traffic will be distributed, but for the North/South >> traffic, the packet on the source chassis enters logical switch pipeline -> >> normal router pipeline -> transit switch pipeline. And then the packet is >> sent to the chassis hosting the gateway router via the tunnel port. On the >> gateway chassis, packet enters the transit switch pipeline -> gateway >> router pipeline -> provider network pipeline. >> >> Thanks >> Numan >> >> >> >> >> >> >> > On 10/05/2018 01:14 PM, mailto:[email protected] wrote: >> > > From: Numan Siddique <mailto:[email protected]> >> > > >> > > An OVN deployment can have multiple logical switches each with a >> > > localnet port connected to a distributed logical router with one >> > > logical router port providing external connectivity (provider >> > > network) and others used as tenant networks with VLAN tagging. >> > > >> > > As reported in [1], external traffic from these VLAN tenant networks >> > > are tunnelled to the gateway chassis (chassis hosting a distributed >> > > gateway port which applies NAT rules). As part of the discussion in >> > > [1], there were few possible solutions proposed by Russell [2]. This >> > > patch implements the first option in [2]. >> > > >> > > With this patch, a new option 'reside-on-redirect-chassis' in >> 'options' >> > > column of Logical_Router_Port table is added. If the value of this >> > > option is set to 'true' and if the logical router also have a >> > > distributed gateway port, then routing for this logical router port >> > > is centralized in the chassis hosting the distributed gateway port. >> > > >> > > If a logical switch 'sw0' is connected to a router 'lr0' with the >> > > router port - 'lr0-sw0' with the address - "00:00:00:00:af:12 >> > 192.168.1.1" >> > > , and it has a distributed logical port - 'lr0-public', then the >> > > below logical flow is added in the logical switch pipeline of 'sw0' >> > > if the 'reside-on-redirect-chassis' option is set on 'lr-sw0' - >> > > >> > > table=16(ls_in_l2_lkup), priority=50, match=(eth.dst == >> > > 00:00:00:00:af:12 && >> > is_chassis_resident("cr-lr0-public")), >> > > action=(outport = "sw0-lr0"; output;) >> > > >> > > With the above flow, the packet doesn't enter the router pipeline in >> > > the source chassis. Instead the packet is sent out via the localnet >> > > port of 'sw0'. The gateway chassis upon receiving this packet, runs >> > > the logical router pipeline applying NAT rules and sends the traffic >> > > out via the localnet port of the provider network. The gateway >> > > chassis will also reply to the ARP requests for the router port IPs. >> > > >> > > With this approach, we avoid redirecting the external traffic to the >> > > gateway chassis via the tunnel port. There are a couple of drawbacks >> > > with this approach: >> > > >> > > - East - West routing is no more distributed for the VLAN tenant >> > > networks if 'reside-on-redirect-chassis' option is defined >> > > >> > > - 'dnat_and_snat' NAT rules with 'logical_mac' and 'logical_port' >> > > columns defined will not work for the VLAN tenant networks. >> > > >> > > This approach is taken for now as it is simple. If there is a >> > > requirement to support distributed routing for these VLAN tenant >> > > networks, we can explore other possible solutions. >> > > >> > > [1] - >> > https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch. >> > org_pipermail_ovs-2Ddiscuss_2018-2DApril_046543.html&d=DwICAg&c=s883Gp >> > UCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=mE2qy >> > wxdjIadgBjj3-Xsjbt7jiXYD543pSHvwWZn5sg&s=TupRua9WlqBQG00wHQofyUYjEPymD >> > aCMGNnZisJ-KyY&e= >> > > [2] - >> > https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch. >> > org_pipermail_ovs-2Ddiscuss_2018-2DApril_046557.html&d=DwICAg&c=s883Gp >> > UCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=mE2qy >> > wxdjIadgBjj3-Xsjbt7jiXYD543pSHvwWZn5sg&s=ufwXW9yvvqyU0Uc4YG3VaNekB5ieu >> > 5EpBrRGcK_j0-k&e= >> > > >> > > Reported-at: >> > https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch. >> > org_pipermail_ovs-2Ddiscuss_2018-2DApril_046543.html&d=DwICAg&c=s883Gp >> > UCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=mE2qy >> > wxdjIadgBjj3-Xsjbt7jiXYD543pSHvwWZn5sg&s=TupRua9WlqBQG00wHQofyUYjEPymD >> > aCMGNnZisJ-KyY&e= >> > > Reported-by: venkata anil <mailto:[email protected]> >> > > Co-authored-by: venkata anil <mailto:[email protected]> >> > > Signed-off-by: Numan Siddique <mailto:[email protected]> >> > > Signed-off-by: venkata anil <mailto:[email protected]> >> > > --- >> > > ovn/northd/ovn-northd.8.xml | 30 ++++ >> > > ovn/northd/ovn-northd.c | 71 +++++++--- >> > > ovn/ovn-architecture.7.xml | 160 +++++++++++++++++++++ >> > > ovn/ovn-nb.xml | 43 ++++++ >> > > tests/ovn.at | 273 >> ++++++++++++++++++++++++++++++++++++ >> > > 5 files changed, 561 insertions(+), 16 deletions(-) >> > > >> > > diff --git a/ovn/northd/ovn-northd.8.xml >> > > b/ovn/northd/ovn-northd.8.xml index 7352c6764..f52699bd3 100644 >> > > --- a/ovn/northd/ovn-northd.8.xml >> > > +++ b/ovn/northd/ovn-northd.8.xml >> > > @@ -874,6 +874,25 @@ output; >> > > resident. >> > > </li> >> > > </ul> >> > > + >> > > + <p> >> > > + For the Ethernet address on a logical switch port of type >> > > + <code>router</code>, when that logical switch port's >> > > + <ref column="addresses" table="Logical_Switch_Port" >> > > + db="OVN_Northbound"/> column is set to <code>router</code> >> and >> > > + the connected logical router port specifies a >> > > + <code>reside-on-redirect-chassis</code> and the logical >> router >> > > + to which the connected logical router port belongs to has a >> > > + <code>redirect-chassis</code> distributed gateway logical >> > router >> > > + port: >> > > + </p> >> > > + >> > > + <ul> >> > > + <li> >> > > + The flow for the connected logical router port's Ethernet >> > > + address is only programmed on the >> > <code>redirect-chassis</code>. >> > > + </li> >> > > + </ul> >> > > </li> >> > > >> > > <li> >> > > @@ -1179,6 +1198,17 @@ output; >> > > upstream MAC learning to point to the >> > > <code>redirect-chassis</code>. >> > > </p> >> > > + >> > > + <p> >> > > + For the logical router port with the option >> > > + <code>reside-on-redirect-chassis</code> set (which is >> > centralized), >> > > + the above flows are only programmed on the gateway port >> > instance on >> > > + the <code>redirect-chassis</code> (if the logical router >> has a >> > > + distributed gateway port). This behavior avoids generation >> > > + of multiple ARP responses from different chassis, and >> allows >> > > + upstream MAC learning to point to the >> > > + <code>redirect-chassis</code>. >> > > + </p> >> > > </li> >> > > >> > > <li> >> > > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c index >> > > 31ea5f410..3998a898c 100644 >> > > --- a/ovn/northd/ovn-northd.c >> > > +++ b/ovn/northd/ovn-northd.c >> > > @@ -4426,13 +4426,32 @@ build_lswitch_flows(struct hmap *datapaths, >> > struct hmap *ports, >> > > ds_put_format(&match, "eth.dst == "ETH_ADDR_FMT, >> > > ETH_ADDR_ARGS(mac)); >> > > if (op->peer->od->l3dgw_port >> > > - && op->peer == op->peer->od->l3dgw_port >> > > - && op->peer->od->l3redirect_port) { >> > > - /* The destination lookup flow for the router's >> > > - * distributed gateway port MAC address should >> only >> > be >> > > - * programmed on the "redirect-chassis". */ >> > > - ds_put_format(&match, " && >> is_chassis_resident(%s)", >> > > - >> > op->peer->od->l3redirect_port->json_key); >> > > + && op->peer->od->l3redirect_port >> > > + && op->od->localnet_port) { >> > > + bool add_chassis_resident_check = false; >> > > + if (op->peer == op->peer->od->l3dgw_port) { >> > > + /* The peer of this port represents a >> > distributed >> > > + * gateway port. The destination lookup >> > > + flow >> > for the >> > > + * router's distributed gateway port MAC >> > address should >> > > + * only be programmed on the >> > "redirect-chassis". */ >> > > + add_chassis_resident_check = true; >> > > + } else { >> > > + /* Check if the option >> > 'reside-on-redirect-chassis' >> > > + * is set to true on the peer port. If set >> > > + to >> > true >> > > + * and if the logical switch has a localnet >> > port, it >> > > + * means the router pipeline for the >> > > + packets >> > from >> > > + * this logical switch should be run on the >> > chassis >> > > + * hosting the gateway port. >> > > + */ >> > > + add_chassis_resident_check = smap_get_bool( >> > > + &op->peer->nbrp->options, >> > > + "reside-on-redirect-chassis", false); >> > > + } >> > > + >> > > + if (add_chassis_resident_check) { >> > > + ds_put_format(&match, " && >> > is_chassis_resident(%s)", >> > > + >> > op->peer->od->l3redirect_port->json_key); >> > > + } >> > > } >> > > >> > > ds_clear(&actions); @@ -5197,15 +5216,35 @@ >> > > build_lrouter_flows(struct hmap *datapaths, >> > struct hmap *ports, >> > > op->lrp_networks.ipv4_addrs[i].network_s, >> > > op->lrp_networks.ipv4_addrs[i].plen, >> > > op->lrp_networks.ipv4_addrs[i].addr_s); >> > > - if (op->od->l3dgw_port && op == op->od->l3dgw_port >> > > - && op->od->l3redirect_port) { >> > > - /* Traffic with eth.src = >> l3dgw_port->lrp_networks.ea_s >> > > - * should only be sent from the "redirect-chassis", >> so >> > that >> > > - * upstream MAC learning points to the >> > "redirect-chassis". >> > > - * Also need to avoid generation of multiple ARP >> > responses >> > > - * from different chassis. */ >> > > - ds_put_format(&match, " && is_chassis_resident(%s)", >> > > - op->od->l3redirect_port->json_key); >> > > + >> > > + if (op->od->l3dgw_port && op->od->l3redirect_port && >> > op->peer >> > > + && op->peer->od->localnet_port) { >> > > + bool add_chassis_resident_check = false; >> > > + if (op == op->od->l3dgw_port) { >> > > + /* Traffic with eth.src = >> > l3dgw_port->lrp_networks.ea_s >> > > + * should only be sent from the >> > > + "redirect-chassis", >> > so that >> > > + * upstream MAC learning points to the >> > "redirect-chassis". >> > > + * Also need to avoid generation of multiple >> > > + ARP >> > responses >> > > + * from different chassis. */ >> > > + add_chassis_resident_check = true; >> > > + } else { >> > > + /* Check if the option >> 'reside-on-redirect-chassis' >> > > + * is set to true on the router port. If set to >> true >> > > + * and if peer's logical switch has a localnet >> > port, it >> > > + * means the router pipeline for the packets from >> > > + * peer's logical switch is be run on the chassis >> > > + * hosting the gateway port and it should reply >> > > + to >> > the >> > > + * ARP requests for the router port IPs. >> > > + */ >> > > + add_chassis_resident_check = smap_get_bool( >> > > + &op->nbrp->options, >> > > + "reside-on-redirect-chassis", false); >> > > + } >> > > + >> > > + if (add_chassis_resident_check) { >> > > + ds_put_format(&match, " && >> is_chassis_resident(%s)", >> > > + op->od->l3redirect_port->json_key); >> > > + } >> > > } >> > > >> > > ds_clear(&actions); >> > > diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml >> > > index 6ed2cf132..998470c34 100644 >> > > --- a/ovn/ovn-architecture.7.xml >> > > +++ b/ovn/ovn-architecture.7.xml >> > > @@ -1372,6 +1372,166 @@ >> > > >> https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.openvswitch.org_en_latest_topics_high-2Davailability&d=DwICAg&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=mE2qywxdjIadgBjj3-Xsjbt7jiXYD543pSHvwWZn5sg&s=3YEL0T7qW3h-GbKKAAcQ2q6kFtMqXliOiuOLrpKVQsg&e= >> . >> > > </p> >> > > >> > > + <h2>Tenant VLAN networks connected to a Logical Router</h2> >> > > + >> > > + <p> >> > > + It is possible to have multiple logical switches each with a >> > localnet port >> > > + (representing physical networks) connected to a logical router >> > > + in >> > which one >> > > + may provide the external connectivity via a distributed gatewat >> > port and >> > > + the rest of them are used internally (with VLAN tagged). It is >> > expected >> > > + that <code>ovn-bridge-mappings</code> is configured >> > > + appropriately >> > on the >> > > + chassis. >> > > + </p> >> > > + >> > > + <h3>East West routing</h3> >> > > + <p> >> > > + East-West routing between these tenant VLAN logical switches >> > > + works >> > almost >> > > + the same way as normal logical switches. When the VM sends such >> > > + a >> > packet, >> > > + then: >> > > + </p> >> > > + <ol> >> > > + <li> >> > > + The packet enters the ingress pipeline of the logical router >> > datapath >> > > + via the logical router port in the source chassis. >> > > + </li> >> > > + >> > > + <li> >> > > + Routing decision is taken. >> > > + </li> >> > > + >> > > + <li> >> > > + The packet goes out of the integration bridge to the provider >> > bridge ( >> > > + belonging to the destination logical switch) via the localnet >> > port. >> > > + </li> >> > > + >> > > + <li> >> > > + The destination chassis receives the packet via the localnet >> port >> > > + and delivers to the destination VM. >> > > + </li> >> > > + </ol> >> > > + >> > > + <h3>External traffic</h3> >> > > + >> > > + <p> >> > > + The following happens when a VM sends an external traffic >> > > + (which >> > requires >> > > + NATting) and the chassis hosting the VM doesn't have a >> > > + distributed >> > gateway >> > > + port. >> > > + </p> >> > > + >> > > + <ol> >> > > + <li> >> > > + The packet enters the ingress pipeline of the logical router >> > datapath >> > > + via the logical router port in the source chassis. >> > > + </li> >> > > + >> > > + <li> >> > > + Routing decision is taken. Since the gateway router or the >> > distributed >> > > + gateway port doesn't reside in the source chassis, the traffic >> is >> > > + redirected to the gateway chassis via the tunnel port. >> > > + </li> >> > > + >> > > + <li> >> > > + The gateway chassis receives the packet, applies the NAT rules >> and >> > > + forwards it via the localnet port. >> > > + </li> >> > > + </ol> >> > > + >> > > + <p> >> > > + Although this works, the VM traffic is tunnelled. In order for >> it to >> > > + work properly, the MTU of the VLAN tenant networks must be >> > > + lowered >> > to >> > > + account for the tunnel encapsulation. >> > > + </p> >> > > + >> > > + <h2>Centralized routing for VLAN tenant networks</h2> >> > > + >> > > + <p> >> > > + To overcome the tunnel encapsulation problem described in the >> > previous >> > > + section, <code>OVN</code> supports the option of enabling >> > centralized >> > > + routing for VLAN tenant networks. CMS can configure the option >> > > + <ref column="options:reside-on-redirect-chassis" >> > > + table="Logical_Router_Port" db="OVN_NB"/> to <code>true</code> >> > > + for >> > each >> > > + <ref table="Logical_Router_Port" db="OVN_NB"/> which connects to >> the >> > > + logical switch of the VLAN tenant network. This causes the >> gateway >> > > + chassis (hosting the distributed gateway port) to handle all the >> > > + routing for these networks, making it centralized. It will reply >> to >> > > + the ARP requests for the logical router port IPs. >> > > + </p> >> > > + >> > > + <p> >> > > + If the logical router doesn't have a distributed gateway port >> > connecting >> > > + to the provider network, then this option is ignored by >> > <code>OVN</code>. >> > > + </p> >> > > + >> > > + <p> >> > > + The following happens when a VM sends an east-west traffic >> > > + which >> > needs to >> > > + be routed: >> > > + </p> >> > > + >> > > + <ol> >> > > + <li> >> > > + The packet from the VM enters the logical datapath pipeline >> > > + of >> > the source >> > > + VLAN network in the source chassis and is sent out via the >> > localnet port >> > > + (instead of sending it to router pipeline). >> > > + </li> >> > > + >> > > + <li> >> > > + The packet enters the logical datapath pipeline of the source >> VLAN >> > > + network in the gateway chassis and is sent to the logical >> datapath >> > > + pipeline belonging to the logical router. >> > > + </li> >> > > + >> > > + <li> >> > > + Routing decision is taken. >> > > + </li> >> > > + >> > > + <li> >> > > + The packet enters the logical datapath pipeline of the >> destination >> > > + VLAN network. The packet is delivered to the destination VM >> > > + if it >> > resides >> > > + in the same chassis. Otherwise the packet is sent out via the >> > localnet >> > > + port of the destination VLAN network. >> > > + </li> >> > > + >> > > + <li> >> > > + The destination chassis receives the packet via the localnet >> port >> > > + and delivers to the destination VM. >> > > + </li> >> > > + </ol> >> > > + >> > > + <p> >> > > + The following happens when a VM sends an external traffic which >> > requires >> > > + NATting: >> > > + </p> >> > > + >> > > + <ol> >> > > + <li> >> > > + The packet from the VM enters the logical datapath pipeline >> > > + of >> > the source >> > > + VLAN network in the source chassis and is sent out via the >> > localnet port >> > > + (instead of sending it to router pipeline). >> > > + </li> >> > > + >> > > + <li> >> > > + The packet enters the logical datapath pipeline of the source >> VLAN >> > > + network in the gateway chassis and is sent to the logical >> datapath >> > > + pipeline belonging to the logical router. >> > > + </li> >> > > + >> > > + <li> >> > > + Routing decision is taken and NAT rules are applied. >> > > + </li> >> > > + >> > > + <li> >> > > + The packet enters the logical datapath pipeline of the >> > > + provider >> > network >> > > + and is sent out via the localnet port of the provider network. >> > > + </li> >> > > + </ol> >> > > + >> > > + <p> >> > > + For the reverse external traffic, the gateway chassis applies >> > > + the >> > unNATting >> > > + rules and sends the packet via the localnet port of the VLAN >> tenant >> > > + network and the destination chassis receives the packet and >> > delivers to >> > > + the VM. >> > > + </p> >> > > + >> > > <h2>Life Cycle of a VTEP gateway</h2> >> > > >> > > <p> >> > > diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml index >> > > 8564ed39c..13ae56e13 100644 >> > > --- a/ovn/ovn-nb.xml >> > > +++ b/ovn/ovn-nb.xml >> > > @@ -1635,6 +1635,49 @@ >> > > chassis to enable high availability. >> > > </p> >> > > </column> >> > > + >> > > + <column name="options" key="reside-on-redirect-chassis"> >> > > + <p> >> > > + Generally routing is distributed in <code>OVN</code>. The >> > packet >> > > + from a logical port which needs to be routed hits the >> > > + router >> > pipeline >> > > + in the source chassis. For the East-West traffic, the >> > > + packet >> > is >> > > + sent directly to the destination chassis. For the outside >> > traffic >> > > + the packet is sent to the gateway chassis. >> > > + </p> >> > > + >> > > + <p> >> > > + When this option is set, <code>OVN</code> considers this >> > > + only >> > if >> > > + </p> >> > > + >> > > + <ul> >> > > + <li> >> > > + The logical router to which this logical router port >> > belongs to >> > > + has a distributed gateway port. >> > > + </li> >> > > + >> > > + <li> >> > > + The peer's logical switch has a localnet port >> (representing >> > > + a tenant VLAN network) >> > > + </li> >> > > + </ul> >> > > + >> > > + <p> >> > > + When this option is set to <code>true</code>, then the >> packet >> > > + which needs to be routed hits the router pipeline in the >> > chassis >> > > + hosting the distributed gateway router port. The source >> > chassis >> > > + pushes out this traffic via the localnet port. With this >> the >> > > + East-West traffic is no more distributed and will always >> > > + go >> > through >> > > + the gateway chassis. >> > > + </p> >> > > + >> > > + <p> >> > > + Without this option set, for any traffic destined to >> > > + outside >> > from a >> > > + logical port which belongs to a logical switch with >> > > + localnet >> > port, >> > > + the source chassis will send the traffic to the gateway >> > chassis via >> > > + the tunnel port instead of the localnet port and this >> > > + could >> > cause MTU >> > > + issues. >> > > + </p> >> > > + </column> >> > > </group> >> > > >> > > <group title="Attachment"> >> > > diff --git a/tests/ovn.at b/tests/ovn.at index 769e09f81..504ba228d >> > > 100644 >> > > --- a/tests/ovn.at >> > > +++ b/tests/ovn.at >> > > @@ -8537,6 +8537,279 @@ OVN_CLEANUP([hv1],[hv2],[hv3]) >> > > >> > > AT_CLEANUP >> > > >> > > +# VLAN traffic for external network redirected through distributed >> > router >> > > +# gateway port should use vlans(i.e input network vlan tag) across >> > hypervisors >> > > +# instead of tunneling. >> > > +AT_SETUP([ovn -- vlan traffic for external network with distributed >> > router gateway port]) >> > > +AT_SKIP_IF([test $HAVE_PYTHON = no]) ovn_start >> > > + >> > > +# Logical network: >> > > +# # One LR R1 that has switches foo (192.168.1.0/24) and # # alice >> > > +(172.16.1.0/24) connected to it. The logical port # # between R1 >> > > +and alice has a "redirect-chassis" specified, # # i.e. it is the >> > > +distributed router gateway port(172.16.1.6). >> > > +# # Switch alice also has a localnet port defined. >> > > +# # An additional switch outside has the same subnet as alice # # >> > > +(172.16.1.0/24), a localnet port and nexthop port(172.16.1.1) # # >> > > +which will receive the packet destined for external network # # >> > > +(i.e 8.8.8.8 as destination ip). >> > > + >> > > +# Physical network: >> > > +# # Three hypervisors hv[123]. >> > > +# # hv1 hosts vif foo1. >> > > +# # hv2 is the "redirect-chassis" that hosts the distributed router >> > gateway port. >> > > +# # hv3 hosts nexthop port vif outside1. >> > > +# # All other tests connect hypervisors to network n1 through >> > > +br-phys >> > for tunneling. >> > > +# # But in this test, hv1 won't connect to n1(and no br-phys in >> > > +hv1), >> > and >> > > +# # in order to show vlans(instead of tunneling) used between hv1 >> > > +and >> > hv2, >> > > +# # a new network n2 created and hv1 and hv2 connected to this >> > > +network >> > through br-ex. >> > > +# # hv2 and hv3 are still connected to n1 network through br-phys. >> > > +net_add n1 >> > > + >> > > +# We are not calling ovn_attach for hv1, to avoid adding br-phys. >> > > +# Tunneling won't work in hv1 as ovn-encap-ip is not added to any >> > bridge in hv1 >> > > +sim_add hv1 >> > > +as hv1 >> > > +ovs-vsctl \ >> > > + -- set Open_vSwitch . external-ids:system-id=hv1 \ >> > > + -- set Open_vSwitch . >> > external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ >> > > + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve,vxlan \ >> > > + -- set Open_vSwitch . external-ids:ovn-encap-ip=192.168.0.1 \ >> > > + -- add-br br-int \ >> > > + -- set bridge br-int fail-mode=secure >> > other-config:disable-in-band=true \ >> > > + -- set Open_vSwitch . >> > > + external-ids:ovn-bridge-mappings=public:br-ex >> > > + >> > > +start_daemon ovn-controller >> > > +ovs-vsctl -- add-port br-int hv1-vif1 -- \ >> > > + set interface hv1-vif1 external-ids:iface-id=foo1 \ >> > > + ofport-request=1 >> > > + >> > > +sim_add hv2 >> > > +as hv2 >> > > +ovs-vsctl add-br br-phys >> > > +ovn_attach n1 br-phys 192.168.0.2 >> > > +ovs-vsctl set Open_vSwitch . >> > external-ids:ovn-bridge-mappings="public:br-ex,phys:br-phys" >> > > + >> > > +sim_add hv3 >> > > +as hv3 >> > > +ovs-vsctl add-br br-phys >> > > +ovn_attach n1 br-phys 192.168.0.3 >> > > +ovs-vsctl -- add-port br-int hv3-vif1 -- \ >> > > + set interface hv3-vif1 external-ids:iface-id=outside1 \ >> > > + options:tx_pcap=hv3/vif1-tx.pcap \ >> > > + options:rxq_pcap=hv3/vif1-rx.pcap \ >> > > + ofport-request=1 >> > > +ovs-vsctl set Open_vSwitch . >> > external-ids:ovn-bridge-mappings="phys:br-phys" >> > > + >> > > +# Create network n2 for vlan connectivity between hv1 and hv2 >> > > +net_add n2 >> > > + >> > > +as hv1 >> > > +ovs-vsctl add-br br-ex >> > > +net_attach n2 br-ex >> > > + >> > > +as hv2 >> > > +ovs-vsctl add-br br-ex >> > > +net_attach n2 br-ex >> > > + >> > > +OVN_POPULATE_ARP >> > > + >> > > +ovn-nbctl create Logical_Router name=R1 >> > > + >> > > +ovn-nbctl ls-add foo >> > > +ovn-nbctl ls-add alice >> > > +ovn-nbctl ls-add outside >> > > + >> > > +# Connect foo to R1 >> > > +ovn-nbctl lrp-add R1 foo 00:00:01:01:02:03 192.168.1.1/24 ovn-nbctl >> > > +lsp-add foo rp-foo -- set Logical_Switch_Port rp-foo \ >> > > + type=router options:router-port=foo \ >> > > + -- lsp-set-addresses rp-foo router >> > > + >> > > +# Connect alice to R1 as distributed router gateway port >> > > +(172.16.1.6) >> > on hv2 >> > > +ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.6/24 \ >> > > + -- set Logical_Router_Port alice options:redirect-chassis="hv2" >> > > +ovn-nbctl lsp-add alice rp-alice -- set Logical_Switch_Port rp-alice >> \ >> > > + type=router options:router-port=alice \ >> > > + -- lsp-set-addresses rp-alice router \ >> > > + >> > > +# Create logical port foo1 in foo >> > > +ovn-nbctl lsp-add foo foo1 \ >> > > +-- lsp-set-addresses foo1 "f0:00:00:01:02:03 192.168.1.2" >> > > + >> > > +# Create logical port outside1 in outside, which is a nexthop >> > > +address # for 172.16.1.0/24 ovn-nbctl lsp-add outside outside1 \ >> > > +-- lsp-set-addresses outside1 "f0:00:00:01:02:04 172.16.1.1" >> > > + >> > > +# Set default gateway (nexthop) to 172.16.1.1 ovn-nbctl >> > > +lr-route-add R1 "0.0.0.0/0" 172.16.1.1 alice AT_CHECK([ovn-nbctl >> > > +lr-nat-add R1 snat 172.16.1.6 192.168.1.1/24]) ovn-nbctl set >> > > +Logical_Switch_Port rp-alice options:nat-addresses=router >> > > + >> > > +ovn-nbctl lsp-add foo ln-foo >> > > +ovn-nbctl lsp-set-addresses ln-foo unknown ovn-nbctl >> > > +lsp-set-options ln-foo network_name=public ovn-nbctl lsp-set-type >> > > +ln-foo localnet AT_CHECK([ovn-nbctl set Logical_Switch_Port ln-foo >> > > +tag=2]) >> > > + >> > > +# Create localnet port in alice >> > > +ovn-nbctl lsp-add alice ln-alice >> > > +ovn-nbctl lsp-set-addresses ln-alice unknown ovn-nbctl lsp-set-type >> > > +ln-alice localnet ovn-nbctl lsp-set-options ln-alice >> > > +network_name=phys >> > > + >> > > +# Create localnet port in outside >> > > +ovn-nbctl lsp-add outside ln-outside ovn-nbctl lsp-set-addresses >> > > +ln-outside unknown ovn-nbctl lsp-set-type ln-outside localnet >> > > +ovn-nbctl lsp-set-options ln-outside network_name=phys >> > > + >> > > +# Allow some time for ovn-northd and ovn-controller to catch up. >> > > +# XXX This should be more systematic. >> > > +ovn-nbctl --wait=hv --timeout=3 sync >> > > + >> > > +# Check that there is a logical flow in logical switch foo's >> > > +pipeline # to set the outport to rp-foo (which is expected). >> > > +OVS_WAIT_UNTIL([test 1 = `ovn-sbctl dump-flows foo | grep >> > > +ls_in_l2_lkup >> > | \ >> > > +grep rp-foo | grep -v is_chassis_resident | wc -l`]) >> > > + >> > > +# Set the option 'reside-on-redirect-chassis' for foo ovn-nbctl set >> > > +logical_router_port foo >> > options:reside-on-redirect-chassis=true >> > > +# Check that there is a logical flow in logical switch foo's >> > > +pipeline # to set the outport to rp-foo with the condition >> is_chassis_redirect. >> > > +ovn-sbctl dump-flows foo >> > > +OVS_WAIT_UNTIL([test 1 = `ovn-sbctl dump-flows foo | grep >> > > +ls_in_l2_lkup >> > | \ >> > > +grep rp-foo | grep is_chassis_resident | wc -l`]) >> > > + >> > > +echo "---------NB dump-----" >> > > +ovn-nbctl show >> > > +echo "---------------------" >> > > +ovn-nbctl list logical_router >> > > +echo "---------------------" >> > > +ovn-nbctl list nat >> > > +echo "---------------------" >> > > +ovn-nbctl list logical_router_port >> > > +echo "---------------------" >> > > + >> > > +echo "---------SB dump-----" >> > > +ovn-sbctl list datapath_binding >> > > +echo "---------------------" >> > > +ovn-sbctl list port_binding >> > > +echo "---------------------" >> > > +ovn-sbctl dump-flows >> > > +echo "---------------------" >> > > +ovn-sbctl list chassis >> > > +echo "---------------------" >> > > + >> > > +for chassis in hv1 hv2 hv3; do >> > > + as $chassis >> > > + echo "------ $chassis dump ----------" >> > > + ovs-vsctl show br-int >> > > + ovs-ofctl show br-int >> > > + ovs-ofctl dump-flows br-int >> > > + echo "--------------------------" >> > > +done >> > > + >> > > +ip_to_hex() { >> > > + printf "%02x%02x%02x%02x" "$@" >> > > +} >> > > + >> > > +foo1_ip=$(ip_to_hex 192 168 1 2) >> > > +gw_ip=$(ip_to_hex 172 16 1 6) >> > > +dst_ip=$(ip_to_hex 8 8 8 8) >> > > +nexthop_ip=$(ip_to_hex 172 16 1 1) >> > > + >> > > +foo1_mac="f00000010203" >> > > +foo_mac="000001010203" >> > > +gw_mac="000002010203" >> > > +nexthop_mac="f00000010204" >> > > + >> > > +# Send ip packet from foo1 to 8.8.8.8 src_mac="f00000010203" >> > > +dst_mac="000001010203" >> > > >> > +packet=${foo_mac}${foo1_mac}08004500001c0000000040110000${foo1_ip}${d >> > +st_ip}0035111100080000 >> > > + >> > > +as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet sleep 2 >> > > + >> > > +# ARP request packet for nexthop_ip to expect at outside1 >> > > >> > +arp_request=ffffffffffff${gw_mac}08060001080006040001${gw_mac}${gw_ip >> > +}000000000000${nexthop_ip} >> > > +echo $arp_request >> hv3-vif1.expected cat hv3-vif1.expected > >> > > +expout $PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv3/vif1-tx.pcap >> > > +| grep >> > ${nexthop_ip} | uniq > hv3-vif1 >> > > +AT_CHECK([sort hv3-vif1], [0], [expout]) >> > > + >> > > +# Send ARP reply from outside1 back to the router >> > > +reply_mac="f00000010204" >> > > >> > +arp_reply=${gw_mac}${nexthop_mac}08060001080006040002${nexthop_mac}${ >> > +nexthop_ip}${gw_mac}${gw_ip} >> > > + >> > > +as hv3 ovs-appctl netdev-dummy/receive hv3-vif1 $arp_reply >> > > +OVS_WAIT_UNTIL([ >> > > + test `as hv2 ovs-ofctl dump-flows br-int | grep table=66 | \ >> > > +grep actions=mod_dl_dst:f0:00:00:01:02:04 | wc -l` -eq 1 >> > > + ]) >> > > + >> > > +# VLAN tagged packet with router port(192.168.1.1) MAC as >> > > +destination >> > MAC >> > > +# is expected on bridge connecting hv1 and hv2 >> > > >> > +expected=${foo_mac}${foo1_mac}8100000208004500001c0000000040110000${f >> > +oo1_ip}${dst_ip}0035111100080000 >> > > +echo $expected > hv1-br-ex_n2.expected >> > > + >> > > +# Packet to Expect at outside1 i.e nexthop(172.16.1.1) port. >> > > +# As connection tracking not enabled for this test, snat can't be >> > > +done >> > on the packet. >> > > +# We still see foo1 as the source ip address. But source >> > > +mac(gateway >> > MAC) and >> > > +# dest mac(nexthop mac) are properly configured. >> > > >> > +expected=${nexthop_mac}${gw_mac}08004500001c000000003f110100${foo1_ip >> > +}${dst_ip}0035111100080000 >> > > +echo $expected > hv3-vif1.expected >> > > + >> > > +reset_pcap_file() { >> > > + local iface=$1 >> > > + local pcap_file=$2 >> > > + ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap >> > > +\ options:rxq_pcap=dummy-rx.pcap >> > > + rm -f ${pcap_file}*.pcap >> > > + ovs-vsctl -- set Interface $iface >> > options:tx_pcap=${pcap_file}-tx.pcap \ >> > > +options:rxq_pcap=${pcap_file}-rx.pcap >> > > +} >> > > + >> > > +as hv1 reset_pcap_file br-ex_n2 hv1/br-ex_n2 as hv3 reset_pcap_file >> > > +hv3-vif1 hv3/vif1 sleep 2 as hv1 ovs-appctl netdev-dummy/receive >> > > +hv1-vif1 $packet sleep 2 >> > > + >> > > +# On hv1, the packet should not go from vlan switch pipleline to >> > > +router # pipleine as hv1 ovs-ofctl dump-flows br-int >> > > + >> > > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=65 | grep >> > "priority=100,reg15=0x1,metadata=0x2" \ >> > > +| grep actions=clone | grep -v n_packets=0 | wc -l], [0], [[0 >> > > +]]) >> > > + >> > > +# On hv1, table 32 check that no packet goes via the tunnel port >> > > +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=32 \ >> > > +| grep "NXM_NX_TUN_ID" | grep -v n_packets=0 | wc -l], [0], [[0 >> > > +]]) >> > > + >> > > +ip_packet() { >> > > + grep "1010203f00000010203" >> > > +} >> > > + >> > > +# Check vlan tagged packet on the bridge connecting hv1 and hv2 >> > > +with the # foo1's mac. >> > > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/br-ex_n2-tx.pcap | >> > ip_packet | uniq > hv1-br-ex_n2 >> > > +cat hv1-br-ex_n2.expected > expout >> > > +AT_CHECK([sort hv1-br-ex_n2], [0], [expout]) >> > > + >> > > +# Check expected packet on nexthop interface $PYTHON >> > > +"$top_srcdir/utilities/ovs-pcap.in" hv3/vif1-tx.pcap | grep >> > ${foo1_ip}${dst_ip} | uniq > hv3-vif1 >> > > +cat hv3-vif1.expected > expout >> > > +AT_CHECK([sort hv3-vif1], [0], [expout]) >> > > + >> > > +OVN_CLEANUP([hv1],[hv2],[hv3]) >> > > +AT_CLEANUP >> > > + >> > > AT_SETUP([ovn -- IPv6 ND Router Solicitation responder]) >> > > AT_KEYWORDS([ovn-nd_ra]) >> > > AT_SKIP_IF([test $HAVE_PYTHON = no]) >> > > >> > >> > >> _______________________________________________ >> dev mailing list >> mailto:[email protected] >> >> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=mE2qywxdjIadgBjj3-Xsjbt7jiXYD543pSHvwWZn5sg&s=vk8-2EI8-njSdNsgLyP81K8HEZOJfSxugzH3JpXsMUM&e= >> > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
