On Wed, 7 Nov 2018 at 15:01, Guru Shetty <[email protected]> wrote: > > > On Fri, 19 Oct 2018 at 15:17, Ankur Sharma <[email protected]> > wrote: > >> Hi Guru, >> >> Thanks for taking a look. >> Please find the detailed explanation of problem statement inline. >> >> Thanks >> >> Regards, >> Ankur >> >> >> >> *From:* Guru Shetty <[email protected]> >> *Sent:* Friday, October 19, 2018 9:35 AM >> *To:* Ankur Sharma <[email protected]> >> *Cc:* ovs-dev <[email protected]> >> *Subject:* Re: [ovs-dev] OVN based distributed virtual routing for VLAN >> backed networks >> >> >> >> >> >> On Tue, 16 Oct 2018 at 15:43, Ankur Sharma <[email protected]> >> wrote: >> >> Hi, >> >> We have done some effort in evaluating usage of OVN for >> Distributed Virtual Routing (DVR) for vlan backed networks. >> >> >> >> Would you mind explaining the above statement with a lot of details? I >> would like to understand the problem well before looking at the proposed >> solution. >> >> [ANKUR]: >> a. OVN provides logical routing and switching, but was designed for >> scenario where logical switch is of type “overlay”. >> >> i.e Packets going on the wire are always encapsulated (exception >> would be communication with external network, via NAT). >> >> b. Our proposal is to enhance OVN to support cases where logical switch >> is of type “vlan”, >> >> i.e there is no encapsulation and “inner” packet goes on the wire >> “as is”. >> >> >> >> c. We evaluated current OVN implementation for both logical-switches and >> logical-routers. >> >> >> >> d. Our proposal is meant to highlight the gaps (as of now) and how we >> plan to fix them. >> > > Let me summarize my understanding of your mail and you can poke holes in > it, if I am wrong and probably rephrase your problem statement. > > Current state of OVN > ================ > > The following is the current state of OVN's "localnet" feature. I haven't > used "localnet" feature for a long time and this is my understanding based > on reading the man pages and the localnet test in tests/ovn.at today. > > OVN currently supports logical switches and lets you interconnect them > with logical routers. In a particular logical switch, you can have many > logical ports which are backed by a VM running on different hypervisors. > The VMs across multiple hypervisors which are backed by a OVN logical port > only talk via overlay networks. In addition, you can add a logical port of > type "localnet" to these logical switches. This "localnet" logical port has > a "tag" associated with it. This localnet port is used to let the VMs of a > OVN logical switch talk to other network endpoints that are not in OVN (for > e.g a baremetal machine), but are in the same subnet. When OVN (i.e br-int) > receives a ARP broadcast packet from a OVN logical port in a logical > switch, it will be sent to all other logical ports including the "localnet" > port. > > You can potentially connect two of these logical switches via a OVN > router. The OVN known logical ports will talk to each other via overlay > networks. But, if you want to exit the network, you need to go out of a > gateway port. > > You can also use a "l2gateway" to connect to multiple vlan backed machines > in the same logical switch that are outside OVN. >
I had a talk with Numan in IRC and I think I understand the problem better. So my above understanding was clearly wrong. Let me re-summarize the problem statement in another mail. > > My understanding of your problem statement > =================================== > > You want to achieve the same feature set as what currently exists in OVN, > but you don't want to use overlay networks when 2 known logical ports of > OVN (say backed by a VM) exists in a "vlan backed logical switch". Is that > the only difference? If that is the only difference, my question is "why?". > Why do you want to avoid overlay networks? Do you gain anything else out of > it? What is it? > > > > > > > > > >> >> >> >> We would like to take it forward with the community. >> >> We understand that some of the work could be overlapping with existing >> patches in review. >> >> We would appreciate the feedback and would be happy to update our patches >> to avoid known overlaps. >> >> This email explains the proposal. We will be following it up with patches. >> Each "CODE CHANGES" section summarizes the change that corresponding patch >> would have. >> >> >> DISTRIBUTED VIRTUAL ROUTING FOR VLAN BACKED NETWORKS >> ====================================================== >> >> >> 1. OVN Bridge Deployment >> ------------------------------------ >> >> Our design follows following ovn-bridge deployment model >> (please refer to figure OVN Bridge deployment). >> i. br-int ==> OVN managed bridge. >> br-pif ==> Learning Bridge, where physical NICs will be connected. >> >> ii. Any packet that should be on physical network, will travel from >> BR-INT >> to BR-PIF, via patch ports (localnet ports). >> >> 2. Layer 2 >> ------------- >> >> DESIGN: >> ~~~~~~~ >> a. Leverage on localnet logical port type as path port between br-int >> and >> br-pif. >> b. Each VLAN backed logical switch will have a localnet port connected >> to it. >> c. Tagging and untagging of vlan headers happens at localnet port >> boundary. >> >> PIPELINE EXECUTION: >> ~~~~~~~~~~~~~~~~~~~ >> a. Unlike geneve encap based solution, where we execute ingress >> pipeline on >> source chassis and egress pipeline on destination chassis, for vlan >> backed logical switches, packet will go through ingress pipeline >> on destination chassis as well. >> >> PACKET FLOW (Figure 1. shows topology and Figure 2. shows the packet >> flow): >> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> a. VM sends unicast traffic (destined to VM2_MAC) to br-int. >> b. For br-int, destination mac is not local, hence it will forward it >> to >> localnet port (by design), which is attached to br-pif. This is >> the stage at which vlan tag is added. Br-pif forwards the packet >> to physical interface. >> c. br-pif on destination chassis sends the received traffic to >> patch-ports >> on br-int (as unicast or unknown unicast). >> d. br-int does vlan tag check, strips the vlan header and sends >> the packet to ingress pipeline of the corresponding datapath. >> >> >> KEY DIFFERENCES AS COMPARED TO OVERLAY: >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> a. No encapsulation. >> b. Both ingress and egress pipelines of logical switch are executed on >> both source and destination hypervisor (unlike overlay where >> ingress >> pipeline is executed on source hypervisor and egress on >> destination). >> >> CODE CHANGES: >> ~~~~~~~~~~~~~ >> a. ovn-nb.ovsschema: >> 1. Add a new column to table Logical_Switch. >> 2. Column name would be "type". >> 3. Values would be either "vlan" or "overlay", with "overlay" >> being default. >> >> b. ovn-sbctl: >> 1. Add a new cli which sets the "type" of logical-switch. >> ovn-nbctl ls-set-network-type SWITCH TYPE >> >> c. ovn-northd: >> 1. Add a new enum to ovn_datapath struct, which will indicate >> if logical_switch datapath type is overlay or vlan. >> 2. Populate a new key value pair in southbound database for >> Datapath >> Bindings of Logical_Switch. >> 3. Key value pair: <logical-switch-type, "vlan" or "overlay">, >> default >> will be overlay. >> >> >> 3. Layer 3 East West >> -------------------- >> >> DESIGN: >> ~~~~~~~ >> a. Since the router port is distributed and there is no encapsulation, >> hence packets with router port mac as source mac cannot go on wire. >> b. We propose replacing router port mac with a chassis specific mac, >> whenever packet goes on wire. >> c. Number of chassis_mac per chassis could be dependent on number of >> physical nics and corresponding bond policy on br-pif. >> >> As of now, we propose only one chassis_mac per chassis >> (shared by all resident logical routers). However, we are analyzing >> if br-pif's bond policy would require more macs per chassis. >> >> PIPELINE EXECUTION: >> ~~~~~~~~~~~~~~~~~~~ >> a. For a DVR E-W flow, both ingress and egress pipelines for >> logical_router >> will execute on source chassis only. >> >> PACKET FLOW (Figure 3. shows topology and Figure 4. shows the packet >> flow): >> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> a. VM1 sends packet (destined to IP2), to br-int. >> b. On Source hypervisor, packet goes through following pipelines: >> 1. Ingress: logical-switch 1 >> 2. Egress: logical-switch 1 >> 3. Ingress: logical-router >> 4. Egress: logical-router >> 5. Ingress: logical-switch2 >> 6. Egress: logical-switch2 >> >> On wire, packet goes out with destination logical switch's vlan. >> As mentioned in design, source mac (RP2_MAC) would be replaced with >> CHASSIS_MAC and destination mac would be that of VM2. >> >> c. Packet reaches destination chassis and enters logical-switch2 >> pipeline in br-int. >> d. Packet goes through logical-switch2 pipeline (both ingress and >> egress) >> and gets forwarded to VM2. >> >> CODE CHANGES: >> ~~~~~~~~~~~~~ >> a. ovn-sb.ovsschema: >> 1. Add a new column to the table Chassis. >> 2. Column name would be "chassis_macs", type being string and no >> limit on range of values. >> 3. This column will hold a list if chassis unique macs. >> 4. This table will be populated from ovn-controller. >> >> b. ovn-sbctl: >> 1. CLI to add/delete chassis_macs to/from the south bound >> database. >> >> c. ovn-controller: >> 1. Read chassis macs from OVS Open_Vswitch table and populate >> south bound database. >> 2. In table=65, add a new flow at priority 150, which will do >> following: >> a. Match: source_mac == router_port_mac, metadata == >> destination_logical_switch, logical_outport = localnet_port >> b. Action: Replace source mac with chassis_mac, add vlan tag. >> >> >> 4. LAYER 3 North South (NO NAT) >> ------------------------------- >> >> DESIGN: >> ~~~~~~~ >> a. For talking to external network endpoint, we will need a gateway >> on OVN DVR. >> b. We propose to use the gateway_chassis construct to achieve the same. >> c. LRP will be attached to Gateway Chassis(s) and only on the active >> chassis we will respond to ARP request for the LRP IP from undelay >> network. >> d. If NATing (keeping state) is not involved then traffic need not go >> via the gateway chassis always, i.e traffic from OVN chassis to >> external network need not go via the gateway chassis. >> >> PIPELINE EXECUTION: >> ~~~~~~~~~~~~~~~~~~~ >> a. From endpoint on OVN chassis to endpoint on underlay. >> i. Like DVR E-W, logical_router ingress and egress pipelines are >> executed on source chassis. >> >> b. From endpoint on underlay TO endpoint on OVN chassis. >> i. logical_router ingress and egress pipelines are executed on >> gateway chassis. >> >> PACKET FLOW LS ENDPOINT to UNDERLAY ENDPOINT (Figure 5. shows >> topology): >> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> a. Packet flow in this case is exactly same as Layer 3 E-W. >> >> >> PACKET FLOW UNDERLAY ENDPOINT to LS ENDPOINT (Figure 5. shows topology >> and >> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> Figure 6. shows the packet flow): >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> a. Gateway for endpoints behind DVR will be resident on only >> gateway-chassis. >> b. Unicast packets will come to gateway-chassis, with destination MAC >> being RP2_MAC. >> c. From now on, it is like L3 E-W flow. >> >> CODE CHANGES: >> ~~~~~~~~~~~~~ >> a. ovn-northd: >> 1. Changes to respond to vlan backed router port ARP from uplink, >> only if it is on a gateway chassis. >> 2. Changes to make sure that in the absence of NAT configuration, >> OVN_CHASSIS to external network traffic does not go via the >> gateway >> chassis. >> >> b. ovn-controller: >> 1. Send out garps, advertising the vlan backed router port's >> (which has gateway chassis attached to it) from the >> active gateway chassis. >> >> >> 5. LAYER 3 North South (NAT) >> ---------------------------- >> >> SNAT, DNAT, SNAT_AND_DNAT (without external mac): >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> a. Our proposal aligns with following patch series which is out for >> review: >> link <http://patchwork.ozlabs.org/patch/952119/ >> [patchwork.ozlabs.org] >> <https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_patch_952119_&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=W9SaC-Y4c66z_A0dCbHWCbvbp5dlFgMaLgJckzFobEU&s=Txu4M2xluHnEGQWjNTNPBziw18TH0s98QHTTsJe3fV4&e=> >> > >> >> b. However, our implementation deviates from proposal in following >> areas: >> i. Usage of lr_in_ip_routing: >> Our implementation sets the redirect flag after routing decision >> is taken. >> This is to ensure that a user entered static route will not >> affect the >> redirect decision (unless it is meant to). >> >> ii. Using Tenant VLAN ID for "redirection": >> Our implementation uses external network router port's >> (router port that has gateway chassis attached to it) vlan id >> for redirection. This is because chassisredirect port is NOT on >> tenant network and logically packet is being forwarded to >> chassisredirect port. >> >> >> SNAT_AND_DNAT (with external mac): >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> a. Current OVN implementation of not going via gateway chassis aligns >> with >> our design and it worked fine. >> >> >> This is just an initial proposal. We have identified more areas that >> should be worked upon, we will submit patches (and put forth >> topics/design for discussion), >> as we make progress. >> >> >> Thanks >> >> Regards, >> Ankur >> _______________________________________________ >> dev mailing list >> [email protected] >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> [mail.openvswitch.org] >> <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=W9SaC-Y4c66z_A0dCbHWCbvbp5dlFgMaLgJckzFobEU&s=CISx3hjn0RpVgFyNMLZJLtBIfzp5g1naShRNAqfHqo8&e=> >> >> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
