On Wed, 7 Nov 2018 at 15:01, Guru Shetty <[email protected]> wrote:

>
>
> On Fri, 19 Oct 2018 at 15:17, Ankur Sharma <[email protected]>
> wrote:
>
>> Hi Guru,
>>
>> Thanks for taking a look.
>> Please find the detailed explanation of problem statement inline.
>>
>> Thanks
>>
>> Regards,
>> Ankur
>>
>>
>>
>> *From:* Guru Shetty <[email protected]>
>> *Sent:* Friday, October 19, 2018 9:35 AM
>> *To:* Ankur Sharma <[email protected]>
>> *Cc:* ovs-dev <[email protected]>
>> *Subject:* Re: [ovs-dev] OVN based distributed virtual routing for VLAN
>> backed networks
>>
>>
>>
>>
>>
>> On Tue, 16 Oct 2018 at 15:43, Ankur Sharma <[email protected]>
>> wrote:
>>
>> Hi,
>>
>> We have done some effort in evaluating usage of OVN for
>> Distributed Virtual Routing (DVR) for vlan backed networks.
>>
>>
>>
>> Would you mind explaining the above statement with a lot of details? I
>> would like to understand the problem well before looking at the proposed
>> solution.
>>
>> [ANKUR]:
>> a. OVN provides logical routing and switching, but was designed for
>> scenario where logical switch is of type “overlay”.
>>
>>      i.e Packets going on the wire are always encapsulated (exception
>> would be communication with external network, via NAT).
>>
>> b. Our proposal is to enhance OVN to support cases where logical switch
>> is of type “vlan”,
>>
>>      i.e there is no encapsulation and “inner” packet goes on the wire
>> “as is”.
>>
>>
>>
>> c. We evaluated current OVN implementation for both logical-switches and
>> logical-routers.
>>
>>
>>
>> d. Our proposal is meant to highlight the gaps (as of now) and how we
>> plan to fix them.
>>
>
> Let me summarize my understanding of your mail and you can poke holes in
> it, if I am wrong and probably rephrase your problem statement.
>
> Current state of OVN
> ================
>
> The following is the current state of OVN's "localnet" feature. I haven't
> used "localnet" feature for a long time and this is my understanding based
> on reading the man pages and the localnet test in tests/ovn.at today.
>
> OVN currently supports logical switches and lets you interconnect them
> with logical routers. In a particular logical switch, you can have many
> logical ports which are backed by a VM running on different hypervisors.
> The VMs across multiple hypervisors which are backed by a OVN logical port
> only talk via overlay networks. In addition, you can add a logical port of
> type "localnet" to these logical switches. This "localnet" logical port has
> a "tag" associated with it. This localnet port is used to let the VMs of a
> OVN logical switch talk to other network endpoints that are not in OVN (for
> e.g a baremetal machine), but are in the same subnet. When OVN (i.e br-int)
> receives a ARP broadcast packet from a OVN logical port in a logical
> switch, it will be sent to all other logical ports including the "localnet"
> port.
>
> You can potentially connect two of these logical switches via a OVN
> router. The OVN known logical ports will talk to each other via overlay
> networks. But, if you want to exit the network, you need to go out of a
> gateway port.
>
> You can also use a "l2gateway" to connect to multiple vlan backed machines
> in the same logical switch that are outside OVN.
>

I had a talk with Numan in IRC and I think I understand the problem better.
So my above understanding was clearly wrong. Let me re-summarize the
problem statement in another mail.





>
> My understanding of your problem statement
> ===================================
>
> You want to achieve the same feature set as what currently exists in OVN,
> but you don't want to use overlay networks when 2 known logical ports of
> OVN (say backed by a VM) exists in a "vlan backed logical switch". Is that
> the only difference? If that is the only difference, my question is "why?".
> Why do you want to avoid overlay networks? Do you gain anything else out of
> it? What is it?
>
>
>
>
>
>
>
>
>
>>
>>
>>
>> We would like to take it forward with the community.
>>
>> We understand that some of the work could be overlapping with existing
>> patches in review.
>>
>> We would appreciate the feedback and would be happy to update our patches
>> to avoid known overlaps.
>>
>> This email explains the proposal. We will be following it up with patches.
>> Each "CODE CHANGES" section summarizes the change that corresponding patch
>> would have.
>>
>>
>> DISTRIBUTED VIRTUAL ROUTING FOR VLAN BACKED NETWORKS
>> ======================================================
>>
>>
>> 1. OVN Bridge Deployment
>> ------------------------------------
>>
>> Our design follows following ovn-bridge deployment model
>> (please refer to figure OVN Bridge deployment).
>>     i. br-int ==> OVN managed bridge.
>>        br-pif ==> Learning Bridge, where physical NICs will be connected.
>>
>>    ii. Any packet that should be on physical network, will travel from
>> BR-INT
>>        to BR-PIF, via patch ports (localnet ports).
>>
>> 2. Layer 2
>> -------------
>>
>>    DESIGN:
>>    ~~~~~~~
>>    a. Leverage on localnet logical port type as path port between br-int
>> and
>>        br-pif.
>>    b. Each VLAN backed logical switch will have a localnet port connected
>>        to it.
>>    c. Tagging and untagging of vlan headers happens at localnet port
>> boundary.
>>
>>    PIPELINE EXECUTION:
>>    ~~~~~~~~~~~~~~~~~~~
>>    a. Unlike geneve encap based solution, where we execute ingress
>> pipeline on
>>        source chassis and egress pipeline on destination chassis, for vlan
>>        backed logical switches, packet will go through ingress pipeline
>>        on destination chassis as well.
>>
>>    PACKET FLOW (Figure 1. shows topology and Figure 2. shows the packet
>> flow):
>>
>>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>    a. VM sends unicast traffic (destined to VM2_MAC) to br-int.
>>    b. For br-int, destination mac is not local, hence it will forward it
>> to
>>        localnet port (by design), which is attached to br-pif. This is
>>        the stage at which vlan tag is added. Br-pif forwards the packet
>>        to physical interface.
>>    c. br-pif on destination chassis sends the received traffic to
>> patch-ports
>>        on br-int (as unicast or unknown unicast).
>>    d. br-int does vlan tag check, strips the vlan header and sends
>>        the packet to ingress pipeline of the corresponding datapath.
>>
>>
>>    KEY DIFFERENCES AS COMPARED TO OVERLAY:
>>    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>    a. No encapsulation.
>>    b. Both ingress and egress pipelines of logical switch are executed on
>>        both source and destination hypervisor (unlike overlay where
>> ingress
>>        pipeline is executed on source hypervisor and egress on
>> destination).
>>
>>    CODE CHANGES:
>>    ~~~~~~~~~~~~~
>>    a. ovn-nb.ovsschema:
>>         1. Add a new column to table Logical_Switch.
>>         2. Column name would be "type".
>>         3. Values would be either "vlan" or "overlay", with "overlay"
>>             being default.
>>
>>    b. ovn-sbctl:
>>         1. Add a new cli which sets the "type" of logical-switch.
>>             ovn-nbctl ls-set-network-type SWITCH TYPE
>>
>>    c. ovn-northd:
>>         1. Add a new enum to ovn_datapath struct, which will indicate
>>             if logical_switch datapath type is overlay or vlan.
>>         2. Populate a new key value pair in southbound database for
>> Datapath
>>             Bindings of Logical_Switch.
>>         3. Key value pair: <logical-switch-type, "vlan" or "overlay">,
>> default
>>             will be overlay.
>>
>>
>> 3. Layer 3 East West
>> --------------------
>>
>>    DESIGN:
>>    ~~~~~~~
>>    a. Since the router port is distributed and there is no encapsulation,
>>        hence packets with router port mac as source mac cannot go on wire.
>>    b. We propose replacing router port mac with a chassis specific mac,
>>        whenever packet goes on wire.
>>    c. Number of chassis_mac per chassis could be dependent on number of
>>        physical nics and corresponding bond policy  on br-pif.
>>
>>       As of now, we propose only one chassis_mac per chassis
>>       (shared by all resident logical routers). However, we are analyzing
>>       if br-pif's bond policy would require more macs per chassis.
>>
>>    PIPELINE EXECUTION:
>>    ~~~~~~~~~~~~~~~~~~~
>>    a. For a DVR E-W flow, both ingress and egress pipelines for
>> logical_router
>>        will execute on source chassis only.
>>
>>    PACKET FLOW (Figure 3. shows topology and Figure 4. shows the packet
>> flow):
>>
>>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>    a. VM1 sends packet (destined to IP2), to br-int.
>>    b. On Source hypervisor, packet goes through following pipelines:
>>       1. Ingress: logical-switch 1
>>       2. Egress:  logical-switch 1
>>       3. Ingress: logical-router
>>       4. Egress:  logical-router
>>       5. Ingress: logical-switch2
>>       6. Egress:  logical-switch2
>>
>>       On wire, packet goes out with destination logical switch's vlan.
>>       As mentioned in design, source mac (RP2_MAC) would be replaced with
>>       CHASSIS_MAC and destination mac would be that of VM2.
>>
>>    c. Packet reaches destination chassis and enters logical-switch2
>>        pipeline in br-int.
>>    d. Packet goes through logical-switch2 pipeline (both ingress and
>> egress)
>>        and gets forwarded to VM2.
>>
>>    CODE CHANGES:
>>    ~~~~~~~~~~~~~
>>    a. ovn-sb.ovsschema:
>>         1. Add a new column to the table Chassis.
>>         2. Column name would be "chassis_macs", type being string and no
>>             limit on range of values.
>>         3. This column will hold a list if chassis unique macs.
>>         4. This table will be populated from ovn-controller.
>>
>>    b. ovn-sbctl:
>>         1. CLI to add/delete chassis_macs to/from the south bound
>> database.
>>
>>    c. ovn-controller:
>>         1. Read chassis macs from OVS Open_Vswitch table and populate
>>             south bound database.
>>         2. In table=65, add a new flow at priority 150, which will do
>> following:
>>            a. Match: source_mac == router_port_mac, metadata ==
>>                destination_logical_switch, logical_outport = localnet_port
>>            b. Action: Replace source mac with chassis_mac, add vlan tag.
>>
>>
>> 4. LAYER 3 North South (NO NAT)
>> -------------------------------
>>
>>    DESIGN:
>>    ~~~~~~~
>>    a. For talking to external network endpoint, we will need a gateway
>>       on OVN DVR.
>>    b. We propose to use the gateway_chassis construct to achieve the same.
>>    c. LRP will be attached to Gateway Chassis(s) and only on the active
>>        chassis we will respond to ARP request for the LRP IP from undelay
>>        network.
>>    d. If NATing (keeping state) is not involved then traffic need not go
>>        via the gateway chassis always, i.e traffic from OVN chassis to
>>        external network need not go via the gateway chassis.
>>
>>    PIPELINE EXECUTION:
>>    ~~~~~~~~~~~~~~~~~~~
>>    a. From endpoint on OVN chassis to endpoint on underlay.
>>       i. Like DVR E-W, logical_router ingress and egress pipelines are
>>          executed on source chassis.
>>
>>    b. From endpoint on underlay TO endpoint on OVN chassis.
>>       i. logical_router ingress and egress pipelines are executed on
>>          gateway chassis.
>>
>>    PACKET FLOW LS ENDPOINT to UNDERLAY ENDPOINT (Figure 5. shows
>> topology):
>>
>>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>    a. Packet flow in this case is exactly same as Layer 3 E-W.
>>
>>
>>    PACKET FLOW UNDERLAY ENDPOINT to LS ENDPOINT (Figure 5. shows topology
>> and
>>
>>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>    Figure 6. shows the packet flow):
>>    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>    a. Gateway for endpoints behind DVR will be resident on only
>>        gateway-chassis.
>>    b. Unicast packets will come to gateway-chassis, with destination MAC
>>        being RP2_MAC.
>>    c. From now on, it is like L3 E-W flow.
>>
>>    CODE CHANGES:
>>    ~~~~~~~~~~~~~
>>    a. ovn-northd:
>>         1. Changes to respond to vlan backed router port ARP from uplink,
>>            only if it is on a gateway chassis.
>>         2. Changes to make sure that in the absence of NAT configuration,
>>            OVN_CHASSIS to external network traffic does not go via the
>> gateway
>>            chassis.
>>
>>    b. ovn-controller:
>>         1. Send out garps, advertising the vlan backed router port's
>>            (which has gateway chassis attached to it) from the
>>            active gateway chassis.
>>
>>
>> 5. LAYER 3 North South (NAT)
>> ----------------------------
>>
>>    SNAT, DNAT, SNAT_AND_DNAT (without external mac):
>>    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>    a. Our proposal aligns with following patch series which is out for
>> review:
>>        link <http://patchwork.ozlabs.org/patch/952119/
>> [patchwork.ozlabs.org]
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_patch_952119_&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=W9SaC-Y4c66z_A0dCbHWCbvbp5dlFgMaLgJckzFobEU&s=Txu4M2xluHnEGQWjNTNPBziw18TH0s98QHTTsJe3fV4&e=>
>> >
>>
>>    b. However, our implementation deviates from proposal in following
>> areas:
>>       i. Usage of lr_in_ip_routing:
>>          Our implementation sets the redirect flag after routing decision
>> is taken.
>>          This is to ensure that a user entered static route will not
>> affect the
>>          redirect decision (unless it is meant to).
>>
>>      ii. Using Tenant VLAN ID for "redirection":
>>          Our implementation uses external network router port's
>>          (router port that has gateway chassis attached to it) vlan id
>>          for redirection. This is because chassisredirect port is NOT on
>>          tenant network and logically packet is being forwarded to
>>          chassisredirect port.
>>
>>
>>    SNAT_AND_DNAT (with external mac):
>>    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>    a. Current OVN implementation of not going via gateway chassis aligns
>> with
>>        our design and it worked fine.
>>
>>
>> This is just an initial proposal. We have identified more areas that
>> should be worked upon, we will submit patches (and put forth
>> topics/design for discussion),
>> as we make progress.
>>
>>
>> Thanks
>>
>> Regards,
>> Ankur
>> _______________________________________________
>> dev mailing list
>> [email protected]
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> [mail.openvswitch.org]
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=W9SaC-Y4c66z_A0dCbHWCbvbp5dlFgMaLgJckzFobEU&s=CISx3hjn0RpVgFyNMLZJLtBIfzp5g1naShRNAqfHqo8&e=>
>>
>>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to