Hi Guru,

Sure, providing more explanation.

Q. What are we trying to solve?
Ans. Getting distributed routing to work for vlan backed networks through OVN.

Q. Disconnect wrt OVN capabilities for above task?

Ans. OVN lacks in certain areas wrt how to forward the packets 
"correctly/efficiently" in the absence of encapsulation (VXLAN, STT or GENEVE).

Following are the known gaps:

L3 E-W

======
a. Since a router port is distributed, hence in the absence of encapsulation, 
we should not be using router port mac as source mac. Our proposal is to 
replace router port mac with a chassis specific unique mac, when an 
unencapsulated packet originating from router port goes on wire.
This was explained in following email:
https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353179.html


b. Sending ARP reply on wire.

As of now, OVN consumes ARP reply from VM which are destined to router port 
(because router port is present locally on vm's chassis as well). Because ARP 
reply is NOT seen on the wire, hence a physical switch will never learn VM's 
mac (unless VM is involved in a L2 communication as well).

As a result, a DVR routed traffic, will always be flooded by TOR (top of the 
rack switch), as dest mac is that of the VM, which TOR never learnt.


L3 N-S

======

a. For vlan backed networks, NATing is NOT a must to talk to "outside" physical 
network (for overlay it is). Hence, OVN requires some changes in this area as 
well.

b. DO NOT respond to ARP request for any ROUTER PORT from uplink, unless it is 
on gateway chassis.

c. When gateway chassis failover happens, then advertise router port mac as 
well.



L3 N-S NAT

=========

a. Current OVN implementation uses geneve encap (geneve options) to provide 
metadata to the gateway chassis (where SNAT happens).

b. In the absence of encapsulation, OVN should be enhanced to still support NAT 
on gateway chassis.


===========================================================


Our initial proposal has details as well:

https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353066.html


Like i mentioned, problem statement we are trying to solve is "Distributed 
Virtual Routing For VLAN Backed Networks".
As a part of above, we have identified some gaps, which we intend to fix.


As we progress further, we will have to add some features as well.
But, as of now we are focused on getting basic functionality to work correctly 
first.



Please feel free to put forth more queries/concerns you have, i will be happy 
to explain.

Thanks again for review.

Regards,
Ankur



________________________________
From: Guru Shetty <g...@ovn.org>
Sent: Monday, November 12, 2018 9:58:07 AM
To: Ankur Sharma
Cc: ovs dev; Numan Siddique; Ben Pfaff
Subject: Re: VLAN tenant network patches



On Sun, 11 Nov 2018 at 21:02, Ankur Sharma 
<ankur.sha...@nutanix.com<mailto:ankur.sha...@nutanix.com>> wrote:

Hi Guru,

Thanks for spending time in understanding the proposal and drafting your 
understanding as well.
Thanks Numan for pitching in.

Some comments (trying to keep them as brief as possible).

a. On a high level, we are trying to do following:
    "Distributed router functionality for vlan backed networks"

I guess, there is a big disconnect then. OVN currently does "distributed router 
for VLAN backed networks". Do you disagree? If so, please explain.



b. This would include changes/analysis for E-W traffic and N-S traffic.


c. Some the changes are specific to the characteristics of a distributed router 
and some are specific to OVN way of doing things.


d. The points we have discussed thus far, is a subset of changes,

    i.e a vlan backed DVR (or logical router) would be more than just replacing 
router port mac with a chassis mac.


e. Numan's changes DO NOT conflict/overlap with what we have proposed so far 
and hence should be discussed/reviewed independently.

    His changes are solving a very specific problem.

    His changes are to "mimic" a centralized router in a distributed router.

    i.e to execute router pipeline on a centralized chassis, while the router 
is still is distributed.
    I have provided my feedback here:
    https://mail.openvswitch.org/pipermail/ovs-dev/2018-November/353701.html 
[mail.openvswitch.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2018-2DNovember_353701.html&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=XO-18lbZRBMj3Y31RX-knscxn_yu-Y9ukK_MhWMq5_s&s=RF3Hsy_IA_3gDzjc66Cpnm8PSAczbRB1PDNIWCIUYQc&e=>



Providing some more comments inline.

Thanks again Guru, Numan, Mark and Han for spending time on the proposal and 
providing feedback.
I am preparing a v2, which will have changes till E-W.



Regards,
Ankur








________________________________
From: Guru Shetty <g...@ovn.org<mailto:g...@ovn.org>>
Sent: Friday, November 9, 2018 11:45 AM
To: ovs dev
Cc: Ankur Sharma; Numan Siddique; Ben Pfaff
Subject: VLAN tenant network patches

I have tried to summarize the problem statement that Numan and Ankur are trying 
to solve here based on my understanding so far. Please correct me and I will 
revise it along.

Current feature set in OVN.
==========================

A logical switch should only have one localnet logical port. If a logical 
switch has a logical port of type "localnet",then all traffic for that logical 
switch avoids overlays. So in essence, this is only useful when all the 
hypervisors are in the same broadcast domain.  Currently there are no known 
problems as long as logical switches are not connected to any logical routers.


When 2 logical switches (each with a localnet port) is connected to a logical 
router, we still push all east-west traffic to the underlay. The source 
hypervisor executes the pipeline of all 3 logical datapaths and then pushes the 
traffic to the underlay via the localnet port (with its corresponding vlan tag) 
of third logical switch.

The above topology creates a problem for the underlying hardware switch. 
Because now it can see the same mac address of the distributed router coming 
from 2 different hypervisors as a source mac address of the packet on wire. 
According to Ankur, there are physical switches which can detect source mac 
address coming from differnet ports and limit it. But this looks like it is 
configurable in physical switches.


For N/S traffic, currently traffic is punted to gateway chassis via a overlay 
tunnel. There is a use case where you want to avoid overlay tunnels. This is 
because for "localnet" topology you can keep the the MTU of inner VM to be the 
same as underlay MTU. But when you have overlays just for one class of traffic, 
this becomes a problem.

So both Ankur's and Numan's patches tries to tackle the above 2 problems.

To re-summarize
Problem 1: External switch getting confused about the machine on which OVN 
router mac address resides. But this is only source mac address coming from 
different hypervisors (not destination mac).

[ANKUR]:
We are trying to do more than just replacing a router port mac with a chassis 
mac. We are trying to get a distributed routing functionality working via OVN 
for vlan backed networks.
Not using the router port mac, is one of the first problems that has to be 
solved.
For a production deployment, we might need some more changes/analysis.


Problem 2: When packet has destination IP address outside OVN router known 
subnets, it is being currently sent via overlay tunnel. This would need MTU 
configuration for inner VMs.

Numans patch:
============

Numans patches tries to solve the above 2 problems by doing the following.
1. When VM-A (on Hyp-A) in switch-A tries to talk to VM-B in switch-B (Hyp-B) 
(switch-A and switch-B are connected with router), Hyp-A will execute switch-A 
pipeline and push the traffic out of localnet port with router's mac address as 
destination.
2. Router chassis will receive the packet, execute switch-A pipleline again, 
router pipeline and then switch-B pipeline and push packet out of switch-B's 
localnet port.  Now Hyp-B receives the traffic, executes switch-B pipeline 
again and packet gets delivered.

The result is that all east west traffic is centralized and has an extra hop.

[ANKUR]:
Yes, Numan's approach is to mimic a centralized router, while the vlan backed 
logical switch is still connected to a distributed logical router (i.e 
connecting ports are of type "patch").



Ankur's proposal:
==============

Though the complete patches do not exist, Ankur wants to solve the problem 1 by 
having a chassis specific MAC. So when packet leaves a hypervisor for east-west 
routing, it uses a unique mac. The disadvantage with this proposal is that the 
VM (i.e logical port) will see the mac of its first hop router change 
continuously which may have some yet to be clearly defined side-effects (leads 
to more ARP requests from the VM).

[ANKUR]:
Just want to clarify that a tcp/ip stack would NEVER populate its ARP cache 
based on IP packets. It would rely on ARP (/GARP) to resolve gateway mac, ARP 
queries for router port (gateway) ip will always be responded by OVN with 
router port mac only.
i.e using Chassis mac as source mac WOULD NOT impact any functionality of a 
VM's networking stack. However, it could still be desirable to NOT TO show the 
chassis mac to a VM.  We intend to solve it as well, but our first 
implementation does not look clean/scalable. We will submit it for review 
anyways, but not in the first series.

Problem 2 is solved similar to what Numan has in patches, although there are 
small changes in implementation. It is not clear whether one code is more 
complicated than other. But it looks like Ankur’s patches will avoid the extra 
hop for east-west traffic.

Numan is perfectly fine with Ankur’s patches (after it is sent, reviewed and 
tested) if they satisfy his problem statements. But he does prefer his patches 
reviewed and merged if there is delay in Ankur's patches (and possibly reverted 
later, if there is an alternative).

[ANKUR] Mine and Numan's patches are not realted to each other and should not 
be seen as "either or".
Numan's patch is trying to solve a very specific case.
It should be reviewed independently and should not be blocked because of my 
changes.
Management plane / data center architecture would drive which approach to take.
As a platform, OVN should support both.





_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to