Re: [nvo3] Comments on Draft Geneve

Anton Ivanov (antivano) Mon, 03 Mar 2014 00:06:08 -0800

Hi all,

I would like to address one more issue which has been omitted so far from the 
background to the discussion.


If we restrict the use cases to virtualization (which is the remit of NVO3), 
the assumption that variable length options are "easy" to implement in software 
is valid if and only if they are constant length for the duration of a session. 
Otherwise it is incorrect.

If you work purely in software with no VMs involved f.e. software switch which 
takes pseudowires from the network and writes to pseudowires with a variable 
length header parsing geneve is trivial - you allocate big enough buffers and 
play with offsets. The code for that has been polished over the years, standard 
kernel buffer handling on all OS-es (or its equivalents for switches), nothing 
new here.

If you have to pass that data into a VM this changes the picture - you want 
that data to be page aligned so you can page it in without copying it. This is 
trivial if your header is constant for the duration of the session. You get the 
header separately, data separately by knowing the offsets. The APIs to do that 
are there - it does not matter are you doing it in userspace (POSIX vector IO 
and its Microsoft equivalent) or in kernel space scatter gather IO. It is easy.

If your header is variable length during the session and you do not know the 
size for a particular packet you have page-in the whole buffer and supply the 
driver with an offset on where to start. This means that you have to zero the 
bits of the header which would otherwise "leak into" the VM every time and/or 
do some copying. If you do not zero them, you have a security issue of the VM 
seeing its overlay and/or metadata which may have potential security use. The 
same applies if you can write directly to the VM address space instead of 
paging in buffers via the mmu. Zeroing 256+ bytes on every pass tends to add up 
to quite a few CPU cycles over time.

So from an implementation perspective as far as variable size headers are 
concerned, there is little difference between software in a virtualized 
environment and hardware. They have very similar restrictions (unless you want 
to sacrifice 40% of your performance to an interim copy). Provided that you 
want performance of course.

Going back to Geneve - if the header is constant duration within the session it 
is not different from what has been done in l2tp and what is being done in sfc. 
No technical merit to perpetrate it. If the header is variable, then we either 
have a case of:

1. The draft may need an IPR statement already at this stage. I do not feel 
comfortable discussing a spec that looks like it has been submarined so you 
need a specific piece of IPR to implement it with an acceptable performance.

2. A spec that is specifically tailored to a single NPU/NIC to ship from a 
single (un)known vendor. This is similarly not something we should be 
discussing (once again - IPR statement there too).

Brgds,

A.


On 02/03/14 23:30, Phil Bedard wrote:
I've read most of the posts in this thread as an operator who may be looking at 
an overlay solution in the future.

So the crux of the discussion is whether to extend the functionality of an 
existing protocol or introduce a brand new protocol.

I would like to see the VNI space extended to 32 bits instead of 24 in whatever 
encapsulation method is being chosen.  24 seems like a holdover from the 
802.1ah I-SID value and other adapted tunnel protocol limitations and I'm not 
sure it's really necessary anymore.

I also believe there has to be a protocol identifier in the encapsulation 
header identifying what comes next.  Static provisioning of this kind of 
information at the endpoints or midpoints in the case of monitoring gear, etc. 
is too cumbersome and not extensible.   I think Tom said it initially, but I 
also don't believe inserting an Ethernet header just for the sake of it is 
efficient and the overlay encapsulation protocol should be able to encapsulate 
IP directly.

I do not think the metadata should be a part of the encapsulation protocol, the 
encapsulation header should be a fixed length.   I think the majority of simple 
overlay networks will not require additional metadata information and will 
likely be using the encapsulation with nothing following it but IP packets or 
Ethernet frames.    Having a variable length suffix is just going to add 
implementation headaches for hardware vendors and will be a quick way to see it 
not get adopted, IMHO.    If someone needs additional hardware support for the 
next header, whether it be a security integrity header, or some sort of 
additional metadata, let that be sorted out elsewhere.

Just my 2c.

-Phil


From: Pankaj Garg <[email protected]<mailto:[email protected]>>
Date: Sunday, March 2, 2014 at 2:06 PM
To: "Larry Kreeger (kreeger)" <[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>>
Subject: Re: [nvo3] Comments on Draft Geneve

My responses are inline marked with PG.

From: Larry Kreeger (kreeger) [mailto:[email protected]]
Sent: Sunday, March 2, 2014 9:16 PM
To: Pankaj Garg; [email protected]<mailto:[email protected]>
Subject: Re: Comments on Draft Geneve

My responses are inline marked with LK>.  - Larry

From: Pankaj Garg <[email protected]<mailto:[email protected]>>
Date: Saturday, March 1, 2014 4:22 AM
To: Larry Kreeger <[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>>
Subject: RE: Comments on Draft Geneve

My comments are inline marked with [PG].

From: nvo3 [mailto:[email protected]] On Behalf Of Larry Kreeger (kreeger)
Sent: Saturday, March 1, 2014 3:28 AM
To: [email protected]<mailto:[email protected]>
Subject: [nvo3] Comments on Draft Geneve

I see that a healthy discussion has broken out around draft-gross-geneve-00 
which I see has a slot in the agenda for the NVO3 WG meeting on Monday.  Here 
are my thoughts.

I will be comparing Geneve to an encapsulation that is near and dear to my 
heart, VXLAN.  When I do this, I see an encapsulation that is very similar to 
VXLAN (e.g. uses UDP, uses a 24-bit segment identifier at the same offset).  I 
see three things that Geneve adds beyond what is available in 
draft-mahalingam-dutt-dcops-vxlan:

1) The ability to encapsulate any protocol with an Ethertype (not just Ethernet 
frames), by adding a Protocol Type field.  This is certainly useful, and has 
already been covered in draft-quinn-vxlan-gpe as a backward compatible 
extension to VXLAN by using a P bit flag to signal its presence.  The field is 
even at the same offset as draft-quinn-vxlan-gpe, but is missing the P bit for 
backwards compatibility.

[PG] The backward compatibility argument is invalid since a frame with P bit 
set (let me call it VXLAN V2) cannot be processed by the older endpoint, thus 
having no backward compatibility.

LK> By backward compatibility, I mean that new implementations of VXLAN (VXLAN 
V2 as you call it) can understand packets sent by older implementations (VXLAN 
V1) as well as from new ones.  If older endpoints could understand the future 
bits, I would call that forward compatibility.
[PG] My point was that the VXLAN V2 endpoint would have to support generating 
and understanding VXLAN V1 format packets. Is it much different than an 
endpoint supporting both Geneve and VXLAN V1?

[PG] Essentially, what you are saying is that one can generate packets in VXLAN 
V1 for older endpoint and VXLAN V2 for newer endpoints. So the question is, why 
is VXLAN V2 better than Geneve? In fact, switching on a top level UDP port, 
provides a cleaner processing pipeline.

LK> By enhancing VXLAN, there is no need to get a new UDP port assigned and all 
the current parsing logic for VXLAN V1 can be applied.
[PG] I am not sure if allocating a new port is the meta issue here. The main 
issue here seems to be whether new protocol should _require_ support for VXLAN 
V1 or not. Coming from NVGRE side, the same argument would apply to Geneve 
where one can say that Geneve should be backward compatible with NVGRE. I feel 
this might be a slippery slope where a new protocol cannot start with a clean 
slate.

2) The addition of an OAM bit to signal that the packet should be processed by 
the tunnel endpoint and not forwarded to a tenant.  This also seems useful, and 
seems identical in usage to the (IMO, poorly named) "Router Alert" bit 
extension to VXLAN covered in (the currently expired) 
draft-singh-nvo3-vxlan-router-alert.

[PG] Yes, the OAM bit usage is similar. However, this is another extension 
which is incompatible with older implementation of VXLAN thus breaking backward 
compatibility.

LK> Again, I would call what you are referring to "forward compatibility".

3) Last, but not least is the addition of a variable length options field, 
which the draft suggests is used to carry metadata along with the payload.  As 
mentioned by some others, IMO, the encapsulation transport header is not the 
right place to define and carry metadata.  Architecturally, metadata should be 
defined independent of transport so it can be carried inside of whatever 
transport is desired (e.g. VXLAN, NVGRE, MPLSoGRE, L2TPV3 etc).  One example of 
an effort to do this is in the Network Service Header draft 
(draft-quinn-sfc-nsh) being discussed in the SFC WG.  I am guessing that since 
the Geneve options field is optional, that the metadata it contains is not 
related to basic network connectivity, but more to providing higher level 
network services (aka Service Functions).  The Network Service Header contains 
two separate parts, the service path (used to guide the packets through the 
service chain) and context (metadata).  I can certainly see the context part of 
NSH being used to carry metadata even if the service chain is null (all 
services are fully distributed to the tunnel endpoints).

[PG] The meta-data should be defined by their respective group. Different 
encapsulation protocols can carry those meta-data in their headers as needed. 
One clear example of how Geneve is better is that it can carry that meta-data 
without breaking hardware offloads, whereas VXLAN and NVGRE cannot do that. Btw 
I want to be clear, Geneve is not defining the meta-data, and it is not tying 
meta-data to Geneve, it is only defining a general purpose ability to carry 
meta-data, which is tremendously useful to have in the encapsulation header.

On a side note, I don’t believe that the design of NSH is suitable for carrying 
general purpose meta-data. In fact in its current definition, it is not 
defining service chaining primitives clearly either, however we can discuss 
that in SFC forum, and focus the discussion on encapsulation header in this 
forum.

In short, I don't see anything in Geneve that cannot be accomplished by using 
the backward compatible extensions to VXLAN proposed in draft-quinn-vxlan-gpe 
and draft-singh-nvo3-vxlan-router-alert, combined with the addition of NSH.

[PG] Yes, one can put multiple (incompatible) extensions on top of VXLAN, and 
achieve many things that Geneve is supporting. But at that point, aren’t we 
creating a new encapsulation format altogether? This new protocol with all such 
extensions would require new hardware, new software, break existing NIC 
offloads etc. and still carry the legacy baggage with no clear advantage. At 
that point, I am not sure, why it is better?

LK> As I wrote above, extending VXLAN allows the same UDP port to be used and 
reuse of the existing VXLAN parsing logic.
[PG] The crux of the discussion seems to be, whether Geneve should have a mode 
that is compatible with VXLAN V1 or not. Even though it might be a slippery 
slope, I think it is something to think about and debate further.

When the current NVO3 WG charter was being written, there seemed to be 
consensus that we have no shortage of encapsulation options, but what was 
lacking was a standard control plane.  The Geneve draft seems to turn that on 
its head by saying "There is a clear advantage in settling on a data format: 
most of the protocols are only superficially different and there is little 
advantage in duplicating effort.  However, the same cannot be said of control 
planes, which are diverse in very fundamental ways.  The case for 
standardization is also less clear given the wide variety in requirements, 
goals, and deployment scenarios.".  I agree with the first part of this, so why 
define a completely new, non-backward compatible encapsulation?  I disagree 
with the second part, since this is clearly the goal of the NVO3 WG.

I see that there is an agenda slot to discuss the Geneve draft, but I'm not 
clear what the goals are of the authors within the IETF since the draft name 
does not target it to any particular WG, and it is currently marked as 
"Informational".  I would suggest that the authors consider extending currently 
implemented encapsulations rather than starting from scratch, e.g. by moving a 
few bits around in the first word of the Geneve header, it could be made 
backward compatible with VXLAN.

Thanks, Larry
_______________________________________________ nvo3 mailing list 
[email protected]<mailto:[email protected]> https://www.ietf.org/mailman/listinfo/nvo3


_______________________________________________
nvo3 mailing list
[email protected]<mailto:[email protected]>
https://www.ietf.org/mailman/listinfo/nvo3

_______________________________________________
nvo3 mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/nvo3

Re: [nvo3] Comments on Draft Geneve

Reply via email to