Re: [nvo3] Let's refocus on real world

Ivan Pepelnjak Tue, 28 Aug 2012 10:25:00 -0700

If I understood vSphere manuals and discussions on various blogs/forumscorrectly, VMware solved most of this problem a long time ago with I/Oshares and a few other features ... but don't trust a networking guy toknow anything about storage :)


On 8/28/12 7:10 PM, Jon Hudson wrote:

Dead on.


Anytime you have a fan-in-fan-out type traffic flow with filesystem info one 
person can ruin the party for everyone. FCoE is a perfect example where a pause 
frame sent on a aggregation link can end up impacting many initiators. Or even 
at a controller level of any array you can get a traffic jam of sorts on poorly 
designed and layed out subsystems. Or too few lines for food at an IETF social.

Lots can be done with queues etc to mitigate the issue, but it is always 
something to be mindful of. Especially if your remote filesystem is not just a 
mounted LUN but the mainsystem/boot LUN and you have windows paging over the 
wire.

On Aug 28, 2012, at 9:55 AM, "Stiliadis, Dimitrios 
(Dimitri)"<[email protected]>  wrote:

FCoE is clearly not a requirementŠ

but, there is something to be said about storage (and I should have
responded in the
other email about this), but in general storage isolation is done at the
storage
level and not the network layer. So, we can ignore.

If we take a a storage server that exports a file system that is mounted
by a
hypervisor and multiple tenants have their VMs in this file system, then a
single
network connection between hypervisor and storage device could potentially
lead to head of line blocking and allow one tenant to influence the
performance
of another tenant. If my memory serves me correct, VMware for example can
only
use two or four iSCSI initiators that have be to shared by the different
VMs
of the hypervisor, and thus traffic from multiple tenants is multiplexed
on the same
network flow .. This means that storage drivers/devices have to take care
of
traffic isolation. And this can be perfectly fine in point-to-point
situations, but
it can get interesting in multiplexed scenarios Š

(but we just don't want the storage guys to blame the network guys for
performance issues ;)

Dimitri

On 8/28/12 9:44 AM, "Ivan Pepelnjak"<[email protected]>  wrote:



In sane real-life designs the virtual network overlay solution would not
transport FCoE. I'm also positive someone will come up with exactly that
requirement sooner rather than later :D

On 8/28/12 6:40 PM, Aldrin Isaac wrote:

The question regarding FCoE is whether overlay solutions need to
transport it.  I think the answer is no.  If something operates at the
underlay level than it isn't in scope for NVo3, including DCB.

On Tuesday, August 28, 2012, Somesh Gupta wrote:

-----Original Message-----
From:

[email protected]<javascript:;>  [mailto:[email protected]
<javascript:;>] On Behalf Of

Ivan Pepelnjak
Sent: Tuesday, August 28, 2012 12:22 AM
To: Stiliadis, Dimitrios (Dimitri)
Cc: Black, David;

[email protected]<javascript:;>; Linda Dunbar

Subject: Re: [nvo3] Let's refocus on real world (was: Comments on Live
Migration and VLAN-IDs)

Dimitri,

We're more in agreement than it might seem. I might have my doubts
about
the operational viability of the OpenStack-to-baremetal use case you
described below, but I'm positive someone will try to do that as well.

In any case, regardless of whether we're considering VMs or bare-metal
servers, in the simplest scenario the server-to-NVE connection is a
point-to-point link, usually without VLAN tagging.

In the VM/hypervisor case, NVE is implemented in the hypervisor soft
switch; in the baremetal server case, it has to be implemented in the
ToR switch.

This is certainly only today's restriction. If nov3 takes off, there
certainly could be a pseudo-driver in Linux that could implement the
NVE (like a VLAN driver) without much additional overhead.

It's important to keep in mind the limitations of the ToR switches to
ensure whatever solution we agree upon will be implementable in ToR
switches as well, but it makes absolutely no sense to assume NVE will
not be in the hypervisor (because someone wants to support a customer
having a decade-old VLAN-only hypervisor soft switch).

As for ToR switch capabilities, Dell has demonstrated NVGRE support and
Arista is right now showing off a hardware VXLAN VTEP prototype, so I
guess it's safe to assume next-generation merchant silicon will support
GRE- and UDP-based encapsulations well before we'll agree on what NVO3
solution should be.

Finally, can at least some of us agree that the topology that makes
most
sense is a direct P2P link between (VM or bare-metal) server and NVE
using VLAN tagging only when a server participating in multiple L2 CUGs
has interface limitations?

Kind regards,
Ivan

On 8/27/12 6:55 AM, Stiliadis, Dimitrios (Dimitri) wrote:

Ivan:

I agree and at the same time disagree with some of the statements
below. I would like to understand your view.

See inline:

On 8/25/12 8:22 AM, "Ivan Pepelnjak"<[email protected]>   wrote:

On 8/24/12 11:11 PM, Linda Dunbar wrote:
[...]

But most, if not all, data centers today don't have the Hypervisors
which can encapsulate the NVo3 defined header. The deployment to

all

100% NVo3 header based servers won't happen overnight. One thing

for

sure that you will see data centers with mixed types of servers for
very long time.

If NVEs are in the ToR, you will see mixed scenario of blade

servers,

servers with simple virtual switches, or even IEEE802.1Qbg's VEPA.

So

it is necessary for NVo3 to deal with the "L2 Site" defined in this
draft.

There are two hypothetical ways of implementing NVO3: existing

layer-2

technologies (with well-known scaling properties that prompted the
creation of NVO3 working group) or something-over-IP encapsulation.

I might be myopic, but from what I see most data centers today (at

least

based on market shares of individual vendors) don't have ToR

switches

that would be able to encapsulate MAC frames or IP datagrams in UDP,

GRE

or MPLS envelopes. I am not familiar enough with the commonly used
merchant silicon hardware to understand whether that's a software or
hardware limitation. In any case, I wouldn't expect switch vendors

to

roll out NVO3-like something-over-IP solutions any time soon.

On the hypervisor front, VXLAN is shipping for months, NVGRE is

included

in the next version of Hyper-V and MAC-over-GRE is available (with

Open

vSwitch) for both KVM and Xen. Open vSwitch is also part of standard
Linux kernel distribution and thus available to any other Linux-

based

hypervisor product.

So: all major hypervisors have MAC-over-IP solutions, each one using

proprietary encapsulation because there's no standard way of doing

it,

and yet we're spending time discussing and documenting the history

of

evolution of virtual networking. Maybe we should be a bit more
forward-looking, acknowledge the world has changed, and come up with

relevant hypervisor-based solution.

Correct, and here is where IETF as a standard body fails. There is no
easy way (any time soon) for a VXLAN based solution to talk to an

NVGRE

or MAC/GRE, or Cloudstack MAC/GRE or STT  (you forgot this one),

based

solution.
Proprietary approaches that drive enterprises to vendor lock ins. And
instead
of trying to address the first problem that is about

"interoperability",

_______________________________________________
nvo3 mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/nvo3

_______________________________________________
nvo3 mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/nvo3

Re: [nvo3] Let's refocus on real world

Reply via email to