Re: [nvo3] LAG/ECMP load-balancing problems facing overlay networks

Lucy yong Mon, 30 Sep 2013 09:24:29 -0700

Hi Jeff,

Please see inline.

-----Original Message-----
From: Jeff Wheeler [mailto:[email protected]] 
Sent: Monday, September 30, 2013 8:18 AM
To: Lucy yong
Cc: Lizhong Jin; [email protected]
Subject: Re: [nvo3] LAG/ECMP load-balancing problems facing overlay networks

On Tue, Sep 24, 2013 at 10:32 AM, Lucy yong <[email protected]> wrote:
> So this is about payload flow forwarding.  Will a payload flow be 
> forwarded to any CPU core, or must be designated to one of cores?

"It depends on the NIC," as well as configuration / NIC driver.  The popular 
Intel 82599 will deliver all GRE traffic with same outer SRC & DST to one Rx 
queue, which will be served by only one CPU core.
GRE-in-UDP should have L4 entropy, as you've mentioned, so the entropy 
information will allow traffic to be distributed to multiple queues / CPUs.

It's true that this NIC (and others) can store flow-to-queue affinity state 
on-die, but in truth, this feature is not useful for NVO3 unless the guest OS 
is trusted to perform its own NVO3 encapsulation (violating many folks' 
security model.)

This is really not a problem today, because it is not ideal for the hypervisor 
implementing NVO3 to be executed on all CPUs anyway -- otherwise the presence 
of much network traffic will starve the guest OS and its applications of CPU, 
cause the CPU caches to be affected badly (probably thrashing most of the L1 
ICACHE each time a packet arrives), and so on.

When it becomes a problem is when NIC vendors decide to implement NVO3 
offloading.  At that point, multiple CPUs can be used effectively as long as 
sufficient information can be parsed from the NVO3 transport header, or the 
inner-header, to determine which DMA queue should receive each packet.  A 
"random" scheme (we are throwing around the word entropy) is stupid, because in 
many cases, multiple guest VMs will be running on one physical host 
(hypervisor) and these VMs can have affinity to certain CPU cores.
[Lucy] We should distinct flow forwarding and ECMP. This is about flow 
forwarding, not ECMP. Does NIC implementing NVO3 offloading mean VXLAN/NVGRE 
encap/decap and tunnel function or other as well? When implementing NVO3 
offloading, does that mean that hypervisor implementing vSwitching need to run 
on one CPU and all the packets need to pass this hypervisor first?

In these cases, it is desirable to direct traffic for VM#1 to CPU#1,
VM#2 to CPU#2, and so on, depending on how the systems folks have allocated 
resources.

Entropy here is really not smart all the time.  It is a useful option for some 
scenarios, but others will require a fixed L4 header per NVO3 termination, so 
the appropriate VM/CPU can receive packets from the NIC efficiently, without 
executing a hypervisor at all.
[Lucy] Flow entropy relates to ECMP LB process, does not apply to flow 
forwarding, i.e. flow entropy can't be used in flow forwarding. Thus, the 
thread title and discussed content are difference topics. 

Thank you very much for the explanation.

Lucy

--
Jeff S Wheeler <[email protected]>
Sr Network Operator  /  Innovative Network Concepts
_______________________________________________
nvo3 mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/nvo3

Re: [nvo3] LAG/ECMP load-balancing problems facing overlay networks

Reply via email to