Hi Jeff, Please see inline.
-----Original Message----- From: Jeff Wheeler [mailto:[email protected]] Sent: Monday, September 30, 2013 8:18 AM To: Lucy yong Cc: Lizhong Jin; [email protected] Subject: Re: [nvo3] LAG/ECMP load-balancing problems facing overlay networks On Tue, Sep 24, 2013 at 10:32 AM, Lucy yong <[email protected]> wrote: > So this is about payload flow forwarding. Will a payload flow be > forwarded to any CPU core, or must be designated to one of cores? "It depends on the NIC," as well as configuration / NIC driver. The popular Intel 82599 will deliver all GRE traffic with same outer SRC & DST to one Rx queue, which will be served by only one CPU core. GRE-in-UDP should have L4 entropy, as you've mentioned, so the entropy information will allow traffic to be distributed to multiple queues / CPUs. It's true that this NIC (and others) can store flow-to-queue affinity state on-die, but in truth, this feature is not useful for NVO3 unless the guest OS is trusted to perform its own NVO3 encapsulation (violating many folks' security model.) This is really not a problem today, because it is not ideal for the hypervisor implementing NVO3 to be executed on all CPUs anyway -- otherwise the presence of much network traffic will starve the guest OS and its applications of CPU, cause the CPU caches to be affected badly (probably thrashing most of the L1 ICACHE each time a packet arrives), and so on. When it becomes a problem is when NIC vendors decide to implement NVO3 offloading. At that point, multiple CPUs can be used effectively as long as sufficient information can be parsed from the NVO3 transport header, or the inner-header, to determine which DMA queue should receive each packet. A "random" scheme (we are throwing around the word entropy) is stupid, because in many cases, multiple guest VMs will be running on one physical host (hypervisor) and these VMs can have affinity to certain CPU cores. [Lucy] We should distinct flow forwarding and ECMP. This is about flow forwarding, not ECMP. Does NIC implementing NVO3 offloading mean VXLAN/NVGRE encap/decap and tunnel function or other as well? When implementing NVO3 offloading, does that mean that hypervisor implementing vSwitching need to run on one CPU and all the packets need to pass this hypervisor first? In these cases, it is desirable to direct traffic for VM#1 to CPU#1, VM#2 to CPU#2, and so on, depending on how the systems folks have allocated resources. Entropy here is really not smart all the time. It is a useful option for some scenarios, but others will require a fixed L4 header per NVO3 termination, so the appropriate VM/CPU can receive packets from the NIC efficiently, without executing a hypervisor at all. [Lucy] Flow entropy relates to ECMP LB process, does not apply to flow forwarding, i.e. flow entropy can't be used in flow forwarding. Thus, the thread title and discussed content are difference topics. Thank you very much for the explanation. Lucy -- Jeff S Wheeler <[email protected]> Sr Network Operator / Innovative Network Concepts _______________________________________________ nvo3 mailing list [email protected] https://www.ietf.org/mailman/listinfo/nvo3
