---------- Forwarded message ---------- From: Lucy yong <[email protected]> To: Jeff Wheeler <[email protected]> Cc: "[email protected]" <[email protected]> Date: Mon, 23 Sep 2013 17:06:43 +0000 Subject: Re: [nvo3] LAG/ECMP load-balancing problems facing overlay networks Jeff,
I agree that supporting LAG/ECMP at the NIC->host interface is the issue for both VXLAN/NVGRE. However if NVGRE uses gre-in-udp, it will make future NIC enhancement for this capability much simpler since both encapsulations use udp src port for the flow entropy that is used by LB. Why do you think that inner payload inspection is necessary for ECMP LB? [Lizhong] the NIC LB is quite different with the router/switch. The NIC LB is called flow steering, and it does not rely on a hash value to steer traffic, but rely on a flow table to steer traffic to a specific CPU core. Usually the key of flow table is the 4 tuple of payload, src/dst IP add and UDP/TCP port. That's why the NIC always has to parse payload header. What I have to point out is, how to steer the VXLAN/NVGRE traffic is still under development, and maybe different with the one I described. Hope the explanation helps. Regards Lizhong Thank, Lucy -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Jeff Wheeler Sent: Sunday, September 22, 2013 1:13 PM Cc: [email protected] Subject: [nvo3] LAG/ECMP load-balancing problems facing overlay networks On Sat, Sep 21, 2013 at 11:57 AM, Lucy yong <[email protected]> wrote: > * Requirement: For performance reasons, multipath over LAG and ECMP > paths SHOULD be supported. > > VXLAN supports common five tuple based LB. NVGRE requests LB to use > GRE header, which is not commonly supported by underlying IP network. I'd like to note that it is becoming increasingly common for the underlying L3 or L2 network to support load-balancing based on GRE inner-header fields. In fact, this can be largely relied on for high-bandwidth GRE tunnels over the public Internet (!) today, and within many datacenter networks. Where this is not so reliable is in the NIC->Host interface. This is an issue facing NVGRE, and in some circumstance, VXLAN also. For example, the most popular 10GE server NIC chipsets will deliver all NVGRE packets to a host DMA ring buffer based on the GRE outer-header only. One example is the Intel 82599, which everyone must be familiar with to have an informed opinion on vRouter-related topics. Even if the NIC is configured with several DMA rings, and the vRouter is able to use different CPU cores to service those rings for distribution of the work, all NVGRE traffic (having same outer-header) will arrive at only one DMA ring, and other CPU cores may not be utilized. If vRouters are used to support network-heavy applications, NIC vendors must be encouraged to support additional header inspection which may then be used for load-balancing across host DMA rings. One would hope same vendors will eventually implement hardware offload of VXLAN/NVGRE entirely, and these two requirements share a common underlying need for the NIC to understand the VXLAN/NVGRE encapsulation. In other words, if a NIC vendor plans to support deeper header inspection for load-balancing, they have already done some of the work needed to support hardware offload. This list is very focused on procedural items and support in the networks; but nearly zero attention is paid to NIC->Host interface issues. For overlay technologies to deliver acceptable performance for network-heavy workloads, NIC->Host interface must be more intelligent than it is today. Next-generation NICs must implement new capabilities. It is therefore useful to consider NICs in any discussion of load-balancing across the datacenter network, if for no other reason than to foster greater understanding of this challenge. -- Jeff S Wheeler <[email protected]> Sr Network Operator / Innovative Network Concepts _______________________________________________ nvo3 mailing list [email protected] https://www.ietf.org/mailman/listinfo/nvo3 _______________________________________________ nvo3 mailing list [email protected] https://www.ietf.org/mailman/listinfo/nvo3
_______________________________________________ nvo3 mailing list [email protected] https://www.ietf.org/mailman/listinfo/nvo3
