On Jan 11, 2011, at 10:14 AM, Shane Amante wrote:
> What causes pain and/or worry to us operators is when someone launches a 
> large *individual* "macro-flow"[1] at the network that start to represent a 
> decent fraction of the overall capacity of a physical component-link 
> underlying a LAG and/or ECMP path, (e.g.: and the following are only an 
> *example*: 200 Mbps, 500 Mbps, 1 Gbps, etc. on a 10 Gbps link).  
> Unfortunately, due to the fact that load-hashing algorithms are stateless 
> (and, thus, non-adaptive), that means that "well-behaved" microflows (think: 
> casual Web surfing, e-mail, etc.) are still co-mingled with those large, 
> fat-flows across all component-links in a common LAG/ECMP path *without* 
> taking into account BW utilization of individual component-links.  So, there 
> is a much higher probability (or, oftentimes, certainty) of congestion and 
> packet loss causing pain to all users on the one (or, more) component-links 
> with fat, macro-flows on them that I can't capacity plan for and I can't 
> easily react to.
> 
> -shane
> 
> [1] Examples are: IPvX in IPvX tunneling, GRE, IPSec, "WAN acceleration" type 
> products that are used for extremely fast large file xfers, etc.

Yes. And there are similar issues in data centers, where load balancing is also 
used but in a different way.

There is another way to make a video flow be a relatively large chunk of a 
link; go closer to the access. "why should I care about a 5/10/20 MBPS...on a 
10 GBPS"? You shouldn't. But, how about a 20 MBPS data flow for each access 
customer when you have sized for an aggregate of 25 MBPS per customer? That can 
be quite a bit different. For the record, when we were developing our 
telepresence product, we had to look at the pacing of traffic coming out of the 
camera/codec, because we found that a 5 MBPS data flow with 10 MBPS peaks could 
momentarily swamp a 100 MBPS link. Obvious enough to me (hint: peak on a short 
timescale != average on a long timescale), but very counterintuitive to my 
colleagues.

There are ways to make stateless hashes change the way they hash. If you're 
using a CRC as a hash generator, for example, it starts with an initial value 
placed in a register. Change the initial value, and all the hashes change. If 
you find the distribution not to your liking, change the value, and see if you 
like that any better. The sad part is that there is no easy way to "calculate 
an initial value I will like"; you have to try them all, or at least 
occasionally try a different one.

Finding a predictable method tends to be about - as Ipsilon proposed a decade 
plus ago - identifying the important data flows and doing something intelligent 
with them, and running the rest statistically.

http://www.ietf.org/rfc/rfc2098.txt
2098 Toshiba's Router Architecture Extensions for ATM : Overview. Y.
     Katsube, K. Nagami, H. Esaki. February 1997. (Format: TXT=43622
     bytes) (Status: INFORMATIONAL)

http://www.ietf.org/rfc/rfc2129.txt
2129 Toshiba's Flow Attribute Notification Protocol (FANP)
     Specification. K. Nagami, Y. Katsube, Y. Shobatake, A. Mogi, S.
     Matsuzawa, T. Jinmei, H. Esaki. April 1997. (Format: TXT=41137 bytes)
     (Status: INFORMATIONAL)

--------------------------------------------------------------------
IETF IPv6 working group mailing list
[email protected]
Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
--------------------------------------------------------------------

Reply via email to