Re: Flow Labels: what problem are we solving?

Shane Amante Tue, 11 Jan 2011 11:16:44 -0800

Hi Thomas,

Token operator here ... :-)  See below.

On Jan 11, 2011, at 06:41 MST, Thomas Narten wrote:
> Sorry to get back to basics, but I have not followed all the Flow
> Label discussions or read all the drafts. I have read
> 
>      draft-ietf-6man-flow-ecmp-00.txt
>      draft-ietf-6man-flow-update-01.txt
> 
> pretty carefully and I still don't quite understand what real problem
> we are trying to solve - and thus, whether the proposed changes
> actually help or are a no op.
> 
> Is there a document that speaks to this?
> 
> Question:
> 
> I understand the value  of ECMP type load balancing. But how much of a
> problem is it today (with IPv6) if the Flow Label is not used?

Today, it's a significant one, especially so for tunneled traffic, e.g.: IPvX 
in IPv6, GRE, IPSec, LISP, etc. and WAN acceleration products for fast file 
xfer's.  In the future, it could significantly inhibit (even, prohibit) the 
IETF from developing new protocols that either use new IPv6 Extension Headers 
or Destination Options, since Core/Edge router/switch HW (primarily built 
around ASIC's) cannot (easily) be adapted to recognize the granularity of 
'individual flows' in those new protocols -- worst-case, existing HW cannot be 
adapted and would requiring swapping it out, which is essentially a 
non-starter.  Even when existing core HW can be adapted, through SW changes, to 
recognize the granularity of new protocols it could be upwards of a decade 
before the cycle of coaxing/prodding/cajoling vendors to develop the capability 
in SW through operators testing and eventually deploying the capability in 
their networks.

I would also raise the "architectural purity" argument of: do you really want 
millions of routers (and, L2 switches) that are really just supposed to forward 
based solely on IP source address, IP destination address (and, IP Traffic 
Class) *attempting* to go deeper into the packet (headers) to discern useful, 
granular information that could be used as input-keys for load-balancing across 
LAG and ECMP paths?  Unfortunately, every device manufacturer is going to make 
their own decisions on what they can (or, will) be able to look at in the IP 
and Extension Headers as input-keys for a load-balancing hash, but the 
end-result will be (and, is currently) a mess in terms of deployment and 
operation.  (As an example, some vendors can read entire IPv6 addresses from a 
IPv6 header in new(er) HW, others can only read parts of the v6 address, etc.)  
Finally, there is this jewel in RFC 2460:
---cut here---
   With one exception, extension headers are not examined or processed
   by any node along a packet's delivery path, until the packet reaches
   the node (or each of the set of nodes, in the case of multicast)
   identified in the Destination Address field of the IPv6 header.
[...]
   The exception referred to in the preceding paragraph is the Hop-by-
   Hop Options header, which carries information that must be examined
   and processed by every node along a packet's delivery path, including
   the source and destination nodes.
---cut here---
While I wasn't around during during the creation of RFC 2460 (so, please 
correct or inform me if I'm wrong), but it seems to at least imply (if not, 
mandate?) that intermediate nodes (such as routers and L2 switches) shouldn't 
be trying to interpret the characteristics of the upper-layer protocols being 
transported.  This would make sense when viewed in light of the end-to-end 
principle, but perhaps I'm taking too strict an interpretation.

> If you hash on just the 5 tuple (excluding the flow label),  you get
> (I assume) the equivalent of what you have in IPv4 today. Why is that
> not good enough?

For one, HW can't glean a 5-tuple when it encounters tunneled packets.  Second, 
it inhibits/prohibits the IETF from developing new upper-layer protocols 
(if/when the need should arise) and getting them deployed in a reasonable 
timeframe.  Lastly, I would like to get to a point where I can tell my 
router/switch vendors to build more simple (and, thus, more cost-effective) HW, 
because they don't have to keep piling complexity into their ASIC's to 
recognize the various legacy and new permutations of transport-layer protocols 
for input-keys for LAG and ECMP load-balancing hashes. Instead, I would ideally 
be able to tell them: just use {IP src, IP dst and IPv6 flow-label} as 
input-keys -- oh, and look at that ... they're all at fixed offsets in the IPv6 
header and at the very beginning of the packet so the time (and, amount of 
memory) for you to copy that region of the packet to extremely expensive SRAM 
(packet buffer memory) has just been reduced, (reducing cycle times to process
  the packet, etc.).  Now, I will grant you this latter point is a bit of 
wishful thinking for now, given that we'll need to be successful in getting 
these drafts agreed upon and published.  Then, hosts (and, 1st-hop) routers 
will need to start writing useful flow-labels (and, hopefully, firewalls or 
their administrators do not screw around and write zero back over the 
flow-label).  However, I'm of the opinion that if you don't start to make a 
small change now, you will never see an improvement down the road.

> Also, splitting flows across different links would seem to have value
> primarily if you hvae a single source (or rather single src/dest pair)
> generating a *lot* of traffic/flows, i.e., so that if you split
> traffic from that source/dest pair, you see measurable load-splitting.
> 
> Is this happening in practice today? Can operators please speak to
> this? And if it is a problem, is it primarily with tunneled traffic
> (where the tunnel aggregates many flows), or is it really between
> individual pairs of nodes that are sending a *lot* of traffic to each
> other? Are there examples of this?
> 
> (I'm not necessarily opposed to this work going forward, but I'm not
> entirely convinced we are solving a real problem. Help me please.)

I think I've answered these questions above; however, if you still aren't 
satisfied with those answers please let me know.  

Lastly, I would add that LAG and ECMP have been around for several [dozen] 
years and will remain with us indefinitely.  IOW, even if 100 GbE were 
cost-effective and deployed today, I know of several operators who will still 
be using LAG or ECMP over Nx 100 GbE trunks in order to continue to carry the 
traffic demand on their networks.  So, in summary, this is not a problem that 
is going away with 100 GbE, or beyond.

-shane
--------------------------------------------------------------------
IETF IPv6 working group mailing list
[email protected]
Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
--------------------------------------------------------------------

Re: Flow Labels: what problem are we solving?

Reply via email to