Hi Wes,

Thanks for your comments. Please find responses below.

>> In order for this to be operationally manageable, especially in the case of 
>> on-router processing of this rebalancing, there has to be an easy way for 
>> the operator to access the information about what's happening - what the 
>> result would be if the flows were balanced according to the hash vs what is 
>> happening as a result of rebalancing, so that they can chase down things 
>> like rebalancing misses or situations where this local optimization is 
>> creating a problem elsewhere in the path because that device did something 
>> different in its attempts to balance better, etc. It may also be that this 
>> info is necessary to properly tune the frequency of sampling, the thresholds 
>> for things like long-lived vs short-lived flows, etc. to the specific 
>> network where it is being used. 
 
For determining component link imbalance, we are using interface counters.  For 
tracking the effectiveness of the scheme, we also have other monitoring 
information such as 1) Number of times rebalancing was done and 2) Time since 
the last rebalancing event. Is there something else that you think would be 
helpful?
 
>> I realize that in the model you've proposed, we're somewhat limited because 
>> this is using sampled flow data instead of the realtime packet hash. It may 
>> be that this drives a requirement for the granularity of data being brought 
>> into the system in the external mode, and some requirements about the level 
>> of information available via the UI (or SNMP or XML or whatever) in the 
>> automatic hardware-based mode.
 
For large flow detection, we propose two modes 1) Sampling Techniques 2) 
Automatic Hardware Recognition. The primary difference between the two modes in 
terms of performance is the time taken to recognize the long-lived large flows; 
there is no loss of accuracy in either mode. In the automatic hardware based 
mode for detecting the large flows, it is best that the load balancing happen 
in the router. Are you looking for anything else ?

Thanks,
Ramki

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of 
George, Wes
Sent: Monday, September 16, 2013 2:02 PM
To: Melinda Shore; [email protected]
Cc: [email protected]
Subject: Re: [OPSAWG] WG last call for "Mechanisms for Optimal LAG/ECMP 
Component Link Utilization in Networks"

My apologies for missing the deadline on this, but I got my wires crossed and 
reviewed this doc in response to something I saw on IETF LC (which was actually 
on a different draft), so I figured the least I can do is provide my feedback 
on the *right* draft/thread/list, and then swear off email and go home for the 
day before I can do any more damage... sigh.
--------
one substantive comment:

I think within the operational considerations (and possibly the info model), 
you need some discussion of diagnostics and troubleshooting, both for on-box 
and off-box implementations. How do I see that it's working properly, and how 
do I diagnose problems when it's not?
One of the problems with the existing hashing algorithms is that they are often 
opaque, such that it's not clear what the device is doing, whether the hashing 
is working properly and the flows are of the sort that create imbalanced 
distribution, or whether hashing has broken somehow -- occasionally you can get 
info, but it's usually hidden commands, with difficult-to-interpret responses, 
and it's not like most vendors publish their "secret sauce" optimizations of 
hashing so that it's easy to predict what will happen given a certain set of 
flows.
In order for this to be operationally manageable, especially in the case of 
on-router processing of this rebalancing, there has to be an easy way for the 
operator to access the information about what's happening - what the result 
would be if the flows were balanced according to the hash vs what is happening 
as a result of rebalancing, so that they can chase down things like rebalancing 
misses or situations where this local optimization is creating a problem 
elsewhere in the path because that device did something different in its 
attempts to balance better, etc. It may also be that this info is necessary to 
properly tune the frequency of sampling, the thresholds for things like 
long-lived vs short-lived flows, etc. to the specific network where it is being 
used.
I realize that in the model you've proposed, we're somewhat limited because 
this is using sampled flow data instead of the realtime packet hash. It may be 
that this drives a requirement for the granularity of data being brought into 
the system in the external mode, and some requirements about the level of 
information available via the UI (or SNMP or XML or whatever) in the automatic 
hardware-based mode.

Thanks
Wes George


> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf
> Of Melinda Shore
> Sent: Saturday, August 24, 2013 4:07 PM
> To: [email protected]
> Cc: [email protected]
> Subject: [OPSAWG] WG last call for "Mechanisms for Optimal LAG/ECMP
> Component Link Utilization in Networks"
>
> This is to announce the start of working group last call on:
>
>     Mechanisms for Optimal LAG/ECMP Component Link Utilization in
>         Networks
>
> http://datatracker.ietf.org/doc/draft-ietf-opsawg-large-flow-load-
> balancing/
>
> It is intended for publication as an informational RFC.
>
> Please give it a careful read and provide any feedback to this
> mailing list by September 9, 2013
>
> Thanks,
>
> opsawg chairs
> _______________________________________________
> OPSAWG mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/opsawg

This E-mail and any of its attachments may contain Time Warner Cable 
proprietary information, which is privileged, confidential, or subject to 
copyright belonging to Time Warner Cable. This E-mail is intended solely for 
the use of the individual or entity to which it is addressed. If you are not 
the intended recipient of this E-mail, you are hereby notified that any 
dissemination, distribution, copying, or action taken in relation to the 
contents of and attachments to this E-mail is strictly prohibited and may be 
unlawful. If you have received this E-mail in error, please notify the sender 
immediately and permanently delete the original and any copy of this E-mail and 
any printout.
_______________________________________________
OPSAWG mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/opsawg
_______________________________________________
OPSAWG mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/opsawg

Reply via email to