Re: draft-so-yong-rtgwg-cl-framework

Curtis Villamizar Sun, 06 May 2012 11:14:13 -0700

Iftekhar,

Thank you for the detailed comments.  Responses to individual
comments/suggestions are inline.

This is a long email message, so if I missed anything, please point
out what I missed.

Please let me know if you are in general agreement with the responses
and if so I'll make specific proposals for replacement text or propose
more detailed action items.

Best Regards,

Curtis

In message <d7d7ab44c06a2440b716f1f1f5e70ae534638...@sv-exdb-prod1.infinera.com>
Iftekhar Hussain writes:

> Dear  Authors,
>
> Please find below some comments on the
> http://tools.ietf.org/html/draft-so-yong-rtgwg-cl-framework-05.
>
>
> ----------------------------
>
> 2.1. Flow Identification
>
> "Operator may have other objectives such as ...composite link energy
> saving, and etc. These new requirements are described in
> [I-D.ietf-rtgwg-cl-requirement]"
>
> Comment:
>
> I don't recall any energy saving related requirement (or discussion)
> in the [I-D.ietf-rtgwg-cl-requirement.  Suggest removing the text
> "energy saving etc."

You are entirely correct that there are no energy savings mentioned in
the requirements document.  This wording came from one of the other
authors but I think I understand the intent.  There is a daily cycle
(and less so a weekly cycle) to traffic levels.  During the low
traffic hours (generally late at night for local hours of night)
traffic can be rebalanced to reduce the number of active component
links.  Where it is possible to depower interfaces, this could result
in energy savings.

I'm OK with removing this example, though it is a very good example
and an goal that we should be pursuing.  Another option is to add a
"MAY" in the requirements docuemt somewhere that indicates that "Load
balancing MAY be used during sustained low traffic periods to reduce
the number of active component links for the purpose of power
reduction".

I'll wait for further comments on this before making a suggestion on
how we will proceed.

> 2.2. Composite Link in Control Plane
>
> "LDP follows the IGP, therefore failure to forward on  the IGP path
> will often result in loss of connectivity if the IGP adjacency is not
> withdrawn when an LDP FEC is refused.  This is a pathologic case that
> can occur if LDP is carried natively and there is a high volume of LDP
> traffic.  This situation can be avoided by carrying LDP within RSVP-TE
> LSP."
>
>
> Comment:
>
> Is it the loss of connectivity referring to LDP control plane
> connectivity only or LDP signaled LSP data plane or both?

The bottom line is that admission control cannot be applied to LDP
without loss of connectivity.  Complete loss of connectivity for a
subset of traffic, no matter how low its priority, is unacceptable.

LDP does not support TE.  There are two options.  Either don't carry
enough LDP traffic such that this situation can occur (this is the
case when a small subset of traffic is high priority traffic carried
within LDP, such as VOIP or enterprise VPN traffic mixed with a much
larger volume of Internet traffic).  The alternative is to carry the
LDP traffic within RSVP-TE LSP when enough LDP traffic is aggregated
such that TE is needed.

The goal was to refer to this problem without going into an
excessively long explanation.  If you think it is too brief and
therefore is unclear, then I'll reword it.

> Comment:
>
> "Composite link capacity is aggregated capacity and MAY be larger than
> individual component link capacity."

Before the MAY insert "LSP capacity" and the sentence makes more
sense.  I'll reread this and see if that is what was intended.

> Composite link aggregate capacity should always be larger than
> individual component link capacity. Did I misunderstand?

The statement as written is not useful.  The sum of positive
quantities is always greater than the quantities being summed.  Thanks
for pointing out this statement.  I'll look at the context and figure
out the intent.

>
> "...1. If no other information is available the largest microflow is
> bound by one of the following:"
>
> Suggested text:
>
> If no other information is available the largest microflow is bound
> (i.e., signaling extensions don't indicate a bound) by one of the
> following:

I'll reword this.  The other information could be signaled or
configured.

> Comment:
>
> Section 2.2 provides architectural guidelines e.g., "  ...Available
> capacity in other component links MUST be used to carry impacted
> traffic.  The available bandwidth after failure MUST be advertised
> immediately to avoid looped crankback" and provides illustrative
> examples e.g.,"... no microflow larger than 10 Gb/s will be present on
> the RSVP-TE LSP that aggregate traffic across the core, even if the
> core interfaces are 100 Gb/s interfaces."

Be careful not to take the last sentence out of the context of the
example (where no interface feeding the core is greater than 10 Gb/s).

> In contrast, section 2.1 appears to be little bit too vague e.g., "
> ... technique of grouping flows, such as hashing on the flow
> identification criteria, becomes essential to reduce the stored state,
> and is an essential scaling technique.  Other means of grouping flows
> may be possible"
>
> Suggestion:
>
> Add some examples which flow identification scheme to use for
> composite links or add a reference to section 4.2 which discusses flow
> identification trade-offs.

The intent in Section 2.1 is to make three points.  First, tracking
every flow is not scalable.  Second, IP src/dst address hashing has
proven (in over two decades of experience) to be an excellent way of
identifying groups of flows (for reasons briefly listed).  Third, if
you find a better way to identify groups of flows, then use it.

We don't want to require the use of IP address hashing, but wanted to
strongly encourage its use, given its long history of successful
deployment.

I'll look at rewording the section.

> 4.1.4. Requirements for Contained LSP
>
> Comment:
>
> Add reference to specific relevant to requirements in
> I-D.ietf-rtgwg-cl-requirement] similar to section 4.1.2 and 4.1.3.

OK.  The sentence "[I-D.ietf-rtgwg-cl-requirement] calls for new LSP
constraints." needs followup indicating which requirements are being
referred to.

> 4.2. Data Plane Challenges
>
> Comment:
>
> Minor typo: "very course..." should be "very coarse..."

Of coarse.  How could I miss that?  :-)

I searched for other uses of the word "course" and found quite a few
that should be "coarse".  Thanks for the catch.

> Comment:
>
> "In practice  using the MPLS label stack alone has proven too course
> to acheive a reasonably good load balance, due to bin-packing issues
> and discrpencies between signaled bandwidth and actual traffic loads
> on LSP."
>
> Suggest adding a reference to an IETF standard or best practices
> document that discusses these issues.

OK, if I can find something.  This has been discussed on and off on
the mailing lists for about a decade.  There may be something in the
original link bundling RFC or elsewhere.

> Comment:
>
> Sections 4.2, 2.1, 2.3, appear to be somewhat redundant. Suggest
> trimming section 2.1 and absorbing that information in section 4.2.

Thanks for pointing this out.  I'll reread and try to reduce
redundancy, using cross references where appropriate rather than
repeating a point.

> 3. Architecture Tradeoffs
>
> Comment:
>
> "Composite Link is applicable to large networks, and therefore
> scalability must be a major consideration."
>
> What is a typical definition of a large network?  Suggestion: Add an
> example (or a forward reference to section 3.3.1 which has an example)

Any network that one of your customers wants to build would be a good
example of large.  :-)  [But not a good definition of large.]

By "large" we mean the type of networks that service providers or
large content providers are building today.

In the future we may find that enterprises are building large enough
private networks to have certain scaling problems that had previously
only been experienced by service providers and content providers in
the past.  A good example of that in the past is overuse of bridging.
Some of the NSFNET regional networks in the late 1980s used bridging
but experienced severe scaling issues very quickly.  It wasn't until
the early to mid 1990s that large enterprises began having similar
problems.

The point of the phrase "Composite Link is applicable to large
networks" is that if you need parallel links because the single link
capacity je jour is too small, then your network is large relative to
most other networks.  Considering the magnitude of "large" where
10Gb/s links are too small, the phrase "and therefore scalability must
be a major consideration" may seem almost rhetorical to some with
deployment experience with IP networks.

The reason that the statement here is brief is that there are a number
of IETF base documents dating back to the mid-1990s on the importance
of scalability.  These realization came from deployment experience.
Despite this someone occasionally makes a naive comment about
processors getting faster and memory getting larger, ignoring the fact
that comparable grouth with order NlogN or N-squared growth in
computation or memory requirements far outpaces Moore's Law
improvements.

I think at least one somewhat ancient RFC indicatea that scalability
must be considered in every document in the IETF routing area.  If I
find it I'll cite it.  It will probably have Brian Carpenter or
Christian Huitman on the author list, making it easier to find.  Maybe
Fred Baker.  I think it was an IAB or IESG RFC.

> 3.1. Scalability Motivations
>
> Comment:
>
> "....a large routing change to be accomplished more quickly,"
>
> What is a typical definition of a large routing change?  Suggestion:
> Add an example.

Not entirely serious: Hurricane Katrina resulted in a number of "large
routing changes".  There was the collaspse of the Raratan River bridge
in NJ taking out fibers on both sides of the bridge (thought to be
redundant), an earthquake east of San Diego taking out three of four
fiber paths, etc.

This is again a relative term that has been used for quite some time.
A customer circuit going down or a new customer coming online results
in a tiny routing change.  A metro fault may be handled entirely in
the metro and have no effect at all on routing elsewhere in the
provider network or global network.  Major fiber outages generally
result in large routing changes.

In an MPLS context, a large routing change is one that affects a large
number of LSP.  A classic example is where one hop along one of two or
three east-west paths across the continental US goes down (or comes
up, though this can be made far less disruptive).  Many LSP traverse
the small number of US east-west paths and terminate in a large number
of medium to large sized cities along either coast.

Again, I'll look at clarifying, but in less words than this response.

> 7.2.5. Dynamic Multipath Balance
>
> Comment:
>
> "...uses a course granularity, the adjustments would have to be
> equally  course, in the worst case moving entire LSP"
>
> Minor typo "course" should be "coarse".

s/course/coarse/ with selective replacement.

> 7. Required Protocol Extensions and Mechanisms
>
> Comment/suggestion:
>
> Organizing section 7 as  a matrix containing something like:
> Requirement#, Existing Mechanisms, Gaps, Ongoing/new extensions...,
> might be more clearer. AT a minimum, add a traceability to each
> requirement in [I-D.ietf-rtgwg-cl-requirement] and the section number
> in framework document that addresses each requirement (or set of
> requirement).

Section 7.2 groups the requirements but there is no where that lists
every requirement in numeric order and points to where in the grouping
it falls.  It should be easy enough to add that.

At the moment, potential issues with the requirements or in this
document are listed in Section 7.3.  This is a different approach,
listing the omissions rather than cross referencing.  Ideally we
address all of the issues and drop this section.

Section 7.4 lists the requirements by protocol or functionality
affected.  This as stated in the first paragraph is for the benefit of
implementors.  Subsections of 7.4 refers back to the groupings in
subsections of 7.2.  It would not be difficult to put in the reverse
cross references (functional group refers to protocol change or
functionality supporting it).

> Is there a minimal set of requirements which must be met to form a
> deployable (useful) composite link based solution?

That's a great question.

"Enough to be useful" is interpreted differently by different network
operators deploying a network.  For an equipment vendor the best
approach is to plan to implement everything over time under the
assumption that every feature will appear in a check box in someone's
RFI/RFP/RFQ eventually, but ask current key customers which features
to prioritize.  That process may yield a different answer for
different equipment vendors, depending on who their current key
customers are.

BTW- Our friends at Verizon made sure that almost every requirement is
a MUST, so ask them to identify the requirements that they are not
really serious about.  Wear protective fireproof undergarments.  :-)

> -----------------------
>
> Regards,
> Iftekhar
_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg

Re: draft-so-yong-rtgwg-cl-framework

Reply via email to