I am in the process of working through the comments received.
I have started with Alia's comments, which I have addressed as indicated
below.
I will post this as version -05. I will incorporate the other comments and
comments on the changes made in -05 in -06.
On 06/12/2013 23:03, Alia Atlas wrote:
I've given this draft a thorough reading (except for commenting on the
cost-based algorithm that Pushpasis already gave feedback on).
I have two major concerns. First, I see no discussion at all about
how multipoint interfaces are handled. Nothing indicates that they
are out of scope for protection nor does the discussion describe how
to handle avoiding the pseudo-nodes or related computation.
This is clarified in the Abstract, the ending of the Introduction and
with some additional text in Section 4.3.
Second, as a standards-track document, I see almost no description of
what MUST or even SHOULD be implemented. What would conformance mean?
Please look version -05 and let us know if this is now addressed.
If not, please indicate what we need to address.
Here are my detailed comments - in going through the draft order, not
by priority.
0)Sec 2: Can you add a paragraph describing the extended-P space and
Q-space for the example in Figure 2? It's fine to have the
definitions, but it is jarring to run into Q-space in Sec 4 without
any other reference. For instance, just adding:
"In [Figure 1], S can reach A, B, and C without going via E; these
form S's extended P-space. The routers that can reach E without going
through S-E will be E's Q-space; these are D and C. B has equal-cost
paths via B-A-S-E and B-C-D-E and so may go through S-E. The single
node in both S's P-space and E's Q-space is C; thus node C is selected
as the repair tunnel's end-point. For more details on the reasoning
behind these calculations, please see [Sec 4.2]"
Added
1) Sec 3, 2nd paragraph, 1 sentence:
"A tunneled repair path tunnels traffic to some staging point in the
network from which it is assumed that, in the absence of multiple
failures, it will travel to its destination using normal forwarding
without looping back."
First, can you replace "assumed" with precalculated or the like? This
isn't an assumption - it is based on shortest-path forwarding and is
carefully determined.
Done
Second, "in the absence of multiple failures" should be "in the
absence of a worse failure (e.g. node or SRLG) or multiple failures".
Done
2) Sec 3.1: For clarity, could you add a sentence indicating that the
tunnel itself must also avoid going over the link S-E?
Done
3) Sec 3.2, 2nd paragraph: "In order for S to obtain
the correct inner label it is necessary to establish a directed LDP
session[RFC5036] to the tunnel end point."
Shouldn't that be a "targeted LDP session"? I don't see directed at
all in RFC5036 and am personally only familiar with them being called
targeted LDP sessions.
Done
4) Sec 3.2 3rd paragraph: "The performance of the encapsulation and
decapsulation is efficient as encapsulation is just a push of one
label (like conventional MPLS TE FRR) and the decapsulation occurs
naturally at the penultimate hop before the repair tunnel endpoint."
Decapsulation can occur at the penultimate hop, if the repair tunnel
endpoint so specifies.
The text now says ...the decapsulation is normally configured
to occur at the penultimate hop...
5) Sec 3.2 3rd paragraph: "The time to establish the TLDP session and
acquire labels will limit the speed at which a new tunnel can be put
into service, but this will not be a problem in normal operation."
Can you please put some justification as to why this will not be a
problem in normal operation? Is your assumption that the added delay
before the network is fully repaired acceptable because failure
analysis studies indicate the separation between failures is generally
at least X seconds and the time to establish a TLDP session and
exchange labels (how many) is no more than Y seconds (in a router that
is also dealing with reconvergence of all protocols beyond the IGP)?
The extra communication clearly adds some delay before protection is
available and a blanket assertion that it doesn't matter is more
marketing than engineering.
It now says
"The time to establish the TLDP session and acquire labels will limit the
speed at which a new tunnel can be put into service. This is not
anticipated
to be a problem in normal operation since the managed introduction
and removal of links is relatively rare as is the incidence of failure in a
well managed network."
6) Given that this is a standards-track draft, is there a reason that
a minimum-to-implement tunnel type isn't specified? I don't see/know
of any references for how a router can know if a remote router can
decapsulate IP-in-IP at speed or support GRE or even support a tLDP
session.
There will normally be a preferred IP tunnel type in a particular IP network
and I would assume that was the type preferred for this purpose as well.
If we specify GRE and the network prefers IP-IP we cause more problems
than we solve. It therefore seems best to leave the IP tunnel type to the
preference of the operator.
Section 7 discusses the LDP case.
7) Sec 4.1: This seems to be ignoring the protection of per-prefix
LFA, which is a superset of "link LFA". More critically, as you know,
doing per-prefix LFA can allow the protection of multi-homed prefixes.
Can you please update this section to indicate how and if remote LFA
applies to multi-homed prefixes? As written, it sounds like it does
not. It also sounds as if remote LFA is required even if there is a
per-prefix LFA for all destinations going via S-E.
The text now says:
Tunneled repair paths (which may be calculated per-prefix) are only
required for links which do not have a link or per-prefix LFA.
8) Sec 4.2.1, 2nd paragraph: "We will proceed as follows: we will
describe how to compute the set of routers which can be reached from S
without traversing S-E."
For clarity, can you change this to: "how to compute the set of
routers which can be reached from S on the shortest-path tree without
traversing S-E"? That may help remove some of the confusion when you
then define extended P-space - where clearly S can reach as well."
Please make the same change in the first sentence of 4.2.1.1.
I have reworded section 4.2.1 and included this change.
9) Sec 4.2.1.1 <http://4.2.1.1>: First, can you explain why you remove
those nodes that
are ECMP in the draft? I assume it is because such nodes require a
directed first hop to avoid accidentally going through S-E.
Second, after describing "excising the sub-tree" - what about a quick
clue such as: "For example, if an SPF computation stores at each node
the next-hops to be used to reach that node from S, then one can
simply add a node to P-space if none of its next-hops are S-E"
The text has been reworded. Please check it.
10) How are multi-point interfaces handled? I see no discussion of
that in Sec 4.2.1.1 or Sec 4.2.1.2.
I am not sure further text is required. Please explain the problem
you are concerned about.
11) In Sec 5, I am confused by the example and I believe it is because
there are typos?
"When a failure occurs on the link between PE1 and P2, PE1 does not
have an LFA for traffic reachable via P1. Similarly, by symmetry, if
the link between PE2 and P1 fails, PE2 does not have an LFA for
traffic reachable via P2."
But PE2->P1 has a cost of 1000 and PE1's path to P1 isn't affected by
the failure of the link PE1-P2... Could you please correct or clarify?
Figure corrected.
12) Sec 6: Given that [I-D.bryant-ipfrr-tunnels] is expired and not
intended for publication and this draft is heading towards RFC, is
there a reason not to accurately and fully pull out the problem
description into this draft?
I am in the process of sending this to the ISE with some de-conflict
notes in the introduction.
Given that text and the existence of the node-protecting draft I think
that pulling text into this draft will make this draft unbalanced WRT the
problem it addresses.
13) Sec 6: The same problems can occur with SRLG failures. Can you
please add a brief paragraph decribing how the different cases apply
for SRLG failure? Also, please expand on the analysis recommendation.
I have added a sentence which I think addresses the issue succinctly.
14) Sec 7, 2nd paragraph: reference is to "Section 2" instead of
"Figure 2".
Done
15) Sec 8: please update the "date of this draft" to be an actual date
- as the draft is revised and progresses, this would falsely imply
more recent data than will be there.
Done
16) Sec 8: Can you please define what is meant by "protected
destinations" and "guaranteed node-protected destinations"? Is the
first if a destination is protected from all sources against all link
failures? Is it something more limited such as PLR/dest pairs?
This is clarified in the text
17) Sec 8.4: typo - "saleability" should be "scaleability".
Done
18) Sec 8.4: The end of this section
"In the very few cases where P and Q spaces have an empty
intersection, one could select the closest node in the Q space and
signal an explicitly-routed RSVP TE LSP to that Q node. A directed
LDP session is then established with the selected Q node and the rest
of the solution is identical to that described elsewhere in this
document. Alternatively the segment routing technology being defined
in the IETF may be used to carry the traffic between non-collocated P
and Q nodes [I-D.filsfils-rtgwg-segment-routing-use-cases],
[I-D.filsfils-rtgwg-segment-routing],
[I-D.gredler-rtgwg-igp-label-advertisement]."
seems to be describing functionality that isn't fully specified or
mentioned elsewhere in the draft and that doesn't fall within RTGWG's
charter. If we include explicit routing rather than hop-by-hop, we've
known how to do that for years. At a minimum, it'd be useful to put
this paragraph in a section by itself that is "Potential Coverage
Improvements Via Explicit Routing" or the like.
I have used the word "potential"
- Stewart
_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg