Re: AD review comments on draft-ietf-rtgwg-remote-lfa-08

Stewart Bryant Tue, 16 Dec 2014 09:58:00 -0800

On 12/12/2014 00:00, Alia Atlas wrote:

Alia thank you for your review.


Here are my responses and the changes make to -09

Minor Comments:

1) In Sec 2, 3rd paragraph, in the sentence:
"The single node in both S's P-space and E's Q-space is C; thus node Cis selected as the repair tunnel's end-point."
it should be "S's extended P-space"

Correct - changed

2) In Sec 2, it says: "The non-failure traffic distribution is notdisrupted by the provision of such a tunnel since it is only used forrepair traffic and MUST NOT be used for normal traffic."This is obviously correct and good - but I think it would be veryuseful to clarify that OAM traffic to test the rLFA may transit thetunnel at any time. Otherwise, the MUST NOT could cause someconfusion - depending on how one thinks about "normal traffic".

This now says:

The non-failure traffic distribution is not disrupted by the provisionof such a tunnel since it is only used for repair traffic and MUST NOTbe used for normal traffic. Note that OAM traffic specifically to verifythe viability of the repair MAY traverse the tunnel prior to a failure.

I used viability rather than for example "availability" to cover anyform of OAM test (CC, CV, delay, jitter.....)

I toyed with saying "normal data traffic" and not adding the OAMsentence, but that would have allowed routing and network managementtraffic (other than OAM) which we also need to exclude.

3) In Sec 3: I can't parse "Examples of worse failures are nodefailures (see Section 6 ), and through the failure of a shared risklink group (SRLG), the through the independent concurrent failure ofmultiple links, and these are out of scope for this specification."
I think you mean "Examples of worse failures are node failures (seeSection 6), the failure of a shared risk link group (SRLG), theindependent concurrent failures of multiple links; protecting againstsuch worse failures is out of scope for this specification." I wouldadd in the failure of broadcast interfaces and NBMA interfaces forcompleteness, even though that was mentioned in Sec 2.

This now says:

Examples of worse failures are node failures (see Section 6), thefailure of a shared risk link group (SRLG), the independent concurrentfailures of multiple links, broadcast or non-broadcast multi-access(NBMA) links [Section 2]; protecting against such worse failures is outof scope for this specification.

4) In Sec 4.2: "Provided both these requirements are met, packetsforwarded over the repair tunnel will reach their destination and willnot loop." Please change to:"will not loop after the single link failure". Of course, looping canhappen if a worse failure than protected against occurs - as withLFA. This could also be mitigated by requiring that the PQ node isdownstream of the PLR, as is mentioned in Sec 4.2.2.

Correct

This now says:

Provided both these requirements are met, packets forwarded over therepair tunnel will reach their destination, and will not loop after asingle link failure.

5) In Sec 4.2.1.2 <http://4.2.1.2>: "This may be calculated bycomputing an SPT at each of S's neighbors (excluding E) and excisingthe subtree reached via the path N->S->E."As described here, a node Y that is reached via N->S->A would beconsidered to be in S's extended P-space. I realize that one wouldassume that Y would be in S's P-space anyway and thus it is safe tonot care about this edge case. However, the ECMP considerations makeit more complex so please at a minimum add in the same caveat as inSec 4.2.1.2 "(including those routers which are members of an ECMPthat includes link S-E)" suitably modified. In the cost-based versionin Compute_Extended_P_Space, this is handled by ignoring any potentialnode from N whose shortest path goes back through S. It'd be nice ifthe two methods were consistent.

I have changed the text to:

This may be calculated by computing an SPT at each of S's neighbors(excluding E) and excising the subtree reached via the path N->S->E.Note this will excise those routers which are reachable through allECMPs that includes link S-E.

I am not sure that this clarification is strictly needed since "removalof the subtree reached via the path N->S->E" would include "thoserouters which are members of any ECMP that includes link S-E".

Would it be less confusing if we changed "excising the subtree reached"to "excising the routers reached"?

6) In Sec 4.2.2: "As described in [RFC5286], always selecting a PQnode that is downstream with respect to the repairing node, preventsthe formation of loops when the failure is worse than expected."Could you clarify that the PQ node is downstream with respect to therepairing node and the destination - rather than the proxy destinationE? I'm fairly certain that the latter wouldn't work (but don't havean example topology created). If you disagree, let me know and I'llwork on creating one. This is the constraint that is expressed inApply_Downstream_Constraint().

I don't think there is a problem in practice since if PQ needed to bedownstream to E WRT S, D_opt(PQ,E) < D_opt(S,E) would apply and in aunit cost network there would be no PQ nodes since we would needD_opt(PQ,E) < 1, i.e. a link metric from PQ to E of less than one. PQnodes would be so rare that this would no be a practical solution.


I have changed the text to:

As described in [RFC5286], always selecting a PQ node that is downstreamto the destination with respect to the repairing node, prevents theformation of loops when the failure is worse than expected. The use ofdownstream nodes reduces the repair coverage, and operators are advisedto determine whether adequate coverage is achieved before enabling thisselection feature.

7) In Sec 4.3: "The reader is referred to[I-D.psarkar-rtgwg-rlfa-node-protection] for further informationon the use of RLFA for node repairs." Can you add "and broadcast orNBMA link repairs"? Do you feel that is accurate?

I cannot see any text on broadcast or NBMA in the draft which is nowdraft-ietf-rtgwg-rlfa-node-protection (updates in text)


I have made no text change on the substantive point.

8) In Sec 6: s/"When the failure is a node failure rather than a linkfailure"/"When the failure is a node failure rather than apoint-to-point link failure"

Done

9) In Sec 6: "Alternatively one might choose to assume that theprobability of a node failure and microloops forming is sufficientlyrare that the case can be ignored." Can you please clarify frommicroloops to "microloops forming due to use of alternates"? We knowthat in cases where a rLFA is necessary, that neighbor isn't loop-freeand so regular microloops due to reconvergence will form.

It took a while to understand the comment but I think I know what you mean.

I have changed the text to:

Alternatively one might choose to assume that the probability of a nodefailure is sufficiently rare that the issue of looping RLFA repairs canbe ignored.

10) In Sec 7: "In the absence of a protocol to learn the preferred IPaddress for targeted LDP, an LSR should attempt a targeted LDP sessionwith the Router ID [RFC2328] [RFC5305] [RFC5340], unless it isconfigured otherwise." Can you please add in some text for how thiswould work for IPv6? I believe that there are current draftsdiscussing carrying Routable IP addresses (e.g.http://datatracker.ietf.org/doc/draft-ietf-ospf-routable-ip-address/). We know that there is interest in having IPv6 only networks withMPLS - so it'd be good not to create new gaps.


It now says

In the absence of a protocol to learn the preferred IP address for targeted LDP, an 
LSR should attempt a targeted LDP session with the Router ID [RFC2328] [RFC5305] 
[RFC5340] [RFC6119] [I-D.ietf-ospf-routable-ip-address"
], unless it is configured otherwise.

11) In Sec 8.4: "In an MPLS network, this is achieved without anyscaleability impact, as the tunnels to the PQ nodes are always presentas aproperty of an LDP-based deployment." The targeted LDP sessionsdon't have a scaleability impact? That the repair tunnels don't needto be specifically created as new tunnels, I agree with - but thisstatement is overselling. Please make the technical point more clearly.

I have cut this back to

As shown in the table, remote LFA provides close to 100% prefixprotection against link failure in 11 of the 14 topologies studied, andprovides a significant improvement in two of the remaining three cases.Note that in an MPLS network the tunnels to the PQ nodes are alwayspresent as a property of an LDP-based deployment.

12) In Sec 9: I feel like here is a good place at least mention theissues with microloops from reconvergence. Since reconvergence afterrLFA is going to result in a local microloop (depending on timing), atleast a reference tohttps://tools.ietf.org/html/draft-litkowski-rtgwg-uloop-delay-03 witha recommendation to consider it is important. Otherwise, the rLFArepair happens and then traffic microloops and is lost. The fact thatthese local microloops occur with real impact much more with rLFA (orany advanced FRR technique) is an important management consideration.

I have added the following new para:

When the network re-converges, microloops [RFC5715] may form due totransient inconsistencies in the router FIBs. If it is determined thatmicroloops are a significant issue in the deployment, then a suitableloop free convergence methods such as one of those described in[RFC5715], [RFC6976] or [I-D.litkowski-rtgwg-uloop-delay] should beimplemented.

13) Sec 12: Saying "To prevent their use as an attack vector therepair tunnel endpoints SHOULD be assigned from a set of addressesthat are not reachable from outside the routing domain." is basicallyempty words without more behind Sec 7 default of using Router IDs.Can you find a reference that talks about a BCP for Router IDs notbeing reachable addresses outside the routing domain? Can you describehow to use the IGP extensions?

Router IDs are used for T-LDP and normal MPLS security applies.

Again with MPLS repair tunnels normal MPLS security applies.

The Section 12 reference was to IP tunnels in an IP rather than MPLSnetwork. I have changed the text to:


The security considerations of [RFC 5286] also apply.

Targeted LDP sessions and MPLS tunnels are normal features of an MPLSnetwork and their use in this application raises no additional securityconcerns.

To prevent their use as an attack vector IP repair tunnel endpoints(where used) SHOULD be assigned from a set of addresses that are notreachable from outside the routing domain.

Nits:
a) In Sec 4.2.1.1 <http://4.2.1.1>: "The exclusion of routersreachable via an ECMP that includes S-E prevents the forwardingsubsystem attempting to a repair endpoint via the failed link S-E."
s/attempting to a repair/from attempting to use a repair

Done

b) In Sec 10: "We propose "Remote LFA" as a natural second step."This is going to be an RFC - so rather than propose, try specify.

I have changed this to:

The purpose of LFA FRR technology is to provide for a simple FRRsolution when such a solution is possible. The first step along thissimplicity approach was "local" LFA [RFC5286]. This specification of"Remote LFA" is a natural second step.

Hopefully these resolutions are acceptable to all. If not please let meknow.


New version at http://datatracker.ietf.org/doc/draft-ietf-rtgwg-remote-lfa/

Diffs athttp://www.ietf.org/rfcdiff?url1=draft-ietf-rtgwg-remote-lfa-08&difftype=--html&submit=Go!&url2=draft-ietf-rtgwg-remote-lfa-09


- Stewart

_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg

Re: AD review comments on draft-ietf-rtgwg-remote-lfa-08

Reply via email to