Benoit Claise's No Objection on draft-ietf-rtgwg-mrt-frr-architecture-09: (with COMMENT)

Benoit Claise Thu, 04 Feb 2016 02:36:44 -0800

Benoit Claise has entered the following ballot position for
draft-ietf-rtgwg-mrt-frr-architecture-09: No Objection


When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-rtgwg-mrt-frr-architecture/



----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

We're lucky to have two OPS DIR reviews for this document.

>From Fred Baker:

In my view, although I have concerns as I am about to state, I consider
the draft to be ready for IESG review and potential publication as an RFC
at Proposed Standard. I have no specific issues I would like to see
addressed, nor do I believe the technology or draft to be fundamentally
flawed.


Speaking in general terms, this draft describes a solution for the
problem posed in RFC 5714, which is to say a solution for fast reroute in
a network whose routing is implemented using IS-IS and LDP. It is not the
only possible solution. In terms of graph theory, we might define a
"connected graph" as a set of "nodes" and a set of "links" that
interconnect them, such that every node is connected via some sequence of
links and nodes to each of the other nodes in the connected graph. The
Maximally Redundant Tree model seeks to divide the connected graph into
two or more connected sub-graphs, each of which connects the same set of
nodes, but using sets of interconnecting links whose intersection set is
null, or is at least minimized. In the event that a link in one connected
sub-graph fails, the network can continue to use another connected
sub-graph to guide routing during the outage.

There are obvious degenerate cases, in which the sets of links in
sub-graphs are forced to overlap to some degree, or some nodes are not
found in all sub-graphs. Part of the architecture is designed to identify
those cases (which might occur, for example, in the presence of multiple
simultaneous failures, or when the network is inherently deficient for
reasons unrelated to and perhaps in violation of the mathematics) and
handle them as best it can.

As one might imagine, this is not trivial. My first comment on reading
the architecture (and on reading the algorithm, which is a separate
document) is that the algorithm is complex, and therefore (like anything
that is complex) prone to errors and failures of various kinds, and
potentially has failure modes that have not yet been detected. This is
not to be considered as a strike against it, but a point of caution; the
operator using the approach wants to ensure that s/he has the tools
necessary to monitor network health, and to quickly discover and correct
errors if and when they occur. The algorithm draft contains several
proofs of correctness for various parts or in various cases, and refers
to papers containing such proofs, with the intent of minimizing the
inherent risk. That said, to my knowledge there is not a global proof of
correctness, as there is for example in the Shortest Path First algorithm
or other algorithms used in the network. The risk is therefore not zero.

>From the perspective of the IETF, that is precisely the reason a protocol
like this should be used operationally at the Proposed Standard level,
updated as needed, and ultimately re-released as an Internet Standard
when the algorithm and implementations have been operationally proven.


With that introduction, the first question in my mind is whether the
description is such that two implementors are likely to be able to
implement interoperable implementations, or whether ambiguities or lack
of clarity would prevent that. This draft identifies two proprietary
prototype implementations, by Huawei and Juniper, which if they are
interoperable would address the question to a considerable degree. The
draft does not, however, describe interoperability testing between them,
which at least suggests that this might be yet future. On this score,
given the complexity of the design, I personally would be greatly
comforted by a test report along the lines of RFC 1246. Since such tests
usually find text that needs tweaking, I might suggest that the
publication at RFC be delayed until such testing can be performed and the
lessons learned, whatever they are, incorporated in the documents.
Failing that, experience leads me to believe that there will be
subsequent documents that update or obsolete these.

The corollary question in my mind is whether an operator reading the
architecture will be able to figure out how to effectively use it. On
this score, I give the draft a thumbs-up. It is well written, the various
issues are raised and dealt with, and the ramifications are in my view
clear.



Now the review from Nevil Brownlee:

This is a long draft, presenting the MRT-FRR architecture, and exploring
in some detail the design alternatives that were possible during that
process.

There are many acronyms used throughout the draft, that will work well

for routers familiar with Routing in general, and MPLS in particular.
Others will find it useful to keep a browser window at hand!  For me,
PLR (Point of Local Repair) was new.

In section 11.1, the equations that test whether a path is loop-free
for nodes S and F use D_opt() as an abbreviation for Distance_opt()
[RFC 5286] - I understand the authors wish to get these equations onto
single lines, but the phrase "where D_opt() means Distance_opt()" would
be helpful.

Throughout the draft the phrase "protocol extensions to .. will be
defined elsewhere" appears, similarly the IANA Considerations section
defines an MPLS Multi-Topology Identifiers Registry, but says that
codepoints in it will be defined elsewhere.  Clearly this draft is
the first in what will become a cluster of RFCs.

On the Operations side of things, section 1.2 notes that "MRT-FRR
supports partial deployment."  That will allow Operators to deploy
it in stages (one MRT Island at a time?).

Further, several sections consider the possibility of "link-protecting
alternates causing route looping," it seems that MRT-FRR should remain
loop-free.

Section 13, Implementation Status [to be removed by the RFC Editor],
demonstrates that at least two implementations exist, clearly that has
helped the authors to work through the design decisions I commented on
above.

Section 14, Operational Considerations, works through the most important
of the decisions an Operator will need to make if they plan to implement
MRT-FRR - this seems very useful.

Overall, the draft is well-written and easy to read (apart from its
high acronym density), I believe it is ready for publication as an RFC.


_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg

Benoit Claise's No Objection on draft-ietf-rtgwg-mrt-frr-architecture-09: (with COMMENT)

Reply via email to