Hello,
I have been selected as the Routing Directorate reviewer for this draft. The
Routing Directorate seeks to review all routing or routing-related drafts as
they pass through IETF last call and IESG review. The purpose of the review is
to provide assistance to the Routing ADs. For more information about the
Routing Directorate, please see
http://www.ietf.org/iesg/directorate/routing.html
Although these comments are primarily for the use of the Routing ADs, it would
be helpful if you could consider them along with any other IETF Last Call
comments that you receive, and strive to resolve them through discussion or by
updating the draft.
Document: draft-ietf-grow-ope-reqs-for-bgp-error-handling-05.txt
Reviewer: Geoff Huston
Review Date: 11 September 2012
IETF LC End Date: 13 September 2012
Intended Status: Informational
Summary:
I have some major concerns about this document that I think should be resolved
before publication. I also have some minor concerns here that also relate to
the manner of expression of these requirements. I do not have major concerns
about the technical content of the requirements described in this document.
Comments:
This document is not clearly written and difficult to understand.
The requirements are scattered across voluminous text, which is unhelpful. I
would've preferred to read a document which managed to enumerate the same
requirements in under 8 pages of text, while the current count of 28 pages
appears to be consumed by prolix and repetitive text that contributes neither
to the precision of the description of the requirements, nor to the description
of the rationale for the requirements.
At its current length, and with its density of expression and level of
repetition I would suggest that's its utility to future readers is
unfortunately compromised. This is a shame, as within this is a well-considered
set of operational requirements for BGP error handling buried within this
document.
Major Issues:
There is a major issue here in terms of the overall readability and a lack in
conciseness in expression and clear structuring of the subject material in an
organised and coherent manner.
More specifically, I take issue with the classification approach used in
Section 2, and I am of the opinion that it chould be rewritten to aid clarity
and readability. I find it confusing to see "Critical" and "Semantic" error
classifications. It would make more sense to me to call these categories
"Critical" and "Non-Critical". I would also suggest to use this classification
to define the proposed handling - i.e. Critical Errors are such that the BGP
message framing has been lost, and it is necessary to restart the session or
undertake some other error handling mechanism that would re-establish BGP
message framing, and Non-Critical Errors are such that BGP message framing has
NOT been lost, and the error recovery process can be managed though various
forms of local actions and potentially some form of additional BGP
protocol-level interaction that would not require a session tear-down to repair.
I am also at a loss to understand the role of section 3 in this requirements
document. It appears to be making the case that the current BGP error handling
approach is ill-suited to operational requirements and that different forms of
error handling should be placed as requirements for the protocol. I would
conventionally expect to see these arguments appear in section 1 of this
document, as part of the argument for the motivation for a new set of error
handling requirements. This is perhaps a specific instance of the previous
mentioned issue that this document could benefit from some careful thought in
the manner of the organisation of the presented material.
I also note that Section 6, Operational Toolset for Monitoring BGP, represents
a scope creep for this document. My concern here is that any general comments
about monitoring BGP would not normally be expected to be enumerated in a
document that was intended to address the requirements for BGP's handling of
error conditions. I am not aware whether the Working Group has considered the
possibility of separately addressing error handling and operational monitoring
in two operational requirements in distinct documents, but from my review of
this document it does appear that a case can be made here for this form of
clear delineation.
In any case, I would suggest that the document would benefit from a major
revision that was focussed on a clear enumeration of the requirements for
error handling rather than the current document form of a somewhat less
structured collection of comments on the BGP NOTIFICATION message and its
current method of handling, comments on existing work in progress on error
handling approaches and mechanisms and the inclusion of consideration of error
handling scope, and the considerations behind re-interpretation of certain
forms of erroneous UPDATES as implicit WITHDRAW messages. At present all these
concepts have been added into the document in a manner that tends to blur the
distinction between a description of the requirement itself and the motivation
for this requirement.
Minor Issues:
last sentence of the abstract: namely the "overview of a set of enhancements to
BGP-4" is inconsistent with the document's purpose ass represented in the title
("Requirements for Enhanced Error Handling Behaviour") or in later parts of the
document. Needs revision.
Introduction: first sentence - "numerous incidents..." is imprecise and
uninformative - perhaps dropping this adjective would help here. Also in this
sentence I would suggest changing "due to the" to "as a consequence of the".
Introduction: second sentence - "the deployments of the protocol have changed
within modern networks" does not parse for me. Is this intending to say that
some current implementatons of this protocol deviate from from the
standards-defined behaviour?
Introduction: This entire section could be reduced by noting that "BGP's
current error handling behaviour, as defined in RFC 4271, define a single error
handling response, namely that of session reset. This response has significant
impacts within an operational environment. This memo proposes a set of
requirements for further refinement of the standard behaviour for error
handling in BGP."
The reminder of the document would benefit considerably from a similar
editorial pass. It is simply way too prolix and this detracts from the
effectiveness of the document as a description of a set of requirements.
section 1.1, first sentence "... are designed to be conducive to this role" -
frankly I have no idea what this means. Is it "consistent with this role"? But
even then it makes no sense. Indeed the first sentence defeats me as to its
intended purpose.
Section 1.1, second sentence - there is some jarring imprecision here that
should be deleted - the "relatively small" amount of NLRI information makes no
sense to me as I am unsure to what this "relative" comparison is being made.
Section 1.1, third sentence. This sentence, "In this case, it is the
expectation.." is wordy and terribly expressed. I was thinking of ways to say
this more concisely, but may be it would be better to remove it completely.
Section 1.1, last sentence. This sentence, about the expectation to be able to
use sub-optimal paths is a bit of a martian for me - the concept is introduced
here without warning and without context - I thought this was a requirement for
error handling specification document, and this statement appears without clear
context.
Section 1.1, second paragraph - "Traditional network architectures _use_ an..."
Section 1.1, second paragraph - the author is implying that the requirements
for IGP and EGPs differ in terms of robustness. It would be helpful it this
claim was substiantiated in some manner in so far as this reviewer does not see
much of a difference at all - both protocols have a very high requirement for
robust operation from this reviewer's perspective.
Section 1.1, third paragraph - yes, BGP carries more information, but the case
that this augmented use provides justification for an altered error handling is
weak, and in my view superfluous to the document's purpose. The previous
paragraph provides adequate motivation and this third paragraph appears to be
another repetition of the basic assertion that "BGP plays a critical role in
network operation, and BGP error handling should not cause a hiatus in the
supply of information provided by the operation of BGP.""
Section 1.1, fourth paragraph appears to be saying that: "BGP systems carry
large volumes of information, and the time taken to recover from a
error-triggered session reset is now a significant factor in terms of overall
network robustness. Error handling approaches that limit the scope of error
recovery to those NLRIs mentioned in the erroneous BGP UPDATE message should be
considered within a requirement set for error handling.
It is possible to reduce section 1.1 to two paragraphs and a more concise set
of statements about the problems that the current standards-defined error
handling response pose to network operators.
Section 1.2 - here the first sentence is a restatement of document's purpose,
already stated in the Abstract and in section 1 - there is no need to restate
it here. The rest of this section is again very wordy. It may be worth
considering a more concise restatement of these requirements, namely that error
handling should avoid the use of session resets where possible, error handling
should, where possible be limited in scope to those NLRI UPDATEs that can be
associated with the error condition, and where session reset is considered to
be unaviodable, various foprms of more graceful session restart should be
considered. Furthermore, as a more general BGP requirement, the inclusion of
mechanisms to allow for operational monitoring of BGP should be stated as an
operational requirement.
Section 2 - I have trouble parsing the structure of this section - perhaps its
because the first four paragraphs here are a more verbose repetition of the
information presented in sections 2.1.1 and 2.1.2.
Section 3- paragraph 3 - I am confused by the purpose of the second half of
this paragraph, starting with the sentence beginning with "It should, however,
be considered if this view is valid..." The first half of the paragraph is
discussing the "treat as withdraw" in the context of iBGP, but the second half
of the paragraph does not appear to concludes this discussion.
Section 4 - paragraph 3 this is an example of an embedded "requirement" that
should be avoided. It would be far clearer to pull out all these requirements
and enumerate them and for each one outline concisely the rationale for the
requirement and its intended effect on the operation of BGP.
Section 4 - paragraph 4 contains another example of this embedded "requirement"
Section 5 - paragraph 2 - This sentence: "Clearly, there is some utility to
this requirement, as error conditions in BGP are, in general, exited from."
What does this mean? I am at a bit of a loss in reading section 5, as, once
more, there are embedded "requirements" and a lot of repetition of material
from earlier sections.
Nits:
Abstract, para 1, sentence 3 - s/strict/standards-defined/ and s/message
causing/message, causing/
Section 1.1, second paragraph ... "As such, BGP has become an IGP" is better
expressed as "As such, iBGP has become an IGP"
Section 1.2 s/UPDATE packet/UPDATE message/
Section 2. first sentence - expand the first use of "DFZ"
Section 2 - why does this document use "BGP-4" and "BGP"? - please pick one
term or the other. I suggest using "BGP" uniformly through the document.
_______________________________________________
GROW mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/grow