I have been selected as the General Area Review Team (Gen-ART)
reviewer for this draft (for background on Gen-ART, please see
http://www.alvestrand.no/ietf/gen/art/gen-art-FAQ.html).
Please resolve these comments along with any other Last Call comments
you may receive.
Document: draft-ietf-nsis-rmd-16.txt
RMD-QOSM - The Resource Management in Diffserv QOS Model
Reviewer: Joel M. Halpern
Review Date: 13-Mar-2010
IETF LC End Date: 22-Mar-2010
IESG Telechat date: N/A
Summary: This document is not ready for publication as an Experimental RFC.
Clarity Issue:
The document makes repeated use of the term Severe congestion. It
seems inevitable that a somewhat fuzzy definition will be used for that,
and I would not have concern about such fuzziness. However, the
definition used in the document, in section 2, presumably with the
understanding and agreement of the working group, is
"congestion that occurs when a node or link fails and the traffic is
rerouter through another node or link." This property (being caused by
node or link failure) has nothing to do with the severity of the
congestion. The text goes on to talk about this type of congestion not
be addressable via admission control.
It is possible that the document means severe congestion (in the more
conventional sense) with the added caveat that it is brought about by
failure. But that is not what the definition says. (If that is indeed
the intent, then clarifying the definition will suffice to resolve this
issue.)
Also, as a lesser matter, there are systems which do address / prevent
element failures from causing severe congestion by using admission
control, so the claim in the definition that it can not be addressed by
admission control is at best misleading. It requires very different
behaviors than RMD,so are presumably inapplicable to this situation.
Major issues:
Section 3.2.3 on applicability seems to state that although there are
Multiple RMD-QOSM schemes, none are mandatory to implement. And that a
domain must all use one scheme. I am not sure if "scheme: here refers
to this document as distinct from some other document, or refers to the
variations (such as reduced state, and two varieties of stateless) on
interior node behavior. If, as seems to be the case since the following
text defines 5 schemes, it is referring to the interior behavior
choices, it would seem that there needs to be a mandatory-to-implement
scheme in order for this document to promote interoperability rather
than fragmentation of the network.
In this day and age, it seems surprising that the protocol specifies
that the interior messages are to be sent with no security. The IETF is
actively working to improve the security of intra-domain and
inter-domain routing, so this decision seems wrong. (Even for an
experiment.) (Section 4.4, 4th bullet.) At the very least, some
explanation of this choice is necessary.
The text in section 4.1.2 states that the 8 bit overload % field
contains a real value. However, I could not find a description of the
encoding by which a real value between (between 0 and 1?) should be
encoded in the message.
Minor issues:
The measurement based admission control mechanism used here looks
remarkably similar to the classical RSVP Predicative service. Both of
these are based on the assumption that current measured characteristics
are an indicator of future load. It is not at all apparent that there
is any such relationship. It seems that the text ought to include some
indication as to what the basis of suggesting this be used is, and why
it is thought to be meaningful. Even if the argument is "it is worth
trying", it seems worth stating that, and stating why it is thought that
it will work now.
It would probably be helpful to explain why it is necessary or
desirable to use two different RESERVE messages across the same domain,
traversing the same set of devices, with different but closely related
information. (particularly in light of the comments about reducing load
on intermediate devices.)
The applicability section states that this mechanism can only be used
with the EF DSCP. Is it further the case that it can only be used for
traffic which consistently uses a stable amount of bandwidth (per
reservation)? One of the difficulties with the style of reservation
based on measurement of load is that the end pointing requesting the
measurement must be aware of whether the measurement data includes the
flow being considered for admission. Otherwise, large flows can cause
significant confusion. With very stable flows, as long as the
measurements are not requested too often, this is achievable.
Otherwise, it is not at all clear to this reader how the proposed
mechanism would work (particularly when refreshing a reservation).
Continuing this line of questioning, the mechanism for modification
seems to send the new bandwidth through the stateless intra-domain
routers. Since they are stateless, those routers do now know what the
old reservation was. And the measurements presumably include traffic
under the old reservation. if these are added together, significant
double-counting woudl seem to occur. (This is listed as minor on the
premise that the protocol presumably actually works, and therefore the
problem is one of reader comprehension, rather than more serious
technical issues.
I was not able to understand the purpose or use of the K bit. I may
have missed it in the dense text. Assuming there is an explanation, a
pointer at the point where the bit is defined to the text which explains
its use would be a very good idea.
Yours,
Joel M. Halpern
Nits/editorial comments:
_______________________________________________
Gen-art mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/gen-art