Dear Joel
Thank you very much for the excellent review!
Below we tried to resolve all your comments/issues!
It will be great if you could inform us if you are satisfied with these
solutions!
On 3/13/2010, "Joel M. Halpern" <[email protected]> wrote:
I have been selected as the General Area Review Team (Gen-ART)
reviewer for this draft (for background on Gen-ART, please see
http://www.alvestrand.no/ietf/gen/art/gen-art-FAQ.html).
Please resolve these comments along with any other Last Call comments
you may receive.
Document: draft-ietf-nsis-rmd-16.txt
RMD-QOSM - The Resource Management in Diffserv QOS Model
Reviewer: Joel M. Halpern
Review Date: 13-Mar-2010
IETF LC End Date: 22-Mar-2010
IESG Telechat date: N/A
Summary: This document is not ready for publication as an Experimental RFC.
Clarity Issue:
The document makes repeated use of the term Severe congestion. It
seems inevitable that a somewhat fuzzy definition will be used for that,
and I would not have concern about such fuzziness. However, the
definition used in the document, in section 2, presumably with the
understanding and agreement of the working group, is
"congestion that occurs when a node or link fails and the traffic is
rerouter through another node or link." This property (being caused by
node or link failure) has nothing to do with the severity of the
congestion. The text goes on to talk about this type of congestion not
be addressable via admission control.
It is possible that the document means severe congestion (in the more
conventional sense) with the added caveat that it is brought about by
failure. But that is not what the definition says. (If that is indeed
the intent, then clarifying the definition will suffice to resolve this
issue.)
Also, as a lesser matter, there are systems which do address / prevent
element failures from causing severe congestion by using admission
control, so the claim in the definition that it can not be addressed by
admission control is at best misleading. It requires very different
behaviors than RMD,so are presumably inapplicable to this situation.
Georgios: We would like to change the severe congestion definition given
in Section 2 as follows,
From:
Severe congestion: is a congestion that occurs when a node or link
fails and the traffic is rerouter through another node or link. If no
measures are taken than the node or the link can become severely
congested and all traffic passing through the node or link will
severely degrade in QoS. This type of congestion cannot be solved
using admission control mechanisms.
INTO:
Severe congestion: Is the congestion situation on a particular link
within the RMD domain where a significant increase in its real packet
queue situation occurs, when due to a link failure re-routed traffic has
to be supported by this particular link. A failure in a communication
path, e.g., a router or a link causes the routing algorithms to adapt to
this failure by changing the routing decisions to reflect changes
in the topology and traffic volume. As a result, the re-routed traffic
will follow a new path and link, which may result in severely overloaded
nodes and links as they need for a long time to support more traffic
than they are able to.
Major issues:
Section 3.2.3 on applicability seems to state that although there are
Multiple RMD-QOSM schemes, none are mandatory to implement. And that a
domain must all use one scheme. I am not sure if "scheme: here refers
to this document as distinct from some other document, or refers to the
variations (such as reduced state, and two varieties of stateless) on
interior node behavior. If, as seems to be the case since the following
text defines 5 schemes, it is referring to the interior behavior
choices, it would seem that there needs to be a mandatory-to-implement
scheme in order for this document to promote interoperability rather
than fragmentation of the network.
Georgios: We will do the following modifications in order to solve this
issue:
In Section 3.2.3 we will replace the existing paragraph that is
associated with the above issue with the following paragraph:
------------------
A very important consideration on using RMD-QOSM is that within one
RMD domain only one of the following RMD-QOSM schemes can be used at
a time. Thus a RMD router can never process and use two different
RMD-QOSM signaling schemes at the same time.
However, all schemes MUST be implemented within one RMD domain. The
operator of an RMD domain MUST pre-configure all the QNE edge nodes
within one domain such that the <SCH> field included in the "PHR
container", see Section 4.1.2 and the "PDR Container", see section
4.1.3, will use
always the same value, such that within one RMD domain only one of the
below described RMD-QOSM schemes can be used at a time.
----------------
Moreover, in Sections 4.1.2 and 4.1.3 we will include the new description
of the <SCH> filed that will be included on the most right 3 bits of the
second 32 bit payload word. The following text will be used:
------------------
In Section 4.1.2:
--------------------
<SCH>: 3-bit. The <SCH> value that is used to specify which of the 6
RMD scenarios, see Section 3.2.3, MUST be used within the RMD domain.
The operator of an RMD domain MUST pre-configure all the QNE edge nodes
within one domain such that the <SCH> field included in the "PHR
container", will
use always the same value, such that within one RMD domain only one of
the below described RMD-QOSM schemes can be used at a time. All the QNE
interior nodes MUST interpret this field before processing any other PHR
container payload fields. The currently defined <SCH> values are:
o 0: RMD-QOSM scheme MUST be: "per flow congestion notification
based on probing";
o 1: RMD-QOSM scheme MUST be: "per flow RMD NSIS measurement
based
admission control",
o 2: RMD-QOSM scheme MUST be: "per flow RMD reservation based"
in
combination with "severe congestion handling by the RMD-QOSM
refresh procedure";
o 3 : RMD-QOSM scheme MUST be: "per flow RMD reservation based"
in
combination with "severe congestion handling by proportional
data packet marking"
o 4: RMD-QOSM scheme MUST be: "per aggregate RMD reservation
based"
in combination with "severe congestion handling by the RMD-
QOSM refresh procedure"
o 5: RMD-QOSM scheme MUST be: "per aggregate RMD reservation
based"
in combination with "severe congestion handling by
proportional data packet marking"
o 6 – 7: reserved
The default value of the <SCH> field SHOULD be set to the value equal
to 3.
------------
In Section 4.1.3:
--------------
<SCH>:
3-bit. The <SCH> value that is used to specify which of the 6 RMD
scenarios, see Section 3.2.3, MUST be used within the RMD domain. The
operator of an RMD domain MUST pre-configure all the QNE edge nodes
within one domain such that the <SCH> field included in the "PDR
container", will
use always the same value, such that within one RMD domain only one of
the below described RMD-QOSM schemes can be used at a time. All the QNE
interior nodes MUST interpret this field before processing any other
"PDR container" payload fields. The currently defined <SCH> values are:
o 0: RMD-QOSM scheme MUST be: "per flow congestion notification
based on probing";
o 1: RMD-QOSM scheme MUST be: "per flow RMD NSIS measurement
based
admission control",
o 2: RMD-QOSM scheme MUST be: "per flow RMD reservation based"
in
combination with "severe congestion handling by the RMD-QOSM
refresh procedure";
o 3 : RMD-QOSM scheme MUST be: "per flow RMD reservation based"
in
combination with "severe congestion handling by proportional
data packet marking"
o 4: RMD-QOSM scheme MUST be: "per aggregate RMD reservation
based"
in combination with "severe congestion handling by the RMD-
QOSM refresh procedure"
o 5: RMD-QOSM scheme MUST be: "per aggregate RMD reservation
based"
in combination with "severe congestion handling by
proportional data packet marking"
o 6 – 7: reserved
The default value of the <SCH> field SHOULD be set to the value equal
to 3.
In this day and age, it seems surprising that the protocol specifies
that the interior messages are to be sent with no security. The IETF is
actively working to improve the security of intra-domain and
inter-domain routing, so this decision seems wrong. (Even for an
experiment.) (Section 4.4, 4th bullet.) At the very least, some
explanation of this choice is necessary.
Georgios: You are right, the text needs to be clarified. What we meant
is that RMD-QOSM relies mainly on the security and reliability support
that is
provided by the bound end-to-end session, which is running between the
boundaries of the RMD domain (i.e., the RMD-QOSM QNE edges) and the
security provided by the D-mode.
We would want to change the specific bullet into:
* When the QNE Ingress needs to send an intra-domain RESERVE
message that is not an initial RESERVE, then the QoS-NSLP sends
this message by including in the GIST API SendMessage primitive such
attributes that the usage of the Datagram Mode is implied, e.g.,
Unreliable attribute. Furthermore the Local policy attribute is set
such that GIST sends the intra-domain RESERVE message in a Q-mode
even if there is a routing state at the QNE Ingress. In this way the
GIST functionality uses its local policy to send the intra-domain
RESERVE message by piggybacking it on a GIST DATA message and sending
it in Q- mode even if there is a routing state for this session. The
intra-domain RESERVE message is piggybacked on the GIST DATA message
that is forwarded and processed by the QNE Interior nodes up to the
QNE
Egress.
------------------
Moreover, we will include the following paragraph at the introductory
part of Section 4.4.
------------------
“RMD-QOSM relies on the security and
reliability support that is provided by the bound end-to-end session,
which is running between the boundaries of the RMD domain (i.e., the
RMD-QOSM QNE edges), and the security provided by the D-mode.”
-------------------
The text in section 4.1.2 states that the 8 bit overload % field
contains a real value. However, I could not find a description of the
encoding by which a real value between (between 0 and 1?) should be
encoded in the message.
Georgios: Agree that the description of this filed is not clear and too
many bits are used to represent this type of overload, while not needed.
Therefore, we will do the following changes:
In Section 4.1.2 we will do the following change.
Change from:
<Overload %>:
8 bits In case of severe congestion the level of overload is
indicated by the Overload %. Overload % is the percentage of the
measured PHB bit rate that is above the bit rate rate used to detect
a severe congestion. Overload % SHOULD be higher than 0 if S bit is
set. If overload in a node is greater than the overload in a previous
node then Overload % SHOULD be updated. For more details see Section
4.6.1.6.1. Note that this field represents a real parameter.
INTO:
<Overload>:
1 bit. This field is used during the severe congestion handling scheme
that is using the RMD-QOSM refresh procedure. This bit is set when an
overload on a QNE interior node is detected and when this field is
carried by the "PHR_Refresh_Update" container. <Overload>
SHOULD be set to"1" if the <S> bit is set. For more details see
Section 4.6.1.6.1.
in Section 4.1.3 a similar change for this parameter will be applied:
<Overload>:
1 bit. This field is used during the severe congestion handling scheme
that is using the RMD-QOSM refresh procedure. This bit is set when an
overload on a QNE interior node is detected and when this field is
carried by the ""PDR_Congestion_Report" container. <Overload>
SHOULD be set to"1" if the <S> bit is set. For more details see
Section
4.6.1.6.1.
-------------
This change on the <Overlaod> filed will be worked out in the rest of the
text.
Minor issues:
The measurement based admission control mechanism used here looks
remarkably similar to the classical RSVP Predicative service. Both of
these are based on the assumption that current measured characteristics
are an indicator of future load. It is not at all apparent that there
is any such relationship. It seems that the text ought to include some
indication as to what the basis of suggesting this be used is, and why
it is thought to be meaningful. Even if the argument is "it is worth
trying", it seems worth stating that, and stating why it is thought that
it will work now.
Georgios: You are right about the fact that the measurement based
admission control scheme can only support a predictive service.
We would like to change the following paragraph in section 3.1 from:
The measurement-based algorithm continuously measures traffic levels
and the actual available resources, and admits flows whose resource
needs are within what is available at the time of the request.
INTO:
The measurement-based algorithm continuously measures traffic levels
and the actual available resources, and admits flows whose resource
needs are within what is available at the time of the request. The
measurement based algorithm is used to support a predictive service
where the service commitment is somewhat less reliable than the service
that can be supported by the reservation based method. A main assumption
that is taken by such measurement based admission control mechanisms is
that the aggregated PHB traffic passing through an RMD interior node is
high and therefore, current measurement characteristics are considered
to be an indicator of future load.
It would probably be helpful to explain why it is necessary or
desirable to use two different RESERVE messages across the same domain,
traversing the same set of devices, with different but closely related
information. (particularly in light of the comments about reducing load
on intermediate devices.)
Georgios: You are right we will try to clarify this as follows:
In Section 3.1 we would like to change the following text from:
“The basic RMD-QOSM/QoS-NSLP signaling is shown in Figure 3. The
signalling scenarios are accomplished using the QoS-NSLP processing
rules defined in [QoS-NSLP], in combination with the RMF triggers
sent via the QoS-NSLP-RMF API described in [QoS-NSLP]. A RESERVE
message is created by a QNI with an Initiator QSpec describing the
reservation and forwarded along the path towards the QNR.”
INTO:
"The basic RMD-QOSM/QoS-NSLP signaling is shown in Figure 3. The
signalling scenarios are accomplished using the QoS-NSLP processing
rules defined in [QoS-NSLP], in combination with the RMF triggers
sent via the QoS-NSLP-RMF API described in [QoS-NSLP].
Due to the fact that within the RMD domain a different QoS model can
be supported than the end-to-end QoS model applied at the edges of the
RMD domain, the RMD interior node reduced state reservations can be
updated independently of the per-flow end-to-end reservations, see
Section 4.7 of [QoS-NSLP]. Therefore, two different RESERVE messages
are
used within the RMD domain. One RESERVE message that is associated with
the per flow end-to-end reservations and is used by the edges of the
RMD domain and one that is associated with the reduced state
reservations within the RMD domain."
The applicability section states that this mechanism can only be used
with the EF DSCP. Is it further the case that it can only be used for
traffic which consistently uses a stable amount of bandwidth (per
reservation)? One of the difficulties with the style of reservation
based on measurement of load is that the end pointing requesting the
measurement must be aware of whether the measurement data includes the
flow being considered for admission. Otherwise, large flows can cause
significant confusion. With very stable flows, as long as the
measurements are not requested too often, this is achievable.
Otherwise, it is not at all clear to this reader how the proposed
mechanism would work (particularly when refreshing a reservation).
Continuing this line of questioning, the mechanism for modification
seems to send the new bandwidth through the stateless intra-domain
routers. Since they are stateless, those routers do now know what the
old reservation was. And the measurements presumably include traffic
under the old reservation. if these are added together, significant
double-counting woudl seem to occur. (This is listed as minor on the
premise that the protocol presumably actually works, and therefore the
problem is one of reader comprehension, rather than more serious
technical issues.
Georgios: You are right that the descriptions are not clear. Please note
that with the measurement based scheme the requested peak bandwidth of a
flow is carried by the admission control request. The admission decision
is considered as positive if the currently carried traffic, as
characterized by the measured statistics, plus the requested resources
for the new flow exceeds the system capacity with a probability smaller
than a value alpha. Otherwise, the admission decision is negative. It is
important to emphasize that due to the fact that the interior nodes are
stateless, they do not store information of previous admission control
requests. This could lead to a situation where the admission control
accuracy is decreased when multiple simultaneous flows (sharing a common
interior node) are requesting admission control simultaneously. By
applying measuring techniques, see e.g., [JaSh97], [GrTs03], which are
using current and past information on NSIS sessions that requested
resources from an NSIS aware interior node, the decrease in admission
control accuracy can be limited.
Moreover, the RMD measurement based schemes described in this document do
not use any refresh procedures, since these approaches are used in
stateless nodes, see Section 4.6.1.3.
In order to clarify the text we would like to do the following.
The abstract description of the measurement based admission control
mechanism given in Section 3.1 will be enhanced as follows:
We will add the following paragraph in Section 3.1:
“It is important to emphasize that the RMD measurement based schemes
described in this document do not use any refresh procedures, since
these approaches are used in stateless nodes, see Section 4.6.1.3.
With the measurement based scheme the requested peak bandwidth of a flow
is carried by the admission control request. The admission decision is
considered as positive if the currently carried traffic, as
characterized by the measured statistics, plus the requested resources
for the new flow exceeds the system capacity with a probability smaller
than a value alpha. Otherwise, the admission decision is negative. It is
important to emphasize that due to the fact that the interior nodes are
stateless, they do not store information of previous admission control
requests. This could lead to a situation where the admission control
accuracy is decreased when multiple simultaneous flows (sharing a common
interior node) are requesting admission control simultaneously. By
applying measuring techniques, see e.g., [JaSh97], [GrTs03], which are
using current and past information on NSIS sessions that requested
resources from an NSIS aware interior node, the decrease in admission
control accuracy can be limited."
I was not able to understand the purpose or use of the K bit. I may
have missed it in the dense text. Assuming there is an explanation, a
pointer at the point where the bit is defined to the text which explains
its use would be a very good idea.
Georgios: You are right. The use of the <K> bit is described in Section
4.6.1.5.2.
The description of the <K> bit will be changed as follows:
<K>: 1 bit. When set to "1" it indicates that the resources/bandwidth
carried by a tearing RESERVE MUST NOT be released and the
resources/bandwidth carried by a non tearing RESERVE MUST NOT be
reserved/refreshed. For more details see Section 4.6.1.5.2.
Best regards,
Georgios
Yours,
Joel M. Halpern
Nits/editorial comments: