Re: [Gen-art] LC review: draft-ietf-nsis-rmd-16.txt

Georgios Karagiannis Tue, 16 Mar 2010 19:29:21 -0700

Dear Joel

Thank you very much for the excellent review!
Below we tried to resolve all your comments/issues!
It will be great if you could inform us if you are satisfied with these
solutions!

On 3/13/2010, "Joel M. Halpern" <[email protected]> wrote:

>I have been selected as the General Area Review Team (Gen-ART)
>reviewer for this draft (for background on Gen-ART, please see
>http://www.alvestrand.no/ietf/gen/art/gen-art-FAQ.html).
>
>Please resolve these comments along with any other Last Call comments
>you may receive.
>
>Document: draft-ietf-nsis-rmd-16.txt
>     RMD-QOSM - The Resource Management in Diffserv QOS Model
>Reviewer: Joel M. Halpern
>Review Date: 13-Mar-2010
>IETF LC End Date: 22-Mar-2010
>IESG Telechat date:  N/A
>
>Summary: This document is not ready for publication as an Experimental RFC.
>
>Clarity Issue:
>       The document makes repeated use of the term Severe congestion.  It
>seems inevitable that a somewhat fuzzy definition will be used for that,
>and I would not have concern about such fuzziness.  However, the
>definition used in the document, in section 2, presumably with the
>understanding and agreement of the working group, is
>"congestion that occurs when a node or link fails and the traffic is
>rerouter through another node or link."  This property (being caused by
>node or link failure) has nothing to do with the severity of the
>congestion.  The text goes on to talk about this type of congestion not
>be addressable via admission control.
>It is possible that the document means severe congestion (in the more
>conventional sense) with the added caveat that it is brought about by
>failure.  But that is not what the definition says.  (If that is indeed
>the intent, then clarifying the definition will suffice to resolve this
>issue.)
>Also, as a lesser matter, there are systems which do address / prevent
>element failures from causing severe congestion by using admission
>control, so the claim in the definition that it can not be addressed by
>admission control is at best misleading.  It requires very different
>behaviors than RMD,so are presumably inapplicable to this situation.

Georgios: We would like to change the severe congestion definition given
in Section 2 as follows,

From:

  Severe congestion: is a congestion that occurs when a node or link
   fails and the traffic is rerouter through another node or link. If no
   measures are taken than the node or the link can become severely
   congested and all traffic passing through the node or link will
   severely degrade in QoS. This type of congestion cannot be solved
   using admission control mechanisms.

INTO:

Severe congestion: Is the congestion situation on a particular link
within the RMD domain where a significant increase in its real packet
queue situation occurs, when due to a link failure re-routed traffic has
to be supported by this particular link. A failure in a communication
path, e.g., a router or a link causes the routing algorithms to adapt to
this failure by changing the routing decisions to reflect changes
in the topology and traffic volume. As a result, the re-routed traffic
will follow a new path and link, which may result in severely overloaded
nodes and links as they need for a long time to support more traffic
than they are able to.

>
>Major issues:
>       Section 3.2.3 on applicability seems to state that although there are
>Multiple RMD-QOSM schemes, none are mandatory to implement.  And that a
>domain must all use one scheme.  I am not sure if "scheme: here refers
>to this document as distinct from some other document, or refers to the
>variations (such as reduced state, and two varieties of stateless) on
>interior node behavior.  If, as seems to be the case since the following
>text defines 5 schemes, it is referring to the interior behavior
>choices, it would seem that there needs to be a mandatory-to-implement
>scheme in order for this document to promote interoperability rather
>than fragmentation of the network.

Georgios: We will do the following modifications in order to solve this
issue:

In Section 3.2.3 we will replace the existing paragraph that is
associated with the above issue with the following paragraph:

------------------

A very important consideration on using RMD-QOSM is that within one
RMD domain only one of the following RMD-QOSM schemes can be used at
a time. Thus a RMD router can never process and use two different
RMD-QOSM signaling schemes at the same time.
However, all schemes MUST be implemented within one RMD domain. The
operator of an RMD domain MUST pre-configure all the QNE edge nodes
within one domain such that the <SCH> field included in the "PHR
container", see Section 4.1.2 and the "PDR Container", see section
4.1.3, will use
always the same value, such that within one RMD domain only one of the
below described RMD-QOSM schemes can be used at a time.

----------------

Moreover, in Sections 4.1.2 and 4.1.3 we will include the new description
of the <SCH> filed that will be included on the most right 3 bits of the
second 32 bit payload word. The following text will be used:

------------------

In Section 4.1.2:

--------------------

  <SCH>: 3-bit.  The <SCH> value that is used to specify which of the 6
RMD scenarios, see Section 3.2.3, MUST be used within the RMD domain.
The operator of an RMD domain MUST pre-configure all the QNE edge nodes
within one domain such that the <SCH> field included in the "PHR
container", will
use always the same value, such that within one RMD domain only one of
the below described RMD-QOSM schemes can be used at a time. All the QNE
interior nodes MUST interpret this field before processing any other PHR
container payload fields. The currently defined <SCH> values are:

   o  0:     RMD-QOSM scheme MUST be: "per flow congestion notification
             based on probing";

   o  1:     RMD-QOSM scheme MUST be: "per flow RMD NSIS measurement
based
             admission control",

   o  2:     RMD-QOSM scheme MUST be: "per flow RMD reservation based"
in
             combination with "severe congestion handling by the RMD-QOSM
             refresh procedure";

   o  3 :    RMD-QOSM scheme MUST be: "per flow RMD reservation based"
in
             combination with "severe congestion handling by proportional
             data packet marking"

   o  4:     RMD-QOSM scheme MUST be: "per aggregate RMD reservation
based"
             in combination with  "severe congestion handling by the RMD-
             QOSM refresh procedure"

   o  5:     RMD-QOSM scheme MUST be: "per aggregate RMD reservation
based"
             in combination with "severe congestion handling by
             proportional data packet marking"

   o  6  7: reserved

  The default value of the <SCH> field SHOULD be set to the value equal
to 3.

------------

In Section 4.1.3:

--------------

   <SCH>:
3-bit.  The <SCH> value that is used to specify which of the 6 RMD
scenarios, see Section 3.2.3, MUST be used within the RMD domain. The
operator of an RMD domain MUST pre-configure all the QNE edge nodes
within one domain such that the <SCH> field included in the "PDR
container", will
use always the same value, such that within one RMD domain only one of
the below described RMD-QOSM schemes can be used at a time. All the QNE
interior nodes MUST interpret this field before processing any other
"PDR container" payload fields. The currently defined <SCH> values are:

   o  0:     RMD-QOSM scheme MUST be: "per flow congestion notification
             based on probing";

   o  1:     RMD-QOSM scheme MUST be: "per flow RMD NSIS measurement
based
             admission control",

   o  2:     RMD-QOSM scheme MUST be: "per flow RMD reservation based"
in
             combination with "severe congestion handling by the RMD-QOSM
             refresh procedure";

   o  3 :    RMD-QOSM scheme MUST be: "per flow RMD reservation based"
in
             combination with "severe congestion handling by proportional
             data packet marking"

   o  4:     RMD-QOSM scheme MUST be: "per aggregate RMD reservation
based"
             in combination with  "severe congestion handling by the RMD-
             QOSM refresh procedure"

   o  5:     RMD-QOSM scheme MUST be: "per aggregate RMD reservation
based"
             in combination with "severe congestion handling by
             proportional data packet marking"

   o  6  7: reserved

  The default value of the <SCH> field SHOULD be set to the value equal
to 3.

>
>       In this day and age, it seems surprising that the protocol specifies
>that the interior messages are to be sent with no security.  The IETF is
>actively working to improve the security of intra-domain and
>inter-domain routing, so this decision seems wrong. (Even for an
>experiment.) (Section 4.4, 4th bullet.)  At the very least, some
>explanation of this choice is necessary.

Georgios:  You are right, the text needs to be clarified. What we meant
is that RMD-QOSM relies mainly on the security and reliability support
that is
provided by the bound end-to-end session, which is running between the
boundaries of the RMD domain (i.e., the RMD-QOSM QNE edges) and the
security provided by the D-mode.

We would want to change the specific bullet into:

   * When the QNE Ingress needs to send an intra-domain RESERVE
     message that is not an initial RESERVE, then the QoS-NSLP sends
     this message by including in the GIST API SendMessage primitive such
     attributes that the usage of the Datagram Mode is implied, e.g.,
     Unreliable attribute. Furthermore the Local policy attribute is set
     such that GIST sends the intra-domain RESERVE message in a Q-mode
     even if there is a routing state at the QNE Ingress. In this way the
     GIST functionality uses its local policy to send the intra-domain
     RESERVE message by piggybacking it on a GIST DATA message and sending
     it in Q- mode even if there is a routing state for this session. The
     intra-domain RESERVE message is piggybacked on the GIST DATA message
     that is forwarded and processed by the QNE Interior nodes up to the
QNE
     Egress.

------------------

Moreover, we will include the following paragraph at the introductory
part of Section 4.4.

------------------

RMD-QOSM relies on the security and
   reliability support that is provided by the bound end-to-end session,
   which is running between the boundaries of the RMD domain (i.e., the
   RMD-QOSM QNE edges), and the security provided by the D-mode.

-------------------

>
>       The text in section 4.1.2 states that the 8 bit overload % field
>contains a real value.  However, I could not find a description of the
>encoding by which a real value between (between 0 and 1?) should be
>encoded in the message.

Georgios: Agree that the description of this filed is not clear and too
many bits are used to represent this type of overload, while not needed.
Therefore, we will do the following changes:

In Section 4.1.2 we will do the following change.
Change from:
 <Overload %>:
   8 bits In case of severe congestion the level of overload is
   indicated by the Overload %. Overload % is the percentage of the
   measured PHB bit rate that is above the bit rate rate used to detect
   a severe congestion. Overload % SHOULD be higher than 0 if S bit is
   set. If overload in a node is greater than the overload in a previous
   node then Overload % SHOULD be updated. For more details see Section
   4.6.1.6.1. Note that this field represents a real parameter.

INTO:

<Overload>:
   1 bit. This field is used during the severe congestion handling scheme
   that is using the RMD-QOSM refresh procedure. This bit is set when an
   overload on a QNE interior node  is detected and when this field is
   carried by the "PHR_Refresh_Update" container. <Overload>
   SHOULD be set to"1" if the <S> bit is set. For more details see
Section 4.6.1.6.1.

in Section 4.1.3 a similar change for this parameter will be applied:

<Overload>:
   1 bit. This field is used during the severe congestion handling scheme
   that is using the RMD-QOSM refresh procedure. This bit is set when an
   overload on a QNE interior node  is detected and when this field is
   carried by the ""PDR_Congestion_Report" container. <Overload>
   SHOULD be set to"1" if the <S> bit is set. For more details see
Section
   4.6.1.6.1.

-------------

This change on the <Overlaod> filed will be worked out in the rest of the
text.

>
>Minor issues:
>       The measurement based admission control mechanism used here looks
>remarkably similar to the classical RSVP Predicative service.  Both of
>these are based on the assumption that current measured characteristics
>are an indicator of future load.  It is not at all apparent that there
>is any such relationship.  It seems that the text ought to include some
>indication as to what the basis of suggesting this be used is, and why
>it is thought to be meaningful.  Even if the argument is "it is worth
>trying", it seems worth stating that, and stating why it is thought that
>it will work now.

Georgios: You are right about the fact that the measurement based
admission control scheme can only support a predictive service.

We would like to change the following paragraph in section 3.1 from:

   The measurement-based algorithm continuously measures traffic levels
   and the actual available resources, and admits flows whose resource
   needs are within what is available at the time of the request.

INTO:

The measurement-based algorithm continuously measures traffic levels
and the actual available resources, and admits flows whose resource
needs are within what is available at the time of the request. The
measurement based algorithm is used to support a predictive service
where the service commitment is somewhat less reliable than the service
that can be supported by the reservation based method. A main assumption
that is taken by such measurement based admission control mechanisms is
that the aggregated PHB traffic passing through an RMD interior node is
high and therefore, current measurement characteristics are considered
to be an indicator of future load.

>       It would probably be helpful to explain why it is necessary or
>desirable to use two different RESERVE messages across the same domain,
>traversing the same set of devices, with different but closely related
>information.  (particularly in light of the comments about reducing load
>on intermediate devices.)

Georgios: You are right we will try to clarify this as follows:

In Section 3.1 we would like to change the following text from:

  The basic RMD-QOSM/QoS-NSLP signaling is shown in Figure 3. The
   signalling scenarios are accomplished using the QoS-NSLP processing
   rules defined in [QoS-NSLP], in combination with the RMF triggers
   sent via the QoS-NSLP-RMF API described in [QoS-NSLP]. A RESERVE
   message is created by a QNI with an Initiator QSpec describing the
   reservation and forwarded along the path towards the QNR.

INTO:

  "The basic RMD-QOSM/QoS-NSLP signaling is shown in Figure 3. The
   signalling scenarios are accomplished using the QoS-NSLP processing
   rules defined in [QoS-NSLP], in combination with the RMF triggers
   sent via the QoS-NSLP-RMF API described in [QoS-NSLP].
   Due to the fact that within the RMD domain a different QoS model can
   be supported than the end-to-end QoS model applied at the edges of the
   RMD domain, the RMD interior node reduced state reservations can be
   updated independently of the per-flow end-to-end reservations, see
   Section 4.7 of [QoS-NSLP]. Therefore, two different RESERVE messages
are
   used within the RMD domain. One RESERVE message that is associated with
   the per flow end-to-end reservations and is used by the edges of the
   RMD domain and one that is associated with the reduced state
   reservations within the RMD domain."

>       The applicability section states that this mechanism can only be used
>with the EF DSCP.  Is it further the case that it can only be used for
>traffic which consistently uses a stable amount of bandwidth (per
>reservation)?  One of the difficulties with the style of reservation
>based on measurement of load is that the end pointing requesting the
>measurement must be aware of whether the measurement data includes the
>flow being considered for admission.  Otherwise, large flows can cause
>significant confusion.  With very stable flows, as long as the
>measurements are not requested too often, this is achievable.
>Otherwise, it is not at all clear to this reader how the proposed
>mechanism would work (particularly when refreshing a reservation).
>       Continuing this line of questioning, the mechanism for modification
>seems to send the new bandwidth through the stateless intra-domain
>routers.  Since they are stateless, those routers do now know what the
>old reservation was.  And the measurements presumably include traffic
>under the old reservation.  if these are added together, significant
>double-counting woudl seem to occur.  (This is listed as minor on the
>premise that the protocol presumably actually works, and therefore the
>problem is one of reader comprehension, rather than more serious
>technical issues.

Georgios: You are right that the descriptions are not clear. Please note
that with the measurement based scheme the requested peak bandwidth of a
flow is carried by the admission control request. The admission decision
is considered as positive if the currently carried traffic, as
characterized by the measured statistics, plus the requested resources
for the new flow exceeds the system capacity with a probability smaller
than a value alpha. Otherwise, the admission decision is negative. It is
important to emphasize that due to the fact that the interior nodes are
stateless, they do not store information of previous admission control
requests. This could lead to a situation where the admission control
accuracy is decreased when multiple simultaneous flows (sharing a common
interior node) are requesting admission control simultaneously. By
applying measuring techniques, see e.g., [JaSh97], [GrTs03], which are
using current and past information on NSIS sessions that requested
resources from an NSIS aware interior node, the decrease in admission
control accuracy can be limited.

Moreover, the RMD measurement based schemes described in this document do
not use any refresh procedures, since these approaches are used in
stateless nodes, see Section 4.6.1.3.

In order to clarify the text we would like to do the following.

The abstract description of the measurement based admission control
mechanism given in Section 3.1 will be enhanced as follows:

We will add the following paragraph in Section 3.1:

It is important to emphasize that the RMD measurement based schemes
described in this document do not use any refresh procedures, since
these approaches are used in stateless nodes, see Section 4.6.1.3.

With the measurement based scheme the requested peak bandwidth of a flow
is carried by the admission control request. The admission decision is
considered as positive if the currently carried traffic, as
characterized by the measured statistics, plus the requested resources
for the new flow exceeds the system capacity with a probability smaller
than a value alpha. Otherwise, the admission decision is negative. It is
important to emphasize that due to the fact that the interior nodes are
stateless, they do not store information of previous admission control
requests. This could lead to a situation where the admission control
accuracy is decreased when multiple simultaneous flows (sharing a common
interior node) are requesting admission control simultaneously. By
applying measuring techniques, see e.g., [JaSh97], [GrTs03], which are
using current and past information on NSIS sessions that requested
resources from an NSIS aware interior node, the decrease in admission
control accuracy can be limited."

>
>       I was not able to understand the purpose or use of the K bit.  I may
>have missed it in the dense text.  Assuming there is an explanation, a
>pointer at the point where the bit is defined to the text which explains
>its use would be a very good idea.

Georgios: You are right. The use of the <K> bit is described in Section
4.6.1.5.2.

The description of the <K> bit will be changed as follows:

<K>: 1 bit. When set to "1" it indicates that the resources/bandwidth
   carried by a tearing RESERVE MUST NOT be released and the
   resources/bandwidth carried by a non tearing RESERVE MUST NOT be
   reserved/refreshed. For more details see Section 4.6.1.5.2.

Best regards,
Georgios

>
>Yours,
>Joel M. Halpern
>
>Nits/editorial comments:
_______________________________________________
Gen-art mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/gen-art

Re: [Gen-art] LC review: draft-ietf-nsis-rmd-16.txt

Reply via email to