Thanks. The new revision addresses my comments. I have completed the shepherd write-up.
It can be found at: https://datatracker.ietf.org/doc/draft-ietf-rtgwg-spf-uloop-pb-statement/shepherdwriteup/ There are a few editing items mentioned in the shepherd write-up (and copied below) to be addressed in the next revision, but I will go ahead and submit it to the IESG for publication. Thanks, Chris The following two warnings should be addresed in a future revision. == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. == Outdated reference: draft-ietf-rtgwg-uloop-delay has been published as RFC 8333 All references have been identified as normative or informative. There are currently 4 normative references. Since this is an informational document, it might make sense to classify some or all of those references as in informative. On Wed, May 23, 2018 at 5:15 AM, <[email protected]> wrote: > Hi Chris, > > > > I have uploaded a new revision. Let me know if it correctly addresses your > comments. > > > > Brgds, > > > > > > *From:* Chris Bowers [mailto:[email protected]] > *Sent:* Monday, April 16, 2018 22:02 > *To:* [email protected]; RTGWG > *Subject:* draft-ietf-rtgwg-spf-uloop-pb-statement > > > > As part of doing the shepherd write-up for this document, I did a review > of the draft. > > > My comments are shown below as a diff on draft-ietf-rtgwg-spf-uloop-pb- > statement-06.txt. > > They can also be viewed at: > https://github.com/cbowers/outgoing-feedback-on-ietf-drafts-2018/commit/ > c1c5018f857e9c7c0f4123c3de1e87041178e387 > > Thanks, > Chris > > ============= > > diff --git a/draft-ietf-rtgwg-spf-uloop-pb-statement-06.txt > b/draft-ietf-rtgwg-spf-uloop-pb-statement-06.txt > index 353ce3c..3dff746 100644 > --- a/draft-ietf-rtgwg-spf-uloop-pb-statement-06.txt > +++ b/draft-ietf-rtgwg-spf-uloop-pb-statement-06.txt > @@ -21,7 +21,16 @@ Abstract > > In this document, we are trying to analyze the impact of using > different Link State IGP implementations in a single network in > - regards of micro-loops. The analysis is focused on the SPF triggers > + regards of micro-loops. > + > +======= > +[CB] > + In this document, we are trying to analyze the impact of using > + different Link State IGP implementations in a single network, with > + respect to micro-loops. > + > +======== > + The analysis is focused on the SPF triggers > and SPF delay algorithm. > > Requirements Language > @@ -95,13 +104,39 @@ Table of Contents > Link State IGP protocols are based on a topology database on which an > SPF (Shortest Path First) algorithm like Dijkstra is implemented to > find the optimal routing paths. > - > + > + ===== > + [CB] proposed modified text since the Shortest Path First algorithm and > + Djikstra algorithm are essentially synonomous. Also propose to use > + "consistent set of non-looping routing paths", since shortest path > routing > + is often not optimal from a traffic engineering perspective. > + > + [proposed text] > + Link State IGP protocols are based on a topology database on which the > + SPF (Shortest Path First) algorithm is run to > + find a consistent set of non-looping routing paths. > + > + ===== > + > Specifications like IS-IS ([RFC1195]) propose some optimizations of > the route computation (See Appendix C.1) but not all the > implementations are following those not mandatory optimizations. > > +============ > +[CB] [proposed text] > +but not all implementations follow those non-mandatory > +optimizations. > +============= > + > We will call "SPF trigger", the events that would lead to a new SPF > computation based on the topology. > + > +============ > +[CB] [proposed text] > + We will call "SPF triggers", the events that would lead to a new SPF > + computation based on the topology. > +============= > + > > Link State IGP protocols, like OSPF ([RFC2328]) and IS-IS > ([RFC1195]), are using multiple timers to control the router behavior > @@ -118,11 +153,27 @@ Internet-Draft spf-microloop > January 2018 > > Some of those timers are standardized in protocol specification, some > are not especially the SPF computation related timers. > + > +============ > +[CB] [proposed text] > + Some of those timers are standardized in protocol specification, while > some > + are not. The SPF computation related timers have generally remained > + unspecified. > +============= > > For non standardized timers, implementations are free to implement it > in any way. For some standardized timer, we can also see that rather > than using static configurable values for such timer, implementations > may offer dynamically adjusted timers to help controlling the churn. > + > +============ > +[CB] In the dicussion above, it is unclear about what the meaning of > "timer" is. > +Is it the numerical value of a timer? Is it the trigger conditions and > logic > +for a timer to start or be reset? Is the the action taken when the timer > expires? > +Perhaps the text could clarified by referring to "timer behavior" and > "timer values" > + > +============= > + > > We will call "SPF delay", the timer that exists in most > implementations that specifies the required delay before running SPF > @@ -138,6 +189,17 @@ Internet-Draft spf-microloop > January 2018 > Some micro-loop mitigation techniques have been defined by IETF (e.g. > [RFC6976], [I-D.ietf-rtgwg-uloop-delay]) but are not implemented due > to complexity or are not providing a complete mitigation. > + > +========== > +[CB] > +This paragraph needs to be clearer. > +[proposed text] > + Two micro-loop mitigation techniques have been defined by the IETF. > + [RFC6976] has not been widely implemented, presumably due to the > complexity > + of the technique. [I-D.ietf-rtgwg-uloop-delay] has been implemented. > + However, it does not prevent all micro-loops that can occur > + for a given topology and failure scenario. > +========== > > In multi-vendor networks, using different implementations of a link > state protocol may favor micro-loops creation during the convergence > @@ -185,17 +247,24 @@ Internet-Draft spf-microloop > January 2018 > will forward the traffic to C through B, but as B as not converged > yet, B will loop back traffic to A, leading to a micro-loop. > > +======== > +[CB] > +Figure 1 and figure 4 are essentially the same topology, but the nodes > +have different names. I think it would be much better for the reader of > this > +document to consolidate the two figures into a single figure. > +======== > + > The micro-loop appears due to the asynchronous convergence of nodes > in a network when an event occurs. > > - Multiple factors (and combination of these factors) may increase the > + Multiple factors (or a combination of these factors) may increase the > probability for a micro-loop to appear: > > o the delay of failure notification: the more B is advised of the > failure later than A, the more a micro-loop may have a chance to > appear. > > - o the SPF delay: most of the implementations supports a delay for > + o the SPF delay: most implementations support a delay for > the SPF computation to try to catch as many events as possible. > If A uses an SPF delay timer of x msec and B uses an SPF delay > timer of y msec and x < y, B would start converging after A > @@ -204,8 +273,8 @@ Internet-Draft spf-microloop > January 2018 > o the SPF computation time: mostly a matter of CPU power and > optimizations like incremental SPF. If A computes its SPF faster > than B, there is a chance for a micro-loop to appear. CPUs are > - today faster enough to consider SPF computation time as > - negligeable (order of msec in a large network). > + today fast enough to consider SPF computation time as > + negligible (on the order of milliseconds in a large network). > > o the SPF computation order: an SPF trigger can be common to > multiple IGP areas or levels (e.g., IS-IS Level1/Level2) or for > @@ -215,8 +284,8 @@ Internet-Draft spf-microloop > January 2018 > done in A and B for each area/level/topology/SPF-algorithm is > different, there is a possibility for a micro-loop to appear. > > - o the RIB and FIB prefix insertion speed or ordering: highly > - implementation dependant. > + o the RIB and FIB prefix insertion speed or ordering. This is highly > + dependent on the implementation. > > > > @@ -225,22 +294,21 @@ Litkowski, et al. Expires July 28, 2018 > [Page 4] > Internet-Draft spf-microloop January 2018 > > > - This document will focus on analysis SPF delay (and associated > - triggers). > + This document will focus on analysis of the SPF delay behavior and the > associated > + triggers. > > 3. SPF trigger strategies > > - Depending of the change advertised in LSP/LSA, the topology may be > + Depending on the change advertised in an LSPDU or LSA, the topology > may be > affected or not. An implementation may avoid running the SPF > computation (and may only run IP reachability computation instead) if > - the advertised change is not affecting topology. > + the advertised change does not affect the topology. > > Different strategies exists to trigger the SPF computation: > > - 1. An implementation may always run a full SPF whatever the change > - to process. > + 1. An implementation may always run a full SPF for any type of change. > > - 2. An implementation may run a full SPF only when required: e.g. if > + 2. An implementation may run a full SPF only when required. For > example, if > a link fails, a local node will run an SPF for its local LSP > update. If the LSP from the neighbor (describing the same > failure) is received after SPF has started, the local node can > @@ -250,26 +318,28 @@ Internet-Draft spf-microloop > January 2018 > 3. If the topology does not change, an implementation may only > recompute the IP reachability. > > - As pointed in Section 1, SPF optimizations are not mandatory in > - specifications, leading to multiple strategies to be implemented. > + As noted in Section 1, SPF optimizations are not mandatory in > + specifications. This has led to the implementation of > + different strategies. > > 4. SPF delay strategies > > Implementations of link state routing protocols use different > - strategies to delay the SPF computation. We usually see the > - following: > + strategies to delay the SPF computation. The two most > + common SPF delay behaviors are the following. > > - 1. Two steps delay. > + 1. Two phase delay. > > 2. Exponential backoff delay. > > - Those behavior will be explained in the next sections. > + These behaviors are described in the following sections. > > -4.1. Two steps SPF delay > +4.1. Two phase SPF delay > > - The SPF delay is managed by four parameters: > + For the two phase SPF delay, the SPF delay is managed by four > parameters: > > - o Rapid delay: amount of time to wait before running SPF. > + o Rapid delay: amount of time to wait before running SPF, after the > + initial SPF trigger event. > > > > @@ -281,13 +351,13 @@ Litkowski, et al. Expires July 28, 2018 > [Page 5] > Internet-Draft spf-microloop January 2018 > > > - o Rapid runs: amount of consecutive SPF runs that can use the rapid > - delay. When the amount is exceeded the delay moves to the slow > + o Rapid runs: the number of consecutive SPF runs that can use the > rapid > + delay. When the number is exceeded, the delay moves to the slow > delay value . > > o Slow delay: amount of time to wait before running SPF. > > - o Wait time: amount of time to wait without events before going back > + o Wait time: amount of time to wait without receiving SPF trigger > events before going back > to the rapid delay. > > Example: Rapid delay = 50msec, Rapid runs = 3, Slow delay = 1sec, > @@ -308,7 +378,9 @@ Internet-Draft spf-microloop > January 2018 > | | | | || | | > < wait time > > > - Figure 2 - Two steps delay algorithm > + Figure 2 - Two phase delay algorithm > + > + > > 4.2. Exponential backoff > > @@ -394,13 +466,20 @@ Internet-Draft spf-microloop > January 2018 > > > for delaying PRC. We consider that E is using a SPF trigger strategy > - that always compute Full SPF and exponential backoff strategy for SPF > + that always computes a Full SPF for any change, and uses the > exponential backoff strategy for SPF > delay (start=150ms, inc=150ms, max=1s) > > We also consider the following sequence of events (note : the time > scale does not intend to represent a real router time scale where > jitters are introduced to all timers) : > > +========== > +[CB] > +This note about jitter and time scale (or timeline) is not clear. I > suggest describing > +it in more detail or deleting it. > +========== > + > + > o t0=0 ms: a prefix is declared down in the network. We consider > this event to happen at time=0. > > @@ -487,12 +566,12 @@ Internet-Draft spf-microloop > January 2018 > Route computation event time scale > > In the table above, we can see that due to discrepancies in the SPF > - management, after multiple events (of a different type), the values > - of the SPF delay are completely misaligned between nodes leading to > - long micro-loops creation. > + management, after multiple events of a different type, the values > + of the SPF delay are completely misaligned between node S and node E, > + leading to the creation of micro-loops. > > - The same issue can also appear with only single type of events as > - displayed below: > + The same issue can also appear with only a single type of event as > + shown below: > > +--------+--------------------+------------------+------------------+ > | Time | Network Event | Router S events | Router E events | > @@ -587,6 +666,28 @@ Internet-Draft spf-microloop > January 2018 > > 6. Proposed work items > > +=============== > +[CB] > +Since we are publishing this document after the SPF backoff algorithm > +draft is published, I think the list of three proposed work items below > will be > +confusing. Someone reading this RFC will wonder why the > +SPF backoff algorithm RFC (which will have an earlier RFC number) > +doesn't satisfy the list of proposed work items. > + > +Perhaps this section should be renamed something like > +"Benefits of standardized SPF delay behavior", and the list of proposed > +work items should be removed. > + > +It may also make sense to explicitly say that the > +SPF backoff algorithm draft/RFC is a solution that > +satisfies this problem statement. > +And that we are publishing the document in order to > +capture the reasoning that led to that draft. Text to this > +effect should probably go in the introduction, instead > +of this section. > + > +=============== > + > In order to enhance the current Link State IGP behavior, authors > would encourage working on standardization of some behaviours. > > @@ -603,14 +704,23 @@ Internet-Draft spf-microloop > January 2018 > > Using the same event sequence as in figure 2, we may expect fewer > and/or shorter micro-loops using standardized implementations. > + > +=========== > +[CB] I think the text should refer to one of the previous tables and not > Figure 2. > +Figure 2 shows the two step delay algorithm. > +=========== > > +--------+--------------------+------------------+------------------+ > | Time | Network Event | Router S events | Router E events | > +--------+--------------------+------------------+------------------+ > | t0=0 | Prefix DOWN | | | > | 10ms | | Schedule PRC (in | Schedule SPF (in | > - > - > + > +=========== > +[CB] > +It seems like there is a typo here. Presumably router E should schedule a > +PRC (not an SPF) at 10ms in this table. > +=========== > > Litkowski, et al. Expires July 28, 2018 [Page 11] > ^L > @@ -677,13 +787,48 @@ Internet-Draft spf-microloop > January 2018 > +--------+--------------------+------------------+------------------+ > > Route computation event time scale > - > + > +============= > +[CB] > +I think the term "time scale" throughout this document is not the right > one. > +Perhaps the term "timeline" would be better or the phrase "sequence of > events". > +============= > +[CB] > +There are several different tables with the same caption > +"Route computation event time scale". > +Regardless of the replacement term for "time scale", it would be helpful > to make a > +distinction between the tables with each caption. For example, this last > +table could have a caption like "Route computation when S and E use the > +same standardized behavior". > + > +========== > As displayed above, there could be some other parameters like router > computation power, flooding timers that may also influence micro- > loops. In Figure 4, we consider E to be a bit slower than S, leading > - to micro-loop creation. Despite of this, we expect that by aligning > + to micro-loop creation. > + > +================= > +[CB] > +There is nothing in Figure 4 that shows that that E is slower than S. > +Perhaps it would be clearer to say something like: > +"In all of the > +examples in this document comparing the SPF timer behavior of > +router S and router E, we have made router E a bit slower than > +router S. This can lead to microloops even when both S and E use > +a common standardized SPF behavior. > +================= > + > + > + Despite of this, we expect that by aligning > implementations at least on SPF trigger and SPF delay, service > provider may reduce the number and the duration of micro-loops. > +=================== > +[CB] > +"Despite of this" should read "In spite of this" or "Despite this". > +Or in this case "However" might be better. > + > +s/service provider/service providers/ > +================== > > 7. Security Considerations > > _________________________________________________________________________________________________________________________ > > Ce message et ses pieces jointes peuvent contenir des informations > confidentielles ou privilegiees et ne doivent donc > pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu > ce message par erreur, veuillez le signaler > a l'expediteur et le detruire ainsi que les pieces jointes. Les messages > electroniques etant susceptibles d'alteration, > Orange decline toute responsabilite si ce message a ete altere, deforme ou > falsifie. Merci. > > This message and its attachments may contain confidential or privileged > information that may be protected by law; > they should not be distributed, used or copied without authorisation. > If you have received this email in error, please notify the sender and delete > this message and its attachments. > As emails may be altered, Orange is not liable for messages that have been > modified, changed or falsified. > Thank you. > >
_______________________________________________ rtgwg mailing list [email protected] https://www.ietf.org/mailman/listinfo/rtgwg
