On Fri, Jul 25, 2014 at 04:43:15PM +0000, [email protected] wrote: | I agree with you at now routers are able to handle multiple events in a short time but anyway : | - delaying is always necessary (I can't really see how we can go under 100/150ms) :
the only reason it is necessary is that some implementations (ours included) do some state compression, such that there is a collection window of 20ms of local-events. strictly speaking thats just a efficiency/responsiveness tradeoff and we got some requests in the past to lower that window. | If you go to something like 0 or 5 msec, we may fall into implementations issues with constant READ/WRITE contentions in FIB that often cause corner cases where the router is completely lost in what it must do again thats implementation dependent. if you have now state-compression on your RIB/FIB path then you're doomed anyway. state of the art is that you do not de-queue route 'N(t)' if route 'N(t+1)' is available, but cancel N(t) update. | Moreover in case of complex outage (node failure or SRLG failure) where multiple LSPs are sent. We must ensure that all LSPs are received before computing SPF to ensure that the target topology is a good one. see and this i do not believe is possible. whatever <N> SPF initial wait will be, there will be always incidents where the LSP triggers arrives after <N>, so there will be always corner cases for loops. | - Increasing delay is also necessary. By experience there are situations where you need absolutely to calm down computation to prevent churn amplification (even today with stronger CPUs) no doubt - the question is when do we need to kick in that self-protection layer ... is it after reception of the 2nd LSP or after the 10th ? | - we need to ensure that increasing delay steps are not so big, so in case of router unsynchronization (in delay value), the difference would be small. | - as agreed , we can run more fast scheduled SPFs | | Both current implementations (two steps and backoff) do no fill these "requirements". rapid-run in a certain way does because it is widening your window of 'concurrent, unrelated network events' without going into backup mode. | I would be in favor of standardizing something really simple with no mathematic underway. Just extend the two steps to an N-Step tunable system that the operator can manage. | Something like defining a WRED profile but without interpolation. | Example : | | Initial : 100 ms, 4 runs | 2nd : 200ms, 2 runs | 3rd : 300ms, 2 runs | 4th : 500ms, 2 runs | 5th : 700ms, 2 runs | 6th : 800ms, 2 runs | 7th : 1s, all subsequent runs as tony (and others) have pointed out ... it will only work during nice-weather conditions ... as soon as you have the perfect-storm, you will have micro loops (and all the pain you try to protect yourself) - therefore i'd be much more in favour of 'something else' which gets us to zero uloops, and for that i am afraid we need to have the notions of independent forwarding planes (ala make before break) - trying to synchronize RIB/FIB update across different CPUs/routing-stacks/vendors/load conditions is a battle that can never be won. /hannes | -----Original Message----- | From: rtgwg [mailto:[email protected]] On Behalf Of Hannes Gredler | Sent: Thursday, July 24, 2014 15:03 | To: DECRAENE Bruno IMT/OLN; Russ White | Cc: [email protected] | Subject: RE: On minimizing SPF backoff induced blackouts | | hi bruno, | | IMO the problem that you're trying to solve is to have the routers still being responsive when multiple events are happening in the network. | | wouldn't then a simple fix be to kick the SPF pacing logic at a much more later point in time ? - i.e. rather than slowing down things down on event #2 or #3, do it at event #<N> (please insert you're favorite number for <N>) | | /hannes | | ________________________________________ | From: rtgwg <[email protected]> on behalf of [email protected] <[email protected]> | Sent: Thursday, July 24, 2014 20:56 | To: Russ White | Cc: [email protected] | Subject: RE: On minimizing SPF backoff induced blackouts | | > > I'm willing to be persuaded, I just don't see the argument for | > > specifying an algorithm with what's been put on the table to this point. | > | > Ok. So I guess my presentation/draft was not clear enough. | | Let me try again. | In slide 5: http://tools.ietf.org/agenda/90/slides/slides-90-rtgwg-2.pdf | - do we agree that it would be better if node A and B both schedule their SPF at roughly the same time? i.e. wait for the same duration? | - if so, what would be your proposition? | | Bruno | | | _________________________________________________________________________________________________________________________ | | Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. | | This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. | If you have received this email in error, please notify the sender and delete this message and its attachments. | As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. | Thank you. | | _______________________________________________ | rtgwg mailing list | [email protected] | https://www.ietf.org/mailman/listinfo/rtgwg | | _______________________________________________ | rtgwg mailing list | [email protected] | https://www.ietf.org/mailman/listinfo/rtgwg | | _________________________________________________________________________________________________________________________ | | Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc | pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler | a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, | Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. | | This message and its attachments may contain confidential or privileged information that may be protected by law; | they should not be distributed, used or copied without authorisation. | If you have received this email in error, please notify the sender and delete this message and its attachments. | As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. | Thank you. | _______________________________________________ rtgwg mailing list [email protected] https://www.ietf.org/mailman/listinfo/rtgwg
