More inline ...

-----Original Message-----
From: Hannes Gredler [mailto:[email protected]] 
Sent: Monday, July 28, 2014 17:07
To: LITKOWSKI Stephane SCE/IBNF
Cc: DECRAENE Bruno IMT/OLN; Russ White; [email protected]
Subject: Re: On minimizing SPF backoff induced blackouts

On Fri, Jul 25, 2014 at 04:43:15PM +0000, [email protected] wrote:
| I agree with you at now routers are able to handle multiple events in a short 
time but anyway :
| - delaying is always necessary (I can't really see how we can go under 
100/150ms) :

the only reason it is necessary is that some implementations (ours included) do 
some state compression, such that there is a collection window of 20ms of 
local-events. strictly speaking thats just a efficiency/responsiveness tradeoff 
and we got some requests in the past to lower that window.

|       If you go to something like 0 or 5 msec, we may fall into 
| implementations issues with constant READ/WRITE contentions in FIB 
| that often cause corner cases where the router is completely lost in 
| what it must do

again thats implementation dependent. if you have now state-compression on your 
RIB/FIB path then you're doomed anyway. state of the art is that you do not 
de-queue route 'N(t)' if route 'N(t+1)' is available, but cancel N(t) update.

|       Moreover in case of complex outage (node failure or SRLG failure) where 
multiple LSPs are sent. We must ensure that all LSPs are received before 
computing SPF to ensure that the target topology is a good one.

see and this i do not believe is possible.
whatever <N> SPF initial wait will be, there will be always incidents where the 
LSP triggers arrives after <N>, so there will be always corner cases for loops.

[SLI] Right, there may always be some corner case, but I think the number of 
occurences is low ... I can only think because we do not have statistics yet 
from routers :)

| - Increasing delay is also necessary. By experience there are 
| situations where you need absolutely to calm down computation to 
| prevent churn amplification (even today with stronger CPUs)

no doubt - the question is when do we need to kick in that self-protection 
layer ... is it after reception of the 2nd LSP or after the 10th ?

[SLI] That's the good question :) We are doing such tuning by experience for 
years now.
The only point is that the behavior must be aligned between all nodes and 
trying to figure out worst case (slowest routers)


| - we need to ensure that increasing delay steps are not so big, so in case of 
router unsynchronization (in delay value), the difference would be small.
| - as agreed , we can run more fast scheduled SPFs
| 
| Both current implementations (two steps and backoff) do no fill these 
"requirements".

rapid-run in a certain way does because it is widening your window of 
'concurrent, unrelated network events'
without going into backup mode.

[SLI] Right, but the gap between rapid-run and backup-mode is really high.


| I would be in favor of standardizing something really simple with no 
mathematic underway. Just extend the two steps to an N-Step tunable system that 
the operator can manage.
| Something like defining a WRED profile but without interpolation.
| Example :
| 
| Initial : 100 ms, 4 runs
| 2nd : 200ms, 2 runs
| 3rd : 300ms, 2 runs
| 4th : 500ms, 2 runs
| 5th : 700ms, 2 runs
| 6th : 800ms, 2 runs
| 7th : 1s, all subsequent runs

as tony (and others) have pointed out ... it will only work during nice-weather 
conditions ... as soon as you have the perfect-storm, you will have micro loops 
(and all the pain you try to protect yourself) - therefore i'd be much more in 
favour of 'something else' which gets us to zero uloops, and for that i am 
afraid we need to have the notions of independent forwarding planes (ala make 
before break) -

trying to synchronize RIB/FIB update across different 
CPUs/routing-stacks/vendors/load conditions is a battle that can never be won.

[SLI] This is not a definitive solution for microloops, just a simple quick win 
to remove somes.
I'm still dreaming on a simple definitive solution ... but I don't see any ... 
oFIB or synchronized FIBs sounds to not really have a good of support for 
implementations ...
In the meantime, I do think there are small and fast areas of improvments even 
if not definitive solutions. 


Stephane


| -----Original Message-----
| From: rtgwg [mailto:[email protected]] On Behalf Of Hannes 
| Gredler
| Sent: Thursday, July 24, 2014 15:03
| To: DECRAENE Bruno IMT/OLN; Russ White
| Cc: [email protected]
| Subject: RE: On minimizing SPF backoff induced blackouts
| 
| hi bruno,
| 
| IMO the problem that you're trying to solve is to have the routers still 
being responsive when multiple events are happening in the network.
| 
| wouldn't then a simple fix be to kick the SPF pacing logic at a much 
| more later point in time ? - i.e. rather than slowing down things down 
| on event #2 or #3, do it at event #<N> (please insert you're favorite 
| number for <N>)
| 
| /hannes
| 
| ________________________________________
| From: rtgwg <[email protected]> on behalf of 
| [email protected] <[email protected]>
| Sent: Thursday, July 24, 2014 20:56
| To: Russ White
| Cc: [email protected]
| Subject: RE: On minimizing SPF backoff induced blackouts
| 
| > > I'm willing to be persuaded, I just don't see the argument for 
| > > specifying an algorithm with what's been put on the table to this point.
| >
| > Ok. So I guess my presentation/draft was not clear enough.
| 
| Let me try again.
| In slide 5: 
| http://tools.ietf.org/agenda/90/slides/slides-90-rtgwg-2.pdf
| - do we agree that it would be better if node A and B both schedule their SPF 
at roughly the same time? i.e. wait for the same duration?
| - if so, what would be your proposition?
| 
| Bruno
| 
| 
| ______________________________________________________________________
| ___________________________________________________
| 
| Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites 
ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez 
le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les 
messages electroniques etant susceptibles d'alteration, Orange decline toute 
responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
| 
| This message and its attachments may contain confidential or privileged 
information that may be protected by law; they should not be distributed, used 
or copied without authorisation.
| If you have received this email in error, please notify the sender and delete 
this message and its attachments.
| As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
| Thank you.
| 
| _______________________________________________
| rtgwg mailing list
| [email protected]
| https://www.ietf.org/mailman/listinfo/rtgwg
| 
| _______________________________________________
| rtgwg mailing list
| [email protected]
| https://www.ietf.org/mailman/listinfo/rtgwg
| 
| ______________________________________________________________________
| ___________________________________________________
| 
| Ce message et ses pieces jointes peuvent contenir des informations 
| confidentielles ou privilegiees et ne doivent donc pas etre diffuses, 
| exploites ou copies sans autorisation. Si vous avez recu ce message 
| par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les 
pieces jointes. Les messages electroniques etant susceptibles d'alteration, 
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.
| 
| This message and its attachments may contain confidential or 
| privileged information that may be protected by law; they should not be 
distributed, used or copied without authorisation.
| If you have received this email in error, please notify the sender and delete 
this message and its attachments.
| As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
| Thank you.
| 

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.

_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg

Reply via email to