Hi Robert,
Thanks for your comment. Please see my response inline.
From: [email protected] At: 01/29/22 11:14:59 UTC-5:00To: [email protected]
Cc: Albert Fu (BLOOMBERG/ 120 PARK ) , [email protected],
[email protected], [email protected]
Subject: Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" -
draft-ietf-lsr-ospf-bfd-strict-mode-04
Hi Ketan and all,
I support this draft - it is a useful addition.
There are two elements which I would suggest to adjust in the text before
publication.
#1 Overpromise
Even below you say:
> Since there is a issue with forwarding (which is what BFD detects)
and in the text we see:
"In certain other scenarios, a degraded or poor quality link will allow OSPF
adjacency formation to succeed but the BFD session establishment will fail or
the BFD session will flap."
Reader may get an impression that if he enables strict mode he is 100% safe.
Sure he is safer then before but not 100% safe.
Real networks prove that there are classes of failures which BFD can not
detect. And Albert knows them too :)
For example some emulated circuits can experience periodic drops only at some
MTUs and only when link utilization reaches X %. While there is ongoing
extension to BFD to fill it with payload I don't think that BFD can be useful
to also saturate say in 80% 10G link with probes to test it well before
allowing OSPF to be established.
[AF] This draft ensures that BFD can be used to detect failure quickly when
there is a complete path failure between the nodes. You are right that there
are many other types of failure that BFD cannot detect. For example, one other
one we have seen from time to time is when an interface fails to forward
configured MTU-sized packet (see the other draft that we have proposed to
address this is
https://datatracker.ietf.org/doc/html/draft-ietf-bfd-large-packets).
There are also other types of failures, such as link errors.
Another error type that we have encountered are errors due to pattern sensitive
pattern where BFD/Routing protocols are working fine, but user traffic cannot
be forwarded due to bit flip.
I tend to think that these type of errors are best handled outside this draft,
and from previous discussions with various people in the BFD working group,
they are keen to keep BFD as simple as possible.
#2 Timer
What I find missing in the draft is a mutually (between OSPF peers) timer fired
after BFD session is up which in OSPF could hold on allowing BFD to do some
more testing before declaring adj to be established. I think just bringing OSPF
adj immediately after the BFD session is up is not a good thing. Keep in mind
that we are bringing the interface up so by applying such a timer we are not
dropping packets .. in fact quite the reverse we are making sure user packets
would not be dropped.
[AF] This is a good point you brought up. Both router vendors that I have
tested (Cisco & Juniper) do indeed have timer mechanism to delay when OSPF
would be allowed to come up, which we have tested (useful to guard against
flapping links). I am not sure if this "hold down" mechanism needs to be
included in the draft.
Cheers,
Robert
Thanks,
Albert
_______________________________________________
Lsr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lsr