Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Jeff Tantsura
Yes/support 

Cheers,
Jeff

> On Jan 27, 2022, at 09:08, Acee Lindem (acee) 
>  wrote:
> 
> 
> LSR WG,
>  
> This begins a two week last call for the subject draft. Please indicate your 
> support or objection on this list prior to 12:00 AM UTC on February 11th, 
> 20222. Also, review comments are certainly welcome.
> Thanks,
> Acee
>  
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Gyan Mishra
Hi  Kethan

Thank you for answering all my questions. I am all set.

Responses in-line

Kind Regards

Gyan
On Sun, Jan 30, 2022 at 11:48 PM Ketan Talaulikar 
wrote:

> Hi Gyan,
>
> Please check inline with KT2.
>
>
> On Mon, Jan 31, 2022 at 1:20 AM Gyan Mishra  wrote:
>
>> Hi Ketan
>>
>> Welcome.  Responses in-line Gyan2
>>
>> Kind Regards
>>
>> On Sun, Jan 30, 2022 at 12:34 PM Ketan Talaulikar 
>> wrote:
>>
>>> Hi Gyan,
>>>
>>> Thanks for your review and your comments/feedback. Please check inline
>>> below for responses.
>>>
>>>
>>> On Sun, Jan 30, 2022 at 12:29 PM Gyan Mishra 
>>> wrote:
>>>

 I support WG Adoption of this draft.

 This is a real world problem that has existed with BFD that operators
 have to deal with where OSPF adjacency comes up before BFD session
 establishes resulting in cases where the link may have L1 issues or maybe a
 dirty link or poor link quality resulting in BFD session establishment
 followed by BFD immediately taking down the link.  With BFD tight timers
 with client protocol registered ends up further exacerbating the issue with
 link flaps resulting in IGP instability.

 This draft mirrors the ISIS block solution in RFC 6213   ISIS BFD
 enabled TLV.

 This issue exists with BGP as well where the protocol registered with
 BFD bootstrapped per RFC 5882 comes up before BFD resulting in
 instability.  I believe this gap still exists for BGP.

>>>
>>> KT> We have
>>> https://datatracker.ietf.org/doc/draft-ietf-idr-bgp-bfd-strict-mode/
>>>
>>>
>> Gyan> Understood to solve this issue for OSPF
>>
>>>
 When BFD comes up it performs link integrity test before session
 establishment to detect dirty errored link does not come Up.

 RFC 5880 BFD 3-way session establishment and  does the link integrity
 and  quality test by sending the BFD control packets to validate
 bi-directional forwarding liveliness detection over any media.

 The case mentioned in this draft where the link is dirty, MTU issues or
 forwarding plane issues exist that cause BFD not to establish resulting in
 the use of default protocol timers and slow convergence is a major issue
 for operators being solved with this draft as well as mentioned above where
 BFD does come up after the IGP is just as bad if not worse if the link is a
 dirty errored link resulting in flapping link.

 The main point here as I mentioned is that BFD must validate the link
 integrity before routing protocol comes Up, so that routing protocol does
 not come Up on a dirty errored link, so the blocking of the adjacency
 capabilities solution here nicely solves the issue.


 In this thread it has been mentioned maybe a CLI timer knob as far as
 implementation for delay knob makes sense.

>>>
>>> KT> Please check Sec 5. The terminology used might differ between
>>> implementations.
>>>
>>
>>  Gyan> I see mention of BFD dampening for poor link quality issues.
>> However if BFD comes Up and establishes that would mean that at the time
>> the BFD session is established as control packet were received based on
>> interval and timer that the link is in a good state at the time of session
>> establishment. The main issue we are solving here is not allowing OSPF to
>> come Up on a link with packet forwarding issues. At that point when OSPF
>> FSM is initiated and goes to Full state we know the link to be stable as
>> BFD was established successfully prior.  If a link were in bad shape or
>> flapping BFD would not establish and that is the crux of this drafts
>> problem statement.  So I think to some degree this draft does preclude the
>> need for BFD dampening.
>>
>>
>>> I would like to note that one workaround used by operators is using RFC
 7130 BFD over bundle member called “BOB” or per link BFD,  and in that case
 control protocol is in fact blocked and BFD comes up first.  This is a
 workaround used putting even individual single links in a bundle to present
 the issue from happening.

 I would like to note that RFC 5882 Generic Application of BFD does
 state that if all neighbors support BFD then the registered control
 protocol being bootstrapped should be blocked from coming up until BFD
 session is established.  Only in case where all neighbors on a LAN do not
 have BFD enabled, blocking the control protocol from coming Up would
 prevent the control protocol from coming Up on neighbors that don’t have
 BFD enabled.

 So the way I read it implementations following BFD RFC 5882 should have
 been blocking OSPF or ISIS  protocol from coming Up before BFD comes up w/o
 having to require a specification for the explicit block.  Apparently most
 all vendors implementations did not follow RFC 5882 it appears with this
 regard and thus now the requirement for operators for this important
 draft.  I think this 

Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Ketan Talaulikar
Hi Gyan,

Please check inline with KT2.


On Mon, Jan 31, 2022 at 1:20 AM Gyan Mishra  wrote:

> Hi Ketan
>
> Welcome.  Responses in-line
>
> Kind Regards
>
> On Sun, Jan 30, 2022 at 12:34 PM Ketan Talaulikar 
> wrote:
>
>> Hi Gyan,
>>
>> Thanks for your review and your comments/feedback. Please check inline
>> below for responses.
>>
>>
>> On Sun, Jan 30, 2022 at 12:29 PM Gyan Mishra 
>> wrote:
>>
>>>
>>> I support WG Adoption of this draft.
>>>
>>> This is a real world problem that has existed with BFD that operators
>>> have to deal with where OSPF adjacency comes up before BFD session
>>> establishes resulting in cases where the link may have L1 issues or maybe a
>>> dirty link or poor link quality resulting in BFD session establishment
>>> followed by BFD immediately taking down the link.  With BFD tight timers
>>> with client protocol registered ends up further exacerbating the issue with
>>> link flaps resulting in IGP instability.
>>>
>>> This draft mirrors the ISIS block solution in RFC 6213   ISIS BFD
>>> enabled TLV.
>>>
>>> This issue exists with BGP as well where the protocol registered with
>>> BFD bootstrapped per RFC 5882 comes up before BFD resulting in
>>> instability.  I believe this gap still exists for BGP.
>>>
>>
>> KT> We have
>> https://datatracker.ietf.org/doc/draft-ietf-idr-bgp-bfd-strict-mode/
>>
>>
> Gyan> Understood to solve this issue for OSPF
>
>>
>>> When BFD comes up it performs link integrity test before session
>>> establishment to detect dirty errored link does not come Up.
>>>
>>> RFC 5880 BFD 3-way session establishment and  does the link integrity
>>> and  quality test by sending the BFD control packets to validate
>>> bi-directional forwarding liveliness detection over any media.
>>>
>>> The case mentioned in this draft where the link is dirty, MTU issues or
>>> forwarding plane issues exist that cause BFD not to establish resulting in
>>> the use of default protocol timers and slow convergence is a major issue
>>> for operators being solved with this draft as well as mentioned above where
>>> BFD does come up after the IGP is just as bad if not worse if the link is a
>>> dirty errored link resulting in flapping link.
>>>
>>> The main point here as I mentioned is that BFD must validate the link
>>> integrity before routing protocol comes Up, so that routing protocol does
>>> not come Up on a dirty errored link, so the blocking of the adjacency
>>> capabilities solution here nicely solves the issue.
>>>
>>>
>>> In this thread it has been mentioned maybe a CLI timer knob as far as
>>> implementation for delay knob makes sense.
>>>
>>
>> KT> Please check Sec 5. The terminology used might differ between
>> implementations.
>>
>
>  Gyan> I see mention of BFD dampening for poor link quality issues.
> However if BFD comes Up and establishes that would mean that at the time
> the BFD session is established as control packet were received based on
> interval and timer that the link is in a good state at the time of session
> establishment. The main issue we are solving here is not allowing OSPF to
> come Up on a link with packet forwarding issues. At that point when OSPF
> FSM is initiated and goes to Full state we know the link to be stable as
> BFD was established successfully prior.  If a link were in bad shape or
> flapping BFD would not establish and that is the crux of this drafts
> problem statement.  So I think to some degree this draft does preclude the
> need for BFD dampening.
>
>
>> I would like to note that one workaround used by operators is using RFC
>>> 7130 BFD over bundle member called “BOB” or per link BFD,  and in that case
>>> control protocol is in fact blocked and BFD comes up first.  This is a
>>> workaround used putting even individual single links in a bundle to present
>>> the issue from happening.
>>>
>>> I would like to note that RFC 5882 Generic Application of BFD does state
>>> that if all neighbors support BFD then the registered control protocol
>>> being bootstrapped should be blocked from coming up until BFD session is
>>> established.  Only in case where all neighbors on a LAN do not have BFD
>>> enabled, blocking the control protocol from coming Up would prevent the
>>> control protocol from coming Up on neighbors that don’t have BFD enabled.
>>>
>>> So the way I read it implementations following BFD RFC 5882 should have
>>> been blocking OSPF or ISIS  protocol from coming Up before BFD comes up w/o
>>> having to require a specification for the explicit block.  Apparently most
>>> all vendors implementations did not follow RFC 5882 it appears with this
>>> regard and thus now the requirement for operators for this important
>>> draft.  I think this implementation discrepancy happened due to normative
>>> language SHOULD Block and not MUST Block is the problem.
>>>
>>
>> KT> There are implementations that provided knobs for this strict
>> mode-like behavior. The draft specifies the procedures for the same to be
>> standardized for 

Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Ketan Talaulikar
Hi Les,

I agree with you that mechanisms like dampening and hold-down are best
achieved at the lowest levels (in this case in the monitoring protocol like
BFD) instead of in each routing protocol on the top.

Now whether this means we include/support the signaling of the parameters
for these mechanisms in BFD or whether they are achieved by provisioning
(as done currently by some implementations) is best discussed in the BFD WG.

Thanks,
Ketan


On Mon, Jan 31, 2022 at 1:08 AM Les Ginsberg (ginsberg) 
wrote:

> Robert –
>
>
>
> Here is what you said (emphasis added):
>
>
>
> 
>
> But the timer I am suggesting is not related to BFD operation, but to OSPF
> (and/or ISIS). It is not about BFD sessions being UP or DOWN. It is about 
> *allowing
> BFD for more testing (with various parameters (for example increasing test
> packet size in some discrete steps)* before OSPF is happy to bring the
> adj. up.
>
> 
>
>
>
> Point #1: If you want BFD to do more testing (such as MTU testing) then
> clearly you need extensions to BFD (such as
> https://datatracker.ietf.org/doc/draft-ietf-bfd-large-packets/ )
>
>
>
> Point #2: The existing timers (as Ketan points out are mentioned in
> Section 5) are applied today at the OSPF level precisely because OSPF does
> not currently have strict-mode operation. So in a flapping scenario you
> could see the following behavior:
>
>
>
> a)BFD goes down
>
> b)OSPF goes down in response to BFD
>
> c)OSPF comes back up
>
> d)Link is still unstable – so traffic is being dropped some of the time –
> but perhaps OSPF adjacency stays up (i.e., OSPF hellos get through often
> enough to keep the OSPF adjacency up)
>
>
>
> So some implementations have chosen to insert a delay following “b”. This
> doesn’t guarantee stability, but hopefully makes it less likely. And
> because OSPF today does NOT wait for BFD to come up, the delay has to be
> implemented at the OSPF level.
>
>
>
> Once you have strict mode support, the sequence becomes:
>
>
>
> a)BFD goes down
>
> b)OSPF goes down in response to BFD
>
> c)BFD comes back up
>
> d)OSPF comes back up
>
>
>
> Now, if the concern is that BFD comes back up while the link is still
> unstable, the way to address that is to put a delay either before BFD
> attempts to bring up a new session or a delay after achieving UP state
> before it signals UP to its clients – such as OSPF. This is a better
> solution because all BFD clients benefit from this. Ad if the link is still
> unstable, it is more likely that the BFD session will go down during the
> delay period than it would be for OSPF because the BFD timers are
> significantly more aggressive.
>
> (BTW, this behavior can be done w/o a BFD protocol extension – it is
> purely an implementation choice.)
>
>
>
> From a design perspective, dampening is always best done at the lowest
> layer possible. In most cases, interface layer dampening is best. If that
> is not reliable for some reason, then move one layer up – not two layers up.
>
>
>
>Les
>
>
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Sunday, January 30, 2022 10:05 AM
> *To:* Ketan Talaulikar 
> *Cc:* Les Ginsberg (ginsberg) ; Acee Lindem (acee) <
> a...@cisco.com>; draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org; Albert Fu <
> af...@bloomberg.net>; lsr 
> *Subject:* Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
>
>
> Hi Ketan,
>
>
>
> I would like to point out that the draft discusses the BFD "dampening" or
> "hold-down" mechanism in Sec 5. We are aware of BFD implementations that
> include such mechanisms in a protocol-agnostic manner.
>
>
>
> BFD dampening or hold-time are completely orthogonal to my point. Both
> have nothing to do with it.
>
>
>
> Those timers only fire when BFD goes down. In my example BFD does not go
> down. But we want to bring up the client adj. only after X ms/sec/min etc
> ...of normal BFD operation if no failure is detected during that timer.
>
>
>
> This draft indicates that OSPF adjacency will "advance" in the neighbor
> FSM only after BFD reports UP.
>
>
>
> And that is exactly too soon. In fact if you do that today without waiting
> some time (if you retire the current OSPF timer) you will not help at all
> in the case you are trying to address.
>
>
>
> Reason being that perhaps 200 ms after BFD UP it will go down, but OSPF
> adj. will get already established. It is really pretty simple.
>
>
>
> Thx,
>
> Robert.
>
>
>
> PS. And yes I think ISIS should also get fixed in that respect.
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Ketan Talaulikar
Hi Robert,

We can update this text in the Introduction section as follows:

OLD

   Note that it is possible
   in some failure scenarios for the network to be in a state such that
   an OSPF adjacency can be established but a BFD session cannot be
   established and maintained.  In certain other scenarios, a degraded
   or poor quality link will allow OSPF adjacency formation to succeed
   but the BFD session establishment will fail or the BFD session will
   flap. In this case, traffic that gets forwarded over such a link may

   experience packet drops while the failure of the BFD session
   establishment would not enable fast routing convergence if the link
   were to go down or flap.

NEW

   A degraded or poor quality link may result in intermittent packet

   drops. In such scenarios, sometimes an OSPF adjacency may be still

   get established over such link, but a BFD session may get

   established or maintained over it given the more aggressive

   monitoring intervals supported by BFD.  The traffic that gets forwarded

 over such a link would experience packet drops and the failure of the

   BFD session establishment would not enable fast routing convergence.

   Frequent OSPF adjaceny flaps may occur over such links as OSPF brings up the

  adjacency only for it to be brought down again by BFD.


Thanks,
Ketan


On Sun, Jan 30, 2022 at 11:41 PM Robert Raszuk  wrote:

> Hi Ketan,
>
> > It explains the scenario of a noisy link that experiences traffic drops.
>
> The point is that BFD may or may not detect noisy links or links with
> "degraded or poor quality". There are many failure scenarios - especially
> brownouts - where BFD will continue to run just fine over a link and where
> at the same time user data will experience very poor performance.
>
> So stating in the RFC that BFD may help to detect such cases is simply
> very misleading (to say it gently :).
>
> And you are stating so exactly in the below sentence:
>
> *"In certain other scenarios, a degraded or poor quality link will allow
> OSPF adjacency formation to succeed*
> *but the BFD session establishment will fail or the BFD session will flap.*
>
> Thx,
> R.
>
>
> On Sun, Jan 30, 2022 at 6:03 PM Ketan Talaulikar 
> wrote:
>
>> Hi Robert,
>>
>> Thanks for your review and comments.
>>
>> This email is in response to your first point "overpromise".
>>
>> First, there is no text in the draft that "overpromises" that the strict
>> mode of operation detects "all forwarding" issues. We are talking about BFD
>> and its capabilities are well-known. It is not in the scope of this
>> document to discuss BFD capabilities and shortcomings (e.g. the MTU issue
>> you describe).
>>
>> The draft text that you have asked to remove is important. It explains
>> the scenario of a noisy link that experiences traffic drops. I am aware of
>> issues in production networks, where we've had OSPF adjacency flaps
>> continuously or sporadically due to OSPF adjacency coming up somehow but
>> then BFD bringing it down. This causes routing churn and service
>> degradation. This is one of the key drivers for this draft.
>>
>> However, welcome any text clarifications/suggestions for improving the
>> document.
>>
>> Thanks,
>> Ketan
>>
>>>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Ketan Talaulikar
Hi Robert,

The draft text refers to dampening and hold-down. The latter can be used
also for the initial session bring-up. The descriptions of those BFD
mechanisms are outside the scope of this document. If this needs to be
standardized at the IETF, then (IMHO) it would be best taken up in the BFD
WG so it can be leveraged for other protocols and use-cases as well.

We can update the text in Sec 5 to clarify this aspect as below.

OLD

   In network deployments with noisy links or those with packet loss,
   BFD sessions may flap frequently.  In such scenarios, OSPF strict-
   mode for BFD may be deployed in conjunction with a BFD dampening or
   hold-down mechanism to avoid frequent adjacency flaps that cause
   routing churn.

NEW

   In network deployments with noisy links with packet loss,
   BFD sessions may flap frequently.  In such scenarios, OSPF strict-
   mode for BFD may be deployed in conjunction with mechanisms such as

   hold-down (to delay initial adjacency bring up) and dampening (to avoid

   frequent adjacency flaps) in BFD to avoid frequent OSPF adjacency

   flaps that cause routing churn. The details of these BFD mechanisms

   are outside the scope of this document.



Thanks,
Ketan


On Sun, Jan 30, 2022 at 11:35 PM Robert Raszuk  wrote:

> Hi Ketan,
>
> I would like to point out that the draft discusses the BFD "dampening" or
>> "hold-down" mechanism in Sec 5. We are aware of BFD implementations that
>> include such mechanisms in a protocol-agnostic manner.
>>
>
> BFD dampening or hold-time are completely orthogonal to my point. Both
> have nothing to do with it.
>
> Those timers only fire when BFD goes down. In my example BFD does not go
> down. But we want to bring up the client adj. only after X ms/sec/min etc
> ...of normal BFD operation if no failure is detected during that timer.
>
> This draft indicates that OSPF adjacency will "advance" in the neighbor
>> FSM only after BFD reports UP.
>>
>
> And that is exactly too soon. In fact if you do that today without waiting
> some time (if you retire the current OSPF timer) you will not help at all
> in the case you are trying to address.
>
> Reason being that perhaps 200 ms after BFD UP it will go down, but OSPF
> adj. will get already established. It is really pretty simple.
>
> Thx,
> Robert.
>
> PS. And yes I think ISIS should also get fixed in that respect.
>
>>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Gyan Mishra
Hi Ketan

Minor cleanup on my part with normative language SHOULD to MUST.  This is
critical as we want to make sure interoperability works and no wiggle room
 or loopholes in misinterpretation of the specification.

The main motivation is to fix the problem with OSPF starting before BFD to
be implemented so that there is backwards compatibility.I think you
should break out scenarios into P2P and LAN scenario.  This is how I would
reword.

   For Multi-access interfaces when BFD is enabled, but the strict-mode for
   operation has been signaled by multiple but not all neighbors, then an
   implementation MUST start the BFD session establishment only in
   2-Way state or higher state.  This makes it possible for an OSPF
   router to support BFD operation in both strict-mode and normal mode
   across different interfaces or even different neighbors.



   For interface with P2P subnet when BFD is enabled, but strict-mode for
   operation has been signaled by both neighbors, then an
   implementation MUST start the BFD session establishment at init state.

   In case one end of P2P link supports Strict mode and the other neighbor

   does not then BFD would come up in Normal mode.


Kind Regards


On Sun, Jan 30, 2022 at 2:50 PM Gyan Mishra  wrote:

> Hi Ketan
>
> Welcome.  Responses in-line
>
> Kind Regards
>
> On Sun, Jan 30, 2022 at 12:34 PM Ketan Talaulikar 
> wrote:
>
>> Hi Gyan,
>>
>> Thanks for your review and your comments/feedback. Please check inline
>> below for responses.
>>
>>
>> On Sun, Jan 30, 2022 at 12:29 PM Gyan Mishra 
>> wrote:
>>
>>>
>>> I support WG Adoption of this draft.
>>>
>>> This is a real world problem that has existed with BFD that operators
>>> have to deal with where OSPF adjacency comes up before BFD session
>>> establishes resulting in cases where the link may have L1 issues or maybe a
>>> dirty link or poor link quality resulting in BFD session establishment
>>> followed by BFD immediately taking down the link.  With BFD tight timers
>>> with client protocol registered ends up further exacerbating the issue with
>>> link flaps resulting in IGP instability.
>>>
>>> This draft mirrors the ISIS block solution in RFC 6213   ISIS BFD
>>> enabled TLV.
>>>
>>> This issue exists with BGP as well where the protocol registered with
>>> BFD bootstrapped per RFC 5882 comes up before BFD resulting in
>>> instability.  I believe this gap still exists for BGP.
>>>
>>
>> KT> We have
>> https://datatracker.ietf.org/doc/draft-ietf-idr-bgp-bfd-strict-mode/
>>
>>
> Gyan> Understood to solve this issue for OSPF
>
>>
>>> When BFD comes up it performs link integrity test before session
>>> establishment to detect dirty errored link does not come Up.
>>>
>>> RFC 5880 BFD 3-way session establishment and  does the link integrity
>>> and  quality test by sending the BFD control packets to validate
>>> bi-directional forwarding liveliness detection over any media.
>>>
>>> The case mentioned in this draft where the link is dirty, MTU issues or
>>> forwarding plane issues exist that cause BFD not to establish resulting in
>>> the use of default protocol timers and slow convergence is a major issue
>>> for operators being solved with this draft as well as mentioned above where
>>> BFD does come up after the IGP is just as bad if not worse if the link is a
>>> dirty errored link resulting in flapping link.
>>>
>>> The main point here as I mentioned is that BFD must validate the link
>>> integrity before routing protocol comes Up, so that routing protocol does
>>> not come Up on a dirty errored link, so the blocking of the adjacency
>>> capabilities solution here nicely solves the issue.
>>>
>>>
>>> In this thread it has been mentioned maybe a CLI timer knob as far as
>>> implementation for delay knob makes sense.
>>>
>>
>> KT> Please check Sec 5. The terminology used might differ between
>> implementations.
>>
>
>  Gyan> I see mention of BFD dampening for poor link quality issues.
> However if BFD comes Up and establishes that would mean that at the time
> the BFD session is established as control packet were received based on
> interval and timer that the link is in a good state at the time of session
> establishment. The main issue we are solving here is not allowing OSPF to
> come Up on a link with packet forwarding issues. At that point when OSPF
> FSM is initiated and goes to Full state we know the link to be stable as
> BFD was established successfully prior.  If a link were in bad shape or
> flapping BFD would not establish and that is the crux of this drafts
> problem statement.  So I think to some degree this draft does preclude the
> need for BFD dampening.
>
>
>> I would like to note that one workaround used by operators is using RFC
>>> 7130 BFD over bundle member called “BOB” or per link BFD,  and in that case
>>> control protocol is in fact blocked and BFD comes up first.  This is a
>>> workaround used putting even individual single links in a bundle to present
>>> the issue from 

Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Robert Raszuk
Hi Les,

> the way to address that is to put a delay either before BFD attempts to
bring up a new session

No this will not work. The BFD session must be fully up and BFD has to have
a chance for normal operation for X units of time. (By normal I mean with
existing or new BFD extensions which is out of scope of this discussion).

> or a delay after achieving UP state before it signals UP to its clients –
such as OSPF.

This is exactly what I am describing. Except you think that now BFD should
hold on on a per client or per OSPF neighbor basis and I think that it is
clients who should hold on from reacting to signaled UP state.

The way you are suggesting puts unnecessary burden on BFD where from BFD
POV link went up at t0 and never went down. It is the client who may need
to delay his action depending on the nature of the client.

At least we got to the point that both of us are clear on the topic.
Before when I see dampening or hold times insertion only indicates that
there was a mismatch in understanding. And to your examples imagine that
this is a new interface and BFD was never up before on it. The
behavior should be identical.

Thx,
R.

On Sun, Jan 30, 2022 at 8:38 PM Les Ginsberg (ginsberg) 
wrote:

> Robert –
>
>
>
> Here is what you said (emphasis added):
>
>
>
> 
>
> But the timer I am suggesting is not related to BFD operation, but to OSPF
> (and/or ISIS). It is not about BFD sessions being UP or DOWN. It is about 
> *allowing
> BFD for more testing (with various parameters (for example increasing test
> packet size in some discrete steps)* before OSPF is happy to bring the
> adj. up.
>
> 
>
>
>
> Point #1: If you want BFD to do more testing (such as MTU testing) then
> clearly you need extensions to BFD (such as
> https://datatracker.ietf.org/doc/draft-ietf-bfd-large-packets/ )
>
>
>
> Point #2: The existing timers (as Ketan points out are mentioned in
> Section 5) are applied today at the OSPF level precisely because OSPF does
> not currently have strict-mode operation. So in a flapping scenario you
> could see the following behavior:
>
>
>
> a)BFD goes down
>
> b)OSPF goes down in response to BFD
>
> c)OSPF comes back up
>
> d)Link is still unstable – so traffic is being dropped some of the time –
> but perhaps OSPF adjacency stays up (i.e., OSPF hellos get through often
> enough to keep the OSPF adjacency up)
>
>
>
> So some implementations have chosen to insert a delay following “b”. This
> doesn’t guarantee stability, but hopefully makes it less likely. And
> because OSPF today does NOT wait for BFD to come up, the delay has to be
> implemented at the OSPF level.
>
>
>
> Once you have strict mode support, the sequence becomes:
>
>
>
> a)BFD goes down
>
> b)OSPF goes down in response to BFD
>
> c)BFD comes back up
>
> d)OSPF comes back up
>
>
>
> Now, if the concern is that BFD comes back up while the link is still
> unstable, the way to address that is to put a delay either before BFD
> attempts to bring up a new session or a delay after achieving UP state
> before it signals UP to its clients – such as OSPF. This is a better
> solution because all BFD clients benefit from this. Ad if the link is still
> unstable, it is more likely that the BFD session will go down during the
> delay period than it would be for OSPF because the BFD timers are
> significantly more aggressive.
>
> (BTW, this behavior can be done w/o a BFD protocol extension – it is
> purely an implementation choice.)
>
>
>
> From a design perspective, dampening is always best done at the lowest
> layer possible. In most cases, interface layer dampening is best. If that
> is not reliable for some reason, then move one layer up – not two layers up.
>
>
>
>Les
>
>
>
>
>
> *From:* Robert Raszuk 
> *Sent:* Sunday, January 30, 2022 10:05 AM
> *To:* Ketan Talaulikar 
> *Cc:* Les Ginsberg (ginsberg) ; Acee Lindem (acee) <
> a...@cisco.com>; draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org; Albert Fu <
> af...@bloomberg.net>; lsr 
> *Subject:* Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
>
>
> Hi Ketan,
>
>
>
> I would like to point out that the draft discusses the BFD "dampening" or
> "hold-down" mechanism in Sec 5. We are aware of BFD implementations that
> include such mechanisms in a protocol-agnostic manner.
>
>
>
> BFD dampening or hold-time are completely orthogonal to my point. Both
> have nothing to do with it.
>
>
>
> Those timers only fire when BFD goes down. In my example BFD does not go
> down. But we want to bring up the client adj. only after X ms/sec/min etc
> ...of normal BFD operation if no failure is detected during that timer.
>
>
>
> This draft indicates that OSPF adjacency will "advance" in the neighbor
> FSM only after BFD reports UP.
>
>
>
> And that is exactly too soon. In fact if you do that today without waiting
> some time (if you retire the current OSPF timer) you will not help at all
> in the case you are trying to address.
>
>
>
> Reason 

Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Gyan Mishra
Hi Ketan

Welcome.  Responses in-line

Kind Regards

On Sun, Jan 30, 2022 at 12:34 PM Ketan Talaulikar 
wrote:

> Hi Gyan,
>
> Thanks for your review and your comments/feedback. Please check inline
> below for responses.
>
>
> On Sun, Jan 30, 2022 at 12:29 PM Gyan Mishra 
> wrote:
>
>>
>> I support WG Adoption of this draft.
>>
>> This is a real world problem that has existed with BFD that operators
>> have to deal with where OSPF adjacency comes up before BFD session
>> establishes resulting in cases where the link may have L1 issues or maybe a
>> dirty link or poor link quality resulting in BFD session establishment
>> followed by BFD immediately taking down the link.  With BFD tight timers
>> with client protocol registered ends up further exacerbating the issue with
>> link flaps resulting in IGP instability.
>>
>> This draft mirrors the ISIS block solution in RFC 6213   ISIS BFD enabled
>> TLV.
>>
>> This issue exists with BGP as well where the protocol registered with BFD
>> bootstrapped per RFC 5882 comes up before BFD resulting in instability.  I
>> believe this gap still exists for BGP.
>>
>
> KT> We have
> https://datatracker.ietf.org/doc/draft-ietf-idr-bgp-bfd-strict-mode/
>
>
Gyan> Understood to solve this issue for OSPF

>
>> When BFD comes up it performs link integrity test before session
>> establishment to detect dirty errored link does not come Up.
>>
>> RFC 5880 BFD 3-way session establishment and  does the link integrity and
>>  quality test by sending the BFD control packets to validate bi-directional
>> forwarding liveliness detection over any media.
>>
>> The case mentioned in this draft where the link is dirty, MTU issues or
>> forwarding plane issues exist that cause BFD not to establish resulting in
>> the use of default protocol timers and slow convergence is a major issue
>> for operators being solved with this draft as well as mentioned above where
>> BFD does come up after the IGP is just as bad if not worse if the link is a
>> dirty errored link resulting in flapping link.
>>
>> The main point here as I mentioned is that BFD must validate the link
>> integrity before routing protocol comes Up, so that routing protocol does
>> not come Up on a dirty errored link, so the blocking of the adjacency
>> capabilities solution here nicely solves the issue.
>>
>>
>> In this thread it has been mentioned maybe a CLI timer knob as far as
>> implementation for delay knob makes sense.
>>
>
> KT> Please check Sec 5. The terminology used might differ between
> implementations.
>

 Gyan> I see mention of BFD dampening for poor link quality issues.
However if BFD comes Up and establishes that would mean that at the time
the BFD session is established as control packet were received based on
interval and timer that the link is in a good state at the time of session
establishment. The main issue we are solving here is not allowing OSPF to
come Up on a link with packet forwarding issues. At that point when OSPF
FSM is initiated and goes to Full state we know the link to be stable as
BFD was established successfully prior.  If a link were in bad shape or
flapping BFD would not establish and that is the crux of this drafts
problem statement.  So I think to some degree this draft does preclude the
need for BFD dampening.


> I would like to note that one workaround used by operators is using RFC
>> 7130 BFD over bundle member called “BOB” or per link BFD,  and in that case
>> control protocol is in fact blocked and BFD comes up first.  This is a
>> workaround used putting even individual single links in a bundle to present
>> the issue from happening.
>>
>> I would like to note that RFC 5882 Generic Application of BFD does state
>> that if all neighbors support BFD then the registered control protocol
>> being bootstrapped should be blocked from coming up until BFD session is
>> established.  Only in case where all neighbors on a LAN do not have BFD
>> enabled, blocking the control protocol from coming Up would prevent the
>> control protocol from coming Up on neighbors that don’t have BFD enabled.
>>
>> So the way I read it implementations following BFD RFC 5882 should have
>> been blocking OSPF or ISIS  protocol from coming Up before BFD comes up w/o
>> having to require a specification for the explicit block.  Apparently most
>> all vendors implementations did not follow RFC 5882 it appears with this
>> regard and thus now the requirement for operators for this important
>> draft.  I think this implementation discrepancy happened due to normative
>> language SHOULD Block and not MUST Block is the problem.
>>
>
> KT> There are implementations that provided knobs for this strict
> mode-like behavior. The draft specifies the procedures for the same to be
> standardized for multi-vendor interop and more importantly the ability to
> signal/negotiate this mode of operation with neighbors.
>
>
Gyan> It maybe good to reference RFC 5882 and state what the BFD
specification  says with regards to 

Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Les Ginsberg (ginsberg)
Robert –

Here is what you said (emphasis added):


But the timer I am suggesting is not related to BFD operation, but to OSPF 
(and/or ISIS). It is not about BFD sessions being UP or DOWN. It is about 
allowing BFD for more testing (with various parameters (for example increasing 
test packet size in some discrete steps) before OSPF is happy to bring the adj. 
up.


Point #1: If you want BFD to do more testing (such as MTU testing) then clearly 
you need extensions to BFD (such as 
https://datatracker.ietf.org/doc/draft-ietf-bfd-large-packets/ )

Point #2: The existing timers (as Ketan points out are mentioned in Section 5) 
are applied today at the OSPF level precisely because OSPF does not currently 
have strict-mode operation. So in a flapping scenario you could see the 
following behavior:

a)BFD goes down
b)OSPF goes down in response to BFD
c)OSPF comes back up
d)Link is still unstable – so traffic is being dropped some of the time – but 
perhaps OSPF adjacency stays up (i.e., OSPF hellos get through often enough to 
keep the OSPF adjacency up)

So some implementations have chosen to insert a delay following “b”. This 
doesn’t guarantee stability, but hopefully makes it less likely. And because 
OSPF today does NOT wait for BFD to come up, the delay has to be implemented at 
the OSPF level.

Once you have strict mode support, the sequence becomes:

a)BFD goes down
b)OSPF goes down in response to BFD
c)BFD comes back up
d)OSPF comes back up

Now, if the concern is that BFD comes back up while the link is still unstable, 
the way to address that is to put a delay either before BFD attempts to bring 
up a new session or a delay after achieving UP state before it signals UP to 
its clients – such as OSPF. This is a better solution because all BFD clients 
benefit from this. Ad if the link is still unstable, it is more likely that the 
BFD session will go down during the delay period than it would be for OSPF 
because the BFD timers are significantly more aggressive.
(BTW, this behavior can be done w/o a BFD protocol extension – it is purely an 
implementation choice.)

From a design perspective, dampening is always best done at the lowest layer 
possible. In most cases, interface layer dampening is best. If that is not 
reliable for some reason, then move one layer up – not two layers up.

   Les


From: Robert Raszuk 
Sent: Sunday, January 30, 2022 10:05 AM
To: Ketan Talaulikar 
Cc: Les Ginsberg (ginsberg) ; Acee Lindem (acee) 
; draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org; Albert Fu 
; lsr 
Subject: Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - 
draft-ietf-lsr-ospf-bfd-strict-mode-04

Hi Ketan,

I would like to point out that the draft discusses the BFD "dampening" or 
"hold-down" mechanism in Sec 5. We are aware of BFD implementations that 
include such mechanisms in a protocol-agnostic manner.

BFD dampening or hold-time are completely orthogonal to my point. Both have 
nothing to do with it.

Those timers only fire when BFD goes down. In my example BFD does not go down. 
But we want to bring up the client adj. only after X ms/sec/min etc ...of 
normal BFD operation if no failure is detected during that timer.

This draft indicates that OSPF adjacency will "advance" in the neighbor FSM 
only after BFD reports UP.

And that is exactly too soon. In fact if you do that today without waiting some 
time (if you retire the current OSPF timer) you will not help at all in the 
case you are trying to address.

Reason being that perhaps 200 ms after BFD UP it will go down, but OSPF adj. 
will get already established. It is really pretty simple.

Thx,
Robert.

PS. And yes I think ISIS should also get fixed in that respect.
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Robert Raszuk
Hi Ketan,

> It explains the scenario of a noisy link that experiences traffic drops.

The point is that BFD may or may not detect noisy links or links with
"degraded or poor quality". There are many failure scenarios - especially
brownouts - where BFD will continue to run just fine over a link and where
at the same time user data will experience very poor performance.

So stating in the RFC that BFD may help to detect such cases is simply very
misleading (to say it gently :).

And you are stating so exactly in the below sentence:

*"In certain other scenarios, a degraded or poor quality link will allow
OSPF adjacency formation to succeed*
*but the BFD session establishment will fail or the BFD session will flap.*

Thx,
R.


On Sun, Jan 30, 2022 at 6:03 PM Ketan Talaulikar 
wrote:

> Hi Robert,
>
> Thanks for your review and comments.
>
> This email is in response to your first point "overpromise".
>
> First, there is no text in the draft that "overpromises" that the strict
> mode of operation detects "all forwarding" issues. We are talking about BFD
> and its capabilities are well-known. It is not in the scope of this
> document to discuss BFD capabilities and shortcomings (e.g. the MTU issue
> you describe).
>
> The draft text that you have asked to remove is important. It explains the
> scenario of a noisy link that experiences traffic drops. I am aware of
> issues in production networks, where we've had OSPF adjacency flaps
> continuously or sporadically due to OSPF adjacency coming up somehow but
> then BFD bringing it down. This causes routing churn and service
> degradation. This is one of the key drivers for this draft.
>
> However, welcome any text clarifications/suggestions for improving the
> document.
>
> Thanks,
> Ketan
>
>>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Robert Raszuk
Hi Ketan,

I would like to point out that the draft discusses the BFD "dampening" or
> "hold-down" mechanism in Sec 5. We are aware of BFD implementations that
> include such mechanisms in a protocol-agnostic manner.
>

BFD dampening or hold-time are completely orthogonal to my point. Both have
nothing to do with it.

Those timers only fire when BFD goes down. In my example BFD does not go
down. But we want to bring up the client adj. only after X ms/sec/min etc
...of normal BFD operation if no failure is detected during that timer.

This draft indicates that OSPF adjacency will "advance" in the neighbor FSM
> only after BFD reports UP.
>

And that is exactly too soon. In fact if you do that today without waiting
some time (if you retire the current OSPF timer) you will not help at all
in the case you are trying to address.

Reason being that perhaps 200 ms after BFD UP it will go down, but OSPF
adj. will get already established. It is really pretty simple.

Thx,
Robert.

PS. And yes I think ISIS should also get fixed in that respect.

>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Ketan Talaulikar
Hi Gyan,

Thanks for your review and your comments/feedback. Please check inline
below for responses.


On Sun, Jan 30, 2022 at 12:29 PM Gyan Mishra  wrote:

>
> I support WG Adoption of this draft.
>
> This is a real world problem that has existed with BFD that operators have
> to deal with where OSPF adjacency comes up before BFD session establishes
> resulting in cases where the link may have L1 issues or maybe a dirty link
> or poor link quality resulting in BFD session establishment followed by BFD
> immediately taking down the link.  With BFD tight timers with client
> protocol registered ends up further exacerbating the issue with link flaps
> resulting in IGP instability.
>
> This draft mirrors the ISIS block solution in RFC 6213   ISIS BFD enabled
> TLV.
>
> This issue exists with BGP as well where the protocol registered with BFD
> bootstrapped per RFC 5882 comes up before BFD resulting in instability.  I
> believe this gap still exists for BGP.
>

KT> We have
https://datatracker.ietf.org/doc/draft-ietf-idr-bgp-bfd-strict-mode/


>
> When BFD comes up it performs link integrity test before session
> establishment to detect dirty errored link does not come Up.
>
> RFC 5880 BFD 3-way session establishment and  does the link integrity and
>  quality test by sending the BFD control packets to validate bi-directional
> forwarding liveliness detection over any media.
>
> The case mentioned in this draft where the link is dirty, MTU issues or
> forwarding plane issues exist that cause BFD not to establish resulting in
> the use of default protocol timers and slow convergence is a major issue
> for operators being solved with this draft as well as mentioned above where
> BFD does come up after the IGP is just as bad if not worse if the link is a
> dirty errored link resulting in flapping link.
>
> The main point here as I mentioned is that BFD must validate the link
> integrity before routing protocol comes Up, so that routing protocol does
> not come Up on a dirty errored link, so the blocking of the adjacency
> capabilities solution here nicely solves the issue.
>
>
> In this thread it has been mentioned maybe a CLI timer knob as far as
> implementation for delay knob makes sense.
>

KT> Please check Sec 5. The terminology used might differ between
implementations.


>
> I would like to note that one workaround used by operators is using RFC
> 7130 BFD over bundle member called “BOB” or per link BFD,  and in that case
> control protocol is in fact blocked and BFD comes up first.  This is a
> workaround used putting even individual single links in a bundle to present
> the issue from happening.
>
> I would like to note that RFC 5882 Generic Application of BFD does state
> that if all neighbors support BFD then the registered control protocol
> being bootstrapped should be blocked from coming up until BFD session is
> established.  Only in case where all neighbors on a LAN do not have BFD
> enabled, blocking the control protocol from coming Up would prevent the
> control protocol from coming Up on neighbors that don’t have BFD enabled.
>
> So the way I read it implementations following BFD RFC 5882 should have
> been blocking OSPF or ISIS  protocol from coming Up before BFD comes up w/o
> having to require a specification for the explicit block.  Apparently most
> all vendors implementations did not follow RFC 5882 it appears with this
> regard and thus now the requirement for operators for this important
> draft.  I think this implementation discrepancy happened due to normative
> language SHOULD Block and not MUST Block is the problem.
>

KT> There are implementations that provided knobs for this strict mode-like
behavior. The draft specifies the procedures for the same to be
standardized for multi-vendor interop and more importantly the ability to
signal/negotiate this mode of operation with neighbors.


>
> RFC 5882 excerpt below:
>
> 4.1 .  Adjacency 
> Establishment
>
>If the session state on either the local or remote system (if known)
>is AdminDown, BFD has been administratively disabled, and the
>establishment of a control protocol adjacency MUST be allowed.
>
>BFD sessions are typically bootstrapped by the control protocol,
>using the mechanism (discovery, configuration) used by the control
>protocol to find neighbors.  Note that it is possible in some failure
>scenarios for the network to be in a state such that the control
>protocol is capable of coming up, but the BFD session cannot be
>established, and, more particularly, data cannot be forwarded.  To
>avoid this situation, it would be beneficial not to allow the control
>protocol to establish a neighbor adjacency.  However, this would
>preclude the operation of the control protocol in an environment in
>which not all systems support BFD.
>
>
>Therefore, the establishment of control protocol adjacencies SHOULD
>be blocked 

Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Ketan Talaulikar
Hi Aijun,

There is a need for a BFD session to be established between neighboring
routers which directly forward data between them to ensure reachability
between them. That is my understanding of various implementations and
deployments at operators. This is independent of the strict mode of
operation.

Perhaps you have a different requirement for "optimization of BFD sessions
on multi-access networks"? If so, it would be clearer if you could put that
requirement/proposal together as a draft for the WG to review. Also, that
would be in any way independent of this specification since what you are
referring to is the base use of BFD by OSPF.

Thanks,
Ketan


On Sun, Jan 30, 2022 at 7:58 AM Aijun Wang 
wrote:

> Hi, Acee and Ketan:
> No, I don’t want to change the NBMA/Broadcast in OSPF to P2MP mode.
> What I want to express is that you brought up the full mesh BFD sessions
> among the routers within such network type. Is it necessary to bring some
> of them(the BFD sessions between DRothers) to DOWN after the OSPF adjacency
> are established between the DRother and DR/BDR router?
> If the BFD session is bootstrapped after the OSPF adjacency is
> established, there will be no such extra/useless BFD sessions
>
> Aijun Wang
> China Telecom
>
> On Jan 30, 2022, at 02:45, Acee Lindem (acee)  wrote:
>
> 
>
> Speaking as WG member:
>
>
>
> Hi Aijun,
>
> If you want a per-neighbor state and route, you have to use P2MP. This
> scope of this draft isn’t to try and make NBMA/Broadcast model something
> that it is not. This is should be common knowledge and the draft needn’t
> address it. Those of us who remember ATM emulated LANs (which were not
> always symmetrically reliable) will recall using P2MP on an inherently
> multi-access network.
>
> Acee
>
>
>
> *From: *Aijun Wang 
> *Date: *Saturday, January 29, 2022 at 3:46 AM
> *To: *'Ketan Talaulikar' 
> *Cc: *"lsr@ietf.org" , "
> draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org" <
> draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org>, 'Albert Fu' <
> af...@bloomberg.net>, Acee Lindem 
> *Subject: *RE: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
>
>
> Hi, Ketan:
>
> OK, then back to my original question:
>
> If one of the BFD session(between DRothers) is DOWN, will it bring DOWN
> the OSPF adjacency(between the DRother and DR/BDR)?
>
> If not, then the traffic between these DRothers will be lost; If yes, it
> seems strange, because the BFD session between the DRother and DR/BDR may
> be still UP.
>
> I think here there are some mismatch between the BFD sessions and the OSPF
> adjacency in Broadcast/NBMA network, then some clarification for the
> procedures are needed.
>
>
>
>
>
> Best Regards
>
>
>
> Aijun Wang
>
> China Telecom
>
>
>
> *From:* lsr-boun...@ietf.org  *On Behalf Of *Ketan
> Talaulikar
> *Sent:* Saturday, January 29, 2022 4:22 PM
> *To:* Aijun Wang 
> *Cc:* lsr@ietf.org; draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org; Albert
> Fu ; Acee Lindem (acee)  40cisco@dmarc.ietf.org>
> *Subject:* Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
>
>
> Hi Aijun,
>
>
>
> The choice of the term "adjacency" was not accurate in my previous
> response to you. I meant "neighborship".
>
>
>
> That said, the substance of my response still remains the same.
>
>
>
> Thanks,
>
> Ketan
>
>
>
>
>
> On Sat, Jan 29, 2022 at 1:42 PM Aijun Wang 
> wrote:
>
> Hi, Ketan:
>
> For the Broadcast/NMBA network type, if you establish BFD sessions
> before the DR/BDR selection, then there will be full mesh BFD sessions
> within the routers on such media type?
>
> Instead of establishing the BFD sessions with DR/BDR only, the same as the
> OSPF adjacency relationship? If so, if one of the BFD session that not with
> the DR/BDR is DOWN, what’s the action then?
>
>
>
> KT> I think there is perhaps a misunderstanding of the purpose of BFD use
> with OSPF. Perhaps a careful reading of RFC5882 would help? In short, BFD
> is used to verify bidirectional connectivity between neighbors to ensure
> data may be forwarded between them. OSPF adjacency is built between every
> router in a LAN since they can directly forward packets between themselves.
>
> *[WAJ] In Broadcast/NBMA network, OSPF adjacency is built only between the
> routers and DR/BDR.  *
>
>
>
> Thanks,
>
> Ketan
>
>
>
>
>
>
>
> Best Regards
>
>
>
> Aijun Wang
>
> China Telecom
>
>
>
> *From:* Ketan Talaulikar 
> *Sent:* Saturday, January 29, 2022 11:13 AM
> *To:* Aijun Wang 
> *Cc:* Acee Lindem (acee) ; lsr@ietf.org;
> draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org; Albert Fu <
> af...@bloomberg.net>
> *Subject:* Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
>
>
> Hi Aijun,
>
>
>
> Please check inline below.
>
>
>
>
>
> On Sat, Jan 29, 2022 at 7:38 AM Aijun Wang 
> wrote:
>
> Hi, Acee:
>
>
>
> Yes. Then I think the sentence in
> 

Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Ketan Talaulikar
Hi Muthu,

Thanks for your review and your support.

Regarding your question, I would like to clarify that this document doesn't
specify BFD operations with OSPF. That was done by RFC5882. Indeed for
virtual links, there would need to be a BFD multi-hop session and the same
would apply to p-t-p unnumbered.

However, I am not sure what specific applicability or operations need to be
called out for Strict Mode of operations for those links.

Thanks,
Ketan


On Sun, Jan 30, 2022 at 12:52 PM Muthu Arul Mozhi Perumal <
muthu.a...@gmail.com> wrote:

> Hi,
>
> I support the draft. A quick question:
> Should it describe the applicability of the mechanism over OSPF virtual
> links and unnumbered interfaces? With virtual links, one would have to
> establish a multi-hop BFD session, so it is slightly different from a BFD
> operational standpoint. For e.g, capability to support single-hop BFD may
> not translate to the capability to support multi-hop BFD..
>
> Regards,
> Muthu
>
> On Thu, Jan 27, 2022 at 10:38 PM Acee Lindem (acee)  40cisco@dmarc.ietf.org> wrote:
>
>> LSR WG,
>>
>>
>>
>> This begins a two week last call for the subject draft. Please indicate
>> your support or objection on this list prior to 12:00 AM UTC on February 11
>> th, 20222. Also, review comments are certainly welcome.
>>
>> Thanks,
>> Acee
>>
>>
>> ___
>> Lsr mailing list
>> Lsr@ietf.org
>> https://www.ietf.org/mailman/listinfo/lsr
>>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Ketan Talaulikar
Hi Robert,

If I've understood correctly, your point is :
a) That there are several other mechanisms for "link" verification that do
various levels of monitoring from basic reachability to more advanced
metrics (BFD is just one of them)
b) That there are several protocols that can leverage (a) before their
protocol sessions are established (OSPF is just one of them)

One can argue that implementations can (and do) some of the above with
implementation-specific mechanisms (e.g. similar to object-tracking). There
is the part of what monitoring mechanism to be used and then their
respective parameters. This could be added to protocols or done via
provisioning (cfg knobs).

This draft covers just the indication of the use of BFD with OSPF. There is
similar work already published as RFC for ISIS (that Les points out) and
there is a WG draft for the same in BGP.

IMHO, there is value in progressing this draft while the broader
discussions for your points continue.

Thanks,
Ketan


On Sun, Jan 30, 2022 at 8:43 PM Robert Raszuk  wrote:

> Hi Albert,
>
> Thank you for confirming that BFD needs to be kept simple and there is
> already reluctance to add to it. So Les's suggestion to put additional
> logic into BFD is likely not a realistic one.
>
> Your note also confirms my points that there is likely to be different
> holdtime timer requirements depending on the link type and peer type.
>
> With that please notice what Les said:
>
> *And once OSPF strict-mode support becomes widely deployed there won’t be
> a need for such a timer for OSPF either.*
>
> That to me clearly means that he is going to retire the current timer once
> we get that RFC out of the door. That is why I proposed to add it to the
> document. But of course authors will decide.
>
>
> Dear WG,
>
> After thinking about this draft I would suggest that what we really need
> is not a point solution, but a general mechanism which will allow us to
> bring the protocol full up after some time from the moment the test suite
> is up.
>
> BFD is only one way to detect if the path to a peer is up or down. There
> are shipping alternatives to this today which could be used instead of BFD
> (for example any form of object tracking). As the current version of the
> draft says there is need to not only detect if the path is up or down but
> also if it meets quality expectations. New wave of INT tools is becoming
> available to allow us to measure those characteristics today.
>
> So while the draft could still bring BFD as one example of such a tool -
> in my opinion it deserves to be generalized a bit to allow other ways to
> determine if the link over which we are to establish IGP adj. meets the
> requirements.
>
> Kind regards,
> Robert
>
>
> On Sun, Jan 30, 2022 at 3:28 PM Albert Fu (BLOOMBERG/ 120 PARK) <
> af...@bloomberg.net> wrote:
>
>> I feel it is better to keep the standard simple and not add timer delay
>> as part of BFD strict draft, as different customers may have different
>> requirements, and there may also be vendor/platform dependency.
>>
>> For example, in the core where there are a lot of link
>> redundancy/diversity, we could afford to have longer time delay since we
>> can tolerate multiple link failures. For majority of the edge connectivity,
>> there are typically only 2 links - in this situation, we would want a lower
>> time delay.
>>
>> I found the current BFD Strict holddown/dampening mechanism as
>> implemented by the two vendors sufficient for our need. If there is an
>> issue causing BFD to fail during this time, OSPF will take longer time to
>> come up. And the delay only needs to be configured on one side.
>>
>> So, in current implementation, there's some sanity "check" that BFD is
>> stable for a period of time before OSPF can come up.
>>
>> In discussion with the BFD working group on my other MTU draft, there's a
>> keen interest among the WG to keep the BFD simple.
>>
>>
>>
>>
>> From: rob...@raszuk.net At: 01/29/22 15:20:06 UTC-5:00
>> To: ginsb...@cisco.com
>> Cc: Albert Fu (BLOOMBERG/ 120 PARK ) ,
>> a...@cisco.com, ketant.i...@gmail.com,
>> draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org, lsr@ietf.org
>> Subject: Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD"
>> - draft-ietf-lsr-ospf-bfd-strict-mode-04
>>
>> Hi Les,
>>
>> > Discussion of how to make BFD failure detection more robust belongs in
>> the BFD WG
>> > If you do not want the BFD session to come back up too quickly after a
>> failure
>>
>> Nothing I suggested is related to any of the above.
>>
>> Let me perhaps provide a very simple example.
>>
>> BFD being used is *AS*IS*.
>>
>> All the operator wants is to run it for say X sec without ever going
>> down before bringing OSPF adj up.
>>
>> That timer and its consistency on both ends clearly belongs to OSPF not
>> to BFD.
>>
>> Now what happens within those 30 sec, what BFD packets are formed and how
>> they are exchanged is all BFD business - but I am not suggesting to include
>> any of those in this 

Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Ketan Talaulikar
Hi Robert,

Thanks for your review again and your comments/discussions.

This thread is about your second point "timer".

I would like to point out that the draft discusses the BFD "dampening" or
"hold-down" mechanism in Sec 5. We are aware of BFD implementations that
include such mechanisms in a protocol-agnostic manner.

This draft indicates that OSPF adjacency will "advance" in the neighbor FSM
only after BFD reports UP.  The BFD mechanisms/timers are outside the scope
of this document.

Please also see a further response on this point in your latest email on
this thread.

Thanks,
Ketan


On Sun, Jan 30, 2022 at 2:19 AM Robert Raszuk  wrote:

> Hi Les,
>
> That timer and its consistency on both ends clearly belongs to OSPF not to
>> BFD.
>>
>
>
>> *[LES:] I disagree. The definition of UP state belongs to the BFD
>> protocol/implementation.*
>>
>> *If you don’t want BFD clients (like OSPF) to react “too quickly” then
>> build additional config/logic into your BFD implementation so it does not
>> signal UP state before additional criteria is met – do not make each BFD
>> client (and there could be multiple for a given session) configure its own
>> definition of BFD UP.*
>>
>
> I think we are looking at this from different perspectives.
>
> I am saying bring BFD UP and allow X seconds/minutes/hours to run a
> sequence of testing before bringing OSPF adj up.
>
> You are saying do not declare BFD as UP before all of those testing
> passes. That test sequence could be just running vanilla normal BFD for X
> seconds/minutes/hours.
>
> That would require introducing a completely new BFD state. Worse, that
> timer may be very different on a per type of interface basis as each
> interface type has completely different characteristics. Also such timer
> would need to have a different value on a per BFD client basis. (For
> example OSPF adj UP could be very different then PE-PE BFD for BGP as PULSE
> alternative :)
>
> Sorry I really do not think this belongs to BFD at all. It is a local
> client thing how long from t0 = BFD UP it will wait before proceeding
> further.
>
> And last but not least - such extended testing does not need to kick in
> every time interface flaps. Maybe the operator only wants to run it during
> maintenance windows once per day ? Or once per week ?
>
> But I am not going to even remotely hope I can convince you :) So let's
> forget it.
>
> Cheers,
> R/
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Ketan Talaulikar
Hi Robert,

Thanks for your review and comments.

This email is in response to your first point "overpromise".

First, there is no text in the draft that "overpromises" that the strict
mode of operation detects "all forwarding" issues. We are talking about BFD
and its capabilities are well-known. It is not in the scope of this
document to discuss BFD capabilities and shortcomings (e.g. the MTU issue
you describe).

The draft text that you have asked to remove is important. It explains the
scenario of a noisy link that experiences traffic drops. I am aware of
issues in production networks, where we've had OSPF adjacency flaps
continuously or sporadically due to OSPF adjacency coming up somehow but
then BFD bringing it down. This causes routing churn and service
degradation. This is one of the key drivers for this draft.

However, welcome any text clarifications/suggestions for improving the
document.

Thanks,
Ketan


On Sun, Jan 30, 2022 at 12:54 AM Robert Raszuk  wrote:

> Hi Acee,
>
> Can you suggest text which with you’d be happy? I’m sure the authors would
>> add you to the acknowledgements.
>>
>
> Actually instead of suggesting any new text I would suggest to delete the
> two below sentences and it will be fine:
>
> *"In certain other scenarios, a degraded or poor quality link will allow
> OSPF adjacency formation to succeed*
> *but the BFD session establishment will fail or the BFD session
> will flap.  In this case, traffic that gets *
> *forwarded over such a link may experience packet drops while the failure
> of the BFD session establishment *
> *would not enable fast routing convergence if the link were to go down or
> flap."*
>
> This could be described but I don’t think it should be normative. This
>> begs the question as to why a hold down timer is not a part of the BFD
>> protocol itself.
>>
>
> There is one - BFD calls it multiplier.
>
> But the timer I am suggesting is not related to BFD operation, but to OSPF
> (and/or ISIS). It is not about BFD sessions being UP or DOWN. It is about
> allowing BFD for more testing (with various parameters (for example
> increasing test packet size in some discrete steps) before OSPF is happy to
> bring the adj. up.
>
> Thx,
> R.
>
>>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Gyan Mishra
Hi Robert


On Sun, Jan 30, 2022 at 10:13 AM Robert Raszuk  wrote:

> Hi Albert,
>
> Thank you for confirming that BFD needs to be kept simple and there is
> already reluctance to add to it. So Les's suggestion to put additional
> logic into BFD is likely not a realistic one.
>
> Your note also confirms my points that there is likely to be different
> holdtime timer requirements depending on the link type and peer type.
>
> With that please notice what Les said:
>
> *And once OSPF strict-mode support becomes widely deployed there won’t be
> a need for such a timer for OSPF either.*
>
> That to me clearly means that he is going to retire the current timer once
> we get that RFC out of the door. That is why I proposed to add it to the
> document. But of course authors will decide.
>

Gyan> The goal is to make sure the link quality is good by ensuring BFD
session comes up prior to OSPF coming Up.  Once BFD is Up the path has been
verified by BFD to be good so I don’t see any need for any additional
timer.  Is the idea to have a timer just to test for initial deployment by
operators and then remove from code for widespread implementation and
deployment.  Also is the idea to have a delay stable period similar to IGP
sync delay timer for test tools to run successfully prior?

>
>
> Dear WG,
>
> After thinking about this draft I would suggest that what we really need
> is not a point solution, but a general mechanism which will allow us to
> bring the protocol full up after some time from the moment the test suite
> is up.
>

Gyan> IPPM WG has OAM and IOAM test tools that can work as well as many
other performance monitoring tools that can be integrated similar to
integration we are seeing with Flex Algo.  However BFD is bootstrapped to
the IGP for fast convergence.  So are you suggesting during the time after
BFD comes up during the delay timer period that a general test suite of
applications run?

>
> BFD is only one way to detect if the path to a peer is up or down. There
> are shipping alternatives to this today which could be used instead of BFD
> (for example any form of object tracking). As the current version of the
> draft says there is need to not only detect if the path is up or down but
> also if it meets quality expectations. New wave of INT tools is becoming
> available to allow us to measure those characteristics today.
>

Gyan> Agreed but are you suggesting this test suite of performance
monitoring tools get kicked off to run after BFD is up during the stable
delay timer period before OSPF comes Up.  Would we add a similar delay
timer to ISIS and test suite to be consistent?

>
> So while the draft could still bring BFD as one example of such a tool -
> in my opinion it deserves to be generalized a bit to allow other ways to
> determine if the link over which we are to establish IGP adj. meets the
> requirements.
>

 Gyan> BFD is a requirement for operators as it’s bootstrapped to the
IGP for fast convergence.  So any test suite we are talking about would be
external to BFD that would run during the IGP delay timer or all the time
once any link is Up.

>
> Kind regards,
> Robert
>
>
> On Sun, Jan 30, 2022 at 3:28 PM Albert Fu (BLOOMBERG/ 120 PARK) <
> af...@bloomberg.net> wrote:
>
>> I feel it is better to keep the standard simple and not add timer delay
>> as part of BFD strict draft, as different customers may have different
>> requirements, and there may also be vendor/platform dependency.
>>
>> For example, in the core where there are a lot of link
>> redundancy/diversity, we could afford to have longer time delay since we
>> can tolerate multiple link failures. For majority of the edge connectivity,
>> there are typically only 2 links - in this situation, we would want a lower
>> time delay.
>>
>> I found the current BFD Strict holddown/dampening mechanism as
>> implemented by the two vendors sufficient for our need. If there is an
>> issue causing BFD to fail during this time, OSPF will take longer time to
>> come up. And the delay only needs to be configured on one side.
>>
>> So, in current implementation, there's some sanity "check" that BFD is
>> stable for a period of time before OSPF can come up.
>>
>> In discussion with the BFD working group on my other MTU draft, there's a
>> keen interest among the WG to keep the BFD simple.
>>
>>
>>
>>
>> From: rob...@raszuk.net At: 01/29/22 15:20:06 UTC-5:00
>> To: ginsb...@cisco.com
>> Cc: Albert Fu (BLOOMBERG/ 120 PARK ) ,
>> a...@cisco.com, ketant.i...@gmail.com,
>> draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org, lsr@ietf.org
>> Subject: Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD"
>> - draft-ietf-lsr-ospf-bfd-strict-mode-04
>>
>> Hi Les,
>>
>> > Discussion of how to make BFD failure detection more robust belongs in
>> the BFD WG
>> > If you do not want the BFD session to come back up too quickly after a
>> failure
>>
>> Nothing I suggested is related to any of the above.
>>
>> Let me perhaps provide a very simple 

Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Gyan Mishra
Sorry I meant publication.

I support publication.

Thanks

Gyan

On Sun, Jan 30, 2022 at 1:59 AM Gyan Mishra  wrote:

>
> I support WG Adoption of this draft.
>
> This is a real world problem that has existed with BFD that operators have
> to deal with where OSPF adjacency comes up before BFD session establishes
> resulting in cases where the link may have L1 issues or maybe a dirty link
> or poor link quality resulting in BFD session establishment followed by BFD
> immediately taking down the link.  With BFD tight timers with client
> protocol registered ends up further exacerbating the issue with link flaps
> resulting in IGP instability.
>
> This draft mirrors the ISIS block solution in RFC 6213   ISIS BFD enabled
> TLV.
>
> This issue exists with BGP as well where the protocol registered with BFD
> bootstrapped per RFC 5882 comes up before BFD resulting in instability.  I
> believe this gap still exists for BGP.
>
> When BFD comes up it performs link integrity test before session
> establishment to detect dirty errored link does not come Up.
>
> RFC 5880 BFD 3-way session establishment and  does the link integrity and
>  quality test by sending the BFD control packets to validate bi-directional
> forwarding liveliness detection over any media.
>
> The case mentioned in this draft where the link is dirty, MTU issues or
> forwarding plane issues exist that cause BFD not to establish resulting in
> the use of default protocol timers and slow convergence is a major issue
> for operators being solved with this draft as well as mentioned above where
> BFD does come up after the IGP is just as bad if not worse if the link is a
> dirty errored link resulting in flapping link.
>
> The main point here as I mentioned is that BFD must validate the link
> integrity before routing protocol comes Up, so that routing protocol does
> not come Up on a dirty errored link, so the blocking of the adjacency
> capabilities solution here nicely solves the issue.
>
>
> In this thread it has been mentioned maybe a CLI timer knob as far as
> implementation for delay knob makes sense.
>
> I would like to note that one workaround used by operators is using RFC
> 7130 BFD over bundle member called “BOB” or per link BFD,  and in that case
> control protocol is in fact blocked and BFD comes up first.  This is a
> workaround used putting even individual single links in a bundle to present
> the issue from happening.
>
> I would like to note that RFC 5882 Generic Application of BFD does state
> that if all neighbors support BFD then the registered control protocol
> being bootstrapped should be blocked from coming up until BFD session is
> established.  Only in case where all neighbors on a LAN do not have BFD
> enabled, blocking the control protocol from coming Up would prevent the
> control protocol from coming Up on neighbors that don’t have BFD enabled.
>
> So the way I read it implementations following BFD RFC 5882 should have
> been blocking OSPF or ISIS  protocol from coming Up before BFD comes up w/o
> having to require a specification for the explicit block.  Apparently most
> all vendors implementations did not follow RFC 5882 it appears with this
> regard and thus now the requirement for operators for this important
> draft.  I think this implementation discrepancy happened due to normative
> language SHOULD Block and not MUST Block is the problem.
>
> RFC 5882 excerpt below:
>
> 4.1 .  Adjacency 
> Establishment
>
>If the session state on either the local or remote system (if known)
>is AdminDown, BFD has been administratively disabled, and the
>establishment of a control protocol adjacency MUST be allowed.
>
>BFD sessions are typically bootstrapped by the control protocol,
>using the mechanism (discovery, configuration) used by the control
>protocol to find neighbors.  Note that it is possible in some failure
>scenarios for the network to be in a state such that the control
>protocol is capable of coming up, but the BFD session cannot be
>established, and, more particularly, data cannot be forwarded.  To
>avoid this situation, it would be beneficial not to allow the control
>protocol to establish a neighbor adjacency.  However, this would
>preclude the operation of the control protocol in an environment in
>which not all systems support BFD.
>
>
>Therefore, the establishment of control protocol adjacencies SHOULD
>be blocked if both systems are willing to establish a BFD session but
>a BFD session cannot be established.  One method for determining that
>both systems are willing to establish a BFD session is if the control
>protocol carries explicit signaling of this fact.  If there is no
>explicit signaling, the willingness to establish a BFD session may be
>determined by means outside the scope of this specification.
>
>If it is believed that the neighboring system does not 

Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Robert Raszuk
Hi Albert,

Thank you for confirming that BFD needs to be kept simple and there is
already reluctance to add to it. So Les's suggestion to put additional
logic into BFD is likely not a realistic one.

Your note also confirms my points that there is likely to be different
holdtime timer requirements depending on the link type and peer type.

With that please notice what Les said:

*And once OSPF strict-mode support becomes widely deployed there won’t be a
need for such a timer for OSPF either.*

That to me clearly means that he is going to retire the current timer once
we get that RFC out of the door. That is why I proposed to add it to the
document. But of course authors will decide.


Dear WG,

After thinking about this draft I would suggest that what we really need is
not a point solution, but a general mechanism which will allow us to bring
the protocol full up after some time from the moment the test suite is up.

BFD is only one way to detect if the path to a peer is up or down. There
are shipping alternatives to this today which could be used instead of BFD
(for example any form of object tracking). As the current version of the
draft says there is need to not only detect if the path is up or down but
also if it meets quality expectations. New wave of INT tools is becoming
available to allow us to measure those characteristics today.

So while the draft could still bring BFD as one example of such a tool - in
my opinion it deserves to be generalized a bit to allow other ways to
determine if the link over which we are to establish IGP adj. meets the
requirements.

Kind regards,
Robert


On Sun, Jan 30, 2022 at 3:28 PM Albert Fu (BLOOMBERG/ 120 PARK) <
af...@bloomberg.net> wrote:

> I feel it is better to keep the standard simple and not add timer delay as
> part of BFD strict draft, as different customers may have different
> requirements, and there may also be vendor/platform dependency.
>
> For example, in the core where there are a lot of link
> redundancy/diversity, we could afford to have longer time delay since we
> can tolerate multiple link failures. For majority of the edge connectivity,
> there are typically only 2 links - in this situation, we would want a lower
> time delay.
>
> I found the current BFD Strict holddown/dampening mechanism as implemented
> by the two vendors sufficient for our need. If there is an issue causing
> BFD to fail during this time, OSPF will take longer time to come up. And
> the delay only needs to be configured on one side.
>
> So, in current implementation, there's some sanity "check" that BFD is
> stable for a period of time before OSPF can come up.
>
> In discussion with the BFD working group on my other MTU draft, there's a
> keen interest among the WG to keep the BFD simple.
>
>
>
>
> From: rob...@raszuk.net At: 01/29/22 15:20:06 UTC-5:00
> To: ginsb...@cisco.com
> Cc: Albert Fu (BLOOMBERG/ 120 PARK ) , a...@cisco.com,
> ketant.i...@gmail.com, draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org,
> lsr@ietf.org
> Subject: Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD"
> - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
> Hi Les,
>
> > Discussion of how to make BFD failure detection more robust belongs in
> the BFD WG
> > If you do not want the BFD session to come back up too quickly after a
> failure
>
> Nothing I suggested is related to any of the above.
>
> Let me perhaps provide a very simple example.
>
> BFD being used is *AS*IS*.
>
> All the operator wants is to run it for say X sec without ever going
> down before bringing OSPF adj up.
>
> That timer and its consistency on both ends clearly belongs to OSPF not to
> BFD.
>
> Now what happens within those 30 sec, what BFD packets are formed and how
> they are exchanged is all BFD business - but I am not suggesting to include
> any of those in this draft.
>
> Do we have a common understanding so far ?
>
> Hint: Albert already stated that he needs that timer and that both vendors
> provided it via cfg. All that confirms is that timer is needed. All I am
> suggesting (even before being aware of the manual cfg for it) was to
> synchronize the value or pick lower configured between two peers.
>
> Kind regards,
> R.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Sat, Jan 29, 2022 at 9:08 PM Les Ginsberg (ginsberg) <
> ginsb...@cisco.com> wrote:
>
>> Robert –
>>
>>
>>
>> It is good that you take an active interest in this technology – but I
>> think the suggestions you are making should not be targeted at IGP use of
>> BFD.
>>
>>
>>
>> Discussion of how to make BFD failure detection more robust belongs in
>> the BFD WG – and – as you know – that WG has taken an interest in such
>> problems e.g., MTU.
>>
>>
>>
>> In regards to “dampening” = which I think is the relevant term for the
>> timer related suggestions you are making - this also does not belong in the
>> IGP. If you do not want the BFD session to come back up too quickly after a
>> failure, the proper place to put timers is either at the interface layer or

Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

2022-01-30 Thread Albert Fu (BLOOMBERG/ 120 PARK)
I feel it is better to keep the standard simple and not add timer delay as part 
of BFD strict draft, as different customers may have different requirements, 
and there may also be vendor/platform dependency.

For example, in the core where there are a lot of link redundancy/diversity, we 
could afford to have longer time delay since we can tolerate multiple link 
failures. For majority of the edge connectivity, there are typically only 2 
links - in this situation, we would want a lower time delay.

I found the current BFD Strict holddown/dampening mechanism as implemented by 
the two vendors sufficient for our need. If there is an issue causing BFD to 
fail during this time, OSPF will take longer time to come up. And the delay 
only needs to be configured on one side. 

So, in current implementation, there's some sanity "check" that BFD is stable 
for a period of time before OSPF can come up. 

In discussion with the BFD working group on my other MTU draft, there's a keen 
interest among the WG to keep the BFD simple.


From: rob...@raszuk.net At: 01/29/22 15:20:06 UTC-5:00To:  ginsb...@cisco.com
Cc:  Albert Fu (BLOOMBERG/ 120 PARK ) ,  a...@cisco.com,  
ketant.i...@gmail.com,  draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org,  
lsr@ietf.org
Subject: Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - 
draft-ietf-lsr-ospf-bfd-strict-mode-04

Hi Les,

> Discussion of how to make BFD failure detection more robust belongs in the 
> BFD WG
> If you do not want the BFD session to come back up too quickly after a failure

Nothing I suggested is related to any of the above. 

Let me perhaps provide a very simple example. 

BFD being used is *AS*IS*.  

All the operator wants is to run it for say X sec without ever going down 
before bringing OSPF adj up. 

That timer and its consistency on both ends clearly belongs to OSPF not to BFD. 

Now what happens within those 30 sec, what BFD packets are formed and how they 
are exchanged is all BFD business - but I am not suggesting to include any of 
those in this draft. 

Do we have a common understanding so far ? 

Hint: Albert already stated that he needs that timer and that both vendors 
provided it via cfg. All that confirms is that timer is needed. All I am 
suggesting (even before being aware of the manual cfg for it) was to 
synchronize the value or pick lower configured between two peers. 

Kind regards,
R.


On Sat, Jan 29, 2022 at 9:08 PM Les Ginsberg (ginsberg)  
wrote:

 

Robert – 
  
It is good that you take an active interest in this technology – but I think 
the suggestions you are making should not be targeted at IGP use of BFD. 
  
Discussion of how to make BFD failure detection more robust belongs in the BFD 
WG – and – as you know – that WG has taken an interest in such problems e.g., 
MTU. 
  
In regards to “dampening” = which I think is the relevant term for the timer 
related suggestions you are making - this also does not belong in the IGP. If 
you do not want the BFD session to come back up too quickly after a failure, 
the  proper place to put timers is either at the interface layer or in the BFD 
implementation. 
I am familiar with implementations which apply this timer at the protocol level 
(AKA BFD client in this context) and this is done precisely because the 
protocol does NOT have the functionality being defined in this draft. Once you 
have  implemented “wait-for-BFD” logic as defined in this draft you do not need 
additional delay timers in the protocol. 
  
I don’t think the suggestions you are making belong in this document. 
  
Les 
  
  

From: Lsr  On Behalf Of  Robert Raszuk
Sent: Saturday, January 29, 2022 11:25 AM
To: Acee Lindem (acee) 
Cc: Ketan Talaulikar ; 
draft-ietf-lsr-ospf-bfd-strict-m...@ietf.org; Albert Fu ; 
lsr 
Subject: Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - 
draft-ietf-lsr-ospf-bfd-strict-mode-04 
  

Hi Acee, 

  


Can you suggest text which with you’d be happy? I’m sure the authors would add 
you to the acknowledgements.
 

  

Actually instead of suggesting any new text I would suggest to delete the two 
below sentences and it will be fine:  

  

"In certain other scenarios, a degraded or poor quality link will allow OSPF 
adjacency formation to succeed 

but the BFD session establishment will fail or the BFD session will flap.  In 
this case, traffic that gets  

forwarded over such a link may experience packet drops while the failure of the 
BFD session establishment  

would not enable fast routing convergence if the link were to go down or flap." 

  

This could be described but I don’t think it should be normative. This begs the 
question as to why a hold down timer is not a part of the BFD protocol itself.
 

  

There is one - BFD calls it multiplier.  

  

But the timer I am suggesting is not related to BFD operation, but to OSPF 
(and/or ISIS). It is not about BFD sessions being UP or DOWN. It is about 
allowing BFD for more testing (with various