Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

Robert Raszuk Mon, 31 Jan 2022 11:53:16 -0800

Hi Albert,

On Mon, Jan 31, 2022 at 8:38 PM Albert Fu (BLOOMBERG/ 120 PARK) <
[email protected]> wrote:


> Hi Robert,
>
> Do you mean we should make it mandatory in the draft to stipulate a delay
> time between when OSPF should wait for BFD to come up?
>

No.

The timer is for OSPF to bring adj up only after X timer expires from the
moment BFD session came up and stayed up (never went down).

No changes to BFD needed at all.

Trivial to implement on the client side and very useful operationally.

Thx,
Robert




> I don't know how others feel, but I tend to agree the main author of this
> Draft, Ketan, that it is best to leave the delay timer out of this draft.
>
> There is already an implicit understanding that BFD must be up before OSPF
> can progress to the adjacency phase.
>
> And I can think of deployments with many redundant links where the delay
> can be large value, and some scenario say sites with only 1 redundant link
> where it is not desirable for the delay not to be too lengthy, to avoid
> both links being down at the same time and cutting communication to the
> site completely.
>
> I have also tested current implementations where the delays do not have to
> match (e.g. one side with delay, and one side no delay).
>
> IMO, it is better not to make the delay a part of the standard.
>
> Thanks
>
> Albert
>
>
> From: [email protected] At: 01/31/22 13:51:56 UTC-5:00
> To: Albert Fu (BLOOMBERG/ 120 PARK ) <[email protected]>
> Cc: [email protected], [email protected], [email protected],
> [email protected], [email protected]
> Subject: Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD"
> - draft-ietf-lsr-ospf-bfd-strict-mode-04
>
> Albert,
>
> > It serves as a sanity check that there's indeed a working
> > BFD for a period of time before OSPF adjacency is allowed
> > to progress.
>
> And that is precisely what I am suggesting that should be both mandatory
> and part of this draft. Not an optional nice to have vendor knob.
>
> Clearly it does not belong to BFD spec or WG what Les and Ketan are trying
> to suggest.
>
> Regards,
> R.
>
>
>
>
>
>
>
>
> On Mon, Jan 31, 2022 at 6:52 PM Albert Fu (BLOOMBERG/ 120 PARK) <
> [email protected]> wrote:
>
>> Hi Robert,
>>
>> The BGP BFD hold-time mentioned in the BGP BFD strict mode draft has
>> different meaning from the holdtime/delay/dampening that has been discussed
>> in this forum thus far.
>>
>> The BGP BFD hold-time, as per the BGP BFD draft below, is user
>> configurable, and is used to bring down the BGP session if BFD session is
>> not established within default of 30s, when the negotiated "BGP HoldTimer"
>> is 0.
>>
>>
>> The OSPF hold-time/delay/dampening that we have been discussing so far is
>> the delay from when BFD comes up to when OSPF will be allowed to come up.
>> This, as Ketan mentioned, is outside the scope of this draft.
>>
>> In my testing with both Cisco and Juniper implementation, the OSPF
>> hold-time/delay/dampening timers are quite arbitrary. You could have no
>> delay (which means bring up OSPF asap), or have it configured on one side
>> only. It serves as a sanity check that there's indeed a working BFD for a
>> period of time before OSPF adjacency is allowed to progress.
>>
>> Thanks
>>
>> Albert
>>
>> From: [email protected] At: 01/31/22 09:59:48 UTC-5:00
>> To: Albert Fu (BLOOMBERG/ 120 PARK ) <[email protected]>,
>> [email protected], [email protected]
>> Cc: [email protected], [email protected],
>> [email protected]
>> Subject: Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD"
>> - draft-ietf-lsr-ospf-bfd-strict-mode-04
>>
>> Les & Ketan
>>
>>
>>> Nowadays, it is also common to see the "break-in-middle" failures. we
>>> use BFD to detect this sort of failure within sub-second. And to dampen
>>> this sort of break-in-middle failures, we will need to use BFD
>>> holdtime/dampening.
>>>
>>
>> Another data point to the above and this discussion which Albert is
>> co-author of.
>>
>> Ref:
>> https://datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-bfd-strict-mode
>>
>> Please see the below paragraph which clearly says *BGP BFD Hold time*:
>>
>>    If the BFD session does not transition to the Up state, and the
>>    HoldTimer has been negotiated to a non-zero value, the BGP FSM will
>>    close the session appropriately.  If the HoldTimer has been
>>    negotiated to a zero value, the session should be closed after a time
>>    of X.  This time X is referred as "BGP BFD Hold time".  The proposed
>>    default BGP BFD Hold time value is 30 seconds.  The BGP BFD Hold time
>>    value is configurable.
>>
>> To me it is clear that BGP BFD Hold time is on the client side and here
>> affects BGP FSM.
>>
>> Thx,
>> Robert.
>>
>>
>>
>>
>>
>>
>>
>> From: [email protected] At: 01/30/22 14:38:37 UTC-5:00
>>> To: [email protected], [email protected]
>>> Cc: Albert Fu (BLOOMBERG/ 120 PARK ) <[email protected]>,
>>> [email protected], [email protected],
>>> [email protected]
>>> Subject: RE: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
>>> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>>>
>>> Robert –
>>>
>>>
>>>
>>> Here is what you said (emphasis added):
>>>
>>>
>>>
>>> <snip>
>>>
>>> But the timer I am suggesting is not related to BFD operation, but to
>>> OSPF (and/or ISIS). It is not about BFD sessions being UP or DOWN. It is
>>> about *allowing BFD for more testing (with various parameters (for
>>> example increasing test packet size in some discrete steps)* before
>>> OSPF is happy to bring the adj. up.
>>>
>>> <end snip>
>>>
>>>
>>>
>>> Point #1: If you want BFD to do more testing (such as MTU testing) then
>>> clearly you need extensions to BFD (such as
>>> https://datatracker.ietf.org/doc/draft-ietf-bfd-large-packets/ )
>>>
>>>
>>>
>>> Point #2: The existing timers (as Ketan points out are mentioned in
>>> Section 5) are applied today at the OSPF level precisely because OSPF does
>>> not currently have strict-mode operation. So in a flapping scenario you
>>> could see the following behavior:
>>>
>>>
>>>
>>> a)BFD goes down
>>>
>>> b)OSPF goes down in response to BFD
>>>
>>> c)OSPF comes back up
>>>
>>> d)Link is still unstable – so traffic is being dropped some of the time
>>> – but perhaps OSPF adjacency stays up (i.e., OSPF hellos get through often
>>> enough to keep the OSPF adjacency up)
>>>
>>>
>>>
>>> So some implementations have chosen to insert a delay following “b”.
>>> This doesn’t guarantee stability, but hopefully makes it less likely. And
>>> because OSPF today does NOT wait for BFD to come up, the delay has to be
>>> implemented at the OSPF level.
>>>
>>>
>>>
>>> Once you have strict mode support, the sequence becomes:
>>>
>>>
>>>
>>> a)BFD goes down
>>>
>>> b)OSPF goes down in response to BFD
>>>
>>> c)BFD comes back up
>>>
>>> d)OSPF comes back up
>>>
>>>
>>>
>>> Now, if the concern is that BFD comes back up while the link is still
>>> unstable, the way to address that is to put a delay either before BFD
>>> attempts to bring up a new session or a delay after achieving UP state
>>> before it signals UP to its clients – such as OSPF. This is a better
>>> solution because all BFD clients benefit from this. Ad if the link is still
>>> unstable, it is more likely that the BFD session will go down during the
>>> delay period than it would be for OSPF because the BFD timers are
>>> significantly more aggressive.
>>>
>>> (BTW, this behavior can be done w/o a BFD protocol extension – it is
>>> purely an implementation choice.)
>>>
>>>
>>>
>>> From a design perspective, dampening is always best done at the lowest
>>> layer possible. In most cases, interface layer dampening is best. If that
>>> is not reliable for some reason, then move one layer up – not two layers up.
>>>
>>>
>>>
>>>    Les
>>>
>>>
>>>
>>>
>>>
>>> *From:* Robert Raszuk <[email protected]>
>>> *Sent:* Sunday, January 30, 2022 10:05 AM
>>> *To:* Ketan Talaulikar <[email protected]>
>>> *Cc:* Les Ginsberg (ginsberg) <[email protected]>; Acee Lindem (acee) <
>>> [email protected]>; [email protected]; Albert
>>> Fu <[email protected]>; lsr <[email protected]>
>>> *Subject:* Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for
>>> BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04
>>>
>>>
>>>
>>> Hi Ketan,
>>>
>>>
>>>
>>> I would like to point out that the draft discusses the BFD "dampening"
>>> or "hold-down" mechanism in Sec 5. We are aware of BFD implementations that
>>> include such mechanisms in a protocol-agnostic manner.
>>>
>>>
>>>
>>> BFD dampening or hold-time are completely orthogonal to my point. Both
>>> have nothing to do with it.
>>>
>>>
>>>
>>> Those timers only fire when BFD goes down. In my example BFD does not go
>>> down. But we want to bring up the client adj. only after X ms/sec/min etc
>>> ...of normal BFD operation if no failure is detected during that timer.
>>>
>>>
>>>
>>> This draft indicates that OSPF adjacency will "advance" in the neighbor
>>> FSM only after BFD reports UP.
>>>
>>>
>>>
>>> And that is exactly too soon. In fact if you do that today
>>> without waiting some time (if you retire the current OSPF timer) you will
>>> not help at all in the case you are trying to address.
>>>
>>>
>>>
>>> Reason being that perhaps 200 ms after BFD UP it will go down, but OSPF
>>> adj. will get already established. It is really pretty simple.
>>>
>>>
>>>
>>> Thx,
>>>
>>> Robert.
>>>
>>>
>>>
>>> PS. And yes I think ISIS should also get fixed in that respect.
>>>
>>>
>>>
>>
>
>

_______________________________________________
Lsr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lsr

Re: [Lsr] Working Group Last Call for "OSPF Strict-Mode for BFD" - draft-ietf-lsr-ospf-bfd-strict-mode-04

Reply via email to