Re: Resolving lingering issues with BFD authentication drafts

2024-02-29 Thread Jeffrey Haas
Reshad,


> On Feb 29, 2024, at 2:36 PM, Reshad Rahman  wrote:
> 
> Jeff, 
> 
> The only thing I am still a bit hesitant about is delaying the notification 
> to the BFD clients (that the session is up) until we've successfully moved to 
> the optimized mode. It's not the actual delay, which should be short, but the 
> fact that it's changing the BFD state machine a bit. But I don't see any 
> other way to do this without the risk of bouncing the BFD session. 

It's worth pointing out that the "bfd holddown" features implemented by 
multiple vendors ALREADY does this.  And, when it's in use, the client will 
wait for stuff on the order of seconds to minutes for some providers.

So, while I agree that it's a change, and theoretically a scary one, it's well 
deployed already in some form.

In the interest of honesty, such holddown features didn't interoperate in their 
use terribly well and was the reason for the PROTOCOLS / clients to refine how 
they used BFD - hence the "strict" features.  Again, not a problem with the 
idea of holddown, but rather than RFC 5882 was a bit too general in its advice.


> Regarding "until we've successfully switched over to the optimized 
> mechanism", what does a successful switch mean? Does it mean sending detect 
> mult packets with optimized auth?

I believe it means that both sides are using the optimized mode.  It can't be 
only one side since the potential session drop will only happen when each side 
turns on optimization.

-- Jeff



Re: Resolving lingering issues with BFD authentication drafts

2024-02-29 Thread Reshad Rahman
 Jeff, 
The only thing I am still a bit hesitant about is delaying the notification to 
the BFD clients (that the session is up) until we've successfully moved to the 
optimized mode. It's not the actual delay, which should be short, but the fact 
that it's changing the BFD state machine a bit. But I don't see any other way 
to do this without the risk of bouncing the BFD session. 
Regarding "until we've successfully switched over to the optimized mechanism", 
what does a successful switch mean? Does it mean sending detect mult packets 
with optimized auth?
Regards,Reshad.

On Sunday, February 25, 2024, 06:53:52 PM EST, Jeffrey Haas 
 wrote:  
 
 Reshad,


On Feb 25, 2024, at 5:31 PM, Reshad Rahman  wrote:
 Jeff, overall this looks to be a good way forward, it addresses the main 
concern I had expressed. 

Excellent.

On Friday, February 23, 2024, 04:32:55 PM EST, Jeffrey Haas  
wrote: - The optimization procedures currently can have BFD go Up with the 
initial  stronger authentication, then go down once the optimized mode kicks in.
 That's the scenario where only 1 end supports optimized procedures?

In the current version of the document, yes.  That's an item the suggested 
changes are intended to address.

Possible ways to address these:

For BFD optimization:
[...]


- Optimized authentication should kick in ASAP when we are in the Up state.
  I believe this means that we send out at least Detect Mult packets in the
  strong mechanism and then switch to the optimized mechanism.  This bounds
  the amount of time when we're not running in optimized mode.
 Why does optimized procedures need to kick in asap? Is this in case 
there's an issue with the optimized procedures?

The general concern is not overly delaying the client's idea of when BFD 
transitions to Up.
The suggested changes take us from Up to an internal "pending" state waiting 
for the optimized mode to kick in.  We can theoretically linger there however 
long we like since we've signaled that this change is coming, but it's not 
helpful to the client to linger there longer than necessary.
The suggestion above is really the lowest bound on time we can take for such a 
transition to ensure we can safely transition to ISAAC mode and entrain the 
sequence numbers for the ISAAC algorithm.


- BFD clients that are expecting optimized authentication SHOULD NOT convey 
BFD sessions (not clients)?


Session on a client. :-)

  to their clients that the session is in the Up state until we've
  successfully switched over to the optimized mechanism.  While this seems
  contrary to BFD behavior, it's no different than any of the existing
  "holddown" procedures clients like BGP can implement to ensure that BFD is
  stable for long enough before using the session.
 Is this in case there's an issue with the optimized procedures?If yes, do 
we also need some text for the case where optimized procedures fail? e.g., at a 
certain point we have to stick to strong auth but do we retry eventually (that 
could cause the session to go down if we do)?

>From the client's session perspective, BFD simply is Up/Down as normal.
>From the protocol perspective, lingering forever waiting for the optimized 
>mode kicking in isn't what we'd want.  So, yes, we need some form of default 
>timeout recommended for implementors.
If we repeatedly bounce the BFD session from Up to Down only at the transition 
to the optimized mode, we likely want to dampen that behavior. At least with 
the new code points we have a sense that this transition to the optimized mode 
is the actual problem between devices that have agreed to use that 
authentication type.  


Thoughts? When transitioning from strong auth to optimized procedures, 
could we send both types of packets when attempting the transition? The aim 
being to avoid the BFD session from going down. I haven't thought this through 
so this may well not hold water.

For the ISAAC procedures, the only requirement is that we believe the other 
side of the session has seen at least one Up packet out of the expected Detect 
Mult packets.  That's sufficient for the entraining procedure.
Once we have entrained the ISAAC session, we should be able to flip in and out 
of the optimized mode at will.
The idea I think you're trying to convey is similar to how other protocols 
handle a graceful rollover for key.  That's normally done by having the 
rollover timeframe being willing to authenticate with both old and new key, not 
having the side generating the packets sending it twice.
For BFD in particular, sending the same PDU with different auth types would 
probably play havoc with the meticulously increasing sequence number 
requirements.  Further, there's no mechanism we have to convey that we've 
successfully processed the rollover.
-- Jeff
  

Re: Resolving lingering issues with BFD authentication drafts

2024-02-25 Thread Jeffrey Haas
Reshad,


> On Feb 25, 2024, at 5:31 PM, Reshad Rahman  wrote:
> 
> Jeff, overall this looks to be a good way forward, it addresses the main 
> concern I had expressed. 

Excellent.

> On Friday, February 23, 2024, 04:32:55 PM EST, Jeffrey Haas  
> wrote:
> - The optimization procedures currently can have BFD go Up with the initial
>   stronger authentication, then go down once the optimized mode kicks in.
> 
>  That's the scenario where only 1 end supports optimized procedures?

In the current version of the document, yes.  That's an item the suggested 
changes are intended to address.

> Possible ways to address these:
> 
> For BFD optimization:
> [...]

> - Optimized authentication should kick in ASAP when we are in the Up state.
>   I believe this means that we send out at least Detect Mult packets in the
>   strong mechanism and then switch to the optimized mechanism.  This bounds
>   the amount of time when we're not running in optimized mode.
> 
>  Why does optimized procedures need to kick in asap? Is this in case 
> there's an issue with the optimized procedures?

The general concern is not overly delaying the client's idea of when BFD 
transitions to Up.

The suggested changes take us from Up to an internal "pending" state waiting 
for the optimized mode to kick in.  We can theoretically linger there however 
long we like since we've signaled that this change is coming, but it's not 
helpful to the client to linger there longer than necessary.

The suggestion above is really the lowest bound on time we can take for such a 
transition to ensure we can safely transition to ISAAC mode and entrain the 
sequence numbers for the ISAAC algorithm.

> 
> - BFD clients that are expecting optimized authentication SHOULD NOT convey
>  BFD sessions (not clients)?

Session on a client. :-)

>   to their clients that the session is in the Up state until we've
>   successfully switched over to the optimized mechanism.  While this seems
>   contrary to BFD behavior, it's no different than any of the existing
>   "holddown" procedures clients like BGP can implement to ensure that BFD is
>   stable for long enough before using the session.
>  Is this in case there's an issue with the optimized procedures?
> If yes, do we also need some text for the case where optimized procedures 
> fail? e.g., at a certain point we have to stick to strong auth but do we 
> retry eventually (that could cause the session to go down if we do)?

From the client's session perspective, BFD simply is Up/Down as normal.

From the protocol perspective, lingering forever waiting for the optimized mode 
kicking in isn't what we'd want.  So, yes, we need some form of default timeout 
recommended for implementors.

If we repeatedly bounce the BFD session from Up to Down only at the transition 
to the optimized mode, we likely want to dampen that behavior. At least with 
the new code points we have a sense that this transition to the optimized mode 
is the actual problem between devices that have agreed to use that 
authentication type.  

> 
> Thoughts?
>  When transitioning from strong auth to optimized procedures, could we 
> send both types of packets when attempting the transition? The aim being to 
> avoid the BFD session from going down. I haven't thought this through so this 
> may well not hold water.

For the ISAAC procedures, the only requirement is that we believe the other 
side of the session has seen at least one Up packet out of the expected Detect 
Mult packets.  That's sufficient for the entraining procedure.

Once we have entrained the ISAAC session, we should be able to flip in and out 
of the optimized mode at will.

The idea I think you're trying to convey is similar to how other protocols 
handle a graceful rollover for key.  That's normally done by having the 
rollover timeframe being willing to authenticate with both old and new key, not 
having the side generating the packets sending it twice.

For BFD in particular, sending the same PDU with different auth types would 
probably play havoc with the meticulously increasing sequence number 
requirements.  Further, there's no mechanism we have to convey that we've 
successfully processed the rollover.

-- Jeff



Re: Resolving lingering issues with BFD authentication drafts

2024-02-25 Thread Reshad Rahman
 Jeff, overall this looks to be a good way forward, it addresses the main 
concern I had expressed. 
BFD WG, please take a look at the procedures outlined below and provide 
feedback. 
Comments/questions inline.
On Friday, February 23, 2024, 04:32:55 PM EST, Jeffrey Haas 
 wrote:  
 
 Here's an attempt to provide a path to resolve the lingering issues in the
authentication drafts.

Core lingering issues:
- The NULL auth method is attackable, but still potentially useful for the
  stability procedures.
- The optimization procedures currently can have BFD go Up with the initial
  stronger authentication, then go down once the optimized mode kicks in.
 That's the scenario where only 1 end supports optimized procedures?
  Right now, the text doesn't place any bounds on how long it might be until
  the optimized procedures are initiated once the session moves to Up.

  The issue here is less about bouncing the BFD session, but the impact on
  BFD clients.

Possible ways to address these:

For BFD optimization:
- We remove no-authentication and NULL-authentication as methods for the
  optimized session.  This leaves us solely with one defined method that
  both provides good enough security.  It also leaves us room to add other
  authentications in the future that have similar properties.
- Optimized authentication should kick in ASAP when we are in the Up state.
  I believe this means that we send out at least Detect Mult packets in the
  strong mechanism and then switch to the optimized mechanism.  This bounds
  the amount of time when we're not running in optimized mode.
 Why does optimized procedures need to kick in asap? Is this in case 
there's an issue with the optimized procedures?
- BFD clients that are expecting optimized authentication SHOULD NOT convey 
BFD sessions (not clients)?
  to their clients that the session is in the Up state until we've
  successfully switched over to the optimized mechanism.  While this seems
  contrary to BFD behavior, it's no different than any of the existing
  "holddown" procedures clients like BGP can implement to ensure that BFD is
  stable for long enough before using the session.
 Is this in case there's an issue with the optimized procedures?If yes, do 
we also need some text for the case where optimized procedures fail? e.g., at a 
certain point we have to stick to strong auth but do we retry eventually (that 
could cause the session to go down if we do)?
  This is also not the length of time such features want.  BGP BFD holddown
  is in the multiples of seconds time frame.  I believe we want something
  that is within two Detection Intervals once the session is Up.

  + It should be noted we already require sending out this number of Up
    packets in the strong mode for entraining ISAAC.  However, I'm not sure if
    our procedures are clear on that point.  To be audited.
- How does a client tell that "we are expecting optimized authentication"?

  We define parallel authentication code points for the procedure.  Today,
  our strong meticulous features are currently meticulous md5 and sha1; code
  points 5 and 3, respectively.  

  We allocate two new code points, "ISAAC-optimized meticulous sha-1" and
  "ISAAC-optimized meticulous md5".  When these code points are used, the
  expectation is the strong cipher is used to get the session to the Up
  state, and the session expects to transition to ISAAC afterwards.  Thus,
  we no longer have the opportunity for an implementation that doesn't
  support optimization to have the session half transition to up using the
  strong mode and fail once the switch attempts to a mode it doesn't
  understand. I like it!
- We might want to consider having the shared secret used for both strong
  and optimized mode.  While we've had discussion that we might not want to
  do this, having a common shared secret means that misconfiguration stops
  being the operational consideration that drives the most likely reasons
  for failure of the transition to optimized authentication.
  + This can be a SHOULD for the above reasons.
  + If the operator does not want to use the same shared secret, that's
    still fine. It just means they're accepting the potential additional
    fragility.
- The NULL auth mechanism is moved out of the optimized draft into the
  stability draft.

For BFD stability:
- The NULL auth method is pulled into this document.
- The NULL auth's procedures are slightly updated such that the sequence
  number SHOULD NOT be used for authentication.  Effectively, it transitions
  to a counter.  This avoids the ability to use it for attacking the
  protocol as noted in prior discussion.
- The NULL auth security properties are no worse at that point than no
  authentication.  
- Existing meticulous methods can be used as well - no change.
- ISAAC can be used when optimized mode is in use.  No change. 
  + ISAAC mode cannot be used alone. Its procedures for entraining the
    sequence numbers currently mean it can't be 

Resolving lingering issues with BFD authentication drafts

2024-02-23 Thread Jeffrey Haas
Here's an attempt to provide a path to resolve the lingering issues in the
authentication drafts.

Core lingering issues:
- The NULL auth method is attackable, but still potentially useful for the
  stability procedures.
- The optimization procedures currently can have BFD go Up with the initial
  stronger authentication, then go down once the optimized mode kicks in.
  Right now, the text doesn't place any bounds on how long it might be until
  the optimized procedures are initiated once the session moves to Up.

  The issue here is less about bouncing the BFD session, but the impact on
  BFD clients.

Possible ways to address these:

For BFD optimization:
- We remove no-authentication and NULL-authentication as methods for the
  optimized session.  This leaves us solely with one defined method that
  both provides good enough security.  It also leaves us room to add other
  authentications in the future that have similar properties.
- Optimized authentication should kick in ASAP when we are in the Up state.
  I believe this means that we send out at least Detect Mult packets in the
  strong mechanism and then switch to the optimized mechanism.  This bounds
  the amount of time when we're not running in optimized mode.
- BFD clients that are expecting optimized authentication SHOULD NOT convey
  to their clients that the session is in the Up state until we've
  successfully switched over to the optimized mechanism.  While this seems
  contrary to BFD behavior, it's no different than any of the existing
  "holddown" procedures clients like BGP can implement to ensure that BFD is
  stable for long enough before using the session.

  This is also not the length of time such features want.  BGP BFD holddown
  is in the multiples of seconds time frame.  I believe we want something
  that is within two Detection Intervals once the session is Up.

  + It should be noted we already require sending out this number of Up
packets in the strong mode for entraining ISAAC.  However, I'm not sure if
our procedures are clear on that point.  To be audited.
- How does a client tell that "we are expecting optimized authentication"?

  We define parallel authentication code points for the procedure.  Today,
  our strong meticulous features are currently meticulous md5 and sha1; code
  points 5 and 3, respectively.  

  We allocate two new code points, "ISAAC-optimized meticulous sha-1" and
  "ISAAC-optimized meticulous md5".  When these code points are used, the
  expectation is the strong cipher is used to get the session to the Up
  state, and the session expects to transition to ISAAC afterwards.  Thus,
  we no longer have the opportunity for an implementation that doesn't
  support optimization to have the session half transition to up using the
  strong mode and fail once the switch attempts to a mode it doesn't
  understand.
- We might want to consider having the shared secret used for both strong
  and optimized mode.  While we've had discussion that we might not want to
  do this, having a common shared secret means that misconfiguration stops
  being the operational consideration that drives the most likely reasons
  for failure of the transition to optimized authentication.
  + This can be a SHOULD for the above reasons.
  + If the operator does not want to use the same shared secret, that's
still fine. It just means they're accepting the potential additional
fragility.
- The NULL auth mechanism is moved out of the optimized draft into the
  stability draft.

For BFD stability:
- The NULL auth method is pulled into this document.
- The NULL auth's procedures are slightly updated such that the sequence
  number SHOULD NOT be used for authentication.  Effectively, it transitions
  to a counter.  This avoids the ability to use it for attacking the
  protocol as noted in prior discussion.
- The NULL auth security properties are no worse at that point than no
  authentication.  
- Existing meticulous methods can be used as well - no change.
- ISAAC can be used when optimized mode is in use.  No change. 
  + ISAAC mode cannot be used alone. Its procedures for entraining the
sequence numbers currently mean it can't be the only authentication.

Lingering cleanup:
- The IANA considerations and the YANG definitions need to be readjusted
  based on where we move things.

Thoughts?

-- Jeff


-