Re: Resolving lingering issues with BFD authentication drafts
Reshad, > On Feb 29, 2024, at 2:36 PM, Reshad Rahman wrote: > > Jeff, > > The only thing I am still a bit hesitant about is delaying the notification > to the BFD clients (that the session is up) until we've successfully moved to > the optimized mode. It's not the actual delay, which should be short, but the > fact that it's changing the BFD state machine a bit. But I don't see any > other way to do this without the risk of bouncing the BFD session. It's worth pointing out that the "bfd holddown" features implemented by multiple vendors ALREADY does this. And, when it's in use, the client will wait for stuff on the order of seconds to minutes for some providers. So, while I agree that it's a change, and theoretically a scary one, it's well deployed already in some form. In the interest of honesty, such holddown features didn't interoperate in their use terribly well and was the reason for the PROTOCOLS / clients to refine how they used BFD - hence the "strict" features. Again, not a problem with the idea of holddown, but rather than RFC 5882 was a bit too general in its advice. > Regarding "until we've successfully switched over to the optimized > mechanism", what does a successful switch mean? Does it mean sending detect > mult packets with optimized auth? I believe it means that both sides are using the optimized mode. It can't be only one side since the potential session drop will only happen when each side turns on optimization. -- Jeff
Re: Resolving lingering issues with BFD authentication drafts
Jeff, The only thing I am still a bit hesitant about is delaying the notification to the BFD clients (that the session is up) until we've successfully moved to the optimized mode. It's not the actual delay, which should be short, but the fact that it's changing the BFD state machine a bit. But I don't see any other way to do this without the risk of bouncing the BFD session. Regarding "until we've successfully switched over to the optimized mechanism", what does a successful switch mean? Does it mean sending detect mult packets with optimized auth? Regards,Reshad. On Sunday, February 25, 2024, 06:53:52 PM EST, Jeffrey Haas wrote: Reshad, On Feb 25, 2024, at 5:31 PM, Reshad Rahman wrote: Jeff, overall this looks to be a good way forward, it addresses the main concern I had expressed. Excellent. On Friday, February 23, 2024, 04:32:55 PM EST, Jeffrey Haas wrote: - The optimization procedures currently can have BFD go Up with the initial stronger authentication, then go down once the optimized mode kicks in. That's the scenario where only 1 end supports optimized procedures? In the current version of the document, yes. That's an item the suggested changes are intended to address. Possible ways to address these: For BFD optimization: [...] - Optimized authentication should kick in ASAP when we are in the Up state. I believe this means that we send out at least Detect Mult packets in the strong mechanism and then switch to the optimized mechanism. This bounds the amount of time when we're not running in optimized mode. Why does optimized procedures need to kick in asap? Is this in case there's an issue with the optimized procedures? The general concern is not overly delaying the client's idea of when BFD transitions to Up. The suggested changes take us from Up to an internal "pending" state waiting for the optimized mode to kick in. We can theoretically linger there however long we like since we've signaled that this change is coming, but it's not helpful to the client to linger there longer than necessary. The suggestion above is really the lowest bound on time we can take for such a transition to ensure we can safely transition to ISAAC mode and entrain the sequence numbers for the ISAAC algorithm. - BFD clients that are expecting optimized authentication SHOULD NOT convey BFD sessions (not clients)? Session on a client. :-) to their clients that the session is in the Up state until we've successfully switched over to the optimized mechanism. While this seems contrary to BFD behavior, it's no different than any of the existing "holddown" procedures clients like BGP can implement to ensure that BFD is stable for long enough before using the session. Is this in case there's an issue with the optimized procedures?If yes, do we also need some text for the case where optimized procedures fail? e.g., at a certain point we have to stick to strong auth but do we retry eventually (that could cause the session to go down if we do)? >From the client's session perspective, BFD simply is Up/Down as normal. >From the protocol perspective, lingering forever waiting for the optimized >mode kicking in isn't what we'd want. So, yes, we need some form of default >timeout recommended for implementors. If we repeatedly bounce the BFD session from Up to Down only at the transition to the optimized mode, we likely want to dampen that behavior. At least with the new code points we have a sense that this transition to the optimized mode is the actual problem between devices that have agreed to use that authentication type. Thoughts? When transitioning from strong auth to optimized procedures, could we send both types of packets when attempting the transition? The aim being to avoid the BFD session from going down. I haven't thought this through so this may well not hold water. For the ISAAC procedures, the only requirement is that we believe the other side of the session has seen at least one Up packet out of the expected Detect Mult packets. That's sufficient for the entraining procedure. Once we have entrained the ISAAC session, we should be able to flip in and out of the optimized mode at will. The idea I think you're trying to convey is similar to how other protocols handle a graceful rollover for key. That's normally done by having the rollover timeframe being willing to authenticate with both old and new key, not having the side generating the packets sending it twice. For BFD in particular, sending the same PDU with different auth types would probably play havoc with the meticulously increasing sequence number requirements. Further, there's no mechanism we have to convey that we've successfully processed the rollover. -- Jeff
Re: Resolving lingering issues with BFD authentication drafts
Reshad, > On Feb 25, 2024, at 5:31 PM, Reshad Rahman wrote: > > Jeff, overall this looks to be a good way forward, it addresses the main > concern I had expressed. Excellent. > On Friday, February 23, 2024, 04:32:55 PM EST, Jeffrey Haas > wrote: > - The optimization procedures currently can have BFD go Up with the initial > stronger authentication, then go down once the optimized mode kicks in. > > That's the scenario where only 1 end supports optimized procedures? In the current version of the document, yes. That's an item the suggested changes are intended to address. > Possible ways to address these: > > For BFD optimization: > [...] > - Optimized authentication should kick in ASAP when we are in the Up state. > I believe this means that we send out at least Detect Mult packets in the > strong mechanism and then switch to the optimized mechanism. This bounds > the amount of time when we're not running in optimized mode. > > Why does optimized procedures need to kick in asap? Is this in case > there's an issue with the optimized procedures? The general concern is not overly delaying the client's idea of when BFD transitions to Up. The suggested changes take us from Up to an internal "pending" state waiting for the optimized mode to kick in. We can theoretically linger there however long we like since we've signaled that this change is coming, but it's not helpful to the client to linger there longer than necessary. The suggestion above is really the lowest bound on time we can take for such a transition to ensure we can safely transition to ISAAC mode and entrain the sequence numbers for the ISAAC algorithm. > > - BFD clients that are expecting optimized authentication SHOULD NOT convey > BFD sessions (not clients)? Session on a client. :-) > to their clients that the session is in the Up state until we've > successfully switched over to the optimized mechanism. While this seems > contrary to BFD behavior, it's no different than any of the existing > "holddown" procedures clients like BGP can implement to ensure that BFD is > stable for long enough before using the session. > Is this in case there's an issue with the optimized procedures? > If yes, do we also need some text for the case where optimized procedures > fail? e.g., at a certain point we have to stick to strong auth but do we > retry eventually (that could cause the session to go down if we do)? From the client's session perspective, BFD simply is Up/Down as normal. From the protocol perspective, lingering forever waiting for the optimized mode kicking in isn't what we'd want. So, yes, we need some form of default timeout recommended for implementors. If we repeatedly bounce the BFD session from Up to Down only at the transition to the optimized mode, we likely want to dampen that behavior. At least with the new code points we have a sense that this transition to the optimized mode is the actual problem between devices that have agreed to use that authentication type. > > Thoughts? > When transitioning from strong auth to optimized procedures, could we > send both types of packets when attempting the transition? The aim being to > avoid the BFD session from going down. I haven't thought this through so this > may well not hold water. For the ISAAC procedures, the only requirement is that we believe the other side of the session has seen at least one Up packet out of the expected Detect Mult packets. That's sufficient for the entraining procedure. Once we have entrained the ISAAC session, we should be able to flip in and out of the optimized mode at will. The idea I think you're trying to convey is similar to how other protocols handle a graceful rollover for key. That's normally done by having the rollover timeframe being willing to authenticate with both old and new key, not having the side generating the packets sending it twice. For BFD in particular, sending the same PDU with different auth types would probably play havoc with the meticulously increasing sequence number requirements. Further, there's no mechanism we have to convey that we've successfully processed the rollover. -- Jeff
Re: Resolving lingering issues with BFD authentication drafts
Jeff, overall this looks to be a good way forward, it addresses the main concern I had expressed. BFD WG, please take a look at the procedures outlined below and provide feedback. Comments/questions inline. On Friday, February 23, 2024, 04:32:55 PM EST, Jeffrey Haas wrote: Here's an attempt to provide a path to resolve the lingering issues in the authentication drafts. Core lingering issues: - The NULL auth method is attackable, but still potentially useful for the stability procedures. - The optimization procedures currently can have BFD go Up with the initial stronger authentication, then go down once the optimized mode kicks in. That's the scenario where only 1 end supports optimized procedures? Right now, the text doesn't place any bounds on how long it might be until the optimized procedures are initiated once the session moves to Up. The issue here is less about bouncing the BFD session, but the impact on BFD clients. Possible ways to address these: For BFD optimization: - We remove no-authentication and NULL-authentication as methods for the optimized session. This leaves us solely with one defined method that both provides good enough security. It also leaves us room to add other authentications in the future that have similar properties. - Optimized authentication should kick in ASAP when we are in the Up state. I believe this means that we send out at least Detect Mult packets in the strong mechanism and then switch to the optimized mechanism. This bounds the amount of time when we're not running in optimized mode. Why does optimized procedures need to kick in asap? Is this in case there's an issue with the optimized procedures? - BFD clients that are expecting optimized authentication SHOULD NOT convey BFD sessions (not clients)? to their clients that the session is in the Up state until we've successfully switched over to the optimized mechanism. While this seems contrary to BFD behavior, it's no different than any of the existing "holddown" procedures clients like BGP can implement to ensure that BFD is stable for long enough before using the session. Is this in case there's an issue with the optimized procedures?If yes, do we also need some text for the case where optimized procedures fail? e.g., at a certain point we have to stick to strong auth but do we retry eventually (that could cause the session to go down if we do)? This is also not the length of time such features want. BGP BFD holddown is in the multiples of seconds time frame. I believe we want something that is within two Detection Intervals once the session is Up. + It should be noted we already require sending out this number of Up packets in the strong mode for entraining ISAAC. However, I'm not sure if our procedures are clear on that point. To be audited. - How does a client tell that "we are expecting optimized authentication"? We define parallel authentication code points for the procedure. Today, our strong meticulous features are currently meticulous md5 and sha1; code points 5 and 3, respectively. We allocate two new code points, "ISAAC-optimized meticulous sha-1" and "ISAAC-optimized meticulous md5". When these code points are used, the expectation is the strong cipher is used to get the session to the Up state, and the session expects to transition to ISAAC afterwards. Thus, we no longer have the opportunity for an implementation that doesn't support optimization to have the session half transition to up using the strong mode and fail once the switch attempts to a mode it doesn't understand. I like it! - We might want to consider having the shared secret used for both strong and optimized mode. While we've had discussion that we might not want to do this, having a common shared secret means that misconfiguration stops being the operational consideration that drives the most likely reasons for failure of the transition to optimized authentication. + This can be a SHOULD for the above reasons. + If the operator does not want to use the same shared secret, that's still fine. It just means they're accepting the potential additional fragility. - The NULL auth mechanism is moved out of the optimized draft into the stability draft. For BFD stability: - The NULL auth method is pulled into this document. - The NULL auth's procedures are slightly updated such that the sequence number SHOULD NOT be used for authentication. Effectively, it transitions to a counter. This avoids the ability to use it for attacking the protocol as noted in prior discussion. - The NULL auth security properties are no worse at that point than no authentication. - Existing meticulous methods can be used as well - no change. - ISAAC can be used when optimized mode is in use. No change. + ISAAC mode cannot be used alone. Its procedures for entraining the sequence numbers currently mean it can't be
Resolving lingering issues with BFD authentication drafts
Here's an attempt to provide a path to resolve the lingering issues in the authentication drafts. Core lingering issues: - The NULL auth method is attackable, but still potentially useful for the stability procedures. - The optimization procedures currently can have BFD go Up with the initial stronger authentication, then go down once the optimized mode kicks in. Right now, the text doesn't place any bounds on how long it might be until the optimized procedures are initiated once the session moves to Up. The issue here is less about bouncing the BFD session, but the impact on BFD clients. Possible ways to address these: For BFD optimization: - We remove no-authentication and NULL-authentication as methods for the optimized session. This leaves us solely with one defined method that both provides good enough security. It also leaves us room to add other authentications in the future that have similar properties. - Optimized authentication should kick in ASAP when we are in the Up state. I believe this means that we send out at least Detect Mult packets in the strong mechanism and then switch to the optimized mechanism. This bounds the amount of time when we're not running in optimized mode. - BFD clients that are expecting optimized authentication SHOULD NOT convey to their clients that the session is in the Up state until we've successfully switched over to the optimized mechanism. While this seems contrary to BFD behavior, it's no different than any of the existing "holddown" procedures clients like BGP can implement to ensure that BFD is stable for long enough before using the session. This is also not the length of time such features want. BGP BFD holddown is in the multiples of seconds time frame. I believe we want something that is within two Detection Intervals once the session is Up. + It should be noted we already require sending out this number of Up packets in the strong mode for entraining ISAAC. However, I'm not sure if our procedures are clear on that point. To be audited. - How does a client tell that "we are expecting optimized authentication"? We define parallel authentication code points for the procedure. Today, our strong meticulous features are currently meticulous md5 and sha1; code points 5 and 3, respectively. We allocate two new code points, "ISAAC-optimized meticulous sha-1" and "ISAAC-optimized meticulous md5". When these code points are used, the expectation is the strong cipher is used to get the session to the Up state, and the session expects to transition to ISAAC afterwards. Thus, we no longer have the opportunity for an implementation that doesn't support optimization to have the session half transition to up using the strong mode and fail once the switch attempts to a mode it doesn't understand. - We might want to consider having the shared secret used for both strong and optimized mode. While we've had discussion that we might not want to do this, having a common shared secret means that misconfiguration stops being the operational consideration that drives the most likely reasons for failure of the transition to optimized authentication. + This can be a SHOULD for the above reasons. + If the operator does not want to use the same shared secret, that's still fine. It just means they're accepting the potential additional fragility. - The NULL auth mechanism is moved out of the optimized draft into the stability draft. For BFD stability: - The NULL auth method is pulled into this document. - The NULL auth's procedures are slightly updated such that the sequence number SHOULD NOT be used for authentication. Effectively, it transitions to a counter. This avoids the ability to use it for attacking the protocol as noted in prior discussion. - The NULL auth security properties are no worse at that point than no authentication. - Existing meticulous methods can be used as well - no change. - ISAAC can be used when optimized mode is in use. No change. + ISAAC mode cannot be used alone. Its procedures for entraining the sequence numbers currently mean it can't be the only authentication. Lingering cleanup: - The IANA considerations and the YANG definitions need to be readjusted based on where we move things. Thoughts? -- Jeff -