Re: Can a BFD session change its source port to facilitate auto recovery

Jeffrey Haas Thu, 23 Mar 2023 11:37:18 -0700


> On Mar 23, 2023, at 2:17 PM, Reshad Rahman 
> <[email protected]> wrote:
> 
> Hi all,
> 
> +1 to Jeff's comment on not wanting to pretend that everything is fine.
> 
> And if we're running BFD single-hop and BFDoLAG where needed, this is a 
> non-issue right?


Not quite.

In theory, if we had a full set of link tests from A..Z, including exercising 
each LAG member, one would think everything should be fine.  This is an ideal 
basis case.

In practice, what's often seen is that even with full coverage of the paths 
that there are end-to-end forwarding faults for various reasons.  In at least 
some of these cases it's because BFD is implemented in a layer that isn't 
exercising the full data path.  To pick a somewhat vendor neutral example, 
consider BFD implemented directly on the line card but not participating in the 
layer 3 ECMP load balancer, or at the LAG level not participating in the layer 
2 equivalent.

It's for reasons like this that we have discussions about whether it makes 
sense to run single-hop BFD in addition to BFD-on-LAG covering the same link.

(It's also worth reminding the Working Group that these types of discussions 
were a motivation for the LIME Working Group we had some years ago.  It very 
much covered this space, but didn't come to successful outcomes.)

Going back to Abhinav's original question, here are my own observations:

RFC 5880 tells us that once a session is Up, we should demultiplex solely based 
on the Discriminators.  (RFC 5880, §6.3)

RFC 5881, used by RFC 5883 tells us that we MUST NOT change the source ports.  
However, it doesn't provide a lot of justification for the WHY of that.  Given 
the prior point, what is the harm?  Some speculation:

- Even if you MUST demux based on Discriminators, I wouldn't place wagers on 
there being no implementations that aren't looking at the full layer-4 
signature as part of the procedures.  In particular, middlebox steering may get 
in the way.
- It's often necessary for hardware based BFD implementations to put in 
exceptions to rate policers to permit BFD to work.

Speculation aside, changing the source port most likely would work.

Is it a good idea?  Probably not.  

Is it a great tool to try to exercise specific legs of an ECMP?  Almost 
certainly not at high rates.  It'd also be clumsy.

Could you do this with some level of success?  Probably.

Would I want to support debugging issues with this as a vendor?  No.

-- Jeff

Re: Can a BFD session change its source port to facilitate auto recovery

Reply via email to