Hi all,
+1 to Jeff's comment on not wanting to pretend that everything is fine.
And if we're running BFD single-hop and BFDoLAG where needed, this is a 
non-issue right?
Regards, Reshad (no hat).
    On Thursday, March 23, 2023, 02:27:21 AM EDT, Jeff Tantsura 
<[email protected]> wrote:  
 
 Abhinav,
Let’s clarify a couple of points.What you are trying to do is to change entropy 
to change local hashing outcome, however for hashing to even be relevant there 
has to he either ECMP or LAG in the path to the destination otherwise shortest 
path will be he used regardless, so statistically, some of the flows between a 
given pair of end points (5 tuple) will be traversing the (partially)broken 
link, would you really like BFD to “pretend“ that everything is just 
fine?Moreover, by far, in case of congestion  - most applications won’t change 
their ports but have their TX rate reduced.There’s work done by Tom Herbert for 
IPv6/TCP (kernel patch upstreamed a few years ago)  - had beeb presented in 
RTGWG pre-Covid, that on RTO changes flow label value (that some might or might 
not include in hashing), which is strongly not recommended to be used outside 
of a tightly controlled homogenous  environment (think within DC).Outside of 
what BFD spec tells us (don’t), the above should provide enough motivation not 
to do this.

Cheers,
Jeff

On Mar 23, 2023, at 05:44, Abhinav Srivastava <[email protected]> wrote:



Multi-hop BFD would be the mechanism that detects the failure on the path it 
happens to be using for the session. I wasn't thinking of another mechanism.  
Detection timer expiry would be the trigger for recovery which could be 
augmented with few other possible criteria like how long session hasn't been 
able to come back up or prolonged flapping. 
ThanksAbhinav
On Wed, 22 Mar, 2023, 3:05 pm Greg Mirsky, <[email protected]> wrote:

Hi Abhinav,thank you for presenting an interesting scenario for a discussion. I 
have several questions to better understand it:   
   - How the network failure that triggers the recovery process is detected?
   - If the failure detection mechanism is not multi-hop BFD, what is the 
relationship between the detection intervals of heat mechanism and the 
multi-hop BFD session?
Regards,Greg
On Wed, Mar 22, 2023 at 4:36 PM Abhinav Srivastava <[email protected]> wrote:


Hi all,

 

I needed clarification around whether source port can be changed for a BFD 
session in case of multi hop BFD.   The ability to change BFD source port when 
BFD session goes down helps BFD session to recover if its stuck on a network 
path where there is some intermittent but significant packet loss.

 

In such cases, normally without BFD, end to end application traffic would 
eventually settle down on a good path as applications typically change source 
port after experiencing disconnection or failures.  But if BFD is being used to 
monitor some part of a path which is experiencing significant but not 100% 
packet loss, it will start causing next hop list of associated static route or 
the associated BGP sessions to start flapping forever, as BFD packets would be 
stuck to that partial lossy path forever (until BFD session is deleted and 
recreated by admin action).  This may also hinder the typical application 
recovery strategy of changing source port on failure.

 

Ability to dynamically change BFD source port can help BFD recover in such 
cases.  Is this something that is allowed as per RFC?  The RFC5881, section 4 
(for single hop) case states that –

“The source port MUST be in the range 49152 through 65535. The same UDP source 
port number MUST be used for all BFD Control packets associated with a 
particular session”

 

Thanks

Abhinav



  

Reply via email to