Re: [EXTERNAL] Can a BFD session change its source port to facilitate auto recovery

xiao.min2 Mon, 27 Mar 2023 08:06:52 -0700

Hi Sasha,






Please see inline...



Original



From: AlexanderVainshtein <[email protected]>
To: Jeff Tantsura <[email protected]>;肖敏10093570;
Cc: Abhinav Srivastava <[email protected]>;[email protected] <[email protected]>;
Date: 2023年03月27日 08:55
Subject: RE: [EXTERNAL] Can a BFD session change its source port to facilitate 
auto recovery




Jeff, Xiao and all,


FWIW I concur with Jeff.


[XM]>>> Thank you for the feedback. I might be wrong, just to make things more 
clear by discussing.






It is not clear to me what exactly the reference to BFD for SR Policies means.


E.g., if Seamless BFD (RFC 7880 and RFC 7881) is used, the reflected packets 
are sent as native IPv4 or IPv6 packets and can encounter problems that are not 
related to the state of the monitored candidate path of the SR Policy.


[XM]>>> Please refer to 
https://datatracker.ietf.org/doc/draft-liu-spring-bfd-srv6-policy-encap/. 






Best Regards,


Xiao Min






My 2c,


Sasha



 



From: Jeff Tantsura <[email protected]> 
 Sent: Monday, March 27, 2023 1:29 AM
 To: Xiao Min <[email protected]>
 Cc: Abhinav Srivastava <[email protected]>; Alexander Vainshtein 
<[email protected]>; [email protected]
 Subject: Re: [EXTERNAL] Can a BFD session change its source port to facilitate 
auto recovery




 


Hi Xiao,


 


please see inline



 
 


On Mar 24, 2023, at 5:43 PM, <[email protected]> <[email protected]> 
wrote:



 


Jeff,


 


Please see inline...


Original



From: JeffTantsura <[email protected]>



To: 肖敏10093570;



Cc: [email protected] <[email protected]>;[email protected] 
<[email protected]>;[email protected] <[email protected]>;



Date: 2023年03月24日 16:48



Subject: Re: [EXTERNAL] Re: Can a BFD session change its source port to 
facilitate auto recovery




That’s not going to fly, number of ECMP paths in today’s networks could be 
anywhere between 2 and 500+, how many of these would you exercise, how would 
you know that you have covered all of them?


[XM]>>> The number of links/LAGs seems much higher than the number of ECMP 
paths. If otherwise I have to run SH BFD on each link/LAG, why not try to run 
MH BFD on each ECMP path? :-) As to the coverage, BFD+IOAM may help, because 
IOAM can tell you the path BFD packet really takes.









[jeff] the number of p2p connections between 2 directly attached IP end-points 
is rarely larger than 32 (either LAG or ECMP), SH BFD sessions are distributed 
across the path traversed and coherency between IP connectivity matrix and BFD 
sessions between any given pair of directly connected IP end-points can easily 
be guaranteed, end2end (MH BFD) is between non directly attached end-points and 
is subject to network topology and routing, and has to be re-evaluated on any 
change.



INT doesn’t really help here, hashing decisions are local, any changes (local 
or global) might change the hashing results, unless you build a full mesh of 
source routed paths… but then, why BFD at all, you could use INT only instead, 
take a look at HPCC draft 
 
 



The role of MH BFD is to verify reachability between 2 non directly connected 
IP end-points, not to monitor every path available.


[XM]>>> IMHO BFD for SR Policy does care about the path, and some SP's networks 
require bidirectional path consistency while employing BFD.









[jeff] how did we get to SR here? If you have got a strict source routed path, 
you only need to validate that path, if it is loose however, same issues
 
 



 


Best Regards,


Xiao Min


 


As a viable solution, run SH BFD on each link/LAG, MH BFD end2end and make sure 
your timers are aligned and not interact with each other in funny ways.


 


Cheers,


Jeff





 
 


On Mar 24, 2023, at 09:26, [email protected] wrote:






Hi Abhinav,


 


When I come across your problem, the first idea coming into my mind is not 
trying to change the source port for a BFD session, but to run multiple BFD 
sessions between the two peers, using each BFD session to monitor a respective 
ECMP path, and then the application would not be declared in failure unless all 
the BFD sessions go down.


 


Best Regards,


Xiao Min


 









From: AbhinavSrivastava <[email protected]>



To: Alexander Vainshtein <[email protected]>;



Cc: [email protected] <[email protected]>;



Date: 2023年03月23日 22:27



Subject: Re: [EXTERNAL] Re: Can a BFD session change its source port to 
facilitate auto recovery




Agree that deletion and recreation (possibly automatically) by associated 
protocol is a good alternative, instead of inbuilt BFD recovery. 


 


Thanks



Abhinav





 


On Thu, 23 Mar, 2023, 3:08 am Alexander Vainshtein, 
<[email protected]> wrote:



Abhinav, Jeff and all,


FWIW I concur with Jeff.


 


In my experience, MH IP BFD sessions are typically used to monitor peering 
between iBGP neighbors, and when the MH IP BFD session goes down, BGP treats 
this as if its session has gone – and deletes the MH IP BFD session in question.


 


I.e., fast recovery of such a session will not happen until BGP would not 
re-create it.


 


Regards,


Sasha



 



From: Rtg-bfd <[email protected]> On Behalf Of Jeff Tantsura
 Sent: Thursday, March 23, 2023 8:27 AM
 To: Abhinav Srivastava <[email protected]>
 Cc: [email protected]
 Subject: [EXTERNAL] Re: Can a BFD session change its source port to facilitate 
auto recovery




 


Abhinav,


 


Let’s clarify a couple of points.



What you are trying to do is to change entropy to change local hashing outcome, 
however for hashing to even be relevant there has to he either ECMP or LAG in 
the path to the destination otherwise shortest path will be he used regardless, 
so statistically, some of the flows between a given pair of end points (5 
tuple) will be traversing the (partially)broken link, would you really like BFD 
to “pretend“ that everything is just fine?



Moreover, by far, in case of congestion  - most applications won’t change their 
ports but have their TX rate reduced.



There’s work done by Tom Herbert for IPv6/TCP (kernel patch upstreamed a few 
years ago)  - had beeb presented in RTGWG pre-Covid, that on RTO changes flow 
label value (that some might or might not include in hashing), which is 
strongly not recommended to be used outside of a tightly controlled homogenous  
environment (think within DC).



Outside of what BFD spec tells us (don’t), the above should provide enough 
motivation not to do this.


Cheers,


Jeff




 


On Mar 23, 2023, at 05:44, Abhinav Srivastava <[email protected]> wrote:






Multi-hop BFD would be the mechanism that detects the failure on the path it 
happens to be using for the session. I wasn't thinking of another mechanism.  
Detection timer expiry would be the trigger for recovery which could be 
augmented with few other possible criteria like how long session hasn't been 
able to come back up or prolonged flapping. 



 


Thanks



Abhinav



 


On Wed, 22 Mar, 2023, 3:05 pm Greg Mirsky, <[email protected]> wrote:



Hi Abhinav,


thank you for presenting an interesting scenario for a discussion. I have 
several questions to better understand it:



·       How the network failure that triggers the recovery process is detected?


·       If the failure detection mechanism is not multi-hop BFD, what is the 
relationship between the detection intervals of heat mechanism and the 
multi-hop BFD session?


Regards,




Greg




 


On Wed, Mar 22, 2023 at 4:36 PM Abhinav Srivastava <[email protected]> wrote:



Hi all,


 


I needed clarification around whether source port can be changed for a BFD 
session in case of multi hop BFD.   The ability to change BFD source port when 
BFD session goes down helps BFD session to recover if its stuck on a network 
path where there is some intermittent but significant packet loss.


 


In such cases, normally without BFD, end to end application traffic would 
eventually settle down on a good path as applications typically change source 
port after experiencing disconnection or failures.  But if BFD is being used to 
monitor some part of a path which is experiencing significant but not 100% 
packet loss, it will start causing next hop list of associated static route or 
the associated BGP sessions to start flapping forever, as BFD packets would be 
stuck to that partial lossy path forever (until BFD session is deleted and 
recreated by admin action).  This may also hinder the typical application 
recovery strategy of changing source port on failure.


 


Ability to dynamically change BFD source port can help BFD recover in such 
cases.  Is this something that is allowed as per RFC?  The RFC5881, section 4 
(for single hop) case states that –


“The source port MUST be in the range 49152 through 65535. The same UDP source 
port number MUST be used for all BFD Control packets associated with a 
particular session”


 


Thanks


Abhinav











 Notice: This e-mail together with any attachments may contain information of 
Ribbon Communications Inc. and its Affiliates that is confidential and/or 
proprietary for the sole use of the intended recipient. Any review, disclosure, 
reliance or distribution by others or forwarding without express permission is 
strictly prohibited. If you are not the intended recipient, please notify the 
sender immediately and then delete all copies, including any attachments.










 





 



 Notice: This e-mail together with any attachments may contain information of 
Ribbon Communications Inc. and its Affiliates that is confidential and/or 
proprietary for the sole use of the intended recipient. Any review, disclosure, 
reliance or distribution by others or forwarding without express permission is 
strictly prohibited. If you are not the intended recipient, please notify the 
sender immediately and then delete all copies, including any attachments.

Re: [EXTERNAL] Can a BFD session change its source port to facilitate auto recovery

Reply via email to