Re: [Lsr] BFD aspects

Aijun Wang Mon, 29 Nov 2021 19:07:35 -0800

Hi, Greg:

Even the BFD auto-configuration extensions has been standardized and 
implemented, won’t the network be filled with the detect packets, instead of 
the user packets?

For PUA/PULSE solution, the mentioned LSA will only be emerged when the node 
status change from “UP” to “DOWN”, but the BFD packet will be sent continuously 
when these PEs are active. 

Which one is efficient?

Certainly, we will consider the massive failure situations, even it will occur 
in very rare circumstances.

Best Regards

Aijun Wang

China Telecom

From: Greg Mirsky <[email protected]> 
Sent: Tuesday, November 30, 2021 10:47 AM
To: Aijun Wang <[email protected]>
Cc: lsr <[email protected]>; Gyan Mishra <[email protected]>; Robert Raszuk 
<[email protected]>
Subject: Re: [Lsr] BFD aspects

Hi Aijun,

thank you for confirming that it is not the conclusion one can arrive based on 
my discussion with Robert. Secondly, the problem you describe, I wouldn't 
characterize as a scaling issue with using multi-hop BFD monitoring path 
continuity in the underlay network. In my opinion, it is an operational 
overhead that can be addressed by an intelligent management plane or a few 
extensions in the control plane that is setting an overlay. Since the 
management plane is usually a proprietary solution, I invite anyone interested 
in working on BFD auto-configuration extensions in the control plane. I much 
appreciate references to the use cases that can benefit from such extensions.

Regards,

Greg

On Mon, Nov 29, 2021 at 6:26 PM Aijun Wang <[email protected] 
<mailto:[email protected]> > wrote:

Hi, Greg:

Firstly, regardless of which methods to be used for the multihop BFD approach, 
it is certainly the configuration overhead if you image there are 10,000 PEs as 
Tony often raised as one example. 

Shouldn’t you configure each pair of them to detect the PE-PE connection?

It is obvious not scalable.

Best Regards

Aijun Wang

China Telecom

From: Greg Mirsky <[email protected] <mailto:[email protected]> > 
Sent: Tuesday, November 30, 2021 10:18 AM
To: Aijun Wang <[email protected] <mailto:[email protected]> >
Cc: Gyan Mishra <[email protected] <mailto:[email protected]> >; Robert 
Raszuk <[email protected] <mailto:[email protected]> >; lsr <[email protected] 
<mailto:[email protected]> >
Subject: Re: [Lsr] BFD aspects

Hi Aijun,

could you please elaborate on how you see that this discussion leads to the 
"BFD based detection for the mentioned problem is not [...] scalable(among 
PEs)" conclusion? I hope that there's nothing I've said or suggested lead you 
to this conclusion. Personally, I believe that BFD-based PE-PE is the best 
technical solution. I understand that an operator may be dissatisfied with the 
additional configuration of the BFD session. As noted, I believe that can be 
addressed in the management plane or minor extensions in the control plane (BGP 
or not). If a particular implementation (or a combination of the implementation 
and HW) has a scaling challenge with multi-hop BFD, then that could be not 
enough sufficient technical justification for a somewhat controversial proposal.

Regards,

Greg

On Mon, Nov 29, 2021 at 5:17 PM Aijun Wang <[email protected] 
<mailto:[email protected]> > wrote:

>From the discussion, I think we can get the conclusion that BFD based 
>detection for the mentioned problem is not reliable (between PE/RR) and 
>scalable(among PEs).

Then also the BGP based solution.

So let’s focus how to implement it within the IGP?  Thanks Greg’s analysis.

And one supplement for Robert’s comments: RR is always not located within the 
same area as PEs, then can’t know the down of PE nodes immediately when the 
summary is configured between areas.

Best Regards

Aijun Wang

China Telecom

From: [email protected] <mailto:[email protected]>  <[email protected] 
<mailto:[email protected]> > On Behalf Of Gyan Mishra
Sent: Tuesday, November 30, 2021 8:44 AM
To: Robert Raszuk <[email protected] <mailto:[email protected]> >
Cc: Greg Mirsky <[email protected] <mailto:[email protected]> >; lsr 
<[email protected] <mailto:[email protected]> >
Subject: Re: [Lsr] BFD aspects

Robert 

On Mon, Nov 29, 2021 at 7:35 PM Robert Raszuk <[email protected] 
<mailto:[email protected]> > wrote:

Hi Greg,

If BFD would have autodiscovery built in, that would indeed be the ultimate 
solution. Of course folks will worry about scaling and number of BFD sessions 
to be run PE-PE. 

GIM>> I sense that it is not "BFD autodiscovery" but an advertisement of BFD 
multi-hop system readiness to the particular PE. That, as I think of it, can be 
done in a control or management plane.

Agreed. 

But if BFD between all PEs would be an option why RR to PE in the local area 
would not be a viable solution ? 

GIM>>Because, in the case of PE-PE, BFD control packets will be fate-sharing 
with data packets. But the path between RR and PE might not be used for 
carrying data packets at all.

100%. But that was accounted for. Reason being that you have at least two RRs 
in an area. The point of BFD was to use detect that PE went down. 

Gyan> What Greg is alluding is a very good point to consider is that the RR in 
many cases in operator networks sit in the “control plane” path which is 
separate from the data plane path.  So the E2E forwarding plane path between 
the PEs, the RR has no knowledge as is it sits outside the forwarding plane 
path.  That being said the PE to RR path is disjoint from the PE-PE path so 
from the PE-RR  RR POV may think the PE is up or down thus the false positive 
or negative. That would be the case regardless of how many RRs are deployed.

You are absolutely right that it may report RR disconnect from the network 
while PE is up and data plane from remote PEs can reach it. That is why we have 
more than one RR. 

As far as fate sharing PE-PE BFD with real user data - I think it is not always 
the case. But this is completely separate discussion :) 

Also please keep in mind that PE going down can be learned by RRs by listening 
to the IGP. No BFD needed. 

Both would be multihop, both would be subject to all transit failures etc ... 

GIM>> I think that there's a difference between the impact a path failure has 
on the data traffic. In the case of monitoring PE-PE path in the underlay and 
using the same encapsulation as data traffic is representative of the data 
experience. A failure of the PE-RR path, in my understanding, may be not 
representative at all. BFD session between RR and PE may fail while PE is 
absolutely functional from the service PoV. 

Please keep in mind that this entire discussion is not about data plane failure 
end to end :)  Yes, it's pretty sad. This entire debate  is to indicate domain 
wide that the IGP component on a PE went down. 

No one considers data plane liveness and even as you observed data plane 
encapsulation congruence. Clearly this is not a true OAM discussion. 

On the other hand, PE might be disconnected from the service while the BFD 
session to RR is in the Up state.

Not likely if you keep in mind that to trigger any remote action such failure 
would have to happen to all RRs. 

Thx a lot,
R.

_______________________________________________
Lsr mailing list
[email protected] <mailto:[email protected]> 
https://www.ietf.org/mailman/listinfo/lsr

-- 

 <http://www.verizon.com/> 

Gyan Mishra

Network Solutions Architect 

Email [email protected] <mailto:[email protected]> 

M 301 502-1347

_______________________________________________
Lsr mailing list
[email protected] <mailto:[email protected]> 
https://www.ietf.org/mailman/listinfo/lsr

_______________________________________________
Lsr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lsr

Re: [Lsr] BFD aspects

Reply via email to