Re: [Lsr] BFD aspects

Greg Mirsky Mon, 29 Nov 2021 19:09:23 -0800

Hi Gyan,
thank you for pointing out the possible confusion I've caused by not being
clear about which option of using the multi-hop BFD I propose to advance
and support with auto-configuration. It is PE-PE, not the PE-RR option
(using Robert's original email). I see the multi-hop BFD between PEs as the
natural solution to the problem of achieving faster service (overlay)
convergence.
What do you think?


Regards,
Greg

On Mon, Nov 29, 2021 at 6:57 PM Gyan Mishra <[email protected]> wrote:

> Hi Greg
>
> Is there any advantage for operators to run multi hop BFD PE-RR BGP client
> versus single hop BFD for IGP P2P clients which is typically done.  Also
> any issues with running both above simultaneously also any benefit in doing
> so.
>
> Many Thanks
>
> Gyan
>
> On Mon, Nov 29, 2021 at 9:47 PM Greg Mirsky <[email protected]> wrote:
>
>> Hi Aijun,
>> thank you for confirming that it is not the conclusion one can arrive
>> based on my discussion with Robert. Secondly, the problem you describe, I
>> wouldn't characterize as a scaling issue with using multi-hop BFD
>> monitoring path continuity in the underlay network. In my opinion, it is an
>> operational overhead that can be addressed by an intelligent management
>> plane or a few extensions in the control plane that is setting an overlay.
>> Since the management plane is usually a proprietary solution, I invite
>> anyone interested in working on BFD auto-configuration extensions in the
>> control plane. I much appreciate references to the use cases that can
>> benefit from such extensions.
>>
>> Regards,
>> Greg
>>
>> On Mon, Nov 29, 2021 at 6:26 PM Aijun Wang <[email protected]>
>> wrote:
>>
>>> Hi, Greg:
>>>
>>>
>>>
>>> Firstly, regardless of which methods to be used for the multihop BFD
>>> approach, it is certainly the configuration overhead if you image there are
>>> 10,000 PEs as Tony often raised as one example.
>>>
>>> Shouldn’t you configure each pair of them to detect the PE-PE connection?
>>>
>>> It is obvious not scalable.
>>>
>>>
>>>
>>>
>>>
>>> Best Regards
>>>
>>>
>>>
>>> Aijun Wang
>>>
>>> China Telecom
>>>
>>>
>>>
>>> *From:* Greg Mirsky <[email protected]>
>>> *Sent:* Tuesday, November 30, 2021 10:18 AM
>>> *To:* Aijun Wang <[email protected]>
>>> *Cc:* Gyan Mishra <[email protected]>; Robert Raszuk <
>>> [email protected]>; lsr <[email protected]>
>>> *Subject:* Re: [Lsr] BFD aspects
>>>
>>>
>>>
>>> Hi Aijun,
>>>
>>> could you please elaborate on how you see that this discussion leads to
>>> the "BFD based detection for the mentioned problem is not [...]
>>> scalable(among PEs)" conclusion? I hope that there's nothing I've said or
>>> suggested lead you to this conclusion. Personally, I believe that BFD-based
>>> PE-PE is the best technical solution. I understand that an operator may be
>>> dissatisfied with the additional configuration of the BFD session. As
>>> noted, I believe that can be addressed in the management plane or minor
>>> extensions in the control plane (BGP or not). If a particular
>>> implementation (or a combination of the implementation and HW) has a
>>> scaling challenge with multi-hop BFD, then that could be not enough
>>> sufficient technical justification for a somewhat controversial proposal.
>>>
>>>
>>>
>>> Regards,
>>>
>>> Greg
>>>
>>>
>>>
>>> On Mon, Nov 29, 2021 at 5:17 PM Aijun Wang <[email protected]>
>>> wrote:
>>>
>>> From the discussion, I think we can get the conclusion that BFD based
>>> detection for the mentioned problem is not reliable (between PE/RR) and
>>> scalable(among PEs).
>>>
>>> Then also the BGP based solution.
>>>
>>>
>>>
>>> So let’s focus how to implement it within the IGP?  Thanks Greg’s
>>> analysis.
>>>
>>> And one supplement for Robert’s comments: RR is always not located
>>> within the same area as PEs, then can’t know the down of PE nodes
>>> immediately when the summary is configured between areas.
>>>
>>>
>>>
>>> Best Regards
>>>
>>>
>>>
>>> Aijun Wang
>>>
>>> China Telecom
>>>
>>>
>>>
>>> *From:* [email protected] <[email protected]> *On Behalf Of *Gyan
>>> Mishra
>>> *Sent:* Tuesday, November 30, 2021 8:44 AM
>>> *To:* Robert Raszuk <[email protected]>
>>> *Cc:* Greg Mirsky <[email protected]>; lsr <[email protected]>
>>> *Subject:* Re: [Lsr] BFD aspects
>>>
>>>
>>>
>>>
>>>
>>> Robert
>>>
>>>
>>>
>>> On Mon, Nov 29, 2021 at 7:35 PM Robert Raszuk <[email protected]> wrote:
>>>
>>> Hi Greg,
>>>
>>>
>>>
>>> If BFD would have autodiscovery built in, that would indeed be the
>>> ultimate solution. Of course folks will worry about scaling and number of
>>> BFD sessions to be run PE-PE.
>>>
>>> GIM>> I sense that it is not "BFD autodiscovery" but an advertisement of
>>> BFD multi-hop system readiness to the particular PE. That, as I think of
>>> it, can be done in a control or management plane.
>>>
>>>
>>>
>>> Agreed.
>>>
>>>
>>>
>>> But if BFD between all PEs would be an option why RR to PE in the local
>>> area would not be a viable solution ?
>>>
>>>
>>>
>>> GIM>>Because, in the case of PE-PE, BFD control packets will be
>>> fate-sharing with data packets. But the path between RR and PE might not be
>>> used for carrying data packets at all.
>>>
>>>
>>>
>>> 100%. But that was accounted for. Reason being that you have at least
>>> two RRs in an area. The point of BFD was to use detect that PE went down.
>>>
>>>
>>>
>>> Gyan> What Greg is alluding is a very good point to consider is that the
>>> RR in many cases in operator networks sit in the “control plane” path
>>> which is separate from the data plane path.  So the E2E forwarding plane
>>> path between the PEs, the RR has no knowledge as is it sits outside the
>>> forwarding plane path.  That being said the PE to RR path is disjoint from
>>> the PE-PE path so from the PE-RR  RR POV may think the PE is up or down
>>> thus the false positive or negative. That would be the case regardless of
>>> how many RRs are deployed.
>>>
>>>
>>>
>>> You are absolutely right that it may report RR disconnect from the
>>> network while PE is up and data plane from remote PEs can reach it. That is
>>> why we have more than one RR.
>>>
>>>
>>>
>>> As far as fate sharing PE-PE BFD with real user data - I think it is not
>>> always the case. But this is completely separate discussion :)
>>>
>>>
>>>
>>> Also please keep in mind that PE going down can be learned by RRs by
>>> listening to the IGP. No BFD needed.
>>>
>>>
>>>
>>> Both would be multihop, both would be subject to all transit failures
>>> etc ...
>>>
>>> GIM>> I think that there's a difference between the impact a path
>>> failure has on the data traffic. In the case of monitoring PE-PE path in
>>> the underlay and using the same encapsulation as data traffic is
>>> representative of the data experience. A failure of the PE-RR path, in my
>>> understanding, may be not representative at all. BFD session between RR and
>>> PE may fail while PE is absolutely functional from the service PoV.
>>>
>>>
>>>
>>> Please keep in mind that this entire discussion is not about data plane
>>> failure end to end :)  Yes, it's pretty sad. This entire debate  is to
>>> indicate domain wide that the IGP component on a PE went down.
>>>
>>>
>>>
>>> No one considers data plane liveness and even as you observed data plane
>>> encapsulation congruence. Clearly this is not a true OAM discussion.
>>>
>>>
>>>
>>> On the other hand, PE might be disconnected from the service while the
>>> BFD session to RR is in the Up state.
>>>
>>>
>>>
>>> Not likely if you keep in mind that to trigger any remote action such
>>> failure would have to happen to all RRs.
>>>
>>>
>>>
>>> Thx a lot,
>>> R.
>>>
>>>
>>>
>>> _______________________________________________
>>> Lsr mailing list
>>> [email protected]
>>> https://www.ietf.org/mailman/listinfo/lsr
>>>
>>> --
>>>
>>> <http://www.verizon.com/>
>>>
>>> *Gyan Mishra*
>>>
>>> *Network Solutions Architect *
>>>
>>> *Email [email protected] <[email protected]>*
>>>
>>> *M 301 502-1347*
>>>
>>>
>>>
>>> _______________________________________________
>>> Lsr mailing list
>>> [email protected]
>>> https://www.ietf.org/mailman/listinfo/lsr
>>>
>> --
>
> <http://www.verizon.com/>
>
> *Gyan Mishra*
>
> *Network Solutions A**rchitect *
>
> *Email [email protected] <[email protected]>*
>
>
>
> *M 301 502-1347*
>
>

_______________________________________________
Lsr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lsr

Re: [Lsr] BFD aspects

Reply via email to