Re: [Lsr] Temporary addition of links to flooding topology in dynamic flooding

2019-03-11 Thread Jeff Tantsura
+1 Les.

In general - in ECMP cases LFA is meaningless (any ECMP member is loop-free per 
definition) so commonly used technology is fast-rehash, where in case of 
failure all the flows that would use the link in question are rehashed over 
other links in the bundle and that is done in HW.

Regards,
Jeff

> On Mar 11, 2019, at 21:28, Les Ginsberg (ginsberg)  wrote:
> 
> Robert –
>  
> I don’t think the word “random” is applicable.
>  
> Section 6.7.11 states (emphasis added):
>  
> “In the unlikely event of multiple failures on the flooding topology,
>it may become partitioned.  The nodes that remain active on the edges
>of the flooding topology partitions will recognize this and will try
>to repair the flooding topology locally by enabling temporary
>flooding towards the nodes that they consider disconnected from the
>flooding topology until a new flooding topology becomes connected
>again.”
>  
> This isn’t a case of every node in the network trying to decide how to repair 
> the partition. It is only the nodes at the edge(s) of the partition doing so. 
> I do not see this as “random”.
>  
> What is being debated on the list is not related to randomness – it is the 
> degree of temporary flooding along the continuum from “minimal” (one 
> additional edge) to “maximal” (all edges to nodes which are seen as currently 
> disconnected). The former risks longer convergence – the latter risks 
> temporary flooding storms. But neither approach is random. Once the failures 
> are known, the set of candidates is predictable.
>  
> The concept of LFA also isn’t applicable here.  LFA (if we use the term in 
> this case to mean a precalculated set of temporary flooding edges) is useful 
> when it can be preinstalled in the forwarding plane, allowing a node to 
> eliminate waiting for control plane intervention when a local failure is 
> detected.
> But LSP/LSA flooding is always done by the control plane – so having a 
> precalculated LFA wouldn’t produce a faster response time. If you are going 
> to suggest that the calculation required to determine a flooding topology 
> partition is itself costly I think this is not supported by current SPF 
> calculation times. In addition, given temporary flooding is normally only 
> required in the event of multiple failures, the combinations required to be 
> supported in order to have a useful set of pre-calculated temporary flooding 
> edges becomes quite large – which makes such an approach impractical.
>  
>Les
>  
>  
> From: Lsr  On Behalf Of Robert Raszuk
> Sent: Monday, March 11, 2019 2:28 PM
> To: lsr@ietf.org
> Subject: [Lsr] Temporary addition of links to flooding topology in dynamic 
> flooding
>  
> Hi,
>  
> As of now at the event of failure of any of the FT enabled link additional 
> links are being added in more or less random fashion by nodes directly 
> connected to the failed links. 
>  
> In the event of 100s of links on such nodes and advisable rate limiting 
> addition of those links it seems that repair of FT may take some time. 
>  
> In order to reduce such time interval better then random addition of 
> remaining links seems recommended. How about we hint participating nodes to 
> execute purely in control plane of FT an LFA algorithm for possible future 
> event of active link failure and use results of the LFA computation to 
> prioritize links which will be first temporary additions upon active flooding 
> links failures ? 
>  
> Such optimization is local and optional and does not require any changes to 
> proposed protocol signalling. 
>  
> Therefor how about just one sentence addition to section 6.7.1 of 
> draft-ietf-lsr-dynamic-flooding:
>  
> Temporary additions of links to flooding topology could be more educated if 
> given node runs a pure control plane LFA ahead of any FT failure on active FT 
> links completely detached from potential LFA runs for data plane topology. 
>  
> Kind regards,
> R.
>  
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Temporary addition of links to flooding topology in dynamic flooding

2019-03-11 Thread Les Ginsberg (ginsberg)
Robert –

I don’t think the word “random” is applicable.

Section 6.7.11 states (emphasis added):

“In the unlikely event of multiple failures on the flooding topology,
   it may become partitioned.  The nodes that remain active on the edges
   of the flooding topology partitions will recognize this and will try
   to repair the flooding topology locally by enabling temporary
   flooding towards the nodes that they consider disconnected from the
   flooding topology until a new flooding topology becomes connected
   again.”

This isn’t a case of every node in the network trying to decide how to repair 
the partition. It is only the nodes at the edge(s) of the partition doing so. I 
do not see this as “random”.

What is being debated on the list is not related to randomness – it is the 
degree of temporary flooding along the continuum from “minimal” (one additional 
edge) to “maximal” (all edges to nodes which are seen as currently 
disconnected). The former risks longer convergence – the latter risks temporary 
flooding storms. But neither approach is random. Once the failures are known, 
the set of candidates is predictable.

The concept of LFA also isn’t applicable here.  LFA (if we use the term in this 
case to mean a precalculated set of temporary flooding edges) is useful when it 
can be preinstalled in the forwarding plane, allowing a node to eliminate 
waiting for control plane intervention when a local failure is detected.
But LSP/LSA flooding is always done by the control plane – so having a 
precalculated LFA wouldn’t produce a faster response time. If you are going to 
suggest that the calculation required to determine a flooding topology 
partition is itself costly I think this is not supported by current SPF 
calculation times. In addition, given temporary flooding is normally only 
required in the event of multiple failures, the combinations required to be 
supported in order to have a useful set of pre-calculated temporary flooding 
edges becomes quite large – which makes such an approach impractical.

   Les


From: Lsr  On Behalf Of Robert Raszuk
Sent: Monday, March 11, 2019 2:28 PM
To: lsr@ietf.org
Subject: [Lsr] Temporary addition of links to flooding topology in dynamic 
flooding

Hi,

As of now at the event of failure of any of the FT enabled link additional 
links are being added in more or less random fashion by nodes directly 
connected to the failed links.

In the event of 100s of links on such nodes and advisable rate limiting 
addition of those links it seems that repair of FT may take some time.

In order to reduce such time interval better then random addition of remaining 
links seems recommended. How about we hint participating nodes to execute 
purely in control plane of FT an LFA algorithm for possible future event of 
active link failure and use results of the LFA computation to prioritize links 
which will be first temporary additions upon active flooding links failures ?

Such optimization is local and optional and does not require any changes to 
proposed protocol signalling.

Therefor how about just one sentence addition to section 6.7.1 of 
draft-ietf-lsr-dynamic-flooding:

Temporary additions of links to flooding topology could be more educated if 
given node runs a pure control plane LFA ahead of any FT failure on active FT 
links completely detached from potential LFA runs for data plane topology.

Kind regards,
R.

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Multiple failures in Dynamic Flooding

2019-03-11 Thread tony . li

Hi Huaimo,



> In summary for multiple failures, two issues below in 
> draft-li-lsr-dynamyic-flooding are discussed:
> 1)  how to determine the current flooding topology is split; and
> 2)  how to repair/connect the flooding topology split.
> For the first issue, the discussions are still going on.
> For the second issue, repairing/connecting the flooding topology split 
> through Hello protocol extensions does not work.  When a “backup 
> path”/connection of multiple hops is needed to connect/repair the flooding 
> topology split, Hello can not go beyond one hop, thus can not repair the 
> flooding topology split in this case.


You do not try to repair things remotely, they are always repaired locally.  If 
there are multiple failures in the flooding topology and it is partitioned, 
then it follows that there are multiple remaining connected components of the 
flooding topology.  Nodes that are adjacent to the failures will update their 
LSPs and flood them throughout their connected component.  Each component will 
see at least two link failures if there is a partition of the FT and each node 
in the component can detect that the FT has partitioned.  Each node is then 
capable of enabling temporary flooding on one or more links that will traverse 
the partition, thereby restoring a functioning FT.  The Area Leader then 
recomputes and redistributes the revised FT.

To put it yet another way, repair is fully distributed.  You should like that.  
:-)


> >We are not requiring it, but a system could also do a more extensive 
> >computation and compare the links between itself and the neighbor
> >by tracing the path in the FT and then confirming that each link is up in 
> >the LSDB.
>  
> It normally takes a long time such as more than ten minutes to age out and 
> remove an LSP/LSA for the neighbor from the LSDB even though the neighbor is 
> disconnected physically.
> How can you decide quickly in tens of milliseconds that the flooding topology 
> is disconnected?


You do not wait for LSP/LSA removal.  You look for link changes in the LSPs 
that you do get, or local link changes.


> >As we have discussed, this is not a solution. In fact, this is more 
> >dangerous than anything else that has been proposed and
> >seems highly likely to trigger a cascade failure. You are enabling full 
> >flooding for many nodes.  In dense topologies, even
> >a radius of 3 is very high.  For example, in a LS topology, a radius of 3 is 
> >sufficient to enable full flooding throughout the
> >entire topology. If that were stable, we would not need Dynamic Flooding at 
> >all.
>  
> This full flooding is enabled only for a very short time.


All it takes is enabling it at sufficient density to create a cascade failure.  
Milliseconds are sufficient for a collapse.


> How do you get that this is more dangerous than anything else and seems 
> highly likely to trigger a cascade failure? Can you give some explanations in 
> details?


Again, we do not have absolute metrics on what triggers a cascade failure 
today.  We have several data points of several different implementations at 
different points in time.  We know that in the early ‘90s, a full mesh of 20 
neighbors running L1L2 was sufficient.  Obviously things have changed somewhat, 
but even more modern implementations have had problems.  This is why the MSDC 
went to BGP.

As a result, we need to be very conservative about what flooding we temporarily 
enable.  We do not want to walk anywhere near the cliff, as the cascade failure 
is fatal to the network.

Tony


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Multiple failures in Dynamic Flooding

2019-03-11 Thread Peter Psenak

Hi Huaimo,

On 11/03/2019 18:08 , Huaimo Chen wrote:

Hi Tony,



In summary for multiple failures, two issues below in
draft-li-lsr-dynamyic-flooding are discussed:

1)  how to determine the current flooding topology is split; and


there is no need to do that. The recovery mechanism will repair the 
split topology if there is a way to do that.




2)  how to repair/connect the flooding topology split.


6.7.11.  Recovery from Multiple Failures


   "The nodes that remain active on the edges
   of the flooding topology partitions will recognize this and will try
   to repair the flooding topology locally by enabling temporary
   flooding towards the nodes that they consider disconnected from the
   flooding topology until a new flooding topology becomes connected
   again."


For the first issue, the discussions are still going on.

For the second issue, repairing/connecting the flooding topology split
through Hello protocol extensions does not work.  When a “backup
path”/connection of multiple hops is needed to connect/repair the
flooding topology split, Hello can not go beyond one hop, thus can not
repair the flooding topology split in this case.


there is no need to send anything multi-hop.

thanks,
Peter






*From:* Tony Li [mailto:tony1ath...@gmail.com] *On Behalf Of

*tony...@tony.li

*Sent:* Wednesday, March 6, 2019 10:45 AM
*To:* Huaimo Chen 
*Cc:* Christian Hopps ; lsr@ietf.org;

lsr-cha...@ietf.org; lsr-...@ietf.org

*Subject:* Multiple failures in Dynamic Flooding







Hi Huaimo,







I’m sorry that you don’t find it useful. Determining the split is

trivial: when you receive an IIH,


it has a system ID of the another system in it. If that other system is

not currently part of the


flooding topology, then it is quite clear that it is disconnected from

the flooding topology.


Repairing the split is done by enabling temporary flooding on the new link.





For an adjacency between two nodes is up, the Hello packets exchanged

between them will not change node/system IDs in them.


How do you determine that other system is not currently part of the

flooding topology?




The IIH includes the system ID.  See ISO 10589 v2, section 9.7, field

“source Id”.  The local system will have


a copy of the flooding topology and can easily see if the neighbor was

present as of the last FT computation.  If not, then it should be


added (modulo rate limiting). The local system can also examine it’s own

LSDB.  If there is no LSP for the neighbor, then it would seem


highly likely that there is a disconnect and the neighbor should again

be added (modulo rate limiting).




We are not requiring it, but a system could also do a more extensive

computation and compare the links between itself and the neighbor


by tracing the path in the FT and then confirming that each link is up

in the LSDB.



It normally takes a long time such as more than ten minutes to age out
and remove an LSP/LSA for the neighbor from the LSDB even though the
neighbor is disconnected physically.

How can you decide quickly in tens of milliseconds that the flooding
topology is disconnected?




There is an issue here that we have not yet resolved, which is the rate

that new links should be


temporarily added to the flooding topology.  Some believe that adding

any new link is the


correct thing to do as it minimizes the recovery time. Others feel that

enabling too many links


could cause a flooding collapse, so link addition should be highly

constrained. We are still


discussing this and invite the WG’s opinions.





The issue is resolved by the solutions in draft-cc-lsr-flooding-reduction.


One solution is below, where the given distance can be adjusted/configured.

If we want every node to flood on all its links, we let the given


distance to a big number. If we want the nodes within 2 hops to a failure



to flood on all their links, we set the given distance to 2.


   “In one way, when two or more failures on the current flooding

  > >topology occur almost in the same time, each of the nodes within a

  > >given distance (such as 3 hops) to a failure point, floods the link

  > >state (LS) that it receives to all the links (except for the one from

   which the LS is received) until a new flooding topology is built.”






As we have discussed, this is not a solution. In fact, this is more

dangerous than anything else that has been proposed and


seems highly likely to trigger a cascade failure. You are enabling full

flooding for many nodes.  In dense topologies, even


a radius of 3 is very high.  For example, in a LS topology, a radius of

3 is sufficient to enable full flooding throughout the


entire topology. If that were stable, we would not need Dynamic Flooding

at all.



This full flooding is enabled only for a very short time.

How do you get that this is more dangerous than anything else and seems
highly likely to trigger a cascade failure? Can you give some
explanations in 

Re: [Lsr] draft-ketant-idr-bgp-ls-bgp-only-fabric

2019-03-11 Thread Ketan Talaulikar (ketant)
Hi Robert,

Please check inline below.

From: Robert Raszuk 
Sent: 10 March 2019 21:40
To: Ketan Talaulikar (ketant) 
Cc: idr@ietf. org ; lsr@ietf.org
Subject: draft-ketant-idr-bgp-ls-bgp-only-fabric

Hi Ketan,

I have read your proposal of defining topology flooding in BGP with interest.

It seems like a pretty brilliant twist to take pieces defined in other 
documents with their original intention for sending IGP information (LSDB or 
TED) over BGP extension and now use all of those without IGP at all :).
[KT] RFC7752 did always have “direct” and “static” protocols – so it was not 
just IGPs. We extended to include BGP protocol for describing Peering topology 
for the EPE use-case with draft-ietf-idr-bgpls-segment-routing-epe

But I have just one question here perhaps to the WG or ADs.

Almost all normative references used in this draft clearly state that they were 
defined for carrying information present in ISIS & OSPF protocols as rather a 
courtesy of transporting them over TCP with BGP envelope between network and 
controller.

Can we now just "reuse" verbatim all of those defined codepoints as well as 
redefine use of BGP-LS SAFI as a new link state p2p network topology transport 
just like that ?
[KT] The BGP-LS SAFI is still used to report link-state information. This draft 
describes how it can be done even when no IGP is running and we are instead 
running BGP protocol (in RFC7938 hop-by-hop routing design). Each router here 
is setting up a BGP-LS session with a controller or a centralized BGP speaker 
to report the router’s own node properties, links, prefixes, etc. The objective 
is build a link-state topology at the controller for specific use-cases like 
topology discovering and TE with SR as described in Sec 6 of the draft.

At min I would like to see some analysis included in this draft of running 
native link state protocol possibly with dynamic flooding optimization vs 
running BGP as the only routing protocol with using BGP as topology discovery 
transport before we proceed further on this document.
[KT] I am not sure I see the contrast here. Personally, I support the dynamic 
flooding optimization work in LSR. There are already DC networks deployed with 
RFC7938 design. All this draft is introducing is topology discovery and other 
use-cases in the latter.

Thanks,
Ketan

Kind regards,
Robert.

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr