Hi Huaimo, > > > 1) There is no concrete procedure/method for fault tolerance > > > to multiple failures. When multiple failures happen and split the > > > flooding topology, the convergence time will be increased > > > significantly without fault tolerance. The longer the convergence > > > time, the more the traffic lose. > > > > there is a solution for multiple failures - see section 6.7.11. > > > > Section 6.7.11 just briefly mentions that the edges of split parts will > determine and repair the split after the split of the flooding topology > happens. However, there is not any details or description on how to determine > or repair the split. This is not useful for implementers.
I’m sorry that you don’t find it useful. Determining the split is trivial: when you receive an IIH, it has a system ID of the another system in it. If that other system is not currently part of the flooding topology, then it is quite clear that it is disconnected from the flooding topology. Repairing the split is done by enabling temporary flooding on the new link. There is an issue here that we have not yet resolved, which is the rate that new links should be temporarily added to the flooding topology. Some believe that adding any new link is the correct thing to do as it minimizes the recovery time. Others feel that enabling too many links could cause a flooding collapse, so link addition should be highly constrained. We are still discussing this and invite the WG’s opinions. > > > 2) The extensions to Hello protocols for enabling “temporary > > flooding” over a new link is not needed. > > > > not if you do flooding on every link that comes up. If you want to be > > smarter, then you need to > > selectively enable flooding only under specific conditions and that must be > > done from both sides of > > the new link. > > There are only a limited number of conditions (or cases). In each > condition/case, it is deterministic whether we need to enable “temporary > flooding” for a new link when it is up. Thus there is no need for any > extensions to Hello protocols for enabling “temporary flooding” on a new link. We know of only two cases: (1) the neighbor is not part of the flooding topology and we feel that we can add more temporary flooding. (2) The neighbor is not part of the flooding topology and we cannot add more temporary flooding. Obviously, in the case where we want to add temporary flooding, that TLV is needed in the IIH. > For example, suppose that we have a current flooding topology containing all > live nodes in an area, when a new link comes up, we may just have two > conditions/cases. One condition/case is that the new link is attached to a > new node not on the current flooding topology. In this condition/case, the > new link needs to be enabled for “temporary flooding” after it is up. Agreed, which is why we need the TLV. > The other condition/case is that the new link is attached to nodes on the > current flooding topology. In this condition/case, there is no need to enable > “temporary flooding” on the link. Agreed. Note that there are some additional corner cases. Since the two neighbors may not have the exact same information, one may consider the other to be on the flooding topology when in fact it is not. This might happen in the case of a node reboot. The IIH TLV gives us an explicit way of signaling, rather than simply guessing and sometimes getting it wrong. > > > 3) The extensions to Hello protocols for requesting/signaling > > > “temporary flooding” for a connection does not work. > > > > sorry, but if you see a problem, please provide details, saying above is > > simply unproductive. > > “The nodes … will try to repair the flooding topology locally by enabling > temporary flooding towards the nodes that they consider disconnected from the > flooding topology ...” > > The above quoted text is from draft-li-lsr-dynamic-flooding-02, where > “enabling temporary flooding towards the nodes” is to request/signal > “temporary flooding” for a connection to connect partitioned/disconnected > flooding topology into one through the extensions to Hello protocols > described in draft-li-lsr-dynamic-flooding-02. Right? > > The extensions to Hello protocols for requesting/signaling “temporary > flooding” for a connection to connect partitioned/disconnected flooding > topology into one does not work since the connection may have two or more > hops and a Hello packet may get lost. All adjacencies are a single hop in both IS-IS and OSPF. Yes, Hello packets may be lost. Fortunately, they are periodically transmitted, thus the next transmission will also contain the TLV. If IIH’s are getting lost at a significant rate, then the adjacency will not (and should not) come up. Thus, the request for temporary flooding will propagate to the neighbor in all cases that matter. > It is not convenient for a user/operator to configure on an area leader since > the leader is dynamically selected. How do you address this? No configuration is required. The election algorithm selects the area leader. The rules are in the draft. An implementation may have a default priority and a default algorithm setting, so no configuration is mandatory. If the operator desires a specific node to become area leader, then configuration may be required to adjust the priority. FWIW, we have this already working in our implementation. It Just Works. > After the user/operator does some configurations on the (designated) leader, > will the backup leader takes over the configurations after the designated > leader is down? There is no need for a backup leader. If the area leader is partitioned from the topology, then leader election is repeated, resulting in a new leader. Again, no configuration is required. Tony
_______________________________________________ Lsr mailing list [email protected] https://www.ietf.org/mailman/listinfo/lsr
