Hi Tony, I would definitely like to understand it. To my mind, link/node failures > have at least an area-wide impact. The scope of a congestion event and > failures are extremely similar to my mind. >
That is 100% correct that link/node failures have an area wide impact. But I am not talking about impact, but locality of the trigger. I think you can agree that link/node failure will be detected by directly attached PLRs only. My point is that congestion is on the other hand caused by traffic flow(s) which are not sourced in P node(s) in the middle of the network. They usually enter the network from the edge (ingress) and go to the the other side (egress). As discussed in the presentation and in the draft, when a prefix is > activated, TTE shifts the backup paths to be in ECMP with the primary path. > This shifts a portion of the traffic for the affected prefix onto the > bypass path. > Ahh ok so you call a backup path an ECMP path with primary path (or primary ECMP paths). Ok. I would rather call them bypass but no issue here. Yes, we are only discussing unicast. Congestion can be a local phenomenon. > 101Gb of traffic funneled into a 100Gb link that drops the excess 1Gb of > traffic will effectively ‘protect’ the downstream 100Gb links. > Do you plan to support PE-CE link congestion by use of PE-CE protection in case of multihomed customer sites ? Exactly. Since repair requires that you provision bandwidth for failures, > their assumption is that bypass links are not completely congested. If all > of your links are totally saturated, then FRR does not help at all and > neither does TTE. That’s not a use case that’s interesting to address. > My point is not about capacity planning during network design. We are already talking about case where our provisioning assumptions are gone. So if primary link got congested there zero assurance that backup links at the time of bypass activation have space for excessive traffic. And what is also worth to consider is the case that two links used to protect each other get congested and while bypass get's activated on both the end result will be still congestion on both links this time caused by symmetric bypass activation. And the point is that to avoid this activation or deactivation should come from the controller not be an autonomous event executed on each node. Please recall that there are two thresholds: high and low. Activating a > prefix may shift some traffic to the bypass. Typically, we would expect > that after a few prefixes are shifted, enough load would be shed so that > utilization lies between the low and high thresholds. > > TTE is iterative and continuous: if flow selection does not alleviate > congestion, more flows will be selected in the next iteration. Similarly, > if flow selection overshoots, it will self-correct by deactivating prefixes > until utilization lies between thresholds. > Prefix or flow ? Are you just modifying rewrite for a prefix(es) or applying more flow ACL to select what goes into bypass ? If you are just doing this based on dst prefix then I could see how it could work for pure Internet transit where there is no encapsulation used in the network. But such networks (even for pure ISP) are in vast majority a history. For various reasons most networks use PE-PE encapsulation of some sort. Take MPLS (LDP or SR) ... Node will be receiving packets with label L - prefix on the packet plays no role in forwarding here. So you need quite deep ACL to go beyond MPLS header (even without MNA) to recognize the flow. Needless to say you need pretty powerful local s-flow capabilities to recognize those flows in the first place. If not then you are likely not going to shift excessing 1 GB of 101 GB demand but perhaps 30-60 GB Kind regards, Robert
_______________________________________________ rtgwg mailing list [email protected] https://www.ietf.org/mailman/listinfo/rtgwg
