Hi Robert- My responses via <cb>
The former is a local event and link/node failures are an isolated events. Congestion is not. If congestion happens on a core link of the node it is very likely it happened on many nodes in the same time which were unfortunate to sit on the path of subject flows. Because of this observation the network wide effect of the former cannot be compared 1:1 with effect of the latter. <cb> Can you point us to the source of this assumption? The solution we propose would react to multiple congestion threshold crosses across a network. We did not describe such a scenario in the draft nor its implications. Maybe it would be good to add a discussion on the topic. Is that your suggestion? That is actually one question I forgot to ask .. when or based on what network event do you "deactivate" TTE ? <cb> It’s described in the draft (or should be). In summary, when the cumulative byte rate across the newly formed ECMP falls below the low threshold for some consecutive number of traffic samples TTE activated prefixes are deactivated according to the prefix selection technique. It was to say that observation of "congestion" should consider configured QoS queues not bits sent out the interface <cb> We can add that point to the draft. The solution does not prevent such a scenario & it has been discussed among the authors/co-authors already. --Colby From: Robert Raszuk <[email protected]> Date: Friday, March 31, 2023 at 6:58 PM To: Tony Li <[email protected]> Cc: Colby Barth <[email protected]>, RTGWG <[email protected]> Subject: Re: TTE [External Email. Be cautious of content] Hi Tony, > - Bypass paths maybe already saturated with traffic causing even further > traffic oscillations This is true, but then that implies that the network is not prepared to handle link failure of the protected link. If the network is under-engineered to begin with, this feature will not magically improve things. Capacity is a zero-sum game and this feature assumes that there is adequate capacity. Not really. First let's observe that most networks are engineered to handle single failure of a node or a link. Properly handling multiple simultaneous failures is in the vast majority of cases not the case. Of course it also depends on the locality of the multiple failures. But there is a much more important point to be stated in respect to handling FRR as a result of link or remote node failure vs triggering FRR based on the link congestion threshold being crossed. The former is a local event and link/node failures are an isolated events. Congestion is not. If congestion happens on a core link of the node it is very likely it happened on many nodes in the same time which were unfortunate to sit on the path of subject flows. Because of this observation the network wide effect of the former can not be compared 1:1 with effect of the latter. As stated, TTE is meant to be used in conjunction with classical TE operating on a much longer time scale. If classical TE corrects the overload situation (which itself will require path changes and impact end-to-end protocols), then TTE will deactivate prefixes and labels and return traffic to the primary path. That is actually one question I forgot to ask .. when or based on what network event do you "deactivate" TTE ? See again with link or node failure the trigger is local and you know when subject link/node which failed is back on the graph. Here you do not know by any specific network event. You could keep measuring traffic volume which goes into protect path and keep comparing if it decreased enough for protection to be removed keeping the history of such flows size in the past. But this is easier said then done. Some people actually like delivering their best effort traffic. Again this was not the point to say that best effort traffic is not important. It was to say that observation of "congestion" should consider configured QoS queues not bits sent out the interface. If I have massive congestion I may really want to "protect" priority flows and only trigger it when priority class get's full. - - - Now I am sure creative folks will go step further and ask to "protect" in such a way based on increased delay or loss on link (with additional measurements). And honestly such new triggers would be safer then congestion trigger as those again are localized and are not chained by their nature across many nodes. Kind regards, Robert Juniper Business Use Only
_______________________________________________ rtgwg mailing list [email protected] https://www.ietf.org/mailman/listinfo/rtgwg
