Gyan, Thanks for your interest. Sorry for replying late on this. Thanks Arie and Jeff for your clarifications.
We have not changed the definition of the Link Bandwidth Ext. Community (0x4004) or it definition as non-transitive. As Arie mentioned previously, we are actually originating it at the AS boundary. BTW, the cumulative link-bandwidth feature first went in XR in 6.1.1. We can incorporate that in the addendum as well get the similar information from other vendors. Thanks, --Satya From: Gyan Mishra <[email protected]> Date: Saturday, July 3, 2021 at 8:34 AM To: Jeff Tantsura <[email protected]> Cc: Arie Vayner <[email protected]>, Robert Raszuk <[email protected]>, "Satya Mohanty (satyamoh)" <[email protected]>, "UTTARO, JAMES" <[email protected]>, "[email protected]" <[email protected]> Subject: Re: [bess] Request discussion on Cumulative Link Bandwidth Draft Thanks Jeff for the clarification. Thanks Gyan On Fri, Jul 2, 2021 at 6:05 PM Jeff Tantsura <[email protected]<mailto:[email protected]>> wrote: Gyan, you are mixing use of BW community as such with cumulative propagation (the theme of the draft). The original community is defined in "Non-Transitive Two-Octet AS-Specific Extended Community Sub-Types" IANA section and that has to be changed to allow eBGP use cases. Aggregation is a very useful feature when the prefix with the community attached is traversing the fabric and is being regenerated and potentially transformed at every hop traversed. The alternative with add-path and potentially path explosion (not to talk about operational complexity of add-path and bugs in the implementations) is a rather unattractive solution for DC fabrics. Cheers, Jeff On Fri, Jul 2, 2021 at 12:17 PM Gyan Mishra <[email protected]<mailto:[email protected]>> wrote: Hi Satya For EBGP DCs with multi-stage clos I understand this can be used, however with Cisco & Juniper & Nokia & Arista and maybe other vendor implementations seem to combine the non transitive link bw extended community and transitive cumulative link bw extended community into a single feature that works for UCMP intra AS and inter AS. Please confirm. These two drafts seem to be combined by implementations into a single UCMP feature for both eBGP and iBGP. https://datatracker.ietf.org/doc/html/draft-ietf-idr-link-bandwidth-07 https://datatracker.ietf.org/doc/html/draft-mohanty-bess-ebgp-dmz-03 Cisco: https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/iproute_bgp/configuration/xe-17/irg-xe-17-book/bgp-link-bandwidth.html https://community.cisco.com/t5/service-providers-documents/asr9000-xr-understanding-unequal-cost-multipath-ucmp-dmz-link/ta-p/3138853 Juniper: https://www.juniper.net/documentation/en_US/junos/topics/concept/bgp-multipath-unequal-understanding.html Nokia: https://documentation.nokia.com/html/0_add-h-f/93-0074-HTML/7750_SR_OS_Routing_Protols_Guide/bgp.html Arista: https://www.arista.com/en/um-eos/eos-border-gateway-protocol-bgp#xx1418621 Kind Regards Gyan On Wed, May 26, 2021 at 3:21 PM Satya Mohanty (satyamoh) <[email protected]<mailto:[email protected]>> wrote: Hi Jim, No, they do not. This draft under discussion is a way to aggregate the link bandwidth in EBGP DCs and advertise it upstream. It works well in multi-stage clos topology fabrics. Traffic is demultiplexed (multi-path load balanced) when it arrives at a node of each stage (unless the sink). The draft you are mentioning, https://tools.ietf.org/id/draft-ietf-bess-evpn-unequal-lb-06.html is really a way to communicate the link-bandwidth across EBGP boundaries. It is mostly geared from an Inter-AS Option C viewpoint (next-hop unchanged) although it also applies to Option B deployments (next-hop-self). There is no notion of aggregating bandwidth here. HTH. Best Regards, --Satya From: "UTTARO, JAMES" <[email protected]<mailto:[email protected]>> Date: Wednesday, May 26, 2021 at 5:38 AM To: Gyan Mishra <[email protected]<mailto:[email protected]>>, Robert Raszuk <[email protected]<mailto:[email protected]>> Cc: Jeff Tantsura <[email protected]<mailto:[email protected]>>, Arie Vayner <[email protected]<mailto:[email protected]>>, "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>>, "Satya Mohanty (satyamoh)" <[email protected]<mailto:[email protected]>> Subject: RE: [bess] Request discussion on Cumulative Link Bandwidth Draft Does this work and Weighted Multi-Path Procedures for EVPN Multi-homing address the same field of use? Thanks, Jim Uttaro From: BESS <[email protected]<mailto:[email protected]>> On Behalf Of Gyan Mishra Sent: Wednesday, May 26, 2021 12:57 AM To: Robert Raszuk <[email protected]<mailto:[email protected]>> Cc: Jeff Tantsura <[email protected]<mailto:[email protected]>>; Arie Vayner <[email protected]<mailto:[email protected]>>; [email protected]<mailto:[email protected]>; Satya Mohanty (satyamoh) <[email protected]<mailto:[email protected]>> Subject: Re: [bess] Request discussion on Cumulative Link Bandwidth Draft Across the DC space in general most providers use NVO3 and vxlan source port entropy L2/L3/L4 hash which provides per packet uniform 50/50 load balancing at the L2 VNI overlay layer, which translates into underlay load balancing of flows and thus no polarization. Across the DC space speaking from an operators perspective as under the floor fiber is not at a premium compare to 100G facilities costs the net addition of bandwidth can be done fairly quickly so you are ahead of the congestion curve and can be proactive versus reactively upgrading bandwidth piecemeal here and there ad hoc. There still maybe cases that still arise as even if you have the fiber infrastructure available, it’s not easy to upgrade and flash every link simultaneously in the DC in one or multiple maintenance windows, so you could still be left with some uneven bandwidth across the DC that could utilize this feature. DC comes into play for PE-CE “wan links”as well use case for unequal cost load balancing use of the cumulative link bandwidth community regenerated. I think the use case where both the iBGP P core P-P links or eBGP PE - PE inter-as are all wan links where link upgrades tend to not get done in unison uniformly, and in those cases both iBGP link bandwidth community can be heavily utilized as well as eBGP cumulative regenerated link bandwidth community for unequal cost load balancing. Across the core as well it is hard to flash all links even under floor fiber to the same bandwidth all at once you are left with the requirement for unequal coat load balancing. As operators upgrade their DC as well as core infrastructure to IPv6 forwarding plane in the move towards SRv6, they can now take advantage of flow label entropy stateless uniform load balancing and elimination of polarization. However, the wan link upgrades of core and DC PE-CE still exists and thus may be done piecemeal, so then both of the drafts are an extremely helpful tool for operators that much need the unequal cost load balancing capability. I support both drafts. Have most vendors implemented this to support both 2 byte and 4 byte AS extended community. The drafts state 2 byte AS support. Thanks Gyan On Tue, May 25, 2021 at 7:00 PM Robert Raszuk <[email protected]<mailto:[email protected]>> wrote: Hi Arie, Draft draft-ietf-idr-link-bandwidth talks about advertising towards IBGP. It does not talk about advertising over EBGP. While I do support your use case I think it would be much cleaner to just ask for new ext. community type. Reason being that as you illustrate you may want to accumulate BGP path's bw across few EBGP hops in the DC. Today there is no way to do so unless you want to completely hijack current lb ext community. Also I see an analogy here to AIGP RFC although it clearly fits rather poorly for those who use BGP as IGP :). Best, R. On Wed, May 26, 2021 at 12:22 AM Arie Vayner <[email protected]<mailto:[email protected]>> wrote: Jeff, Actually, the way this draft is written, and how the implementations I'm aware of are implemented, this is not really a transitive community. It is a new community that is being generated on the AS boundary. The community value is not carried over, but is calculated based on an cumulative value of other received communities, and then advertised as a new value across the AS boundary. Tnx, Arie On Tue, May 25, 2021 at 12:55 PM Jeff Tantsura <[email protected]<mailto:[email protected]>> wrote: Hi, I support adoption of the draft as Informational, please note, that request to change transitivity characteristics of the community is requested in another draft. Gyan - please note, while pretty much every vendor has implemented the community and relevant data-plane constructs, initial draft defines the community as non transititive, some vendors have followed that while some other have implemented it a transitive (to support obvious use case - eBGP in DC). Cheers, Jeff On May 22, 2021, 8:38 AM -0700, Satya Mohanty (satyamoh) <[email protected]<mailto:[email protected]>>, wrote: Hi all, On behalf of all the authors, we request a discussion of the draft https://datatracker.ietf.org/doc/html/draft-mohanty-bess-ebgp-dmz-03<https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/draft-mohanty-bess-ebgp-dmz-03__;!!BhdT!0b982PpfH6BN3cZfnleSv0ex_IJjKEMUOXi42a_RhyGg6nB17aWOnm6zLkY$> and subsequent WG adoption. This draft extends the usage of the DMZ link bandwidth to scenarios where the cumulative link bandwidth needs to be advertised to a BGP speaker. Additionally, there is provision to send the link bandwidth extended community to EBGP speakers via configurable knobs. Please refer to section 3 and 4 for the use cases. This feature has multiple-vendor implementations and has been deployed by several customers in their networks. Best Regards, --Satya _______________________________________________ BESS mailing list [email protected]<mailto:[email protected]> https://www.ietf.org/mailman/listinfo/bess<https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/bess__;!!BhdT!0b982PpfH6BN3cZfnleSv0ex_IJjKEMUOXi42a_RhyGg6nB17aWO_sLM9KM$> _______________________________________________ BESS mailing list [email protected]<mailto:[email protected]> https://www.ietf.org/mailman/listinfo/bess<https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/bess__;!!BhdT!0b982PpfH6BN3cZfnleSv0ex_IJjKEMUOXi42a_RhyGg6nB17aWO_sLM9KM$> _______________________________________________ BESS mailing list [email protected]<mailto:[email protected]> https://www.ietf.org/mailman/listinfo/bess<https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/bess__;!!BhdT!0b982PpfH6BN3cZfnleSv0ex_IJjKEMUOXi42a_RhyGg6nB17aWO_sLM9KM$> -- Error! Filename not specified.<https://urldefense.com/v3/__http:/www.verizon.com/__;!!BhdT!0b982PpfH6BN3cZfnleSv0ex_IJjKEMUOXi42a_RhyGg6nB17aWORYETg1w$> Gyan Mishra Network Solutions Architect Email [email protected]<mailto:[email protected]> M 301 502-1347 -- [Image removed by sender.]<http://www.verizon.com/> Gyan Mishra Network Solutions Architect Email [email protected]<mailto:[email protected]> M 301 502-1347 -- [Image removed by sender.]<http://www.verizon.com/> Gyan Mishra Network Solutions Architect Email [email protected]<mailto:[email protected]> M 301 502-1347
_______________________________________________ BESS mailing list [email protected] https://www.ietf.org/mailman/listinfo/bess
