Jeff, End-points do not need to participate in link state topology. Those are merely leaves even if all endpoints are L3.
And 8K or even 256K of routes I think would be supported today by cheapest nodes nodes from any vendor :) Cheers, R. On Tue, Nov 28, 2023 at 12:32 AM Jeff Tantsura <[email protected]> wrote: > Robert, > > In context of LLM (10% of that for DLRM) training clusters, towards > 2024/25 we would be looking to up to 8K end-points in a 3 stage leaf-spine > fabric and up to 64-256K in 5 stages. > Virtualization and how it is instantiated might significantly change > amount/distribution of state in underlay/overlay. > > Obviously, these are hyperscale size deployments, many will be running > 10-30 switches fabrics, where anything could work. > BGP seems to work just fine, some data plane signaling could be used as a > near real time augmentation to “slow but stable” control plane. > > Cheers, > Jeff > > On Nov 26, 2023, at 14:30, Robert Raszuk <[email protected]> wrote: > > > Hey Jeff, > > Could you be so kind and defined term: "scaled-out leaf-spine fabrics" ? > > Specifically folks watching us here would highly appreciate if we state > max target nodes with vanilla ISIS and max target nodes when your ISIS > implementation supports draft-ietf-lsr-dynamic-flooding > <https://datatracker.ietf.org/doc/html/draft-ietf-lsr-dynamic-flooding> > > While I am a BGP person I feel pretty strongly that BGP is not a best fit > for the vast majority of DC fabrics in use today. > > Cheers, > Robert > > > On Sun, Nov 26, 2023 at 10:49 PM Jeff Tantsura <[email protected]> > wrote: > >> I agree with all aforementioned comments. >> >> Wrt AI/ML networking - if a controller is used, what is required is link >> state exposure northbound and not link state protocol in the fabric. (I >> could argue for RIFT though ;-)) >> I’d urge you to take a look at Meta’s deployment in their ML clusters >> (publicly available) - they use BGP as the routing protocol to exchange >> reachability (and build ECMP sets) and provide a backup if controller >> computed next hop goes away/before new one has been computed. >> Open R is used northbound to expose the topology (in exactly same way - >> BGP-LS could be used). >> >> To summarize: an LS protocol brings no additional value in scaled-out >> leaf-spine fabrics, without significant modifications - it doesn’t work in >> irregular topologies such as DF, etc. >> Existing proposals - there are shipping implementations and experience in >> operating it, have proven their relative value in suitable deployments. >> >> Cheers, >> Jeff >> >> > On Nov 26, 2023, at 12:20, Acee Lindem <[email protected]> wrote: >> > >> > Speaking as WG member: >> > >> > I agree. The whole Data Center IGP flooding discussion went on years >> ago and the simplistic enhancement proposed in the subject draft is neither >> relevant or useful now. >> > >> > Thanks, >> > Acee >> > >> >> On Nov 24, 2023, at 11:55 PM, Les Ginsberg (ginsberg) <ginsberg= >> [email protected]> wrote: >> >> >> >> Xiaohu – >> >> I also point out that there are at least two existing drafts which >> specifically address IS-IS flooding reduction in CLOS networks and do so in >> greater detail and with more robustness than what is in your draft: >> >> https://datatracker.ietf.org/doc/draft-ietf-lsr-distoptflood/ >> >> https://datatracker.ietf.org/doc/draft-ietf-lsr-isis-spine-leaf-ext/ >> >> I do not see a need for yet another draft specifically aimed at CLOS >> networks. >> >> Note that work on draft-ietf-lsr-isis-spine-leaf-ext was suspended due >> to lack of interest in deploying an IGP solution in CLOS networks. >> >> You are suggesting in draft-xu-lsr-fare that AI is going to change >> this. Well, maybe, but if so I think we should return to the solutions >> already available and prioritize work on them. >> >> Les >> >> From: Lsr <[email protected]> On Behalf Of Tony Li >> >> Sent: Thursday, November 23, 2023 8:39 AM >> >> To: [email protected] >> >> Cc: [email protected] >> >> Subject: Re: [Lsr] New Version Notification for >> draft-xu-lsr-flooding-reduction-in-clos-01.txt >> >> Hi, >> >> What you’re proposing is already described in IS-IS Mesh Groups ( >> https://www.rfc-editor.org/rfc/rfc2973.html) and improved upon in >> Dynamic Flooding ( >> https://datatracker.ietf.org/doc/html/draft-ietf-lsr-dynamic-flooding). >> >> Regards, >> >> Tony >> >> >> >> >> >> On Nov 23, 2023, at 8:29 AM, [email protected] wrote: >> >> Hi all, >> >> Any comments or suggestions are welcome. >> >> Best regards, >> >> Xiaohu >> >> 发件人: [email protected] <[email protected]> >> >> 日期: 星期三, 2023年11月22日 11:37 >> >> 收件人: Xiaohu Xu <[email protected]> >> >> 主题: New Version Notification for >> draft-xu-lsr-flooding-reduction-in-clos-01.txt >> >> A new version of Internet-Draft >> draft-xu-lsr-flooding-reduction-in-clos-01.txt >> >> has been successfully submitted by Xiaohu Xu and posted to the >> >> IETF repository. >> >> >> >> Name: draft-xu-lsr-flooding-reduction-in-clos >> >> Revision: 01 >> >> Title: Flooding Reduction in CLOS Networks >> >> Date: 2023-11-22 >> >> Group: Individual Submission >> >> Pages: 6 >> >> URL: >> https://www.ietf.org/archive/id/draft-xu-lsr-flooding-reduction-in-clos-01.txt >> >> Status: >> https://datatracker.ietf.org/doc/draft-xu-lsr-flooding-reduction-in-clos/ >> >> HTMLized: >> https://datatracker.ietf.org/doc/html/draft-xu-lsr-flooding-reduction-in-clos >> >> Diff: >> https://author-tools.ietf.org/iddiff?url2=draft-xu-lsr-flooding-reduction-in-clos-01 >> >> >> >> Abstract: >> >> >> >> In a CLOS topology, an OSPF (or ISIS) router may receive identical >> >> copies of an LSA (or LSP) from multiple OSPF (or ISIS) neighbors. >> >> Moreover, two OSPF (or ISIS) neighbors may exchange the same LSA (or >> >> LSP) simultaneously. This results in unnecessary flooding of link- >> >> state information, which wastes the precious resources of OSPF (or >> >> ISIS) routers. Therefore, this document proposes extensions to OSPF >> >> (or ISIS) to reduce this flooding within CLOS networks. The >> >> reduction of OSPF (or ISIS) flooding is highly beneficial for >> >> improving the scalability of CLOS networks. >> >> >> >> >> >> >> >> The IETF Secretariat >> >> >> >> _______________________________________________ >> >> Lsr mailing list >> >> [email protected] >> >> https://www.ietf.org/mailman/listinfo/lsr >> >> _______________________________________________ >> >> Lsr mailing list >> >> [email protected] >> >> https://www.ietf.org/mailman/listinfo/lsr >> > >> > >> > _______________________________________________ >> > Lsr mailing list >> > [email protected] >> > https://www.ietf.org/mailman/listinfo/lsr >> >> _______________________________________________ >> Lsr mailing list >> [email protected] >> https://www.ietf.org/mailman/listinfo/lsr >> >
_______________________________________________ Lsr mailing list [email protected] https://www.ietf.org/mailman/listinfo/lsr
