Jeff,

End-points do not need to participate in link state topology. Those are
merely leaves even if all endpoints are L3.

And 8K or even 256K of routes I think would be supported today by cheapest
nodes nodes from any vendor :)

Cheers,
R.





On Tue, Nov 28, 2023 at 12:32 AM Jeff Tantsura <[email protected]>
wrote:

> Robert,
>
> In context of LLM (10% of that for DLRM) training clusters, towards
> 2024/25 we would be looking to up to 8K end-points in a 3 stage leaf-spine
> fabric and up to 64-256K in 5 stages.
> Virtualization and how it is instantiated might significantly change
> amount/distribution of state in underlay/overlay.
>
> Obviously, these are hyperscale size deployments, many will be running
> 10-30 switches fabrics, where anything could work.
> BGP seems to work just fine, some data plane signaling could be used as a
> near real time augmentation to “slow but stable” control plane.
>
> Cheers,
> Jeff
>
> On Nov 26, 2023, at 14:30, Robert Raszuk <[email protected]> wrote:
>
> 
> Hey Jeff,
>
> Could you be so kind and defined term: "scaled-out leaf-spine fabrics" ?
>
> Specifically folks watching us here would highly appreciate if we state
> max target nodes with vanilla ISIS and max target nodes when your ISIS
> implementation supports draft-ietf-lsr-dynamic-flooding
> <https://datatracker.ietf.org/doc/html/draft-ietf-lsr-dynamic-flooding>
>
> While I am a BGP person I feel pretty strongly that BGP is not a best fit
> for the vast majority of DC fabrics in use today.
>
> Cheers,
> Robert
>
>
> On Sun, Nov 26, 2023 at 10:49 PM Jeff Tantsura <[email protected]>
> wrote:
>
>> I agree with all aforementioned comments.
>>
>> Wrt AI/ML networking - if a controller is used, what is required is link
>> state exposure northbound and not link state protocol  in the fabric. (I
>> could argue for RIFT though ;-))
>> I’d urge you to take a look at Meta’s deployment  in their ML clusters
>> (publicly available) - they use BGP as the routing protocol to exchange
>> reachability (and build ECMP sets) and provide a backup if controller
>> computed next hop goes away/before new one has been computed.
>> Open R is used northbound to expose the topology (in exactly same way -
>> BGP-LS could be used).
>>
>> To summarize: an LS protocol brings no additional value in scaled-out
>> leaf-spine fabrics, without significant modifications -  it doesn’t work in
>> irregular topologies such as DF, etc.
>> Existing proposals - there are shipping implementations and experience in
>> operating it, have proven their relative value in suitable deployments.
>>
>> Cheers,
>> Jeff
>>
>> > On Nov 26, 2023, at 12:20, Acee Lindem <[email protected]> wrote:
>> >
>> > Speaking as WG member:
>> >
>> > I agree. The whole Data Center IGP flooding discussion went on years
>> ago and the simplistic enhancement proposed in the subject draft is neither
>> relevant or useful now.
>> >
>> > Thanks,
>> > Acee
>> >
>> >> On Nov 24, 2023, at 11:55 PM, Les Ginsberg (ginsberg) <ginsberg=
>> [email protected]> wrote:
>> >>
>> >> Xiaohu –
>> >> I also point out that there are at least two existing drafts which
>> specifically address IS-IS flooding reduction in CLOS networks and do so in
>> greater detail and with more robustness than what is in your draft:
>> >> https://datatracker.ietf.org/doc/draft-ietf-lsr-distoptflood/
>> >> https://datatracker.ietf.org/doc/draft-ietf-lsr-isis-spine-leaf-ext/
>> >> I do not see a need for yet another draft specifically aimed at CLOS
>> networks.
>> >> Note that work on draft-ietf-lsr-isis-spine-leaf-ext was suspended due
>> to lack of interest in deploying an IGP solution in CLOS networks.
>> >> You are suggesting in draft-xu-lsr-fare that AI is going to change
>> this. Well, maybe, but if so I think we should return to the solutions
>> already available and prioritize work on them.
>> >>    Les
>> >>  From: Lsr <[email protected]> On Behalf Of Tony Li
>> >> Sent: Thursday, November 23, 2023 8:39 AM
>> >> To: [email protected]
>> >> Cc: [email protected]
>> >> Subject: Re: [Lsr] New Version Notification for
>> draft-xu-lsr-flooding-reduction-in-clos-01.txt
>> >> Hi,
>> >> What you’re proposing is already described in IS-IS Mesh Groups (
>> https://www.rfc-editor.org/rfc/rfc2973.html) and improved upon in
>> Dynamic Flooding (
>> https://datatracker.ietf.org/doc/html/draft-ietf-lsr-dynamic-flooding).
>> >> Regards,
>> >> Tony
>> >>
>> >>
>> >> On Nov 23, 2023, at 8:29 AM, [email protected] wrote:
>> >> Hi all,
>> >> Any comments or suggestions are welcome.
>> >> Best regards,
>> >> Xiaohu
>> >> 发件人: [email protected] <[email protected]>
>> >> 日期: 星期三, 2023年11月22日 11:37
>> >> 收件人: Xiaohu Xu <[email protected]>
>> >> 主题: New Version Notification for
>> draft-xu-lsr-flooding-reduction-in-clos-01.txt
>> >> A new version of Internet-Draft
>> draft-xu-lsr-flooding-reduction-in-clos-01.txt
>> >> has been successfully submitted by Xiaohu Xu and posted to the
>> >> IETF repository.
>> >>
>> >> Name:     draft-xu-lsr-flooding-reduction-in-clos
>> >> Revision: 01
>> >> Title:    Flooding Reduction in CLOS Networks
>> >> Date:     2023-11-22
>> >> Group:    Individual Submission
>> >> Pages:    6
>> >> URL:
>> https://www.ietf.org/archive/id/draft-xu-lsr-flooding-reduction-in-clos-01.txt
>> >> Status:
>> https://datatracker.ietf.org/doc/draft-xu-lsr-flooding-reduction-in-clos/
>> >> HTMLized:
>> https://datatracker.ietf.org/doc/html/draft-xu-lsr-flooding-reduction-in-clos
>> >> Diff:
>> https://author-tools.ietf.org/iddiff?url2=draft-xu-lsr-flooding-reduction-in-clos-01
>> >>
>> >> Abstract:
>> >>
>> >>   In a CLOS topology, an OSPF (or ISIS) router may receive identical
>> >>   copies of an LSA (or LSP) from multiple OSPF (or ISIS) neighbors.
>> >>   Moreover, two OSPF (or ISIS) neighbors may exchange the same LSA (or
>> >>   LSP) simultaneously.  This results in unnecessary flooding of link-
>> >>   state information, which wastes the precious resources of OSPF (or
>> >>   ISIS) routers.  Therefore, this document proposes extensions to OSPF
>> >>   (or ISIS) to reduce this flooding within CLOS networks.  The
>> >>   reduction of OSPF (or ISIS) flooding is highly beneficial for
>> >>   improving the scalability of CLOS networks.
>> >>
>> >>
>> >>
>> >> The IETF Secretariat
>> >>
>> >> _______________________________________________
>> >> Lsr mailing list
>> >> [email protected]
>> >> https://www.ietf.org/mailman/listinfo/lsr
>> >> _______________________________________________
>> >> Lsr mailing list
>> >> [email protected]
>> >> https://www.ietf.org/mailman/listinfo/lsr
>> >
>> >
>> > _______________________________________________
>> > Lsr mailing list
>> > [email protected]
>> > https://www.ietf.org/mailman/listinfo/lsr
>>
>> _______________________________________________
>> Lsr mailing list
>> [email protected]
>> https://www.ietf.org/mailman/listinfo/lsr
>>
>
_______________________________________________
Lsr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lsr

Reply via email to