[Int-area] Re: Regarding the draft: Scale-Up Network Header (SUNH)

Tom Herbert Thu, 08 Jan 2026 15:25:36 -0800

On Thu, Jan 8, 2026 at 2:56 PM Haoyu Song <[email protected]> wrote:
>
> Hi Tom,
>
>
>
> I came across your draft “Scale-Up Network Header (SUNH)” and found it very 
> interesting and timely. It resonated well with a draft I wrote a while ago 
> (https://datatracker.ietf.org/doc/html/draft-song-ship-edge-05) although that 
> one addressed a more general scope.
>


Hi Haoyu,

Thanks for your comments!

> Here are some comments and thoughts I’d like to share.
>
>
>
> “Some traffic patterns may have a majority of small packets, like for KV 
> cache in AI, where packet sizes may commonly be 256 bytes or less.”
>
> As far as I know, the token KV cache is pretty big and the packet size is 
> limited by the MTU. The small packets in Scale Up network are usually for 
> control plane signaling and synchronization, or small memory-semantic 
> transactions. Anyway, I agree the header overhead is a big concern in AIDCN 
> and the current SUE solution is flawed.

Yes, but the 256 number seems to be the latest "practical" minimum
size we need support. At least people aren't talking about getting
line rate with sixty-four byte packets like was all the rage a few
years ago! :-)

>
>
>
> There’s no dedicated Scale Up Network NICs available. Scale Up network 
> interface is usually supposed to be integrated into the GPU dies. When 
> Ethernet interface is used, the situation might change. But a GPU cannot 
> afford to have two PCIe interfaces to connect two separate NICs. So most 
> likely the scale up and scale out networks will be converged and share the 
> same NIC. If that’s true, compatibility and the ability to interoperate with 
> standard IP protocols become a necessity.

Yes. A lot of this has to do with the topologies. In our design, GPUs
in a single system connect to a memory fabric and the scale-up network
would probably be useful for intra rack connectivity.

>
>
>
> I think 16-bit SUNH address is too long for now, and probably too short in 
> the future if the scale up and scale out networks are converged. So it’s 
> better to maintain flexibility. The SHIP draft I mentioned earlier provides a 
> flexible scheme and allows the gateway switches to translate the 
> header-compressed packets into normal IPv4/v6 packets so the inter-DC traffic 
> can be seamlessly supported.  With this, the routing header (i.e., compressed 
> SRv6) can also be supported, making the scheme even more flexible to support 
> SR (in another research, I found that capability is very useful in certain 
> DCN topologies).

It's a tradeoff (of course any address size we choose is a tradeoff
:-) ). While flexibility is nice, it comes at the expense of
complexity. Supporting multiple address lengths in the same protocol
is complex on both switches and hosts. There's a lesson to be learned
from IPv6. While IPv6 still has an IP version number, the fact that
IPv6 has its own EtherType makes the IP version number redundant and
unnecessary-- in practical terms IPv6 is a distinct protocol from
IPv4, not just a different version. In SUNH we can apply the lesson by
eschewing things like version numbers and variable length headers.  If
sixteen bit addresses prove to be the wrong choice then we can just
spin a new protocol with a different address size and its own
EtherType.

As for SRv6 used with SUNH, I'm personally ambivalent. The design of
the protocol allows for SRv6 headers, but I suspect that most use
cases like scale-up networking tend to be flat networks so SR might
not be very interesting.

>
>
>
> I think this is an area and opportunity that IETF can contribute to the AI 
> network, and I’m looking forward to contributing to it.

Great!

Tom

>
>
>
> Best regards,
>
> Haoyu
>
>

_______________________________________________
Int-area mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[Int-area] Re: Regarding the draft: Scale-Up Network Header (SUNH)

Reply via email to