On 27-Sep-24 08:42, Templin (US), Fred L wrote:
Hi Brian,
-----Original Message-----
From: Brian E Carpenter <[email protected]>
Sent: Thursday, September 26, 2024 1:22 PM
To: Templin (US), Fred L <[email protected]>; Tom Herbert
<[email protected]>; Tim Chown <[email protected]>
Cc: Internet Area <[email protected]>; IPv6 List <[email protected]>
Subject: Re: [Int-area] Re: IP Parcels and Advanced Jumbos (AJs)
On 27-Sep-24 05:56, Templin (US), Fred L wrote:
Hi Tom,
I would like to gently suggest a new terminology. Rather than calling them "the
multi-segment buffers managed by GSO and GRO", can we
begin calling them "parcel buffers" or simply "parcels"? Not suggesting this in
a self-serving manner - I just think it is a more concise yet
more descriptive terminology.
But that isn't the same thing. RFC2675 jumbograms are single datagrams. They
were originally intended for use over HIPPI, i.e. internally to
data centres as they existed 25 years ago, so the usage that Tom reported seems
close to what they were designed for.
I was not referring to Tom's reference to jumbograms; I was referring to "the
multi-segment buffers managed by GSO and GRO".
Tom, is there a full description of this usage?
GSO and GRO are fully described in "IP Parcels and Advanced Jumbos". GSO and
GRO are one and the same as parcels before an IP and upper layer protocol header are
appended. GSO is one and the same as parcel packetization, and GRO is one and the same as
parcel restoration.
To my understanding, "big TCP" and "big UDP" use the jumbogram construct to
ferry large parcel buffers internally - not to transmit large packets over large MTU links.
Indeed. But if sendmsg() and recvmsg() can and do generate RFC2675 packets, it means that
any discussion of obsoleting RFC2675 should be off the table. And, to quibble about
terminology, such a packet (even if it's never sent on an actual wire) is a single
payload from end to end, so I wouldn't use "parcel" to describe it.
Brian
Thank you - Fred
Regards
Brian
Thank you - Fred
-----Original Message-----
From: Tom Herbert <[email protected]>
Sent: Thursday, September 26, 2024 10:15 AM
To: Tim Chown <[email protected]>
Cc: Paul Vixie <[email protected]>; Templin (US), Fred L <[email protected]>;
Internet Area <[email protected]>; IPv6 List
<[email protected]>
Subject: Re: [Int-area] Re: IP Parcels and Advanced Jumbos (AJs)
On Thu, Sep 26, 2024 at 9:03 AM Tim Chown
<[email protected]> wrote:
Hi,
From: Paul Vixie <[email protected]>
Date: Tuesday, 24 September 2024 at 20:59
To: Templin (US), Fred L <[email protected]>, Internet Area
<[email protected]>, IPv6 List <[email protected]>
Subject: [Int-area] Re: IP Parcels and Advanced Jumbos (AJs)
Something like this is long needed and will become badly needed. Every 10X of
speed increase since 10mbit/sec has gone straight to
PPS,
whereas the speed increase from 3mbit/sec to 10mbit/sec was shared between PPS
and MTU.
If every 10X has been shared between PPS and MTU, say sqrt(10) for each, our
MTU would be well over 64K by now and our PPS
wouldn't
require dedicated NPU hardware to source, sink, and ferry those packets at link
saturation levels.
Every attempt at PMTUD so far has failed but that's not an excuse to stop
trying.
I think that depends on the deployment scenario and environment. In R&E
networking the adoption of 9000 MTU for large scale wide
area data transfers has grown, in particular by dozens of sites worldwide that
take part in the CERN experiments. CERN did a site survey
recently, for which I could dig out the results.
The sites running 9000 MTU are interoperating with the sites still at 1500,
which is an indication that PMTUD is working well enough.
The
large majority of CERN traffic is IPv6, so for that there’s no fragmentation on
path.
Tim,
That's also happening in some datacenters. I believe Google is using a
9K MTU internally as it makes zero copy on hosts feasible (two 4K
pages per packet). Interestingly, there's also increasing use of
RFC2675 jumbograms, they're not sent on the wire but used internally
for GSO and GRO for greater than 64K packets.
Tom
The use case is somewhat constrained in that it’s only the parts of the campus
with the storage, the campus paths to the edge, and the
intervening R&E backbones that need to be configured. But with correct ICMPv6
filtering, it seems robust.
Best wishes,
Tim
Thanks for driving this Fred.
p vixie
On Sep 24, 2024 14:39, "Templin (US), Fred L"
<[email protected]> wrote:
It has been a while since I have posted about this, and there are some updates
to highlight.
See below for the IPv6 and IPv4 versions of “IP Parcels and Advanced Jumbos
(AJs)”:
https://datatracker.ietf.org/doc/draft-templin-6man-parcels2/
https://datatracker.ietf.org/doc/draft-templin-intarea-parcels2/
The documents acknowledge that parcels are analogous to Generic Segment/Receive
Offload
(GSO/GRO) but taken to the ultimate aspiration of encapsulating multi-segment
buffers in
{TCP/UDP}/IP headers for transmission over parcel-capable network paths. They
further give
a name to the multi-segment buffers used by GSO/GRO, suggesting that they be
called
“parcel buffers” or simply “parcels”.
AJs are simply single-segment parcels that can range in size from very small to
very large.
They differ from ordinary jumbograms in several important ways, most notably in
terms
of integrity verification and error correction. They also suggest a new link
service model
that defers integrity checks to the end systems where bad data can be discarded
while
good data can be accepted as an end-to-end function, reducing retransmissions.
Together, these documents cover all possible packet sizes and configurations
that may
be necessary both in the near term and for the foreseeable future for
Internetworking
performance maximization . Comments on the list(s) are welcome.
Fred Templin
_______________________________________________
Int-area mailing list -- [email protected]
To unsubscribe send an email to [email protected]
_______________________________________________
Int-area mailing list -- [email protected]
To unsubscribe send an email to [email protected]
_______________________________________________
Int-area mailing list -- [email protected]
To unsubscribe send an email to [email protected]