Re: [Int-area] [EXTERNAL] Re: IP Parcels improves performance for end systems
Tom, > -Original Message- > From: Tom Herbert [mailto:t...@herbertland.com] > Sent: Tuesday, March 22, 2022 1:47 PM > To: Robinson, Herbie > Cc: Templin (US), Fred L ; int-area@ietf.org > Subject: Re: [Int-area] [EXTERNAL] Re: IP Parcels improves performance for > end systems > > EXT email: be mindful of links/attachments. > > > > On Tue, Mar 22, 2022 at 10:40 AM Robinson, Herbie > wrote: > > > > > The nice thing about TSO/GSO/GRO is that they don't require any > > > changes to the protocol as just implementation techniques, also > > > they're one sided opitmizations meaning for instance that TSO can be > > > used at the sender without requiring GRO to be used at the receiver. > > > My understanding is that IP parcels requires new protocol that would > > > need to be implemented on both endpoints and possibly in some routers. > > > Do you have data that shows the benefits of IP Parcels in light of > > > these requirements? > > > > The various segmentation offload optimizations done by NICs have a number > > of drawbacks that the parcel scheme does not have: > > > > 1. They are still limited to 65K bytes. > > That's true, however we do start to get diminishing returns above 64K > (this is why I asked for real data, the existing offloads have already > solved the bulk of the problem AFAICT). The primary problem with these > optimizations is that they are opportunistic. e.g. we can only send > >64K on a TCP connection if that's available CWND. In the worst case, > e.g. TCP is in slow start, then we may only bei sending a few segments > so such optimizations don't help and can actually hurt if those > deploying these optimization under provision for the worst case that > the optimizations can't be used. I can most definitely show that some ULPs that use UDP get much better single-segment (i.e., singleton IP Parcel) performance proportionally to the segment size. With NFS over UDP many decades ago, it was plainly evident that 8KB segments outperformed smaller segment sizes even though IP fragmentation was used. With LTP over UDP today, I can show that performance is directly related to segment size up to the maximum segment size of ~64KB - again, with heavy reliance on IP fragmentation. And, the performance for a singleton IP Parcel with segment size of ~64KB was much better than for an N-segment IP Parcel with smaller segment size (e.g., ~9KB, ~16KB, etc.). The performance curves suggest that with adequate hardware multi-segment IP Parcels with larger segment sizes would give the best performance, i.e., even if the Parcel exceeds 64KB by a little or even a lot. Fred > > > 2. The number of streams that can be optimized are significantly limited > > by memory in the NIC and usually by the size of the receive > queue. In other words, it words some of the time, but not all of the time. > > I don't see how IP parcels are any different in this regard, it would > similarly be reassembling segments into larger segments which requires > memory if the NIC is doing it. There might be an argument that this is > standard means to perform intelligent fragmentation, but again that > needs to be measured against the complexity and feasibility of > deploying a new on the wire protocol. > > > 3. The NIC is responsible for checksum computation and validation. If one > > is operating in a fault tolerant environment, one wants the > checksum computation/checking done in the fault tolerant domain (i.e., the > host, not the NIC). > > We still need checksum offload with IP parcels. For instance, if a > parcel contains 10 TCP segments we'd need the device to compute and > set 10 checksums. I'm not sure how this could work in IP parcels other > than using similar techniques for TSO. > > Tom > > > > > > > > > ___ > > Int-area mailing list > > Int-area@ietf.org > > https://www.ietf.org/mailman/listinfo/int-area ___ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area
Re: [Int-area] [EXTERNAL] Re: IP Parcels improves performance for end systems
On Tue, Mar 22, 2022 at 10:40 AM Robinson, Herbie wrote: > > > The nice thing about TSO/GSO/GRO is that they don't require any > > changes to the protocol as just implementation techniques, also > > they're one sided opitmizations meaning for instance that TSO can be > > used at the sender without requiring GRO to be used at the receiver. > > My understanding is that IP parcels requires new protocol that would > > need to be implemented on both endpoints and possibly in some routers. > > Do you have data that shows the benefits of IP Parcels in light of > > these requirements? > > The various segmentation offload optimizations done by NICs have a number of > drawbacks that the parcel scheme does not have: > > 1. They are still limited to 65K bytes. That's true, however we do start to get diminishing returns above 64K (this is why I asked for real data, the existing offloads have already solved the bulk of the problem AFAICT). The primary problem with these optimizations is that they are opportunistic. e.g. we can only send >64K on a TCP connection if that's available CWND. In the worst case, e.g. TCP is in slow start, then we may only bei sending a few segments so such optimizations don't help and can actually hurt if those deploying these optimization under provision for the worst case that the optimizations can't be used. > 2. The number of streams that can be optimized are significantly limited by > memory in the NIC and usually by the size of the receive queue. In other > words, it words some of the time, but not all of the time. I don't see how IP parcels are any different in this regard, it would similarly be reassembling segments into larger segments which requires memory if the NIC is doing it. There might be an argument that this is standard means to perform intelligent fragmentation, but again that needs to be measured against the complexity and feasibility of deploying a new on the wire protocol. > 3. The NIC is responsible for checksum computation and validation. If one > is operating in a fault tolerant environment, one wants the checksum > computation/checking done in the fault tolerant domain (i.e., the host, not > the NIC). We still need checksum offload with IP parcels. For instance, if a parcel contains 10 TCP segments we'd need the device to compute and set 10 checksums. I'm not sure how this could work in IP parcels other than using similar techniques for TSO. Tom > > > > ___ > Int-area mailing list > Int-area@ietf.org > https://www.ietf.org/mailman/listinfo/int-area ___ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area
Re: [Int-area] IP Parcels improves performance for end systems
Thanks for these thoughts, Herbie, and see below for follow-up: Fred > -Original Message- > From: Robinson, Herbie [mailto:herbie.robin...@stratus.com] > Sent: Tuesday, March 22, 2022 12:58 PM > To: Templin (US), Fred L ; int-area@ietf.org > Subject: RE: IP Parcels improves performance for end systems > > I think this is a really good idea; although, it could be a bridge too far. > It certainly won’t get implemented quickly. I still remember > implementing path MTU discovery thinking about the big performance win I was > going to get in IPv6 from 9K packets (vs 1500 byte packets) > only to discover that certain router vendors had interpreted one of the RFCs > to mean they didn’t have to send packet-too-big messages if > the packet in question was larger then 1500 bytes (because one of the RFCs > stated that IPv6 implementations only have to support 1500 > bytes). Yes, the RFCs could be interpreted that way, but that interpretation > clearly violates the intent (to make path MTU reliable). The real > world upshot of that is that anything over 1500 bytes is still a black hole > and using path MTU can really only get you a whopping 300 bytes > (from 1200 to 1500). Yes, the limitations of traditional path MTU discovery have been well understood for a long time, which have motivated active probing schemes such as RFC4821 and RFC8899. But even these tools alone are not likely to find MTUs in excess of ~9KB in the near future - maybe not even in our lifetimes. But, fortunately, we can do much better than that with IP Parcels in the very near term. > Not trying to be negative, but 4M parcels are really looking to the future > and that's only a good thing if we realize > what we are doing. Fortunately, we know what we are doing. IP Parcels can be supported in all size ranges in the very near future and without changing out any hardware, including links, bridges switches and routers. They can be supported in the near term by establishing an Adaptation Layer for the Internet using AERO/OMNI. All that is required is for Parcel-capable end systems to configure an OMNI interface as point of connection to a virtual link spanning the network. The interface provides entry into an adaptation layer below the IP layer but above the data link layer. The adaptation layer uses encapsulation and two levels of fragmentation so that even the largest possible parcel (~4MB) can be conveyed across the virtual link with no change to any underlying network gear. In my talk, perhaps I spent too much time trying to explain the scenarios for discovering and using Parcel-capable links. When in fact, if the OMNI virtual link is extended nearly to the edges of the network then the only need for Parcel-capable (physical) links would be at the very extreme edges. And, no middleboxes would need to change. > Getting into the details: > > Using "segment" as the term to describe the individual parcel contents is > probably not a good idea, because TCP has segments but most > other ULPs do not. The term "segment" is also used by the Licklider Transmission Protoocl in the same way as for TCP, and I assume QUIC also has an ULP unit of data that becomes the retransmission unit in case of loss. Perhaps I can add a definition to the document that defines "segment" as "the ULP data unit that becomes the retransmission unit in case of loss"? > I don't completely follow the description as to exactly how one forms a > parcel. For example, does each segment include in IPvX header? I > think I read no, but it could be a little clearer. This is a fair point, and the text can certainly be made clearer. As you surmised, there is only a single IPvX header and J ULP segments, with J ranging from 1 to 64. > I think that requiring all the segments to be exactly the same size (except > the last one) is a problem. It's definitely a problem for UDP. Even > with TCP, it becomes difficult to use exactly N bytes in a packet -- It > involves a very fragile dance between the ULP and the IP layer to > communicate the exact size of the options being used. Things like IPSec make > it impossible to use some MTU values and one needs to go a > little under (so does fragmentation; although, that doesn't apply here). > Given how difficult and fragile is it for an implementation to > completely fill up to the MTU (or "L" in the context of your document), a > reasonable design choice would be to make worse case > assumptions when they tell ULP how many bytes it has to work with. The requirement of all segments being exactly the same size except possibly the final one was borrowed from GSO/GRO and is practical for all applications that are capable of using GSO/GRO. The same size requirement means that segments can be re-ordered in flight between the source and destination (i.e., if any parcel re-packaging occurs in the path) and then put back together in a slightly different order than their original packaging. And,
Re: [Int-area] IP Parcels improves performance for end systems
I think this is a really good idea; although, it could be a bridge too far. It certainly won’t get implemented quickly. I still remember implementing path MTU discovery thinking about the big performance win I was going to get in IPv6 from 9K packets (vs 1500 byte packets) only to discover that certain router vendors had interpreted one of the RFCs to mean they didn’t have to send packet-too-big messages if the packet in question was larger then 1500 bytes (because one of the RFCs stated that IPv6 implementations only have to support 1500 bytes). Yes, the RFCs could be interpreted that way, but that interpretation clearly violates the intent (to make path MTU reliable). The real world upshot of that is that anything over 1500 bytes is still a black hole and using path MTU can really only get you a whopping 300 bytes (from 1200 to 1500). Not trying to be negative, but 4M parcels are really looking to the future and that's only a good thing if we realize what we are doing. Getting into the details: Using "segment" as the term to describe the individual parcel contents is probably not a good idea, because TCP has segments but most other ULPs do not. I don't completely follow the description as to exactly how one forms a parcel. For example, does each segment include in IPvX header? I think I read no, but it could be a little clearer. I think that requiring all the segments to be exactly the same size (except the last one) is a problem. It's definitely a problem for UDP. Even with TCP, it becomes difficult to use exactly N bytes in a packet -- It involves a very fragile dance between the ULP and the IP layer to communicate the exact size of the options being used. Things like IPSec make it impossible to use some MTU values and one needs to go a little under (so does fragmentation; although, that doesn't apply here). Given how difficult and fragile is it for an implementation to completely fill up to the MTU (or "L" in the context of your document), a reasonable design choice would be to make worse case assumptions when they tell ULP how many bytes it has to work with. If a packet is longer than 1500 bytes, some routers will not return ICMPv6 messages. You shouldn't depend on that for performance. ___ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area
Re: [Int-area] [EXTERNAL] Re: IP Parcels improves performance for end systems
> The nice thing about TSO/GSO/GRO is that they don't require any > changes to the protocol as just implementation techniques, also > they're one sided opitmizations meaning for instance that TSO can be > used at the sender without requiring GRO to be used at the receiver. > My understanding is that IP parcels requires new protocol that would > need to be implemented on both endpoints and possibly in some routers. > Do you have data that shows the benefits of IP Parcels in light of > these requirements? The various segmentation offload optimizations done by NICs have a number of drawbacks that the parcel scheme does not have: 1. They are still limited to 65K bytes. 2. The number of streams that can be optimized are significantly limited by memory in the NIC and usually by the size of the receive queue. In other words, it words some of the time, but not all of the time. 3. The NIC is responsible for checksum computation and validation. If one is operating in a fault tolerant environment, one wants to the checksum computation/checking done in the fault tolerant domain (i.e., the host, not the NIC). ___ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area
Re: [Int-area] IP Parcels improves performance for end systems
Tom, see below: > -Original Message- > From: Tom Herbert [mailto:t...@herbertland.com] > Sent: Tuesday, March 22, 2022 10:00 AM > To: Templin (US), Fred L > Cc: Eggert, Lars ; int-area@ietf.org > Subject: Re: [Int-area] IP Parcels improves performance for end systems > > On Tue, Mar 22, 2022 at 7:42 AM Templin (US), Fred L > wrote: > > > > Lars, I did a poor job of answering your question. One of the most > > important aspects of > > > > IP Parcels in relation to TSO and GSO/GRO is that transports get to use a > > full 4MB buffer > > > > instead of the 64KB limit in current practices. This is possible due to the > > IP Parcel jumbo > > > > payload option encapsulation which provides a 32-bit length field instead > > of just a 16-bit. > > > > By allowing the transport to present the IP layer with a buffer of up to > > 4MB, it reduces > > > > the overhead, minimizes system calls and interrupts, etc. > > > > > > > > So, yes, IP Parcels is very much about improving the performance for end > > systems in > > > > comparison with current practice (GSO/GRO and TSO). > > Hi Fred, > > The nice thing about TSO/GSO/GRO is that they don't require any > changes to the protocol as just implementation techniques, also > they're one sided opitmizations meaning for instance that TSO can be > used at the sender without requiring GRO to be used at the receiver. > My understanding is that IP parcels requires new protocol that would > need to be implemented on both endpoints and possibly in some routers. It is not entirely true that the protocol needs to be implemented on both endpoints . Sources that send IP Parcels send them into a Parcel-capable path which ends at either the final destination or a router for which the next hop is not Parcel-capable. If the Parcel-capable path extends all the way to the final destination, then the Parcel is delivered to the destination which knows how to deal with it. If the Parcel-capable path ends at a router somewhere in the middle, the router opens the Parcel and sends each enclosed segment as an independent IP packet. The final destination is then free to apply GRO to the incoming IP packets even if it does not understand Parcels. IP Parcels is about efficient shipping and handling just like the major online retailer service model I described during the talk. The goal is to deliver the fewest and largest possible parcels to the final destination rather than delivering lots of small IP packets. It is good for the network and good for the end systems both. If this were not true, then Amazon would send the consumer 50 small boxes with 1 item each instead of 1 larger box with all 50 items inside. And, we all know what they would choose to do. > Do you have data that shows the benefits of IP Parcels in light of > these requirements? I have data that shows that GSO/GRO is good for packaging sizes up to 64KB even if the enclosed segments will require IP fragmentation upon transmission. The data implies that even larger packaging sizes (up to a maximum of 4MB) would be better still. Fred > Thanks, > Tom > > > > > > > > > Thanks - Fred > > > > ___ > > Int-area mailing list > > Int-area@ietf.org > > https://www.ietf.org/mailman/listinfo/int-area ___ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area
Re: [Int-area] IP Parcels improves performance for end systems
On Tue, Mar 22, 2022 at 7:42 AM Templin (US), Fred L wrote: > > Lars, I did a poor job of answering your question. One of the most important > aspects of > > IP Parcels in relation to TSO and GSO/GRO is that transports get to use a > full 4MB buffer > > instead of the 64KB limit in current practices. This is possible due to the > IP Parcel jumbo > > payload option encapsulation which provides a 32-bit length field instead of > just a 16-bit. > > By allowing the transport to present the IP layer with a buffer of up to 4MB, > it reduces > > the overhead, minimizes system calls and interrupts, etc. > > > > So, yes, IP Parcels is very much about improving the performance for end > systems in > > comparison with current practice (GSO/GRO and TSO). Hi Fred, The nice thing about TSO/GSO/GRO is that they don't require any changes to the protocol as just implementation techniques, also they're one sided opitmizations meaning for instance that TSO can be used at the sender without requiring GRO to be used at the receiver. My understanding is that IP parcels requires new protocol that would need to be implemented on both endpoints and possibly in some routers. Do you have data that shows the benefits of IP Parcels in light of these requirements? Thanks, Tom > > > > Thanks - Fred > > ___ > Int-area mailing list > Int-area@ietf.org > https://www.ietf.org/mailman/listinfo/int-area ___ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area
[Int-area] IP Parcels improves performance for end systems
Lars, I did a poor job of answering your question. One of the most important aspects of IP Parcels in relation to TSO and GSO/GRO is that transports get to use a full 4MB buffer instead of the 64KB limit in current practices. This is possible due to the IP Parcel jumbo payload option encapsulation which provides a 32-bit length field instead of just a 16-bit. By allowing the transport to present the IP layer with a buffer of up to 4MB, it reduces the overhead, minimizes system calls and interrupts, etc. So, yes, IP Parcels is very much about improving the performance for end systems in comparison with current practice (GSO/GRO and TSO). Thanks - Fred ___ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area
[Int-area] Note taker / jabber scribe for IntArea session @ IETF 113
Hi, As usual, we will need a few people to help us with the notes and jabber to run the meeting. Please send us a note if you can help us out. Best, Juan-Carlos & Wassim ___ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area