Re: [Bloat] Packet reordering and RACK (was The "Some Congestion Experienced" ECN codepoint)

Carsten Bormann Sun, 17 Mar 2019 07:35:09 -0700

>> The end-to-end argument applies:  Ultimately, there needs to be resequencing 
>> at the end anyway, so any reordering in the network would be a performance 
>> optimization.  It turns out that keeping packets lying around in some buffer 
>> somewhere in the network just to do resequencing before they exit an L2 
>> domain (or a tunnel) is a pessimization, not an optimization.
> 
>       I do not buy the end to end argument here, because in the extreme why 
> do ARQ on individual links anyway, we can just leave it to the end-points to 
> do the ARQ and TCP does anyway.


The optimization is that the retransmission on a single link (or within a path 
segment, which is what I’m interested in) does not need to span the entire 
end-to-end path.  That is strictly better than an end-to-end retransmission.  
Also, a local segment may allow faster recovery by not implicating the entire 
e2e latency, which allows for strictly better latency.  So, yes, there are 
significant optimizations in doing local retransmissions, but there are also 
interesting interactions with end-to-end retransmission that need to be taken 
care of.  This has been known for a long time, e.g., see 
https://tools.ietf.org/html/rfc3819#section-8 which documents things that were 
considered to be well known in the early 2000s.

> The point is transport-ARQ allows to use link technologies that otherwise 
> would not be acceptable at all. So doing ARQ on the individual links already 
> indicates that somethings are more efficient to not only do e2e.

Obviously.

> I just happen to think that re-ordering falls into the same category, at 
> least for users stuck behind a slow link as is typical at the edge of the 
> internet.

Resequencing (which is the term I prefer for putting things back in sequence 
again, after they have been reordered) requires storing packets that are ahead 
of later packets.  This is strictly suboptimal if these packets could be 
delivered instead (in contrast, it *is* a good idea to resequence packets that 
are in a queue waiting for a transmission opportunity).  So *requiring*(*) 
local path segments to resequence is strictly suboptimal.

(*) even if this is not a strict requirement, but just a statement of the form 
“the transport will be much more efficient if you deliver in order”.

> To put numbers to my example, assume I am on a 1/1 Mbps link and I get TCP 
> data at 1 Mbps rate and MTU1500 packets (I am going to keep the numbers 
> approximate) and I get a burst of say 10 packets containing say 10 individual 
> messages for my application telling the position of say an object in 3d space
> 
> each packet is going to "hog" the link for: 1000 ms/s * (1500 * 8 b/packet ) 
> / (1000 * 1000 b/s)  = 12 ms
> So I get access to messages/new positions every 12 ms and I can display this 
> smoothly

That is already broken by design.  If you are not accounting for latency 
variation (“jitter”), you won’t be able to deal with it.  Your example also 
makes sure it does not work well by being based on 100 % utilization.

> Now if the first packet gets r-odered to be last, I either drop that packet

…which is another nice function the network could do for you before expending 
further resources on useless delivery; see e.g. draft-ietf-6lo-deadline-time 
for one way to do this.

> and accept a 12 ms gap or if that is not an option I get to wait 9*12 = 108ms 
> before positions can be updated, that IMHO shows why re-ordering is terrible 
> even if TCP would be more tolerant. 

You are assuming that the network can magically resequence a packet into place 
that it does not have.

Now I do understand that forwarding an out-of-order packet will block the 
output port for the time needed to serialize it.  So if you get it right before 
what would have been an in-order packet, the latter incurs additional latency.  
Note that this requires a bottleneck configuration, i.e., packets to be 
forwarded arrive faster than they can be serialized out.  Don’t do bottlenecks 
if you want ultra-low latency.  (And don’t do links where you need to 
retransmit, either.)

> Especially in the context of L4S something like this seems to be totally 
> unacceptable if ultra-low latency is supposed to be anything more than 
> marketing. 

Dropping packets that can’t be used anyway is strictly better than delivering 
them.
But apart from that, forwarding packets that I have is strictly better for low 
latency than leaving the output port idle and waiting for previous-in-order 
packets to send them out in sequence.

>> For three decades now, we have acted as if there is no cost for in-order 
>> delivery from L2 — not because that is true, but because deployed transport 
>> protocol implementations were built and tested with simple links that don’t 
>> reorder.  
> 
>       Well, that is similar to the argument for performing non-aligned loads 
> fast in hardware, yes this comes with a considerable cost in complexity and 
> it is harder to make this go fast than just allowing aligned loads and fixing 
> up unaligned loads by trapping to software, but from a user perspective the 
> fast hardware beats the fickle only make aligned loads go fast approach any 
> old day.

CPUs have an abundance of transistors you can throw at this problem so the 
support of unaligned loads has become standard practice for CPUs with enough 
transistors.
I’m not sure this argument transfers, because this is not about transistors 
(except maybe when we talk about in-queue resequencing, which would be a nice 
feature if we had information in the packets to allow it).

>> Techniques for ECMP (equal-cost multi-path) have been developed that appease 
>> that illusion, but they actually also are pessimizations at least in some 
>> cases.
> 
>       Sure, but if I understand correctly, this is partly due to the fact 
> that transport people opted not to do the re-sorting on a flow-by-flow basis; 
> that would solve the blocking issue from the transport perspective, sure the 
> affected flow would still suffer from some increased delay, but as I tried to 
> show above that might be still smaller than the delay incurred by doing the 
> re-sorting after the bottleneck link. What is wrong with my analysis?

Transport people have no control over what is happening in the network, so 
maybe I don’t understand the argument.

>> The question at hand is whether we can make the move back to end-to-end 
>> resequencing techniques that work well,
> 
>       But we can not, we can make TCP more robust, but what I predict if RACK 
> allows for 100ms delay transports will take this as the new the new goal and 
> will keep pushing against that limit; and all in the name of bandwidth over 
> latency.

Where does this number come from?  100 ms is pretty long as a reordering 
maximum for most paths outside of satellite links.  Instead, you would do 
something based on an RTT estimate.

>> at least within some limits that we still have to find.
>> That probably requires some evolution at the end-to-end transport 
>> implementation layer.  We are in a better position to make that happen than 
>> we have been for a long time.
> 
>       Probably true, but also not very attractive from an end-user 
> perspective…. unless this will allow transport innovations that will allow 
> massively more bandwidth at a smallish latency cost.

The argument against in-network resequencing is mostly a latency argument (but, 
as a second order effect, that reduced latency may also allow more throughput), 
so, again, I don’t quite understand.

Grüße, Carsten

_______________________________________________
Bloat mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/bloat

Re: [Bloat] Packet reordering and RACK (was The "Some Congestion Experienced" ECN codepoint)

Reply via email to