Re: [Bloat] Jumbo frames and LAN buffers (was: RE: Burst Loss)

Jonathan Morton Sun, 15 May 2011 17:32:03 -0700

On 15 May, 2011, at 11:49 pm, Fred Baker wrote:

> 
> On May 15, 2011, at 11:28 AM, Jonathan Morton wrote:
>> The fundamental thing is that the sender must be able to know when sent 
>> frames can be flushed from the buffer because they don't need to be 
>> retransmitted.  So if there's a NACK, there must also be an ACK - at which 
>> point the ACK serves the purpose of the NACK, as it does in TCP.  The only 
>> alternative is a wall-time TTL, which is doable on single hops but requires 
>> careful design.
> 
> To a point. NORM holds a frame for possible retransmission for a stated 
> period of time, and if retransmission isn't requested in that interval 
> forgets it. So the ack isn't actually necessary; what is necessary is that 
> the retention interval be long enough that a nack has a high probability of 
> succeeding in getting the message through.


Okay, so because it can fall back to TCP's retransmit, the retention 
requirements can be relaxed.

>> ...recent versions of Ethernet *do* support a throttling feedback mechanism, 
>> and this can and should be exploited to tell the edge host or router that 
>> ECN *might* be needed.  Also, with throttling feedback throughout the LAN, 
>> the Ethernet can for practical purposes be treated as almost-reliable.  This 
>> is *better* in terms of packet loss than ARQ or NACK, although if the 
>> Ethernet's buffers are large, it will still increase delay.  (With small 
>> buffers, it will just decrease throughput to the capacity, which is fine.)
> 
> It increases the delay anyway. It just pushes the retention buffer to another 
> place. What do you think the packet is doing during the "don't transmit" 
> interval?

Most packets delayed by Ethernet throttling would, with small buffers, end up 
waiting in the sending host (or router).  They thus spend more time in a 
potentially active queue instead of in a dumb one.  But even if the host queue 
is dumb, the overall delay is no worse than with the larger Ethernet buffers.

> Throughput never exceeds capacity. If I have a 10 GBPS link, I will never get 
> more than 10 GBPS through it. Buffer fill rate is statistically predictable. 
> With small buffers, the fill rate acheives the top sooner. They increase the 
> probability that the buffers are full, which is to say the drop probability. 
> Which puts us to an end to end retransmission, which is the worst case of 
> what you were worried about.

Let's suppose someone has generously provisioned an office with GigE 
throughout, using a two-level hierarchy of switches.  Some dumb schmuck then 
schedules every single computer to run it's backups (to a single fileserver) at 
the same time.  That's say 100 computers all competing for one GigE link to the 
fileserver.  If the switches are fair, each computer should get 10Mbps - that's 
the capacity.

With throttling, each computer sees the link closed 99% of the time.  It can 
send at link rate for the remaining 1% of the time.  On medium timescales, that 
looks like a 10Mbps bottleneck at the first link.  So the throughput on that 
link equals the capacity, and hopefully the goodput is also thus.  The only 
queue that is likely to overflow is the one on the sending computer, and one 
would hope there is enough feedback in a host's own TCP/IP stack to prevent 
that.

Without throttling but with ARQ, NACK or whatever you want to call it, the host 
has no signal to tell it to slow down - so the throughput on the edge link is 
more than 10Mbps (but the goodput will be less).  The buffer in the outer 
switch fills up - no matter how big or small it is - and starts dropping 
packets.  The switch then won't ask for retransmission of packets it's just 
dropped, because it has nowhere to put them.  The same process then repeats at 
the inner switch.  Finally, the server sees the missing packets, and asks for 
the retransmission - but these requests have to be switched all the way back to 
the clients, because the missing packets aren't in the switches' buffers.  It's 
therefore no better than a TCP SACK retransmission.

So there you have a classic congested network scenario in which throttling 
solves the problem, but link-level retransmission can't.

Where ARQ and/or NACK come in handy is where the link itself is unreliable, 
such as on WLANs (hence the use in amateur radio) and last-mile links.  In that 
case, the reason for the packet loss is not a full receive buffer, so asking 
for a retransmission is not inherently self-defeating.

> I'm not going to argue against letting retransmission go end to end; it's an 
> endless debate. I'll simply note that several link layers, including but not 
> limited to those you mention, find that applications using them work better 
> if there is a high high probability of retransmission in an interval on the 
> order of the link RTT as opposed to the end to end RTT. You brought up data 
> centers (aka variable delays in LAN networks); those have been heavily the 
> province of fiberchannel, which is a link layer protocol with retransmission. 
> Think about it.

What I'd like to see is a complete absence of need for retransmission on a 
properly built wired network.  Obviously the capability still needs to be there 
to cope with the parts that aren't properly built or aren't wired, but TCP can 
do that. Throttling (in the form of Ethernet PAUSE) is simply the third 
possible method of signalling congestion in the network, alongside delay and 
loss - and it happens to be quite widely deployed already.

 - Jonathan

_______________________________________________
Bloat mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/bloat

Re: [Bloat] Jumbo frames and LAN buffers (was: RE: Burst Loss)

Reply via email to