On Mon, Feb 08, 2021 at 07:46:46PM +0100, Alexander Bluhm wrote: > On Mon, Feb 08, 2021 at 07:03:59PM +0100, Jan Klemkow wrote: > > On Mon, Feb 08, 2021 at 03:42:54PM +0100, Alexander Bluhm wrote: > > > On Wed, Feb 03, 2021 at 11:20:04AM +0100, Claudio Jeker wrote: > > > > Just commit it. OK claudio@ > > > > If people see problems we can back it out again. > > > > > > This has huge impact on TCP performance. > > > > > > http://bluhm.genua.de/perform/results/2021-02-07T00%3A01%3A40Z/perform.html > > > > > > For a single TCP connection between to OpenBSD boxes, througput > > > drops by 77% from 3.1 GBit/sec to 710 MBit/sec. But with 100 > > > parallel connections the througput over all increases by 5%. > > > > For single connections our kernel is limited to send out 4 max TCP > > segments. I don't see that, because I just measured with 10 and 30 > > streams in parallel. > > > > FreeBSD disabled it 20 yeas ago. > > https://github.com/freebsd/freebsd-src/commit/d912c694ee00de5ea0f46743295a0fc603cab562 > > TCP_MAXBURST was added together with SACK in rev 1.12 of tcp_output.c > to our code base. > > ---------------------------- > revision 1.12 > date: 1998/11/17 19:23:02; author: provos; state: Exp; lines: +239 -14; > NewReno, SACK and FACK support for TCP, adapted from code for BSDI > by Hari Balakrishnan (h...@lcs.mit.edu), Tom Henderson (t...@cs.berkeley.edu) > and Venkat Padmanabhan (padma...@cs.berkeley.edu) as part of the > Daedalus research group at the University of California, > (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent > at the Center for Information Technology Integration (citi.umich.edu)] > ---------------------------- > > > I would suggest to remove the whole feature. > > Sending 4 segments per call to tcp_output() cannot scale. Bandwith > increases, window size grows, but segment size is 1500 for decades. > > With this diff on top of jan's delay ACK behavior I get 4.1 GBit/sec > over a single TCP connection using tcpbench -S1000000. Before both > changes it was only 3.0. > > I recommend removing TCP_MAXBURST like FreeBSD did. >
I agree that this maxburst limit is no longer adequate. TCP New Reno RFC6582 has the following: In Section 3.2, step 3 above, it is noted that implementations should take measures to avoid a possible burst of data when leaving fast recovery, in case the amount of new data that the sender is eligible to send due to the new value of the congestion window is large. This can arise during NewReno when ACKs are lost or treated as pure window updates, thereby causing the sender to underestimate the number of new segments that can be sent during the recovery procedure. Specifically, bursts can occur when the FlightSize is much less than the new congestion window when exiting from fast recovery. One simple mechanism to avoid a burst of data when leaving fast recovery is to limit the number of data packets that can be sent in response to a single acknowledgment. (This is known as "maxburst_" in ns-2 [NS].) Other possible mechanisms for avoiding bursts include rate- based pacing, or setting the slow start threshold to the resultant congestion window and then resetting the congestion window to FlightSize. A recommendation on the general mechanism to avoid excessively bursty sending patterns is outside the scope of this document. While I agree that bursts need to be limited I think the implementation of TCP_MAXBURST is bad. Since FreeBSD removed the code I guess nobody really ran into issues of additional packet loss because of the burts. So go ahead and remove it. OK claudio@ -- :wq Claudio