On Wed, Jun 15, 2016 at 1:38 PM, Eric Dumazet <eduma...@google.com> wrote: > > On Wed, Jun 15, 2016 at 1:34 PM, Daniel Metz <dm...@mytum.de> wrote: > > Yuchung Cheng | 2016-06-15 20:02: > >> Let me explain in a different way: > >> > >> * RFC6298 applies a lower bound of 1 second to RTO (section 2.4) > >> > >> * Linux currently applies a lower bound of 200ms (min_rto) to > >> K*RTTVAR, but /not/ RTO itself. > >> > >> * This patch applies the lower bound of 200ms to RTO, similar to RFC6298 > >> > >> > >> Let's say the SRTT is 100ms and RTT variations is 10ms. The variation > >> is low because we've been sending large chunks, and RTT is fairly > >> stable, and we sample on every ACK. The RTOs produced are > >> > >> RFC6298: RTO=1s > >> Linux: RTO=300ms > >> This patch: RTO=200ms > >> > >> Then we send 1 packet out. The receiver delays the ACK up to 200ms. > >> The actual RTT can be longer because other network components further > >> delay the data or the ACK. This patch would surely fire the RTO > >> spuriously. > >> > >> so we can either implement RFC6298 faithfully, or apply the > >> lower-bound as-is, or something in between. But the current patch > >> as-is is more aggressive. Did I miss something? > > > > Thank you for the clarification. The fundamental thought of this patch was > > to decrease Linux RTO overestimation. This also involved not clinging to the > > RFC 6298 MinRTO of 1 second ((2.4) "[...] at the same time acknowledging > > that at some future point, research may show that a smaller minimum RTO is > > acceptable or superior."). A more aggressive RTO will of course increase the > > amount of Spurious Retransmission. The question is, if the benefit is higher > > than the sacrifice. The tests we conducted did not show significant negative > > impact so far. However, for sender-limited TCP flows the results were > > promising. > > > > I guess the problem is that some folks use smaller rto than > RTAX_RTO_MIN , look at tcp_rto_min()
Also many other stacks (e.g., Windows until very recently) do not have 40ms delayed ACKs like Linux. One thing we at least know is that the current 200ms lower-bound on RTTVAR works for a long time. That's why I propose to do so. In other words, change the RTT variation averaging, but not the lower-bound. Will try to get the experiment going to test different min_rto values so we have more data.