On 8/30/07, David Miller <[EMAIL PROTECTED]> wrote: > From: "Ian McDonald" <[EMAIL PROTECTED]> > Date: Thu, 30 Aug 2007 09:32:38 +1200 > > > So I'm suspecting that the default should be changed to 1000 to match > > the RFC which would solve this issue. I note that the RFC is a SHOULD > > rather than a MUST. I had a quick look around and not sure why Linux > > overrides the RFC on this one. > > Everyone uses this value, even BSD since ancient times. > > None of the research folks want to commit to saying a lower value is > OK, even though it's quite clear that on a local 10 gigabit link a > minimum value of even 200 is absolutely and positively absurd. > Understand what you are saying. That is why I questioned as 200 msecs makes no sense on a LAN with < 1 msec RTT. So if the current is ridiculous and 1000 is even more so, why do we use? Just because that is how TCP is written I'm guessing.
I know that in DCCP CCID3 the RTO is 4 x RTT (from memory - it might be a slight variation) but we ended up putting a minimum on it as you also face a problem if it fires too frequently (i.e. link is in usecs). I might ask around on research lists and see why this issue has never been revisited. Now to the original issue - high RTT links. If that is an issue, and I believe it would be, then it's probably better to do this on a per route basis or similar, although then we're becoming a defacto X x rtt type setup. Rereading the RFC this actually doesn't seem prohibited and here is the code from DCCP CCID3 that we use: /* * Update timeout interval for the nofeedback timer. * We use a configuration option to increase the lower bound. * This can help avoid triggering the nofeedback timer too * often ('spinning') on LANs with small RTTs. */ hctx->ccid3hctx_t_rto = max_t(u32, 4 * hctx->ccid3hctx_rtt, CONFIG_IP_DCCP_CCID3_RTO * (USEC_PER_SEC/1000)); /* * Schedule no feedback timer to expire in * max(t_RTO, 2 * s/X) = max(t_RTO, 2 * t_ipi) */ t_nfb = max(hctx->ccid3hctx_t_rto, 2 * hctx->ccid3hctx_t_ipi); ccid3_pr_debug("%s(%p), Scheduled no feedback timer to " "expire in %lu jiffies (%luus)\n", dccp_role(sk), sk, usecs_to_jiffies(t_nfb), t_nfb); sk_reset_timer(sk, &hctx->ccid3hctx_no_feedback_timer, jiffies + usecs_to_jiffies(t_nfb)); Maybe the TCP code could do this also (with a sysctl to turn behaviour off and on) and then it would save system administrators having to "tune" the TCP stack if they want this sort of behaviour. Ian -- Web1: http://wand.net.nz/~iam4/ Web2: http://www.jandi.co.nz Blog: http://iansblog.jandi.co.nz - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html