Here's the consensus from the bugwash on irc: - we'll add stats counters for SESS_CLOSE - then we'll check on larger production systems which impact increasing timeout_req to 7 seconds has by - comparing the RX_TIMEOUT to sess_conn - comparing tcp counters
scn wants to provide advise on which tcp stats to watch Nils On 15/03/15 18:36, Nils Goroll wrote: > Hi, > > some time has passed since my initial email regarding this suggestion and it > still holds. > > Unless there is a strong argument against it, I think we really should > increase > the default timeout_req to 7 seconds. I think the argumentation for this value > is sound and I haven't found any reasons against it. > > Please keep this suggestion separate from the suggestion to re-introduce > SO_LINGER. I still need to do production system tests with it. > > Nils > > On 26/02/15 11:27, Nils Goroll wrote: >> This tcpdump output illustrates an issue we seem to have with default Linux >> tcp >> timeouts and the default timeout_req of 2 seconds: >> >> 16:47:44.542049 IP client.49550 > varnish.80: Flags [S], seq 29295818, win >> 4380, >> options [mss 1460,sackOK,eol], length 0 >> 16:47:44.542080 IP varnish.80 > client.49550: Flags [S.], seq 3652568857, ack >> 29295819, win 29200, options [mss 1460,nop,nop,sackOK], length 0 >> 16:47:44.542250 IP client.49550 > varnish.80: Flags [.], ack 1, win 4380, >> length 0 >> 16:47:46.080501 IP client.49550 > varnish.80: Flags [P.], seq 1:1453, ack 1, >> win >> 4380, length 1452 >> 16:47:46.080528 IP varnish.80 > client.49550: Flags [.], ack 1453, win 31944, >> length 0 >> 16:47:48.082783 IP varnish.80 > client.49550: Flags [F.], seq 1, ack 1453, >> win >> 31944, length 0 >> 16:47:48.083070 IP client.49550 > varnish.80: Flags [.], ack 2, win 4380, >> length 0 >> 16:47:48.350763 IP client.49550 > varnish.80: Flags [P.], seq 1453:2905, ack >> 2, >> win 4380, length 1452 >> 16:47:48.350792 IP varnish.80 > client.49550: Flags [R], seq 3652568859, win >> 0, >> length 0 >> >> The packet at 16:47:46.080501 contains the first part of a request up to the >> start of a very long cookie line. >> >> At 16:47:48 varnish closes after reaching timeout_req of 2s. Then, the client >> immediately acks. >> >> My understanding is that the varnish->client ack 1453 got lost and the client >> did not get around to retransmit seq 1:1453 before we timed out. >> >> >> The most helpful online reference regarding recommended initial tcp >> retransmittion timeouts I have found so far is >> http://tools.ietf.org/html/rfc6298#ref-PA00 >> >> In summary, an initial timeout (RTO) of 1s is now recommended, but the >> former 3s >> RTO remains valid. So, for any client following the former 3s recommendation, >> current we don't even tolerate a single packet retransmission after 3way is >> complete. For those clients following the new 1s recommended RTO, timing is >> also >> really tight it seems unlikely that we tolerate retransmission of two >> packets. >> >> Based on this, I'd suggest to raise the default timeout_req to 7 seconds to >> allow for two retransmissions at RTO=3. >> >> This seems to be particularly relevant with the growing popularity of mobile >> clients. >> >> The risk is increased resource usage for malicious requests. To address it, >> I'd >> suggest to document that lowering timeout_req can be an option to mitigate >> certain DoS (slowloris) attacks. >> >> >> Nils >> >> >> _______________________________________________ >> varnish-dev mailing list >> [email protected] >> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev >> > > _______________________________________________ > varnish-dev mailing list > [email protected] > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev > _______________________________________________ varnish-dev mailing list [email protected] https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
