The new tcp send/recv socket buffer scaling uncovered a few bugs in our stack that seem to be around for some time but were well hidden.
First hunk removes a totaly incorrect call of tcp_rscale(). It is not allowed to recalculate the receive scaling value after the initial SYN packet. We need to stick to the value the syncache calculated. Because of this our window updates are totaly wrong which results in a lot of "data after window" packets. The second part has todo with the syncache not properly initializing the initial TCP timestamp and by that causing havok in 50% of all connections (because of the timestamp modulation). This results in non working receive side buffer scaling and probably other nasty effects. Just to be sure I also make sure that we do not use stack garbage in the dummy tcpcb we pass to tcp_dooptions() in syn_cache_add(). This problems can be triggered by using tcpbench over the loopback. -- :wq Claudio Index: tcp_input.c =================================================================== RCS file: /cvs/src/sys/netinet/tcp_input.c,v retrieving revision 1.236 diff -u -p -r1.236 tcp_input.c --- tcp_input.c 24 Sep 2010 02:59:45 -0000 1.236 +++ tcp_input.c 28 Sep 2010 16:27:14 -0000 @@ -765,11 +768,6 @@ findpcb: if (tp == NULL) goto badsyn; /*XXX*/ - /* - * Compute proper scaling - * value from buffer space - */ - tcp_rscale(tp, so->so_rcv.sb_hiwat); goto after_listen; } } else { @@ -3813,6 +3812,7 @@ syn_cache_get(struct sockaddr *src, stru #endif tp->ts_modulate = sc->sc_modulate; + tp->ts_recent = sc->sc_timestamp; tp->iss = sc->sc_iss; tp->irs = sc->sc_irs; tcp_sendseqinit(tp); @@ -3993,6 +3993,7 @@ syn_cache_add(struct sockaddr *src, stru #else if (optp) { #endif + bzero(&tb, sizeof(tb)); tb.pf = tp->pf; #ifdef TCP_SACK tb.sack_enable = tp->sack_enable;