netinet

John Baldwin Sat, 14 May 2011 07:38:29 -0700

On 5/14/11 6:37 AM, Mikolaj Golub wrote:

Hi,


On Mon, 2 May 2011 21:05:52 +0000 (UTC) John Baldwin wrote:

  JB>  Author: jhb
  JB>  Date: Mon May  2 21:05:52 2011
  JB>  New Revision: 221346
  JB>  URL: http://svn.freebsd.org/changeset/base/221346

  JB>  Log:
  JB>    Handle a rare edge case with nearly full TCP receive buffers.  If a TCP
  JB>    buffer fills up causing the remote sender to enter into persist mode, 
but
  JB>    there is still room available in the receive buffer when a window probe
  JB>    arrives (either due to window scaling, or due to the local application
  JB>    very slowing draining data from the receive buffer), then the single 
byte
  JB>    of data in the window probe is accepted.  However, this can cause 
rcv_nxt
  JB>    to be greater than rcv_adv.  This condition will only last until the 
next
  JB>    ACK packet is pushed out via tcp_output(), and since the previous ACK
  JB>    advertised a zero window, the ACK should be pushed out while the TCP
  JB>    pcb is write-locked.
  JB>
  JB>    During the window while rcv_nxt is greather than rcv_adv, a few places
  JB>    would compute the remaining receive window via rcv_adv - rcv_nxt.
  JB>    However, this value was then (uint32_t)-1.  On a 64 bit machine this
  JB>    could expand to a positive 2^32 - 1 when cast to a long.  In 
particular,
  JB>    when calculating the receive window in tcp_output(), the result would 
be
  JB>    that the receive window was computed as 2^32 - 1 resulting in 
advertising
  JB>    a far larger window to the remote peer than actually existed.
  JB>
  JB>    Fix various places that compute the remaining receive window to either
  JB>    assert that it is not negative (i.e. rcv_nxt<= rcv_adv), or treat the
  JB>    window as full if rcv_nxt is greather than rcv_adv.
  JB>
  JB>    Reviewed by:        bz
  JB>    MFC after:        1 month

  JB>  Modified:
  JB>    head/sys/netinet/tcp_input.c
  JB>    head/sys/netinet/tcp_output.c
  JB>    head/sys/netinet/tcp_timewait.c

  JB>  Modified: head/sys/netinet/tcp_input.c
  JB>  
==============================================================================
  JB>  --- head/sys/netinet/tcp_input.c        Mon May  2 21:04:37 2011        
(r221345)
  JB>  +++ head/sys/netinet/tcp_input.c        Mon May  2 21:05:52 2011        
(r221346)
  JB>  @@ -1831,6 +1831,9 @@ tcp_do_segment(struct mbuf *m, struct tc
  JB>           win = sbspace(&so->so_rcv);
  JB>           if (win<  0)
  JB>                   win = 0;
  JB>  +        KASSERT(SEQ_GEQ(tp->rcv_adv, tp->rcv_nxt),
  JB>  +            ("tcp_input negative window: tp %p rcv_nxt %u rcv_adv %u", 
tp,
  JB>  +            tp->rcv_adv, tp->rcv_nxt));

I am getting this when running tests with HAST (both primary and secondary HAST
instances on the same host).

HAST is synchronizing data in MAXPHYS (131072 bytes) blocks. The sender splits
them on smaller chunks of MAX_SEND_SIZE (32768 bytes), while the receiver
receives the whole block calling recv() with MSG_WAITALL option.


Can you capture a tcpdump (probably easiest to do from the other host)?

--
John Baldwin
_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Re: svn commit: r221346 - head/sys/netinet

Reply via email to