Hi Andre and RE team,

I've had a patch sitting in re@'s inbox for this problem since 15th Sep and have been waiting for their go-ahead to commit. The patch I submitted is at:

http://people.freebsd.org/~lstewart/patches/misctcp/tcpreassstackfix_9.x.r225576.diff

The proposed commit message was:

##########
Use a backup (stack allocated) struct tseg_qent when we are unable to allocate one from the TCP reassembly UMA zone and the incoming segment is the one we've been waiting for (i.e. th_seq == rcv_nxt). This avoids TCP connections stalling when the zone limit is reached.

PR:        kern/155407
Reported by:    Slawa Olhovchenkov and Steven Hartland
Tested by:    Steven Hartland
Submitted by:    andre
Reviewed by:    jhb
Approved by:    re (?)
MFC after:    1 week
##########

I feel the logging changes should have been committed separately to the fix, but other than that, what you committed achieves the same thing as the patch I proposed.

I should have updated the ML thread to say it was submitted and awaiting approval, so you weren't to know.

Anyhoo, I guess I'll leave it up to you and re@ to sort out how you want to proceed, but wanted to make sure everyone was on the same page as RE would have gotten confused when you requested your patch be MFCed.

Cheers,
Lawrence

On 10/08/11 03:39, Andre Oppermann wrote:
Author: andre
Date: Fri Oct  7 16:39:03 2011
New Revision: 226113
URL: http://svn.freebsd.org/changeset/base/226113

Log:
   Prevent TCP sessions from stalling indefinitely in reassembly
   when reaching the zone limit of reassembly queue entries.

   When the zone limit was reached not even the missing segment
   that would complete the sequence space could be processed
   preventing the TCP session forever from making any further
   progress.

   Solve this deadlock by using a temporary on-stack queue entry
   for the missing segment followed by an immediate dequeue again
   by delivering the contiguous sequence space to the socket.

   Add logging under net.inet.tcp.log_debug for reassembly queue
   issues.

   Reviewed by: lsteward (previous version)
   Tested by:   Steven Hartland<killing-at-multiplay.co.uk>
   MFC after:   3 days

Modified:
   head/sys/netinet/tcp_reass.c

Modified: head/sys/netinet/tcp_reass.c
==============================================================================
--- head/sys/netinet/tcp_reass.c        Fri Oct  7 16:09:44 2011        
(r226112)
+++ head/sys/netinet/tcp_reass.c        Fri Oct  7 16:39:03 2011        
(r226113)
@@ -177,7 +177,9 @@ tcp_reass(struct tcpcb *tp, struct tcphd
        struct tseg_qent *nq;
        struct tseg_qent *te = NULL;
        struct socket *so = tp->t_inpcb->inp_socket;
+       char *s = NULL;
        int flags;
+       struct tseg_qent tqs;

        INP_WLOCK_ASSERT(tp->t_inpcb);

@@ -215,19 +217,40 @@ tcp_reass(struct tcpcb *tp, struct tcphd
                TCPSTAT_INC(tcps_rcvmemdrop);
                m_freem(m);
                *tlenp = 0;
+               if ((s = tcp_log_addrs(&tp->t_inpcb->inp_inc, th, NULL, NULL))) 
{
+                       log(LOG_DEBUG, "%s; %s: queue limit reached, "
+                           "segment dropped\n", s, __func__);
+                       free(s, M_TCPLOG);
+               }
                return (0);
        }

        /*
         * Allocate a new queue entry. If we can't, or hit the zone limit
         * just drop the pkt.
+        *
+        * Use a temporary structure on the stack for the missing segment
+        * when the zone is exhausted. Otherwise we may get stuck.
         */
        te = uma_zalloc(V_tcp_reass_zone, M_NOWAIT);
-       if (te == NULL) {
+       if (te == NULL&&  th->th_seq != tp->rcv_nxt) {
                TCPSTAT_INC(tcps_rcvmemdrop);
                m_freem(m);
                *tlenp = 0;
+               if ((s = tcp_log_addrs(&tp->t_inpcb->inp_inc, th, NULL, NULL))) 
{
+                       log(LOG_DEBUG, "%s; %s: global zone limit reached, "
+                           "segment dropped\n", s, __func__);
+                       free(s, M_TCPLOG);
+               }
                return (0);
+       } else if (th->th_seq == tp->rcv_nxt) {
+               bzero(&tqs, sizeof(struct tseg_qent));
+               te =&tqs;
+               if ((s = tcp_log_addrs(&tp->t_inpcb->inp_inc, th, NULL, NULL))) 
{
+                       log(LOG_DEBUG, "%s; %s: global zone limit reached, "
+                           "using stack for missing segment\n", s, __func__);
+                       free(s, M_TCPLOG);
+               }
        }
        tp->t_segqlen++;

@@ -304,6 +327,8 @@ tcp_reass(struct tcpcb *tp, struct tcphd
        if (p == NULL) {
                LIST_INSERT_HEAD(&tp->t_segq, te, tqe_q);
        } else {
+               KASSERT(te !=&tqs, ("%s: temporary stack based entry not "
+                   "first element in queue", __func__));
                LIST_INSERT_AFTER(p, te, tqe_q);
        }

@@ -327,7 +352,8 @@ present:
                        m_freem(q->tqe_m);
                else
                        sbappendstream_locked(&so->so_rcv, q->tqe_m);
-               uma_zfree(V_tcp_reass_zone, q);
+               if (q !=&tqs)
+                       uma_zfree(V_tcp_reass_zone, q);
                tp->t_segqlen--;
                q = nq;
        } while (q&&  q->tqe_th->th_seq == tp->rcv_nxt);

_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Reply via email to