Please cherry-pick 4648dc97af9d496218a05353b0e442b3dfa6aaab into 3.2.11.
This has been broken since 3.2.9 with the inclusion of
daef52bab1fd26e24e8e9578f8fb33ba1d0cb412, which a lot of people seem to
be hitting.

Thanks!

Simon-

----- Forwarded message from Neal Cardwell <[email protected]> -----

Date: Tue, 13 Mar 2012 12:44:00 -0400
From: Neal Cardwell <[email protected]>
To: Simon Kirby <[email protected]>
Cc: Eric Dumazet <[email protected]>, [email protected]
Subject: Re: [PATCH] tcp: fix syncookie regression
Envelope-to: [email protected]
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20120113;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
        :cc:content-type:content-transfer-encoding:x-system-of-record
        :x-gm-message-state;
        bh=Y+NcUlBbKENbSyi7liW5OcPFG4g1AlhUQ8ZhP/2Gkzo=;
        b=WQHKh5Cisn06iTCO68u8Ku/bvxD1IMA5XBDs6TStB9YPiSOISL9G5k0eLGVrf1McyA
        UuEWSB+o+zc/vg8q9qRBKVYogQ6xk/GNfMyLQgNnBibLAQ0fUOM3cq5dWzhIyWbPGWVz
        QEmYGRzlHcjOiMY/BaHz0hGEFBQqbpIJiQWhLPppRAX3UIb96nTbxLSbQ5TL3p2JjXBZ
        lVQpe0BVhwYtMHvDRxvfxXYbGP3BahCu6uo5ABesRHOLSQiCSpiPYFZ3JeYTok6nsQgG
        tWLvge0mKAlFI+7sjeehGxiE7sWWSXpouu+zedoONxG3ZViQSSDBKyyeDHHhtPQGso55
        ZYVQ==
X-System-Of-Record: true
X-Gm-Message-State: 
ALoCoQmq16sB9JFSdCAmmswdoD2OboVYAtrm7E8eCHFRe4wyGEhZqkAL2jnZ3kyVYLsfuWA/kY7wSowB1cJRBKtmNCGAll76nliIAB+G8uVuE2xK44p96JHucXGtkqa3/nDriHmETJvtuU2eZd8ZQGTK/KZsUM530g==
X-CRM114-Version: 20080326-BlameSentansoken ( TRE 0.7.5 (LGPL) ) MF-9D8703B0 
[pR: 5.2306]
X-CRM114-Status: Good  ( pR: 5.2306 )

On Tue, Mar 13, 2012 at 3:26 AM, Simon Kirby <[email protected]> wrote:
>
> While deploying this on top of 3.2.9, we hit what seems to be the bug
> fixed by 4648dc97af9d496218a05353b0e442b3dfa6aaab in 3.3. I see 3.2.9 has
> daef52bab1fd26e24e8e9578f8fb33ba1d0cb412, so maybe this is exposed in 3.2
> now?
...
> net/ipv4/tcp_input.c:
> ? ? 3438 #if FASTRETRANS_DEBUG > 0
> ---> 3439 ? ? ? ? WARN_ON((int)tp->sacked_out < 0);
> ? ? 3440 ? ? ? ? WARN_ON((int)tp->lost_out < 0);
> ? ? 3441 ? ? ? ? WARN_ON((int)tp->retrans_out < 0);
> ? ? 3442 ? ? ? ? if (!tp->packets_out && tcp_is_sack(tp)) {
> ...
> ? ? 3057 ? ? ? ? /* D. Check consistency of the current state. */
> ---> 3058 ? ? ? ? tcp_verify_left_out(tp);

Yes, exactly. 3.2.9 and 3.2.10 have
daef52bab1fd26e24e8e9578f8fb33ba1d0cb412 but not the
4648dc97af9d496218a05353b0e442b3dfa6aaab that fixes the resulting
issue with sacked_out going negative, and these lines you've flagged
are exactly the symptoms we'd expect because of that. The fix is
queued up for the -stable series already, so it should be in 3.2.11, I
presume. The fix is already in 3.3-rc7.

> Oops two seconds later, scrolled off console, didn't get written to disk
> or remote syslog server, all we have is syslog-broadcasted Oops and Code
> lines due to broken printk priorities:

As you can imagine, the oops is probably connected to sacked_out going
negative; others have reported an oops along with sacked_out going
negative. The fix should take care of that.

neal

----- End forwarded message -----
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to