From: Neil Horman <nhor...@tuxdriver.com>

[ Upstream commit 20a785aa52c82246055a089e55df9dac47d67da1 ]

This BUG halt was reported a while back, but the patch somehow got
missed:

PID: 2879   TASK: c16adaa0  CPU: 1   COMMAND: "sctpn"
 #0 [f418dd28] crash_kexec at c04a7d8c
 #1 [f418dd7c] oops_end at c0863e02
 #2 [f418dd90] do_invalid_op at c040aaca
 #3 [f418de28] error_code (via invalid_op) at c08631a5
    EAX: f34baac0  EBX: 00000090  ECX: f418deb0  EDX: f5542950  EBP: 00000000
    DS:  007b      ESI: f34ba800  ES:  007b      EDI: f418dea0  GS:  00e0
    CS:  0060      EIP: c046fa5e  ERR: ffffffff  EFLAGS: 00010286
 #4 [f418de5c] add_timer at c046fa5e
 #5 [f418de68] sctp_do_sm at f8db8c77 [sctp]
 #6 [f418df30] sctp_primitive_SHUTDOWN at f8dcc1b5 [sctp]
 #7 [f418df48] inet_shutdown at c080baf9
 #8 [f418df5c] sys_shutdown at c079eedf
 #9 [f418df70] sys_socketcall at c079fe88
    EAX: ffffffda  EBX: 0000000d  ECX: bfceea90  EDX: 0937af98
    DS:  007b      ESI: 0000000c  ES:  007b      EDI: b7150ae4
    SS:  007b      ESP: bfceea7c  EBP: bfceeaa8  GS:  0033
    CS:  0073      EIP: b775c424  ERR: 00000066  EFLAGS: 00000282

It appears that the side effect that starts the shutdown timer was processed
multiple times, which can happen as multiple paths can trigger it.  This of
course leads to the BUG halt in add_timer getting called.

Fix seems pretty straightforward, just check before the timer is added if its
already been started.  If it has mod the timer instead to min(current
expiration, new expiration)

Its been tested but not confirmed to fix the problem, as the issue has only
occured in production environments where test kernels are enjoined from being
installed.  It appears to be a sane fix to me though.  Also, recentely,
Jere found a reproducer posted on list to confirm that this resolves the
issues

Signed-off-by: Neil Horman <nhor...@tuxdriver.com>
CC: Vlad Yasevich <vyasev...@gmail.com>
CC: "David S. Miller" <da...@davemloft.net>
CC: jere.leppa...@nokia.com
CC: marcelo.leit...@gmail.com
CC: netdev@vger.kernel.org
Acked-by: Marcelo Ricardo Leitner <marcelo.leit...@gmail.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>
---
 net/sctp/sm_sideeffect.c |   14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -1523,9 +1523,17 @@ static int sctp_cmd_interpreter(enum sct
                        timeout = asoc->timeouts[cmd->obj.to];
                        BUG_ON(!timeout);
 
-                       timer->expires = jiffies + timeout;
-                       sctp_association_hold(asoc);
-                       add_timer(timer);
+                       /*
+                        * SCTP has a hard time with timer starts.  Because we 
process
+                        * timer starts as side effects, it can be hard to tell 
if we
+                        * have already started a timer or not, which leads to 
BUG
+                        * halts when we call add_timer. So here, instead of 
just starting
+                        * a timer, if the timer is already started, and just 
mod
+                        * the timer with the shorter of the two expiration 
times
+                        */
+                       if (!timer_pending(timer))
+                               sctp_association_hold(asoc);
+                       timer_reduce(timer, jiffies + timeout);
                        break;
 
                case SCTP_CMD_TIMER_RESTART:


Reply via email to