Re: kernel BUG at kernel/timer.c:407!

Gerrit Renker Mon, 05 Mar 2007 08:40:04 -0800

Hi Andre,

thank you for providing a detailed report.


The mentioned patch avoids sending received information to both rx/tx halves
of a CCID. There are two possibilities:
 (a) The patch itself has a bug
     The patch has been tested extensively over months (but using mainly
     CCID3, as CCID2 gave bad performance)
                                                or
 (b) There is a problem in the CCID2 code, since there have been many
     kernel changes and the code was not so much maintained.

I would much rather hunt down this bug than revert this patch, since reverting 
it
means that almost twice the work is done per each packet and truly bidirectional
connections are not currently supported.

With initial tests here I could not reproduce the bug and had difficulties 
since I have not all required libraries for paraslash.

1. I see you are using loopback (127.0.0.1) where packets are routed back
   internally: does the bug persist when CCID2 sender and CCID2 receiver
   are on different hosts?

2. This is using CCID2, which has not been maintained for a while. Can
   you please try CCID 3 also, e.g. by using the following sysctls:
   
   sysctl -w net.dccp.default.rx_ccid=3 
   sysctl -w net.dccp.default.tx_ccid=3
   sysctl -w net.dccp.default.tx_qlen=5
   sysctl -w net.dccp.default.seq_window=100
   sysctl -w net.dccp.default.send_ackvec=0

3. Caveat: `Server' listens, `Client' connects. I could not build the paraslash 
app
   due to missing libraries, but found that dccp_recv_open calls connect in 
dccp_recv.c
   while dccp_open() in dccp_send.c calls listen(). It seems that the roles are 
reversed,
   is it possible to swap this in the application and does the problem persist?

4. Notwithstanding (3), the BUG() condition in mod_timer should not be 
triggered, so
   any further information - in particular the tests in (1,2) would be good to 
do.
 

Gerrit 

|  ccid2_hc_tx_send_packet: pipe=1 cwnd=1
|  ------------[ cut here ]------------
|  kernel BUG at kernel/timer.c:407!
|  invalid opcode: 0000 [#1]
|  PREEMPT 
|  CPU:    0
|  EIP:    0060:[<c0124ace>]    Not tainted VLI
|  EFLAGS: 00210246   (2.6.20.1 #12)
|  EIP is at mod_timer+0x28/0x2c
|  eax: c15807d4   ebx: c1580414   ecx: 00000000   edx: fffbda11
|  esi: c1580414   edi: dea75644   ebp: dc637db8   esp: dc637db8
|  ds: 007b   es: 007b   ss: 0068
|  Process para_server (pid: 1085, ti=dc636000 task=de5df030 task.ti=dc636000)
|  Stack: dc637dc4 c0369b66 00000001 dc637df8 c0407ddc dc637de0 c15804bc 
00200292 
|         dc637de4 c041588c c15804b0 dc637df8 00000000 dea75644 c1580414 
dc7b2ec4 
|         dc637e1c c040971d dc637e08 dc637e8c 00000000 00000000 c0529780 
c1580414 
|  Call Trace:
|   [<c0103c3b>] show_trace_log_lvl+0x1a/0x30
|   [<c0103cf2>] show_stack_log_lvl+0x8d/0xaa
|   [<c0103f07>] show_registers+0x1a8/0x312
|   [<c0104210>] die+0x109/0x226
|   [<c01043ab>] do_trap+0x7e/0xb4
|   [<c0104684>] do_invalid_op+0xa3/0xad
|   [<c0415b5c>] error_code+0x74/0x7c
|   [<c0369b66>] sk_reset_timer+0xf/0x19
|   [<c0407ddc>] dccp_write_xmit+0x224/0x22c
|   [<c040971d>] dccp_sendmsg+0x10a/0x15a
|   [<c03a9e26>] inet_sendmsg+0x44/0x55
|   [<c036666a>] do_sock_write+0x9a/0xa9
|   [<c03666e3>] sock_aio_write+0x6a/0x7b
|   [<c015afc6>] do_sync_write+0xc7/0x116
|   [<c015b197>] vfs_write+0x182/0x187
|   [<c015b23d>] sys_write+0x3d/0x64
|   [<c0102fa4>] syscall_call+0x7/0xb
|   =======================
The BUG is caused via the following chain: 

1. dccp_write_xmit(sk, 0) (due to !block)
1. dccp_sendmsg
2. ccid2_hc_tx_send_packet -> with hctx->ccid2hctx_pipe >= hctx->ccid2hctx_cwnd
   (see above, pipe=cwnd=1) ==> returns 1
3. in dccp_write_xmit(sk, 0):
   if (!block) {                 /* this is true here */
                sk_reset_timer(sk, &dp->dccps_xmit_timer,
                                msecs_to_jiffies(err)+jiffies)
   ==> BUG()
|   <7>dccp_set_state: listen(c1580030) LISTEN     -> CLOSED
This may be a clue: this socket has not gone past listen state (i.e. not 
entered server)

-
To unsubscribe from this list: send the line "unsubscribe dccp" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kernel BUG at kernel/timer.c:407!

Reply via email to