Comments/questions inline for Dave and Arnaldo: (or anybody else!)
>
> ip_queue_xmit:
> push %ebp
> push %edi
> push %esi
> push %ebx
> sub $0xbc, %esp
> mov 0xd0(%esp), %ebp ! %ebp = arg0 (skb)
> mov 0x8(%ebp), %ebx ! %ebx = skb->sk
> mov 0x13c(%ebx), %eax ! %eax = inet_sk(sk)->opt
How did you get the assembler code - curious so that I can do in the future...
> So find out what path in the DCCP stack allows an output packet
> SKB to not have skb->sk initialized. :-)
>
> This funny buisness with only doing a skb_set_owner_w(skb, sk)
> in dccp_transmit_skb() if the SKB is cloned is probably part
> of the problem.
>
> For example, the first OOPS goes back into dccp_retransmit_skb(). If
> the skb is already cloned, it makes a copy using pskb_copy() and
Can you only clone an skb once or as many times as you like? I had a
quick look at the code for skb_clone and it looks like you can
repeatedly clone. If that is the case Arnaldo why did you choose to
use a pskb_copy?
> passes that to dccp_transmit_skb(). That won't pass the
> "skb_cloned()" test, and will likely leave us with a NULL skb->sk.
>
> There needs to therefore be a better way to test "not DATA packet" in
> dccp_transmit_skb(), because "skb_cloned()" obviously does not always
> indicate that.
I tried altering dccp_retransmit so that it just clones unconditionally i.e.:
return dccp_transmit_skb(skb_clone(skb, GFP_ATOMIC));
and also altered dccp_transmit_skb so that it sets owner unconditionally like:
- if (skb_cloned(skb))
skb_set_owner_w(skb, sk);
I tried one of these at a time.
In both cases ttcp now times out which is good.
What's not good however is that performance was also really bad (10 kb
per sec) and also get crashes like this after the timeout.
ttcp-t: buflen=256, nbuf=100, align=16384/+0, port=5001 dccp(inet)
-> 10.0.2.3ttcp-t: socket
Unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip:
00000000
*pde = 00000000
Oops: 0000 [#1]
Modules linked in: dccp_ccid3 dccp_tfrc_lib dccp e100 3c59x
CPU: 0
EIP: 0060:[<00000000>] Not tainted VLI
EFLAGS: 00010286 (2.6.14-rc3)
EIP is at 0x0
eax: c0440000 ebx: 00000000 ecx: 00000000 edx: 00000100
esi: 00000000 edi: 00000000 ebp: 00000000 esp: c0441f00
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=c0440000 task=c03bfba0)
Stack: 00000002 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 bfc49bd0 00000000 00000000 00000000 00000000 ffffffff ffffffff
ffffffff 00000000 00000006 0000000f c6f64000 c6f64000 00000000 00000000
Call Trace:
Code: Bad EIP value.
Now if I do both of those changes at the same time I get my
performance back but I get the following crash so there must be more
than one place that sk is not getting set in the skb...
Unable to handle kernel NULL pointer dereference at virtual address 0000013c
printing eip:
c031ad24
*pde = 00000000
Oops: 0000 [#1]
Modules linked in: dccp_ccid3 dccp_tfrc_lib dccp e100 3c59x
CPU: 0
EIP: 0060:[<c031ad24>] Not tainted VLI
EFLAGS: 00010292 (2.6.14-rc2)
EIP is at ip_queue_xmit+0x14/0x4c0
eax: c8887ba0 ebx: 00000000 ecx: 00000001 edx: 00000000
esi: c7353aac edi: c7353ac0 ebp: c70899e0 esp: c043de08
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=c043c000 task=c03bcba0)
Stack: c0357620 c0374d7d 00000004 c037647d c037641b 00000000 00000000 00000000
00000000 c01d32d3 0000005b c0357620 c0374d7d c7b66560 00000000 00000246
c7dc44c8 00000048 c0357620 c0374d7d c75fb620 00000000 c043deb8 00000009
Call Trace:
[<c01d32d3>] acpi_ev_sci_xrupt_handler+0x3f/0x46
[<c0134140>] handle_IRQ_event+0x30/0x70
[<c887e0ef>] dccp_v4_checksum+0x3f/0xb0 [dccp]
[<c887d9fe>] dccp_v4_send_check+0x2e/0x40 [dccp]
[<c88809d8>] dccp_transmit_skb+0x2f8/0x380 [dccp]
[<c8882e3a>] dccp_retransmit_timer+0x4a/0x190 [dccp]
[<c011ed85>] update_process_times+0x85/0x130
[<c8882f80>] dccp_write_timer+0x0/0xa0 [dccp]
[<c8882fe4>] dccp_write_timer+0x64/0xa0 [dccp]
[<c0106f22>] timer_interrupt+0x42/0x60
[<c011ef06>] run_timer_softirq+0xb6/0x1b0
Anyway I'll keep on hunting but don't know how far I'll get. What I
really need to do is understand the retransmit path better...
Hopefully this might others some clues.
Ian
-
To unsubscribe from this list: send the line "unsubscribe dccp" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html