Re: [PATCH] tcp: Modify the condition for the first skb to collapse
On Tue, 2013-06-18 at 05:52 -0400, Jun Chen wrote: > > > There are many warning for tcp_recvmsg before this crash. I can't find > other memory warning in the logs, but I'm not sure whether there are > memory issues because of the length limitation of saved logs. I think > this logs will give you more information. > > <4>[ 7736.343742] [ cut here ] > > <4>[ 7736.343759] WARNING: > at /data/buildbot/workdir/jb/kernel/net/ipv4/tcp.c:1496 tcp_recvmsg > +0x3bf/0x910() > > <4>[ 7736.343775] recvmsg bug: copied AB57C870 seq AB57CD95 rcvnxt > AB57F19F fl 0 > > <4>[ 7736.343845] Call Trace: > > <4>[ 7736.343865] [] warn_slowpath_common+0x72/0xa0 > > <4>[ 7736.343888] [] ? tcp_recvmsg+0x3bf/0x910 > > <4>[ 7736.343902] [] ? tcp_recvmsg+0x3bf/0x910 > > <4>[ 7736.343922] [] warn_slowpath_fmt+0x33/0x40 > > <4>[ 7736.343944] [] tcp_recvmsg+0x3bf/0x910 > > <4>[ 7736.343968] [] inet_recvmsg+0x85/0xa0 > > <4>[ 7736.343992] [] sock_aio_read+0x140/0x160 > > <4>[ 7736.344016] [] ? set_next_entity+0xc1/0xf0 > > <4>[ 7736.344039] [] do_sync_read+0xb7/0xf0 > > <4>[ 7736.344064] [] ? rw_verify_area+0x6c/0x120 > > <4>[ 7736.344077] [] ? sys_epoll_wait+0x68/0x360 > > <4>[ 7736.344098] [] vfs_read+0x149/0x160 > > <4>[ 7736.344120] [] ? fget_light+0x58/0xd0 > > <4>[ 7736.344142] [] sys_read+0x3d/0x70 > > <4>[ 7736.344164] [] syscall_call+0x7/0xb > > <4>[ 7736.344187] [] ? perf_cpu_notify+0x45/0x89 > > <4>[ 7736.344205] ---[ end trace b3c5b245ce7ff5b5 ]--- > Thats exactly the interesting stuff ;) This was fixed, or should be fixed if still happening on more recent kernels. Basically, once we are in this state, there is nothing we can do to prevent a crash. Please try to reproduce the issue using 3.9 or David trees (net-next or net ) Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tcp: Modify the condition for the first skb to collapse
On Mon, 2013-06-17 at 06:21 -0700, Eric Dumazet wrote: > On Mon, 2013-06-17 at 14:52 -0400, Jun Chen wrote: > > On Mon, 2013-06-17 at 03:29 -0700, Eric Dumazet wrote: > > > On Mon, 2013-06-17 at 13:29 -0400, Jun Chen wrote: > > > > > > > > > hi, > > > > When the condition of tcp_win_from_space(skb->truesize) > skb->len is > > > > true but the before(start, TCP_SKB_CB(skb)->seq) is also true, the final > > > > condition will be true. The follow line: > > > > int offset = start - TCP_SKB_CB(skb)->seq; > > > > BUG_ON(offset < 0); > > > > this BUG_ON will be triggered. > > > > > > > > > > Really this should never happen, we must track what's happening here. > > It's very very rare, but the logic of codes have such a little hole. > > > > > > Are you using a pristine kernel, without any patches ? > > The based kernel version is 3.4. > > > > > > Are you able to reproduce this bug in a short amount of time ? > > I can't reproduce it in short time, this log had just been found once > > for long long time tests on many devices . > > > > > > What kind of driver is in use ? (your stack trace was truncated) > > > > I attach the whole stack traces for you. > > > Any other suspect messages before this, a memory allocation error for > example ? > > I believe we have a bug in tcp_collapse() if one alloc_skb() returns > NULL while we were in the middle of collapsing a big GRO packet. > > gro_skb needed 3 skb to be rebuilt, and only two skbs could be allocated > > skb1: seq=X end_seq=X+4000 > skb2: seq=X+4000 end_seq=X+8000 > > grp_skb: seq=X end_seq=X+16000 > > Next time we call tcp_collapse(), we might split again the GRO packet > and get following incorrect queue : > > skb1: seq=X end_seq=X+4000 > skb2: seq=X+4000 end_seq=X+8000 > skb3: seq=X end_seq=X+4000 > skb4: seq=X+4000 end_seq=X+8000 > skb5: seq=X+8000 end_seq=X+12000 > skb6: seq=X+12000 end_seq=X+16000 > > > I would use the following patch instead, to narrow the problem > > If we really find in the ofo queue a skb with a lower seq than the > previous one, we should complain instead of lowering @start, since > this is going to crash later. > > receive_queue / ofo_queue should contain monotonically increasing > skb->seq. > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 46271cdc..5507a09 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -4513,8 +4513,10 @@ static void tcp_collapse_ofo_queue(struct sock *sk) > start = TCP_SKB_CB(skb)->seq; > end = TCP_SKB_CB(skb)->end_seq; > } else { > - if (before(TCP_SKB_CB(skb)->seq, start)) > - start = TCP_SKB_CB(skb)->seq; > + if (before(TCP_SKB_CB(skb)->seq, start)) { > + pr_err_once("tcp_collapse_ofo_queue() : seq > %08x before start %08X\n", > + TCP_SKB_CB(skb)->seq, start); > + } > if (after(TCP_SKB_CB(skb)->end_seq, end)) > end = TCP_SKB_CB(skb)->end_seq; > } > > There are many warning for tcp_recvmsg before this crash. I can't find other memory warning in the logs, but I'm not sure whether there are memory issues because of the length limitation of saved logs. I think this logs will give you more information. <4>[ 7736.343742] [ cut here ] <4>[ 7736.343759] WARNING: at /data/buildbot/workdir/jb/kernel/net/ipv4/tcp.c:1496 tcp_recvmsg +0x3bf/0x910() <4>[ 7736.343775] recvmsg bug: copied AB57C870 seq AB57CD95 rcvnxt AB57F19F fl 0 <4>[ 7736.343845] Call Trace: <4>[ 7736.343865] [] warn_slowpath_common+0x72/0xa0 <4>[ 7736.343888] [] ? tcp_recvmsg+0x3bf/0x910 <4>[ 7736.343902] [] ? tcp_recvmsg+0x3bf/0x910 <4>[ 7736.343922] [] warn_slowpath_fmt+0x33/0x40 <4>[ 7736.343944] [] tcp_recvmsg+0x3bf/0x910 <4>[ 7736.343968] [] inet_recvmsg+0x85/0xa0 <4>[ 7736.343992] [] sock_aio_read+0x140/0x160 <4>[ 7736.344016] [] ? set_next_entity+0xc1/0xf0 <4>[ 7736.344039] [] do_sync_read+0xb7/0xf0 <4>[ 7736.344064] [] ? rw_verify_area+0x6c/0x120 <4>[ 7736.344077] [] ? sys_epoll_wait+0x68/0x360 <4>[ 7736.344098] [] vfs_read+0x149/0x160 <4>[ 7736.344120] [] ? fget_light+0x58/0xd0 <4>[ 7736.344142] [] sys_read+0x3d/0x70 <4>[ 7736.344164] [] syscall_call+0x7/0xb <4>[ 7736.344187] [] ? perf_cpu_notify+0x45/0x89 <4>[ 7736.344205] ---[ end trace b3c5b245ce7ff5b5 ]--- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tcp: Modify the condition for the first skb to collapse
On Mon, 2013-06-17 at 14:52 -0400, Jun Chen wrote: > On Mon, 2013-06-17 at 03:29 -0700, Eric Dumazet wrote: > > On Mon, 2013-06-17 at 13:29 -0400, Jun Chen wrote: > > > > > > > hi, > > > When the condition of tcp_win_from_space(skb->truesize) > skb->len is > > > true but the before(start, TCP_SKB_CB(skb)->seq) is also true, the final > > > condition will be true. The follow line: > > > int offset = start - TCP_SKB_CB(skb)->seq; > > > BUG_ON(offset < 0); > > > this BUG_ON will be triggered. > > > > > > > Really this should never happen, we must track what's happening here. > It's very very rare, but the logic of codes have such a little hole. > > > > Are you using a pristine kernel, without any patches ? > The based kernel version is 3.4. > > > > Are you able to reproduce this bug in a short amount of time ? > I can't reproduce it in short time, this log had just been found once > for long long time tests on many devices . > > > > What kind of driver is in use ? (your stack trace was truncated) > > I attach the whole stack traces for you. > > <0>[ 7736.348788] Call Trace: > > <4>[ 7736.348861] [] tcp_prune_queue+0x120/0x2f0 > > <4>[ 7736.348984] [] tcp_data_queue+0x777/0xf00 > > <4>[ 7736.349055] [] ? ipt_do_table+0x1f8/0x480 > > <4>[ 7736.349126] [] ? ipt_do_table+0x1f8/0x480 > > <4>[ 7736.349196] [] tcp_rcv_established+0x114/0x680 > > <4>[ 7736.349269] [] tcp_v4_do_rcv+0x164/0x350 > > <4>[ 7736.349396] [] ? nf_nat_fn+0xb1/0x1d0 > > <4>[ 7736.349470] [] tcp_v4_rcv+0x6f1/0x7a0 > > <4>[ 7736.349599] [] ? nf_hook_slow+0x10d/0x150 > > <4>[ 7736.349673] [] ip_local_deliver_finish+0x8b/0x200 > > <4>[ 7736.349796] [] ip_local_deliver+0x8f/0xa0 > > <4>[ 7736.349867] [] ? ip_rcv_finish+0x300/0x300 > > <4>[ 7736.349937] [] ip_rcv_finish+0xdf/0x300 > > <4>[ 7736.350062] [] ip_rcv+0x258/0x330 > > <4>[ 7736.350132] [] ? inet_del_protocol+0x30/0x30 > > <4>[ 7736.350258] [] __netif_receive_skb+0x325/0x410 > > <4>[ 7736.350331] [] process_backlog+0x96/0x150 > > <4>[ 7736.350455] [] net_rx_action+0x115/0x210 > > <4>[ 7736.350525] [] ? tcp_out_of_resources+0xb0/0xb0 > > <4>[ 7736.350652] [] __do_softirq+0x9b/0x220 > > <4>[ 7736.350723] [] ? local_bh_enable_ip+0xd0/0xd0 > Any other suspect messages before this, a memory allocation error for example ? I believe we have a bug in tcp_collapse() if one alloc_skb() returns NULL while we were in the middle of collapsing a big GRO packet. gro_skb needed 3 skb to be rebuilt, and only two skbs could be allocated skb1: seq=X end_seq=X+4000 skb2: seq=X+4000 end_seq=X+8000 grp_skb: seq=X end_seq=X+16000 Next time we call tcp_collapse(), we might split again the GRO packet and get following incorrect queue : skb1: seq=X end_seq=X+4000 skb2: seq=X+4000 end_seq=X+8000 skb3: seq=X end_seq=X+4000 skb4: seq=X+4000 end_seq=X+8000 skb5: seq=X+8000 end_seq=X+12000 skb6: seq=X+12000 end_seq=X+16000 I would use the following patch instead, to narrow the problem If we really find in the ofo queue a skb with a lower seq than the previous one, we should complain instead of lowering @start, since this is going to crash later. receive_queue / ofo_queue should contain monotonically increasing skb->seq. diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 46271cdc..5507a09 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -4513,8 +4513,10 @@ static void tcp_collapse_ofo_queue(struct sock *sk) start = TCP_SKB_CB(skb)->seq; end = TCP_SKB_CB(skb)->end_seq; } else { - if (before(TCP_SKB_CB(skb)->seq, start)) - start = TCP_SKB_CB(skb)->seq; + if (before(TCP_SKB_CB(skb)->seq, start)) { + pr_err_once("tcp_collapse_ofo_queue() : seq %08x before start %08X\n", + TCP_SKB_CB(skb)->seq, start); + } if (after(TCP_SKB_CB(skb)->end_seq, end)) end = TCP_SKB_CB(skb)->end_seq; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tcp: Modify the condition for the first skb to collapse
On Mon, 2013-06-17 at 03:29 -0700, Eric Dumazet wrote: > On Mon, 2013-06-17 at 13:29 -0400, Jun Chen wrote: > > > > > hi, > > When the condition of tcp_win_from_space(skb->truesize) > skb->len is > > true but the before(start, TCP_SKB_CB(skb)->seq) is also true, the final > > condition will be true. The follow line: > > int offset = start - TCP_SKB_CB(skb)->seq; > > BUG_ON(offset < 0); > > this BUG_ON will be triggered. > > > > Really this should never happen, we must track what's happening here. It's very very rare, but the logic of codes have such a little hole. > > Are you using a pristine kernel, without any patches ? The based kernel version is 3.4. > > Are you able to reproduce this bug in a short amount of time ? I can't reproduce it in short time, this log had just been found once for long long time tests on many devices . > > What kind of driver is in use ? (your stack trace was truncated) I attach the whole stack traces for you. <0>[ 7736.348788] Call Trace: <4>[ 7736.348861] [] tcp_prune_queue+0x120/0x2f0 <4>[ 7736.348984] [] tcp_data_queue+0x777/0xf00 <4>[ 7736.349055] [] ? ipt_do_table+0x1f8/0x480 <4>[ 7736.349126] [] ? ipt_do_table+0x1f8/0x480 <4>[ 7736.349196] [] tcp_rcv_established+0x114/0x680 <4>[ 7736.349269] [] tcp_v4_do_rcv+0x164/0x350 <4>[ 7736.349396] [] ? nf_nat_fn+0xb1/0x1d0 <4>[ 7736.349470] [] tcp_v4_rcv+0x6f1/0x7a0 <4>[ 7736.349599] [] ? nf_hook_slow+0x10d/0x150 <4>[ 7736.349673] [] ip_local_deliver_finish+0x8b/0x200 <4>[ 7736.349796] [] ip_local_deliver+0x8f/0xa0 <4>[ 7736.349867] [] ? ip_rcv_finish+0x300/0x300 <4>[ 7736.349937] [] ip_rcv_finish+0xdf/0x300 <4>[ 7736.350062] [] ip_rcv+0x258/0x330 <4>[ 7736.350132] [] ? inet_del_protocol+0x30/0x30 <4>[ 7736.350258] [] __netif_receive_skb+0x325/0x410 <4>[ 7736.350331] [] process_backlog+0x96/0x150 <4>[ 7736.350455] [] net_rx_action+0x115/0x210 <4>[ 7736.350525] [] ? tcp_out_of_resources+0xb0/0xb0 <4>[ 7736.350652] [] __do_softirq+0x9b/0x220 <4>[ 7736.350723] [] ? local_bh_enable_ip+0xd0/0xd0 > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tcp: Modify the condition for the first skb to collapse
On Mon, 2013-06-17 at 13:29 -0400, Jun Chen wrote: > > > hi, > When the condition of tcp_win_from_space(skb->truesize) > skb->len is > true but the before(start, TCP_SKB_CB(skb)->seq) is also true, the final > condition will be true. The follow line: > int offset = start - TCP_SKB_CB(skb)->seq; > BUG_ON(offset < 0); > this BUG_ON will be triggered. > Really this should never happen, we must track what's happening here. Are you using a pristine kernel, without any patches ? Are you able to reproduce this bug in a short amount of time ? What kind of driver is in use ? (your stack trace was truncated) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tcp: Modify the condition for the first skb to collapse
On Mon, 2013-06-17 at 01:15 -0700, Eric Dumazet wrote: > On Mon, 2013-06-17 at 10:18 -0400, Jun Chen wrote: > > When search the first skb to collapse,the condition of overlap to the next > > one have been > > reached,but the start is less than TCP_SKB_CB(skb)->seq at this time, then > > followed process > > will trigger the BUG_ON of the offset(start - TCP_SKB_CB(skb)->seq). > > So this patch add one check (! before(start,TCP_SKB_CB(skb)->seq)) to avoid > > this ipanic. > > > > Signed-off-by: Chen Jun > > --- > > net/ipv4/tcp_input.c |3 ++- > > 1 files changed, 2 insertions(+), 1 deletions(-) > > > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > index 9c62257..4c745c5 100644 > > --- a/net/ipv4/tcp_input.c > > +++ b/net/ipv4/tcp_input.c > > @@ -4465,7 +4465,8 @@ restart: > > * overlaps to the next one. > > */ > > if (!tcp_hdr(skb)->syn && !tcp_hdr(skb)->fin && > > - (tcp_win_from_space(skb->truesize) > skb->len || > > + ((tcp_win_from_space(skb->truesize) > skb->len && > > + !before(start, TCP_SKB_CB(skb)->seq)) || > > before(TCP_SKB_CB(skb)->seq, start))) { > > end_of_skbs = false; > > break; > > Hmm... I must say I do not understand this patch. > > If we find a skb with before(TCP_SKB_CB(skb)->seq, start), then the > final condition will be true. > > Let's rewrite your code to equivalent one : > > if (!tcp_hdr(skb)->syn && !tcp_hdr(skb)->fin && > (before(TCP_SKB_CB(skb)->seq, start) || > tcp_win_from_space(skb->truesize) > skb->len)) { > > So it seems your patch would not solve the problem for all > possible skbs (aka not bloated) ? > > Please tell us how you trigger this bug, and send the stack trace. > > Thanks > > hi, When the condition of tcp_win_from_space(skb->truesize) > skb->len is true but the before(start, TCP_SKB_CB(skb)->seq) is also true, the final condition will be true. The follow line: int offset = start - TCP_SKB_CB(skb)->seq; BUG_ON(offset < 0); this BUG_ON will be triggered. Follow line is my error logs: <2>[ 7736.344508] kernel BUG at /data/buildbot/workdir/jb/kernel/net/ipv4/tcp_input.c:4845! <4>[ 7736.344578] invalid opcode: [#1] PREEMPT SMP <4>[ 7736.344883] Modules linked in: atomisp lm3559 ov9724 imx1x5 bcm4335(O) cfg80211 bcm_bt_lpm videobuf_vmalloc videobuf_core matrix(C) <4>[ 7736.345681] <4>[ 7736.345748] Pid: 5189, comm: TimedEventQueue Tainted: GWC O 3.4.43-186445-g3ada675 #1 Intel Corporation Merrifield/SALT BAY <4>[ 7736.346059] EIP: 0060:[] EFLAGS: 00010297 CPU: 1 <4>[ 7736.346183] EIP is at tcp_collapse+0x3bd/0x3d0 <4>[ 7736.346250] EAX: ab57d2bb EBX: df428c00 ECX: c97dcd00 EDX: 10c0 <4>[ 7736.346372] ESI: df4289c0 EDI: fadb EBP: edca1d88 ESP: edca1d60 <4>[ 7736.346441] DS: 007b ES: 007b FS: 00d8 GS: 003b SS: 0068 <4>[ 7736.346560] CR0: 8005003b CR2: 41d310bc CR3: 2d30 CR4: 001007d0 <4>[ 7736.346629] DR0: DR1: DR2: DR3: <4>[ 7736.346749] DR6: 0ff0 DR7: 0400 <0>[ 7736.346816] Process TimedEventQueue (pid: 5189, ti=edca task=dc30b660 task.ti=c9a6e000) <0>[ 7736.346936] Stack: <4>[ 7736.347002] fadb c97dcd5c 0001 c97dcd00 0e32 c97dcd00 <4>[ 7736.347615] c97dcd00 df428180 edca1db0 c18addd0 ab57c870 ab57f19f c97dcd00 <4>[ 7736.348175] c97dd198 80c0 c97dcd00 df428180 edca1df0 c18aea27 c18dc8f8 <0>[ 7736.348788] Call Trace: <4>[ 7736.348861] [] tcp_prune_queue+0x120/0x2f0 <4>[ 7736.348984] [] tcp_data_queue+0x777/0xf00 <4>[ 7736.349055] [] ? ipt_do_table+0x1f8/0x480 <4>[ 7736.349126] [] ? ipt_do_table+0x1f8/0x480 <4>[ 7736.349196] [] tcp_rcv_established+0x114/0x680 <4>[ 7736.349269] [] tcp_v4_do_rcv+0x164/0x350 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tcp: Modify the condition for the first skb to collapse
On Mon, 2013-06-17 at 10:18 -0400, Jun Chen wrote: > When search the first skb to collapse,the condition of overlap to the next > one have been > reached,but the start is less than TCP_SKB_CB(skb)->seq at this time, then > followed process > will trigger the BUG_ON of the offset(start - TCP_SKB_CB(skb)->seq). > So this patch add one check (! before(start,TCP_SKB_CB(skb)->seq)) to avoid > this ipanic. > > Signed-off-by: Chen Jun > --- > net/ipv4/tcp_input.c |3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 9c62257..4c745c5 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -4465,7 +4465,8 @@ restart: >* overlaps to the next one. >*/ > if (!tcp_hdr(skb)->syn && !tcp_hdr(skb)->fin && > - (tcp_win_from_space(skb->truesize) > skb->len || > + ((tcp_win_from_space(skb->truesize) > skb->len && > + !before(start, TCP_SKB_CB(skb)->seq)) || >before(TCP_SKB_CB(skb)->seq, start))) { > end_of_skbs = false; > break; Hmm... I must say I do not understand this patch. If we find a skb with before(TCP_SKB_CB(skb)->seq, start), then the final condition will be true. Let's rewrite your code to equivalent one : if (!tcp_hdr(skb)->syn && !tcp_hdr(skb)->fin && (before(TCP_SKB_CB(skb)->seq, start) || tcp_win_from_space(skb->truesize) > skb->len)) { So it seems your patch would not solve the problem for all possible skbs (aka not bloated) ? Please tell us how you trigger this bug, and send the stack trace. Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tcp: Modify the condition for the first skb to collapse
On Mon, 2013-06-17 at 10:18 -0400, Jun Chen wrote: When search the first skb to collapse,the condition of overlap to the next one have been reached,but the start is less than TCP_SKB_CB(skb)-seq at this time, then followed process will trigger the BUG_ON of the offset(start - TCP_SKB_CB(skb)-seq). So this patch add one check (! before(start,TCP_SKB_CB(skb)-seq)) to avoid this ipanic. Signed-off-by: Chen Jun jun.d.c...@intel.com --- net/ipv4/tcp_input.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 9c62257..4c745c5 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -4465,7 +4465,8 @@ restart: * overlaps to the next one. */ if (!tcp_hdr(skb)-syn !tcp_hdr(skb)-fin - (tcp_win_from_space(skb-truesize) skb-len || + ((tcp_win_from_space(skb-truesize) skb-len + !before(start, TCP_SKB_CB(skb)-seq)) || before(TCP_SKB_CB(skb)-seq, start))) { end_of_skbs = false; break; Hmm... I must say I do not understand this patch. If we find a skb with before(TCP_SKB_CB(skb)-seq, start), then the final condition will be true. Let's rewrite your code to equivalent one : if (!tcp_hdr(skb)-syn !tcp_hdr(skb)-fin (before(TCP_SKB_CB(skb)-seq, start) || tcp_win_from_space(skb-truesize) skb-len)) { So it seems your patch would not solve the problem for all possible skbs (aka not bloated) ? Please tell us how you trigger this bug, and send the stack trace. Thanks -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tcp: Modify the condition for the first skb to collapse
On Mon, 2013-06-17 at 01:15 -0700, Eric Dumazet wrote: On Mon, 2013-06-17 at 10:18 -0400, Jun Chen wrote: When search the first skb to collapse,the condition of overlap to the next one have been reached,but the start is less than TCP_SKB_CB(skb)-seq at this time, then followed process will trigger the BUG_ON of the offset(start - TCP_SKB_CB(skb)-seq). So this patch add one check (! before(start,TCP_SKB_CB(skb)-seq)) to avoid this ipanic. Signed-off-by: Chen Jun jun.d.c...@intel.com --- net/ipv4/tcp_input.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 9c62257..4c745c5 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -4465,7 +4465,8 @@ restart: * overlaps to the next one. */ if (!tcp_hdr(skb)-syn !tcp_hdr(skb)-fin - (tcp_win_from_space(skb-truesize) skb-len || + ((tcp_win_from_space(skb-truesize) skb-len + !before(start, TCP_SKB_CB(skb)-seq)) || before(TCP_SKB_CB(skb)-seq, start))) { end_of_skbs = false; break; Hmm... I must say I do not understand this patch. If we find a skb with before(TCP_SKB_CB(skb)-seq, start), then the final condition will be true. Let's rewrite your code to equivalent one : if (!tcp_hdr(skb)-syn !tcp_hdr(skb)-fin (before(TCP_SKB_CB(skb)-seq, start) || tcp_win_from_space(skb-truesize) skb-len)) { So it seems your patch would not solve the problem for all possible skbs (aka not bloated) ? Please tell us how you trigger this bug, and send the stack trace. Thanks hi, When the condition of tcp_win_from_space(skb-truesize) skb-len is true but the before(start, TCP_SKB_CB(skb)-seq) is also true, the final condition will be true. The follow line: int offset = start - TCP_SKB_CB(skb)-seq; BUG_ON(offset 0); this BUG_ON will be triggered. Follow line is my error logs: 2[ 7736.344508] kernel BUG at /data/buildbot/workdir/jb/kernel/net/ipv4/tcp_input.c:4845! 4[ 7736.344578] invalid opcode: [#1] PREEMPT SMP 4[ 7736.344883] Modules linked in: atomisp lm3559 ov9724 imx1x5 bcm4335(O) cfg80211 bcm_bt_lpm videobuf_vmalloc videobuf_core matrix(C) 4[ 7736.345681] 4[ 7736.345748] Pid: 5189, comm: TimedEventQueue Tainted: GWC O 3.4.43-186445-g3ada675 #1 Intel Corporation Merrifield/SALT BAY 4[ 7736.346059] EIP: 0060:[c18ad61d] EFLAGS: 00010297 CPU: 1 4[ 7736.346183] EIP is at tcp_collapse+0x3bd/0x3d0 4[ 7736.346250] EAX: ab57d2bb EBX: df428c00 ECX: c97dcd00 EDX: 10c0 4[ 7736.346372] ESI: df4289c0 EDI: fadb EBP: edca1d88 ESP: edca1d60 4[ 7736.346441] DS: 007b ES: 007b FS: 00d8 GS: 003b SS: 0068 4[ 7736.346560] CR0: 8005003b CR2: 41d310bc CR3: 2d30 CR4: 001007d0 4[ 7736.346629] DR0: DR1: DR2: DR3: 4[ 7736.346749] DR6: 0ff0 DR7: 0400 0[ 7736.346816] Process TimedEventQueue (pid: 5189, ti=edca task=dc30b660 task.ti=c9a6e000) 0[ 7736.346936] Stack: 4[ 7736.347002] fadb c97dcd5c 0001 c97dcd00 0e32 c97dcd00 4[ 7736.347615] c97dcd00 df428180 edca1db0 c18addd0 ab57c870 ab57f19f c97dcd00 4[ 7736.348175] c97dd198 80c0 c97dcd00 df428180 edca1df0 c18aea27 c18dc8f8 0[ 7736.348788] Call Trace: 4[ 7736.348861] [c18addd0] tcp_prune_queue+0x120/0x2f0 4[ 7736.348984] [c18aea27] tcp_data_queue+0x777/0xf00 4[ 7736.349055] [c18dc8f8] ? ipt_do_table+0x1f8/0x480 4[ 7736.349126] [c18dc8f8] ? ipt_do_table+0x1f8/0x480 4[ 7736.349196] [c18b2e84] tcp_rcv_established+0x114/0x680 4[ 7736.349269] [c18bb034] tcp_v4_do_rcv+0x164/0x350 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tcp: Modify the condition for the first skb to collapse
On Mon, 2013-06-17 at 13:29 -0400, Jun Chen wrote: hi, When the condition of tcp_win_from_space(skb-truesize) skb-len is true but the before(start, TCP_SKB_CB(skb)-seq) is also true, the final condition will be true. The follow line: int offset = start - TCP_SKB_CB(skb)-seq; BUG_ON(offset 0); this BUG_ON will be triggered. Really this should never happen, we must track what's happening here. Are you using a pristine kernel, without any patches ? Are you able to reproduce this bug in a short amount of time ? What kind of driver is in use ? (your stack trace was truncated) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tcp: Modify the condition for the first skb to collapse
On Mon, 2013-06-17 at 03:29 -0700, Eric Dumazet wrote: On Mon, 2013-06-17 at 13:29 -0400, Jun Chen wrote: hi, When the condition of tcp_win_from_space(skb-truesize) skb-len is true but the before(start, TCP_SKB_CB(skb)-seq) is also true, the final condition will be true. The follow line: int offset = start - TCP_SKB_CB(skb)-seq; BUG_ON(offset 0); this BUG_ON will be triggered. Really this should never happen, we must track what's happening here. It's very very rare, but the logic of codes have such a little hole. Are you using a pristine kernel, without any patches ? The based kernel version is 3.4. Are you able to reproduce this bug in a short amount of time ? I can't reproduce it in short time, this log had just been found once for long long time tests on many devices . What kind of driver is in use ? (your stack trace was truncated) I attach the whole stack traces for you. 0[ 7736.348788] Call Trace: 4[ 7736.348861] [c18addd0] tcp_prune_queue+0x120/0x2f0 4[ 7736.348984] [c18aea27] tcp_data_queue+0x777/0xf00 4[ 7736.349055] [c18dc8f8] ? ipt_do_table+0x1f8/0x480 4[ 7736.349126] [c18dc8f8] ? ipt_do_table+0x1f8/0x480 4[ 7736.349196] [c18b2e84] tcp_rcv_established+0x114/0x680 4[ 7736.349269] [c18bb034] tcp_v4_do_rcv+0x164/0x350 4[ 7736.349396] [c18de301] ? nf_nat_fn+0xb1/0x1d0 4[ 7736.349470] [c18bc0c1] tcp_v4_rcv+0x6f1/0x7a0 4[ 7736.349599] [c1881dad] ? nf_hook_slow+0x10d/0x150 4[ 7736.349673] [c189b30b] ip_local_deliver_finish+0x8b/0x200 4[ 7736.349796] [c189b60f] ip_local_deliver+0x8f/0xa0 4[ 7736.349867] [c189b280] ? ip_rcv_finish+0x300/0x300 4[ 7736.349937] [c189b05f] ip_rcv_finish+0xdf/0x300 4[ 7736.350062] [c189b878] ip_rcv+0x258/0x330 4[ 7736.350132] [c189af80] ? inet_del_protocol+0x30/0x30 4[ 7736.350258] [c1864175] __netif_receive_skb+0x325/0x410 4[ 7736.350331] [c1864956] process_backlog+0x96/0x150 4[ 7736.350455] [c1864ba5] net_rx_action+0x115/0x210 4[ 7736.350525] [c18b7680] ? tcp_out_of_resources+0xb0/0xb0 4[ 7736.350652] [c123dc0b] __do_softirq+0x9b/0x220 4[ 7736.350723] [c123db70] ? local_bh_enable_ip+0xd0/0xd0 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tcp: Modify the condition for the first skb to collapse
On Mon, 2013-06-17 at 14:52 -0400, Jun Chen wrote: On Mon, 2013-06-17 at 03:29 -0700, Eric Dumazet wrote: On Mon, 2013-06-17 at 13:29 -0400, Jun Chen wrote: hi, When the condition of tcp_win_from_space(skb-truesize) skb-len is true but the before(start, TCP_SKB_CB(skb)-seq) is also true, the final condition will be true. The follow line: int offset = start - TCP_SKB_CB(skb)-seq; BUG_ON(offset 0); this BUG_ON will be triggered. Really this should never happen, we must track what's happening here. It's very very rare, but the logic of codes have such a little hole. Are you using a pristine kernel, without any patches ? The based kernel version is 3.4. Are you able to reproduce this bug in a short amount of time ? I can't reproduce it in short time, this log had just been found once for long long time tests on many devices . What kind of driver is in use ? (your stack trace was truncated) I attach the whole stack traces for you. 0[ 7736.348788] Call Trace: 4[ 7736.348861] [c18addd0] tcp_prune_queue+0x120/0x2f0 4[ 7736.348984] [c18aea27] tcp_data_queue+0x777/0xf00 4[ 7736.349055] [c18dc8f8] ? ipt_do_table+0x1f8/0x480 4[ 7736.349126] [c18dc8f8] ? ipt_do_table+0x1f8/0x480 4[ 7736.349196] [c18b2e84] tcp_rcv_established+0x114/0x680 4[ 7736.349269] [c18bb034] tcp_v4_do_rcv+0x164/0x350 4[ 7736.349396] [c18de301] ? nf_nat_fn+0xb1/0x1d0 4[ 7736.349470] [c18bc0c1] tcp_v4_rcv+0x6f1/0x7a0 4[ 7736.349599] [c1881dad] ? nf_hook_slow+0x10d/0x150 4[ 7736.349673] [c189b30b] ip_local_deliver_finish+0x8b/0x200 4[ 7736.349796] [c189b60f] ip_local_deliver+0x8f/0xa0 4[ 7736.349867] [c189b280] ? ip_rcv_finish+0x300/0x300 4[ 7736.349937] [c189b05f] ip_rcv_finish+0xdf/0x300 4[ 7736.350062] [c189b878] ip_rcv+0x258/0x330 4[ 7736.350132] [c189af80] ? inet_del_protocol+0x30/0x30 4[ 7736.350258] [c1864175] __netif_receive_skb+0x325/0x410 4[ 7736.350331] [c1864956] process_backlog+0x96/0x150 4[ 7736.350455] [c1864ba5] net_rx_action+0x115/0x210 4[ 7736.350525] [c18b7680] ? tcp_out_of_resources+0xb0/0xb0 4[ 7736.350652] [c123dc0b] __do_softirq+0x9b/0x220 4[ 7736.350723] [c123db70] ? local_bh_enable_ip+0xd0/0xd0 Any other suspect messages before this, a memory allocation error for example ? I believe we have a bug in tcp_collapse() if one alloc_skb() returns NULL while we were in the middle of collapsing a big GRO packet. gro_skb needed 3 skb to be rebuilt, and only two skbs could be allocated skb1: seq=X end_seq=X+4000 skb2: seq=X+4000 end_seq=X+8000 missing grp_skb: seq=X end_seq=X+16000 Next time we call tcp_collapse(), we might split again the GRO packet and get following incorrect queue : skb1: seq=X end_seq=X+4000 skb2: seq=X+4000 end_seq=X+8000 skb3: seq=X end_seq=X+4000 skb4: seq=X+4000 end_seq=X+8000 skb5: seq=X+8000 end_seq=X+12000 skb6: seq=X+12000 end_seq=X+16000 I would use the following patch instead, to narrow the problem If we really find in the ofo queue a skb with a lower seq than the previous one, we should complain instead of lowering @start, since this is going to crash later. receive_queue / ofo_queue should contain monotonically increasing skb-seq. diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 46271cdc..5507a09 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -4513,8 +4513,10 @@ static void tcp_collapse_ofo_queue(struct sock *sk) start = TCP_SKB_CB(skb)-seq; end = TCP_SKB_CB(skb)-end_seq; } else { - if (before(TCP_SKB_CB(skb)-seq, start)) - start = TCP_SKB_CB(skb)-seq; + if (before(TCP_SKB_CB(skb)-seq, start)) { + pr_err_once(tcp_collapse_ofo_queue() : seq %08x before start %08X\n, + TCP_SKB_CB(skb)-seq, start); + } if (after(TCP_SKB_CB(skb)-end_seq, end)) end = TCP_SKB_CB(skb)-end_seq; } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tcp: Modify the condition for the first skb to collapse
On Mon, 2013-06-17 at 06:21 -0700, Eric Dumazet wrote: On Mon, 2013-06-17 at 14:52 -0400, Jun Chen wrote: On Mon, 2013-06-17 at 03:29 -0700, Eric Dumazet wrote: On Mon, 2013-06-17 at 13:29 -0400, Jun Chen wrote: hi, When the condition of tcp_win_from_space(skb-truesize) skb-len is true but the before(start, TCP_SKB_CB(skb)-seq) is also true, the final condition will be true. The follow line: int offset = start - TCP_SKB_CB(skb)-seq; BUG_ON(offset 0); this BUG_ON will be triggered. Really this should never happen, we must track what's happening here. It's very very rare, but the logic of codes have such a little hole. Are you using a pristine kernel, without any patches ? The based kernel version is 3.4. Are you able to reproduce this bug in a short amount of time ? I can't reproduce it in short time, this log had just been found once for long long time tests on many devices . What kind of driver is in use ? (your stack trace was truncated) I attach the whole stack traces for you. Any other suspect messages before this, a memory allocation error for example ? I believe we have a bug in tcp_collapse() if one alloc_skb() returns NULL while we were in the middle of collapsing a big GRO packet. gro_skb needed 3 skb to be rebuilt, and only two skbs could be allocated skb1: seq=X end_seq=X+4000 skb2: seq=X+4000 end_seq=X+8000 missing grp_skb: seq=X end_seq=X+16000 Next time we call tcp_collapse(), we might split again the GRO packet and get following incorrect queue : skb1: seq=X end_seq=X+4000 skb2: seq=X+4000 end_seq=X+8000 skb3: seq=X end_seq=X+4000 skb4: seq=X+4000 end_seq=X+8000 skb5: seq=X+8000 end_seq=X+12000 skb6: seq=X+12000 end_seq=X+16000 I would use the following patch instead, to narrow the problem If we really find in the ofo queue a skb with a lower seq than the previous one, we should complain instead of lowering @start, since this is going to crash later. receive_queue / ofo_queue should contain monotonically increasing skb-seq. diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 46271cdc..5507a09 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -4513,8 +4513,10 @@ static void tcp_collapse_ofo_queue(struct sock *sk) start = TCP_SKB_CB(skb)-seq; end = TCP_SKB_CB(skb)-end_seq; } else { - if (before(TCP_SKB_CB(skb)-seq, start)) - start = TCP_SKB_CB(skb)-seq; + if (before(TCP_SKB_CB(skb)-seq, start)) { + pr_err_once(tcp_collapse_ofo_queue() : seq %08x before start %08X\n, + TCP_SKB_CB(skb)-seq, start); + } if (after(TCP_SKB_CB(skb)-end_seq, end)) end = TCP_SKB_CB(skb)-end_seq; } There are many warning for tcp_recvmsg before this crash. I can't find other memory warning in the logs, but I'm not sure whether there are memory issues because of the length limitation of saved logs. I think this logs will give you more information. 4[ 7736.343742] [ cut here ] 4[ 7736.343759] WARNING: at /data/buildbot/workdir/jb/kernel/net/ipv4/tcp.c:1496 tcp_recvmsg +0x3bf/0x910() 4[ 7736.343775] recvmsg bug: copied AB57C870 seq AB57CD95 rcvnxt AB57F19F fl 0 4[ 7736.343845] Call Trace: 4[ 7736.343865] [c1237032] warn_slowpath_common+0x72/0xa0 4[ 7736.343888] [c18a955f] ? tcp_recvmsg+0x3bf/0x910 4[ 7736.343902] [c18a955f] ? tcp_recvmsg+0x3bf/0x910 4[ 7736.343922] [c1237103] warn_slowpath_fmt+0x33/0x40 4[ 7736.343944] [c18a955f] tcp_recvmsg+0x3bf/0x910 4[ 7736.343968] [c18c9bb5] inet_recvmsg+0x85/0xa0 4[ 7736.343992] [c1852030] sock_aio_read+0x140/0x160 4[ 7736.344016] [c126b221] ? set_next_entity+0xc1/0xf0 4[ 7736.344039] [c130d627] do_sync_read+0xb7/0xf0 4[ 7736.344064] [c130dc6c] ? rw_verify_area+0x6c/0x120 4[ 7736.344077] [c1349aa8] ? sys_epoll_wait+0x68/0x360 4[ 7736.344098] [c130e1e9] vfs_read+0x149/0x160 4[ 7736.344120] [c130f518] ? fget_light+0x58/0xd0 4[ 7736.344142] [c130e23d] sys_read+0x3d/0x70 4[ 7736.344164] [c198c361] syscall_call+0x7/0xb 4[ 7736.344187] [c198] ? perf_cpu_notify+0x45/0x89 4[ 7736.344205] ---[ end trace b3c5b245ce7ff5b5 ]--- -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tcp: Modify the condition for the first skb to collapse
On Tue, 2013-06-18 at 05:52 -0400, Jun Chen wrote: There are many warning for tcp_recvmsg before this crash. I can't find other memory warning in the logs, but I'm not sure whether there are memory issues because of the length limitation of saved logs. I think this logs will give you more information. 4[ 7736.343742] [ cut here ] 4[ 7736.343759] WARNING: at /data/buildbot/workdir/jb/kernel/net/ipv4/tcp.c:1496 tcp_recvmsg +0x3bf/0x910() 4[ 7736.343775] recvmsg bug: copied AB57C870 seq AB57CD95 rcvnxt AB57F19F fl 0 4[ 7736.343845] Call Trace: 4[ 7736.343865] [c1237032] warn_slowpath_common+0x72/0xa0 4[ 7736.343888] [c18a955f] ? tcp_recvmsg+0x3bf/0x910 4[ 7736.343902] [c18a955f] ? tcp_recvmsg+0x3bf/0x910 4[ 7736.343922] [c1237103] warn_slowpath_fmt+0x33/0x40 4[ 7736.343944] [c18a955f] tcp_recvmsg+0x3bf/0x910 4[ 7736.343968] [c18c9bb5] inet_recvmsg+0x85/0xa0 4[ 7736.343992] [c1852030] sock_aio_read+0x140/0x160 4[ 7736.344016] [c126b221] ? set_next_entity+0xc1/0xf0 4[ 7736.344039] [c130d627] do_sync_read+0xb7/0xf0 4[ 7736.344064] [c130dc6c] ? rw_verify_area+0x6c/0x120 4[ 7736.344077] [c1349aa8] ? sys_epoll_wait+0x68/0x360 4[ 7736.344098] [c130e1e9] vfs_read+0x149/0x160 4[ 7736.344120] [c130f518] ? fget_light+0x58/0xd0 4[ 7736.344142] [c130e23d] sys_read+0x3d/0x70 4[ 7736.344164] [c198c361] syscall_call+0x7/0xb 4[ 7736.344187] [c198] ? perf_cpu_notify+0x45/0x89 4[ 7736.344205] ---[ end trace b3c5b245ce7ff5b5 ]--- Thats exactly the interesting stuff ;) This was fixed, or should be fixed if still happening on more recent kernels. Basically, once we are in this state, there is nothing we can do to prevent a crash. Please try to reproduce the issue using 3.9 or David trees (net-next or net ) Thanks -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/