Re: BUG() can be hit in tcp_collapse()
On Wed, 2016-11-30 at 12:00 -0500, Vladis Dronov wrote: > Hello, Eric, Marco, all, > > This is JFYI and a follow-up message. > > A further investigation was made to find out the Linux kernel commit which has > introduced the flaw. It appeared that previous Linux kernel versions are > vulnerable, > down to v3.6-rc1. This fact was hidden by 'net.ipv4.tcp_fastopen' set to 0 by > default, > and now it is easier to notice since kernel v3.12 due to commit 0d41cca490 > where the > default was changed to 1. With 'net.ipv4.tcp_fastopen' set to 1, previous > Linux > kernels (including RHEL-7 ones) are also vulnerable. > > The bug is here since tcp-fastopen feature was introduced in kernel v3.6-rc1, > the first > commit when the reproducer starts to panic the kernel with > net.ipv4.tcp_fastopen=1 set > is cf60af03ca, which is a part of commit sequence 2100c8d2d9..67da22d23f > introducing > net-tcp-fastopen feature: > > $ git bisect bad cf60af03ca4e71134206809ea892e49b92a88896 > cf60af03ca4e71134206809ea892e49b92a88896 is the first bad commit > commit cf60af03ca4e71134206809ea892e49b92a88896 > Author: Yuchung Cheng> Date: Thu Jul 19 06:43:09 2012 + > > So, ideally, the upstream commit ac6e780070 which fixes the bug should have > "Fixes: cf60af03ca" statement, unfortunately, this investigation was not > completed at > the time the patch was accepted upstream. And unfortunately I do not see > other way > to add this information except making notes in a comment in the related code, > which > seems weird. Well, the crash can happen way before Yuchung patch. It is a 0-day bug.
Re: BUG() can be hit in tcp_collapse()
Hello, Eric, Marco, all, This is JFYI and a follow-up message. A further investigation was made to find out the Linux kernel commit which has introduced the flaw. It appeared that previous Linux kernel versions are vulnerable, down to v3.6-rc1. This fact was hidden by 'net.ipv4.tcp_fastopen' set to 0 by default, and now it is easier to notice since kernel v3.12 due to commit 0d41cca490 where the default was changed to 1. With 'net.ipv4.tcp_fastopen' set to 1, previous Linux kernels (including RHEL-7 ones) are also vulnerable. The bug is here since tcp-fastopen feature was introduced in kernel v3.6-rc1, the first commit when the reproducer starts to panic the kernel with net.ipv4.tcp_fastopen=1 set is cf60af03ca, which is a part of commit sequence 2100c8d2d9..67da22d23f introducing net-tcp-fastopen feature: $ git bisect bad cf60af03ca4e71134206809ea892e49b92a88896 cf60af03ca4e71134206809ea892e49b92a88896 is the first bad commit commit cf60af03ca4e71134206809ea892e49b92a88896 Author: Yuchung ChengDate: Thu Jul 19 06:43:09 2012 + So, ideally, the upstream commit ac6e780070 which fixes the bug should have "Fixes: cf60af03ca" statement, unfortunately, this investigation was not completed at the time the patch was accepted upstream. And unfortunately I do not see other way to add this information except making notes in a comment in the related code, which seems weird. Best regards, Vladis Dronov | Red Hat, Inc. | Product Security Engineer
Re: BUG() can be hit in tcp_collapse()
Hello, Eric, > Another sk_filter() is used in tcp v6. > So the correct patch would be : Thank you much for your research. I'm happy my report has resulted as the proposed patch. Best regards, Vladis Dronov | Red Hat, Inc. | Product Security Engineer
Re: BUG() can be hit in tcp_collapse()
On Thu, 2016-11-10 at 11:49 -0800, Eric Dumazet wrote: > On Thu, 2016-11-10 at 11:26 -0800, Eric Dumazet wrote: > > > The issue is that sk_filter() truncates an incoming packet to a smaller > > value. > > > > Bad things happen because TCP_SKB_CB(skb)->end_seq is not updated. > > > > I guess other issues would also happen if the truncation also removes > > part of tcp header. > > > > sk_filter_trim_cap(sk, skb, tcp_hlen) would be needed, > > or sk_filter_trim_cap(sk, skb, skb->len) to only ACCEPT/DROP packets, > > but no truncations. > > Something like : Another sk_filter() is used in tcp v6. So the correct patch would be : diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 61b7be303eec..0b8f575eefaa 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1676,7 +1676,7 @@ int tcp_v4_rcv(struct sk_buff *skb) nf_reset(skb); - if (sk_filter(sk, skb)) + if (sk_filter_trim_cap(sk, skb, skb->len)) goto discard_and_relse; skb->dev = NULL; diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 6ca23c2e76f7..96525649a397 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1229,7 +1229,7 @@ static int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb) if (skb->protocol == htons(ETH_P_IP)) return tcp_v4_do_rcv(sk, skb); - if (sk_filter(sk, skb)) + if (sk_filter_trim_cap(sk, skb, skb->len)) goto discard; /* @@ -1457,7 +1457,7 @@ static int tcp_v6_rcv(struct sk_buff *skb) if (tcp_v6_inbound_md5_hash(sk, skb)) goto discard_and_relse; - if (sk_filter(sk, skb)) + if (sk_filter_trim_cap(sk, skb, skb->len)) goto discard_and_relse; skb->dev = NULL;
Re: BUG() can be hit in tcp_collapse()
On Thu, 2016-11-10 at 11:26 -0800, Eric Dumazet wrote: > The issue is that sk_filter() truncates an incoming packet to a smaller > value. > > Bad things happen because TCP_SKB_CB(skb)->end_seq is not updated. > > I guess other issues would also happen if the truncation also removes > part of tcp header. > > sk_filter_trim_cap(sk, skb, tcp_hlen) would be needed, > or sk_filter_trim_cap(sk, skb, skb->len) to only ACCEPT/DROP packets, > but no truncations. Something like : diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 61b7be303eec..0b8f575eefaa 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1676,7 +1676,7 @@ int tcp_v4_rcv(struct sk_buff *skb) nf_reset(skb); - if (sk_filter(sk, skb)) + if (sk_filter_trim_cap(sk, skb, skb->len)) goto discard_and_relse; skb->dev = NULL; diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 6ca23c2e76f7..2c7a6f7f1113 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1229,7 +1229,7 @@ static int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb) if (skb->protocol == htons(ETH_P_IP)) return tcp_v4_do_rcv(sk, skb); - if (sk_filter(sk, skb)) + if (sk_filter_trim_cap(sk, skb, skb->len)) goto discard; /*
Re: BUG() can be hit in tcp_collapse()
On Thu, 2016-11-10 at 07:44 -0800, Eric Dumazet wrote: > On Thu, 2016-11-10 at 09:47 -0500, Vladis Dronov wrote: > > Hello, > > > > It was discovered by Marco Grassi(many thanks) that > > the > > latest stable Linux kernel v4.8.6 is crashing in tcp_collapse() after making > > certain syscalls: > > > > [9.622886] kernel BUG at net/ipv4/tcp_input.c:4813! > > [9.623299] invalid opcode: [#1] SMP > > [9.623642] Modules linked in: iptable_nat nf_nat_ipv4 nf_nat > > [9.624287] CPU: 2 PID: 2871 Comm: poc Not tainted 4.8.6 #2 > > [9.624730] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > 1.8.2-20150714_191134- 04/01/2014 > > [9.625459] task: 8801387b9a00 task.stack: 8801380e4000 > > [9.625929] RIP: 0010:[] [] > > tcp_collapse+0x3ac/0x3b0 > > [9.626609] RSP: 0018:8801380e7b78 EFLAGS: 00010282 > > [9.627028] RAX: fff2 RBX: 0ec0 RCX: > > 0ec0 > > [9.627587] RDX: 8801365cd000 RSI: RDI: > > 8801364106e0 > > [9.628142] RBP: 8801380e7bc8 R08: R09: > > 88013b003300 > > [9.628704] R10: 8801365cd000 R11: R12: > > 0ec0 > > [9.629259] R13: 88013663ae00 R14: cdf0ca26 R15: > > 8801364106e0 > > [9.629819] FS: 7f2cef695800() GS:88013fc8() > > knlGS: > > [9.630945] CS: 0010 DS: ES: CR0: 80050033 > > [9.631655] CR2: 2002a000 CR3: 000139d46000 CR4: > > 001406e0 > > [9.632462] Stack: > > [9.632900] cdf0da260001 88013805 > > 8801380500a8 > > [9.634138] 8801 880138050688 0900 > > 8801364136e0 > > [9.635379] 88013805 880138050688 8801380e7c00 > > 8178d630 > > [9.636622] Call Trace: > > [9.637087] [] tcp_try_rmem_schedule+0x140/0x380 > > [9.637834] [] tcp_data_queue+0x898/0xcf0 > > [9.638538] [] tcp_rcv_established+0x20b/0x6c0 > > [9.639268] [] ? sk_reset_timer+0x13/0x30 > > [9.639968] [] tcp_v6_do_rcv+0x1b9/0x420 > > [9.640666] [] __release_sock+0x82/0xf0 > > [9.641353] [] release_sock+0x2b/0x90 > > [9.642029] [] tcp_sendmsg+0x55a/0xb60 > > [9.642714] [] inet_sendmsg+0x60/0x90 > > [9.643389] [] sock_sendmsg+0x33/0x40 > > [9.644064] [] SYSC_sendto+0xee/0x160 > > [9.645530] [] SyS_sendto+0x9/0x10 > > [9.646190] [] entry_SYSCALL_64_fastpath+0x1a/0xa4 > > [9.646947] Code: 48 c7 07 00 00 00 00 48 89 42 08 48 89 10 e8 cc 7e f8 > > ff 49 8b 47 30 48 8b 80 80 01 00 00 65 48 ff 80 b0 01 00 00 e9 72 fd ff ff > > <0f> 0b 66 90 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 fe 53 8b > > [9.651794] RIP [] tcp_collapse+0x3ac/0x3b0 > > [9.652554] RSP > > > > The reproducer is generated by the syzkaller, please, see attached. The > > following BUG() is hit: > > > > [net/ipv4/tcp_input.c] > > static void > > tcp_collapse(struct sock *sk, struct sk_buff_head *list, > > struct sk_buff *head, struct sk_buff *tail, > > u32 start, u32 end) > > { > > ... > > /* Copy data, releasing collapsed skbs. */ > > while (copy > 0) { > > int offset = start - TCP_SKB_CB(skb)->seq; > > int size = TCP_SKB_CB(skb)->end_seq - start; > > > > BUG_ON(offset < 0); > > if (size > 0) { > > size = min(copy, size); > > 4812: if (skb_copy_bits(skb, offset, skb_put(nskb, size), size)) > > 4813: BUG(); > > > > /usr/src/linux-4.8.6/net/ipv4/tcp_input.c: 4812 > > 0x8178d390 :mov%r12d,%esi > > 0x8178d393 :callq 0x81713ce0 > > > > 0x8178d398 :mov-0x30(%rbp),%r8d > > 0x8178d39c :mov%r12d,%ecx > > 0x8178d39f :mov%rax,%rdx > > 0x8178d3a2 :mov%r15,%rdi > > 0x8178d3a5 :mov%r8d,%esi > > 0x8178d3a8 :callq 0x81714b90 > > > > 0x8178d3ad :test %eax,%eax > > 0x8178d3af :jne0x8178d4ec > > > > ... > > /usr/src/linux-4.8.6/net/ipv4/tcp_input.c: 4813 > > 0x8178d4ec :ud2 > > > > I have checked that the reproducer can cause hitting this BUG() in the > > kernels > > since, at least v4.0. I was not checking the earlier kernels except RHEL-7 > > ones > > (3.10.0-xxx) which are not vulnerable. > > > > The upstream kernels since v4.9-rc1 are not vulnerable too and I have > > bisected > > the repo to the commit c9c3321257 which fixes the issue. > > > > $ git tag --contain
Re: BUG() can be hit in tcp_collapse()
On Thu, 2016-11-10 at 09:47 -0500, Vladis Dronov wrote: > Hello, > > It was discovered by Marco Grassi(many thanks) that the > latest stable Linux kernel v4.8.6 is crashing in tcp_collapse() after making > certain syscalls: > > [9.622886] kernel BUG at net/ipv4/tcp_input.c:4813! > [9.623299] invalid opcode: [#1] SMP > [9.623642] Modules linked in: iptable_nat nf_nat_ipv4 nf_nat > [9.624287] CPU: 2 PID: 2871 Comm: poc Not tainted 4.8.6 #2 > [9.624730] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.8.2-20150714_191134- 04/01/2014 > [9.625459] task: 8801387b9a00 task.stack: 8801380e4000 > [9.625929] RIP: 0010:[] [] > tcp_collapse+0x3ac/0x3b0 > [9.626609] RSP: 0018:8801380e7b78 EFLAGS: 00010282 > [9.627028] RAX: fff2 RBX: 0ec0 RCX: > 0ec0 > [9.627587] RDX: 8801365cd000 RSI: RDI: > 8801364106e0 > [9.628142] RBP: 8801380e7bc8 R08: R09: > 88013b003300 > [9.628704] R10: 8801365cd000 R11: R12: > 0ec0 > [9.629259] R13: 88013663ae00 R14: cdf0ca26 R15: > 8801364106e0 > [9.629819] FS: 7f2cef695800() GS:88013fc8() > knlGS: > [9.630945] CS: 0010 DS: ES: CR0: 80050033 > [9.631655] CR2: 2002a000 CR3: 000139d46000 CR4: > 001406e0 > [9.632462] Stack: > [9.632900] cdf0da260001 88013805 > 8801380500a8 > [9.634138] 8801 880138050688 0900 > 8801364136e0 > [9.635379] 88013805 880138050688 8801380e7c00 > 8178d630 > [9.636622] Call Trace: > [9.637087] [] tcp_try_rmem_schedule+0x140/0x380 > [9.637834] [] tcp_data_queue+0x898/0xcf0 > [9.638538] [] tcp_rcv_established+0x20b/0x6c0 > [9.639268] [] ? sk_reset_timer+0x13/0x30 > [9.639968] [] tcp_v6_do_rcv+0x1b9/0x420 > [9.640666] [] __release_sock+0x82/0xf0 > [9.641353] [] release_sock+0x2b/0x90 > [9.642029] [] tcp_sendmsg+0x55a/0xb60 > [9.642714] [] inet_sendmsg+0x60/0x90 > [9.643389] [] sock_sendmsg+0x33/0x40 > [9.644064] [] SYSC_sendto+0xee/0x160 > [9.645530] [] SyS_sendto+0x9/0x10 > [9.646190] [] entry_SYSCALL_64_fastpath+0x1a/0xa4 > [9.646947] Code: 48 c7 07 00 00 00 00 48 89 42 08 48 89 10 e8 cc 7e f8 ff > 49 8b 47 30 48 8b 80 80 01 00 00 65 48 ff 80 b0 01 00 00 e9 72 fd ff ff <0f> > 0b 66 90 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 fe 53 8b > [9.651794] RIP [] tcp_collapse+0x3ac/0x3b0 > [9.652554] RSP > > The reproducer is generated by the syzkaller, please, see attached. The > following BUG() is hit: > > [net/ipv4/tcp_input.c] > static void > tcp_collapse(struct sock *sk, struct sk_buff_head *list, > struct sk_buff *head, struct sk_buff *tail, > u32 start, u32 end) > { > ... > /* Copy data, releasing collapsed skbs. */ > while (copy > 0) { > int offset = start - TCP_SKB_CB(skb)->seq; > int size = TCP_SKB_CB(skb)->end_seq - start; > > BUG_ON(offset < 0); > if (size > 0) { > size = min(copy, size); > 4812: if (skb_copy_bits(skb, offset, skb_put(nskb, size), size)) > 4813: BUG(); > > /usr/src/linux-4.8.6/net/ipv4/tcp_input.c: 4812 > 0x8178d390 :mov%r12d,%esi > 0x8178d393 :callq 0x81713ce0 > > 0x8178d398 :mov-0x30(%rbp),%r8d > 0x8178d39c :mov%r12d,%ecx > 0x8178d39f :mov%rax,%rdx > 0x8178d3a2 :mov%r15,%rdi > 0x8178d3a5 :mov%r8d,%esi > 0x8178d3a8 :callq 0x81714b90 > > 0x8178d3ad :test %eax,%eax > 0x8178d3af :jne0x8178d4ec > > ... > /usr/src/linux-4.8.6/net/ipv4/tcp_input.c: 4813 > 0x8178d4ec :ud2 > > I have checked that the reproducer can cause hitting this BUG() in the kernels > since, at least v4.0. I was not checking the earlier kernels except RHEL-7 > ones > (3.10.0-xxx) which are not vulnerable. > > The upstream kernels since v4.9-rc1 are not vulnerable too and I have bisected > the repo to the commit c9c3321257 which fixes the issue. > > $ git tag --contain c9c3321257e1b95be9b375f811fb250162af8d39 > v4.9-rc1 > > Stable v4.8.6 kernel with the c9c3321257 commit applied does not hit the > BUG(), > so I believe this commit should be backported to the stable branch. This > commit > applies cleanly to the v4.8.6 tree with just line offsets. > > Meanwhile, I see that
Re: BUG() can be hit in tcp_collapse()
On Thu, Nov 10, 2016 at 09:47:26AM -0500, Vladis Dronov wrote: > Hello, > > It was discovered by Marco Grassi(many thanks) that the > latest stable Linux kernel v4.8.6 is crashing in tcp_collapse() after making > certain syscalls: > > [9.622886] kernel BUG at net/ipv4/tcp_input.c:4813! > [9.623299] invalid opcode: [#1] SMP > [9.623642] Modules linked in: iptable_nat nf_nat_ipv4 nf_nat > [9.624287] CPU: 2 PID: 2871 Comm: poc Not tainted 4.8.6 #2 > [9.624730] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.8.2-20150714_191134- 04/01/2014 > [9.625459] task: 8801387b9a00 task.stack: 8801380e4000 > [9.625929] RIP: 0010:[] [] > tcp_collapse+0x3ac/0x3b0 > [9.626609] RSP: 0018:8801380e7b78 EFLAGS: 00010282 > [9.627028] RAX: fff2 RBX: 0ec0 RCX: > 0ec0 > [9.627587] RDX: 8801365cd000 RSI: RDI: > 8801364106e0 > [9.628142] RBP: 8801380e7bc8 R08: R09: > 88013b003300 > [9.628704] R10: 8801365cd000 R11: R12: > 0ec0 > [9.629259] R13: 88013663ae00 R14: cdf0ca26 R15: > 8801364106e0 > [9.629819] FS: 7f2cef695800() GS:88013fc8() > knlGS: > [9.630945] CS: 0010 DS: ES: CR0: 80050033 > [9.631655] CR2: 2002a000 CR3: 000139d46000 CR4: > 001406e0 > [9.632462] Stack: > [9.632900] cdf0da260001 88013805 > 8801380500a8 > [9.634138] 8801 880138050688 0900 > 8801364136e0 > [9.635379] 88013805 880138050688 8801380e7c00 > 8178d630 > [9.636622] Call Trace: > [9.637087] [] tcp_try_rmem_schedule+0x140/0x380 > [9.637834] [] tcp_data_queue+0x898/0xcf0 > [9.638538] [] tcp_rcv_established+0x20b/0x6c0 > [9.639268] [] ? sk_reset_timer+0x13/0x30 > [9.639968] [] tcp_v6_do_rcv+0x1b9/0x420 > [9.640666] [] __release_sock+0x82/0xf0 > [9.641353] [] release_sock+0x2b/0x90 > [9.642029] [] tcp_sendmsg+0x55a/0xb60 > [9.642714] [] inet_sendmsg+0x60/0x90 > [9.643389] [] sock_sendmsg+0x33/0x40 > [9.644064] [] SYSC_sendto+0xee/0x160 > [9.645530] [] SyS_sendto+0x9/0x10 > [9.646190] [] entry_SYSCALL_64_fastpath+0x1a/0xa4 > [9.646947] Code: 48 c7 07 00 00 00 00 48 89 42 08 48 89 10 e8 cc 7e f8 ff > 49 8b 47 30 48 8b 80 80 01 00 00 65 48 ff 80 b0 01 00 00 e9 72 fd ff ff <0f> > 0b 66 90 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 fe 53 8b > [9.651794] RIP [] tcp_collapse+0x3ac/0x3b0 > [9.652554] RSP > > The reproducer is generated by the syzkaller, please, see attached. The > following BUG() is hit: > > [net/ipv4/tcp_input.c] > static void > tcp_collapse(struct sock *sk, struct sk_buff_head *list, > struct sk_buff *head, struct sk_buff *tail, > u32 start, u32 end) > { > ... > /* Copy data, releasing collapsed skbs. */ > while (copy > 0) { > int offset = start - TCP_SKB_CB(skb)->seq; > int size = TCP_SKB_CB(skb)->end_seq - start; > > BUG_ON(offset < 0); > if (size > 0) { > size = min(copy, size); > 4812: if (skb_copy_bits(skb, offset, skb_put(nskb, size), size)) > 4813: BUG(); > > /usr/src/linux-4.8.6/net/ipv4/tcp_input.c: 4812 > 0x8178d390 :mov%r12d,%esi > 0x8178d393 :callq 0x81713ce0 > > 0x8178d398 :mov-0x30(%rbp),%r8d > 0x8178d39c :mov%r12d,%ecx > 0x8178d39f :mov%rax,%rdx > 0x8178d3a2 :mov%r15,%rdi > 0x8178d3a5 :mov%r8d,%esi > 0x8178d3a8 :callq 0x81714b90 > > 0x8178d3ad :test %eax,%eax > 0x8178d3af :jne0x8178d4ec > > ... > /usr/src/linux-4.8.6/net/ipv4/tcp_input.c: 4813 > 0x8178d4ec :ud2 > > I have checked that the reproducer can cause hitting this BUG() in the kernels > since, at least v4.0. I was not checking the earlier kernels except RHEL-7 > ones > (3.10.0-xxx) which are not vulnerable. > > The upstream kernels since v4.9-rc1 are not vulnerable too and I have bisected > the repo to the commit c9c3321257 which fixes the issue. > > $ git tag --contain c9c3321257e1b95be9b375f811fb250162af8d39 > v4.9-rc1 > > Stable v4.8.6 kernel with the c9c3321257 commit applied does not hit the > BUG(), > so I believe this commit should be backported to the stable branch. This > commit > applies cleanly to the v4.8.6 tree with just line offsets. I'll be glad to take