Re: kernel BUG at net/key/af_key.c:LINE!
On Wed, Nov 15, 2017 at 12:29:19PM +0100, Steffen Klassert wrote: > On Fri, Nov 10, 2017 at 02:14:06PM +1100, Herbert Xu wrote: > > On Fri, Nov 10, 2017 at 01:30:38PM +1100, Herbert Xu wrote: > > > > > > I found the problem. This crap is coming from clone_policy. Now > > > let me where this code came from. > > > > ---8<--- > > Subject: xfrm: Copy policy family in clone_policy > > > > The syzbot found an ancient bug in the IPsec code. When we cloned > > a socket policy (for example, for a child TCP socket derived from a > > listening socket), we did not copy the family field. This results > > in a live policy with a zero family field. This triggers a BUG_ON > > check in the af_key code when the cloned policy is retrieved. > > > > This patch fixes it by copying the family field over. > > > > Reported-by: syzbot> > Signed-off-by: Herbert Xu > > Patch applied, thanks Herbert! And to tell the bot what fixes this: #syz fix: xfrm: Copy policy family in clone_policy Also, does this fix need to go to stable? The commit doesn't have Cc: stable.
Re: kernel BUG at net/key/af_key.c:LINE!
On Fri, Nov 10, 2017 at 02:14:06PM +1100, Herbert Xu wrote: > On Fri, Nov 10, 2017 at 01:30:38PM +1100, Herbert Xu wrote: > > > > I found the problem. This crap is coming from clone_policy. Now > > let me where this code came from. > > ---8<--- > Subject: xfrm: Copy policy family in clone_policy > > The syzbot found an ancient bug in the IPsec code. When we cloned > a socket policy (for example, for a child TCP socket derived from a > listening socket), we did not copy the family field. This results > in a live policy with a zero family field. This triggers a BUG_ON > check in the af_key code when the cloned policy is retrieved. > > This patch fixes it by copying the family field over. > > Reported-by: syzbot> Signed-off-by: Herbert Xu Patch applied, thanks Herbert!
Re: kernel BUG at net/key/af_key.c:LINE!
On Fri, Nov 10, 2017 at 01:30:38PM +1100, Herbert Xu wrote: > > I found the problem. This crap is coming from clone_policy. Now > let me where this code came from. ---8<--- Subject: xfrm: Copy policy family in clone_policy The syzbot found an ancient bug in the IPsec code. When we cloned a socket policy (for example, for a child TCP socket derived from a listening socket), we did not copy the family field. This results in a live policy with a zero family field. This triggers a BUG_ON check in the af_key code when the cloned policy is retrieved. This patch fixes it by copying the family field over. Reported-by: syzbotSigned-off-by: Herbert Xu diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index 8cafb3c..c238959 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -1306,6 +1306,7 @@ static struct xfrm_policy *clone_policy(const struct xfrm_policy *old, int dir) newp->xfrm_nr = old->xfrm_nr; newp->index = old->index; newp->type = old->type; + newp->family = old->family; memcpy(newp->xfrm_vec, old->xfrm_vec, newp->xfrm_nr*sizeof(struct xfrm_tmpl)); spin_lock_bh(>xfrm.xfrm_policy_lock); -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: kernel BUG at net/key/af_key.c:LINE!
On Fri, Nov 10, 2017 at 01:11:45PM +1100, Herbert Xu wrote: > > Oh and this is an important clue. We have two policies with > identical index values. The index value is meant to be unique > so clearly something funny is going on. I found the problem. This crap is coming from clone_policy. Now let me where this code came from. -- Email: Herbert XuHome Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: kernel BUG at net/key/af_key.c:LINE!
On Fri, Nov 10, 2017 at 01:04:59PM +1100, Herbert Xu wrote: > > By castrating the reproducer to not perform a pfkey dump I have > captured the corrupted policy via xfrm: > > src ???/0 dst ???/0 uid 0 > socket in action allow index 2083 priority 0 ptype main share any > flag (0x) > lifetime config: > limit: soft 0(bytes), hard 0(bytes) > limit: soft 0(packets), hard 0(packets) > expire add: soft 0(sec), hard 0(sec) > expire use: soft 0(sec), hard 0(sec) > lifetime current: > 0(bytes), 0(packets) > add 2017-11-10 09:58:17 use 2017-11-10 09:58:20 > tmpl src ac14:bb:: dst :: > proto 0 spi 0x(0) reqid 0(0x) mode transport > level 5 share any > enc-mask auth-mask comp-mask > > For comparison here is a good policy that was also created by the > reproducer: > > src fe80::bb/0 dst ::/0 uid 0 > socket in action allow index 2083 priority 0 ptype main share any > flag (0x) > lifetime config: > limit: soft 0(bytes), hard 0(bytes) > limit: soft 0(packets), hard 0(packets) > expire add: soft 0(sec), hard 0(sec) > expire use: soft 0(sec), hard 0(sec) > lifetime current: > 0(bytes), 0(packets) > add 2017-11-10 09:58:17 use 2017-11-10 09:58:17 > tmpl src ac14:bb:: dst :: > proto 0 spi 0x(0) reqid 0(0x) mode transport > level 5 share any > enc-mask auth-mask comp-mask Oh and this is an important clue. We have two policies with identical index values. The index value is meant to be unique so clearly something funny is going on. Cheers, -- Email: Herbert XuHome Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: kernel BUG at net/key/af_key.c:LINE!
On Thu, Nov 09, 2017 at 10:38:57PM +1100, Herbert Xu wrote: > > The xfrm code path is meant to forbid the creation of such a policy. > I don't currently see how this is bypassing that check. But > clearly it has found a way through the check since it's crashing. By castrating the reproducer to not perform a pfkey dump I have captured the corrupted policy via xfrm: src ???/0 dst ???/0 uid 0 socket in action allow index 2083 priority 0 ptype main share any flag (0x) lifetime config: limit: soft 0(bytes), hard 0(bytes) limit: soft 0(packets), hard 0(packets) expire add: soft 0(sec), hard 0(sec) expire use: soft 0(sec), hard 0(sec) lifetime current: 0(bytes), 0(packets) add 2017-11-10 09:58:17 use 2017-11-10 09:58:20 tmpl src ac14:bb:: dst :: proto 0 spi 0x(0) reqid 0(0x) mode transport level 5 share any enc-mask auth-mask comp-mask For comparison here is a good policy that was also created by the reproducer: src fe80::bb/0 dst ::/0 uid 0 socket in action allow index 2083 priority 0 ptype main share any flag (0x) lifetime config: limit: soft 0(bytes), hard 0(bytes) limit: soft 0(packets), hard 0(packets) expire add: soft 0(sec), hard 0(sec) expire use: soft 0(sec), hard 0(sec) lifetime current: 0(bytes), 0(packets) add 2017-11-10 09:58:17 use 2017-11-10 09:58:17 tmpl src ac14:bb:: dst :: proto 0 spi 0x(0) reqid 0(0x) mode transport level 5 share any enc-mask auth-mask comp-mask Cheers, -- Email: Herbert XuHome Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: kernel BUG at net/key/af_key.c:LINE!
On Wed, Nov 08, 2017 at 08:59:15AM +0100, Dmitry Vyukov wrote: > > Also the repro needs to be compiled with -m32 (but it does not compile > without it due to missing __NR_mmap2, so I guess you passed -m32). OK that's what I was missing. I had hacked it to compile in 64-bit :) However, I still don't understand why it's crashing yet. What is clear is that we're getting a socket policy with xp->family set to zero, and the policy is created via the xfrm code path (as opposed to af_key). The xfrm code path is meant to forbid the creation of such a policy. I don't currently see how this is bypassing that check. But clearly it has found a way through the check since it's crashing. Cheers, -- Email: Herbert XuHome Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: kernel BUG at net/key/af_key.c:LINE!
On Tue, Oct 24, 2017 at 05:10:06PM +0200, Dmitry Vyukov wrote: > On Tue, Oct 24, 2017 at 5:08 PM, syzbot >> wrote: > > Hello, > > > > syzkaller hit the following crash on > > 02a2b05395dde2f49eb67b51a5fbc6606943 > > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master > > compiler: gcc (GCC) 7.1.1 20170620 > > .config is attached > > Raw console output is attached. > > C reproducer is attached > > syzkaller reproducer is attached. See https://goo.gl/kgGztJ > > for information about syzkaller reproducers > > This also happened on more recent commits, including net-next > 833e0e2f24fd0525090878f71e129a8a4cb8bf78 (Oct 10) with similar > signature: Unfortunately I cannot reproduce the crash with your reproducer. Does it always crash for you? > [ cut here ] > kernel BUG at net/key/af_key.c:2068! > invalid opcode: [#1] SMP KASAN > Dumping ftrace buffer: >(ftrace buffer empty) > Modules linked in: > CPU: 1 PID: 11011 Comm: syz-executor1 Not tainted 4.14.0-rc4+ #80 > Hardware name: Google Google Compute Engine/Google Compute Engine, > BIOS Google 01/01/2011 > task: 8801d4ecc1c0 task.stack: 8801c13f8000 > RIP: 0010:pfkey_xfrm_policy2msg+0x209c/0x22b0 net/key/af_key.c:2068 This shows that you have a xfrm policy that has a bogus family field in your policy database. But it gives no clue as to how it got there. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: kernel BUG at net/key/af_key.c:LINE!
On Wed, Nov 8, 2017 at 8:59 AM, Dmitry Vyukovwrote: > On Wed, Nov 8, 2017 at 8:47 AM, Herbert Xu > wrote: >> On Tue, Oct 24, 2017 at 05:10:06PM +0200, Dmitry Vyukov wrote: >>> On Tue, Oct 24, 2017 at 5:08 PM, syzbot >>> >>> wrote: >>> > Hello, >>> > >>> > syzkaller hit the following crash on >>> > 02a2b05395dde2f49eb67b51a5fbc6606943 >>> > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master >>> > compiler: gcc (GCC) 7.1.1 20170620 >>> > .config is attached >>> > Raw console output is attached. >>> > C reproducer is attached >>> > syzkaller reproducer is attached. See https://goo.gl/kgGztJ >>> > for information about syzkaller reproducers >>> >>> This also happened on more recent commits, including net-next >>> 833e0e2f24fd0525090878f71e129a8a4cb8bf78 (Oct 10) with similar >>> signature: >> >> Unfortunately I cannot reproduce the crash with your reproducer. >> Does it always crash for you? >> >>> [ cut here ] >>> kernel BUG at net/key/af_key.c:2068! >>> invalid opcode: [#1] SMP KASAN >>> Dumping ftrace buffer: >>>(ftrace buffer empty) >>> Modules linked in: >>> CPU: 1 PID: 11011 Comm: syz-executor1 Not tainted 4.14.0-rc4+ #80 >>> Hardware name: Google Google Compute Engine/Google Compute Engine, >>> BIOS Google 01/01/2011 >>> task: 8801d4ecc1c0 task.stack: 8801c13f8000 >>> RIP: 0010:pfkey_xfrm_policy2msg+0x209c/0x22b0 net/key/af_key.c:2068 >> >> This shows that you have a xfrm policy that has a bogus family >> field in your policy database. But it gives no clue as to how >> it got there. > > Just triggered it within a second. > Are you using the provided config? > Also the repro needs to be compiled with -m32 (but it does not compile > without it due to missing __NR_mmap2, so I guess you passed -m32). That was on linux-next: commit 8b82a8a7ab53ee1a065ac69c835737a701f46b2e (HEAD, tag: next-20171107, linux-next/master) Author: Stephen Rothwell Date: Tue Nov 7 16:18:10 2017 +1100 Add linux-next specific files for 20171107
Re: kernel BUG at net/key/af_key.c:LINE!
On Wed, Nov 8, 2017 at 8:47 AM, Herbert Xuwrote: > On Tue, Oct 24, 2017 at 05:10:06PM +0200, Dmitry Vyukov wrote: >> On Tue, Oct 24, 2017 at 5:08 PM, syzbot >> >> wrote: >> > Hello, >> > >> > syzkaller hit the following crash on >> > 02a2b05395dde2f49eb67b51a5fbc6606943 >> > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master >> > compiler: gcc (GCC) 7.1.1 20170620 >> > .config is attached >> > Raw console output is attached. >> > C reproducer is attached >> > syzkaller reproducer is attached. See https://goo.gl/kgGztJ >> > for information about syzkaller reproducers >> >> This also happened on more recent commits, including net-next >> 833e0e2f24fd0525090878f71e129a8a4cb8bf78 (Oct 10) with similar >> signature: > > Unfortunately I cannot reproduce the crash with your reproducer. > Does it always crash for you? > >> [ cut here ] >> kernel BUG at net/key/af_key.c:2068! >> invalid opcode: [#1] SMP KASAN >> Dumping ftrace buffer: >>(ftrace buffer empty) >> Modules linked in: >> CPU: 1 PID: 11011 Comm: syz-executor1 Not tainted 4.14.0-rc4+ #80 >> Hardware name: Google Google Compute Engine/Google Compute Engine, >> BIOS Google 01/01/2011 >> task: 8801d4ecc1c0 task.stack: 8801c13f8000 >> RIP: 0010:pfkey_xfrm_policy2msg+0x209c/0x22b0 net/key/af_key.c:2068 > > This shows that you have a xfrm policy that has a bogus family > field in your policy database. But it gives no clue as to how > it got there. Just triggered it within a second. Are you using the provided config? Also the repro needs to be compiled with -m32 (but it does not compile without it due to missing __NR_mmap2, so I guess you passed -m32).
Re: kernel BUG at net/key/af_key.c:LINE!
On Tue, Oct 24, 2017 at 5:08 PM, syzbotwrote: > Hello, > > syzkaller hit the following crash on > 02a2b05395dde2f49eb67b51a5fbc6606943 > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached > Raw console output is attached. > C reproducer is attached > syzkaller reproducer is attached. See https://goo.gl/kgGztJ > for information about syzkaller reproducers This also happened on more recent commits, including net-next 833e0e2f24fd0525090878f71e129a8a4cb8bf78 (Oct 10) with similar signature: [ cut here ] kernel BUG at net/key/af_key.c:2068! invalid opcode: [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 11011 Comm: syz-executor1 Not tainted 4.14.0-rc4+ #80 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 task: 8801d4ecc1c0 task.stack: 8801c13f8000 RIP: 0010:pfkey_xfrm_policy2msg+0x209c/0x22b0 net/key/af_key.c:2068 RSP: 0018:8801c13ff4b0 EFLAGS: 00010212 RAX: 0001 RBX: 8801ceaa828c RCX: c90001f3c000 RDX: 0599 RSI: 8444c4fc RDI: 8801ceaa812c RBP: 8801c13ff588 R08: 0001 R09: 8801d55dbb40 R10: 001b R11: ed003aabb782 R12: 8801ceaa8148 R13: 8801ceaa8040 R14: 0008 R15: 0001 FS: 7fc611208700() GS:8801db30() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 2ff0 CR3: 0001a13b6000 CR4: 001406e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: dump_sp+0x14f/0x510 net/key/af_key.c:2669 xfrm_policy_walk+0x2f1/0xa30 net/xfrm/xfrm_policy.c:1015 pfkey_dump_sp+0x42/0x50 net/key/af_key.c:2692 pfkey_do_dump+0xaa/0x3f0 net/key/af_key.c:299 pfkey_spddump+0x1a0/0x210 net/key/af_key.c:2719 pfkey_process+0x60b/0x720 net/key/af_key.c:2809 pfkey_sendmsg+0x4d6/0x9f0 net/key/af_key.c:3648 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 sock_write_iter+0x320/0x5e0 net/socket.c:912 call_write_iter include/linux/fs.h:1770 [inline] new_sync_write fs/read_write.c:468 [inline] __vfs_write+0x68a/0x970 fs/read_write.c:481 vfs_write+0x18f/0x510 fs/read_write.c:543 SYSC_write fs/read_write.c:588 [inline] SyS_write+0xef/0x220 fs/read_write.c:580 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x4520a9 RSP: 002b:7fc611207c08 EFLAGS: 0216 ORIG_RAX: 0001 RAX: ffda RBX: 00718000 RCX: 004520a9 RDX: 0010 RSI: 2ff0 RDI: 0019 RBP: 0086 R08: R09: R10: R11: 0216 R12: 004bf3b0 R13: R14: 0005 R15: 0029 Code: ff ff 48 89 95 58 ff ff ff 89 8d 70 ff ff ff e8 fb 70 5e fd 48 8b 95 58 ff ff ff 8b 8d 70 ff ff ff e9 04 e3 ff ff e8 74 4c 29 fd <0f> 0b be 02 00 00 00 4c 89 f7 e8 15 72 5e fd e9 6f e3 ff ff 48 RIP: pfkey_xfrm_policy2msg+0x209c/0x22b0 net/key/af_key.c:2068 RSP: 8801c13ff4b0 ---[ end trace 3103e09d7f60a307 ]--- > [ cut here ] > kernel BUG at net/key/af_key.c:2068! > invalid opcode: [#1] SMP KASAN > Dumping ftrace buffer: >(ftrace buffer empty) > Modules linked in: > CPU: 0 PID: 3024 Comm: syzkaller790413 Not tainted 4.14.0-rc2+ #16 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > task: 8801cddc8100 task.stack: 8801c0a88000 > RIP: 0010:pfkey_xfrm_policy2msg+0x209c/0x22b0 net/key/af_key.c:2068 > RSP: 0018:8801c0a8f318 EFLAGS: 00010297 > RAX: 8801cddc8100 RBX: 8801cea778cc RCX: > RDX: RSI: 204e RDI: 8801cea7776c > RBP: 8801c0a8f3f0 R08: 0001 R09: 8801d0b66dc0 > R10: 001b R11: ed003a16cdd2 R12: 8801cea77788 > R13: 8801cea77680 R14: 0008 R15: 0001 > FS: () GS:8801db20(0063) knlGS:ecf1fb40 > CS: 0010 DS: 002b ES: 002b CR0: 80050033 > CR2: 20002ff0 CR3: 0001d4b3c000 CR4: 001406f0 > DR0: DR1: DR2: > DR3: DR6: fffe0ff0 DR7: 0400 > Call Trace: > dump_sp+0x14f/0x510 net/key/af_key.c:2669 > xfrm_policy_walk+0x2f1/0xa30 net/xfrm/xfrm_policy.c:1015 > pfkey_dump_sp+0x42/0x50 net/key/af_key.c:2692 > pfkey_do_dump+0xaa/0x3f0 net/key/af_key.c:299 > pfkey_spddump+0x1a0/0x210 net/key/af_key.c:2719 > pfkey_process+0x60b/0x720 net/key/af_key.c:2809 > pfkey_sendmsg+0x4d6/0x9f0 net/key/af_key.c:3648 > sock_sendmsg_nosec net/socket.c:633 [inline] > sock_sendmsg+0xca/0x110