Re: general protection fault in kernfs_kill_sb (2)
Hi Al, On Mon, 14 May 2018 05:04:15 +0100 Al Virowrote: > > On Mon, May 14, 2018 at 12:20:16PM +0900, Tetsuo Handa wrote: > > > But there remains a refcount bug because deactivate_locked_super() from > > kernfs_mount_ns() triggers kobj_ns_drop() from sysfs_kill_sb() via > > sb->kill_sb() when kobj_ns_drop() is always called by sysfs_mount() > > if kernfs_mount_ns() returned an error. > > Trivial: > > unfuck sysfs_mount() > > Signed-off-by: Al Viro I noticed this turn up in linux-next today. Can I ask that you please add an actual commit message to it before sending it on to Linus. -- Cheers, Stephen Rothwell pgpt80ZhcWVlk.pgp Description: OpenPGP digital signature
Re: general protection fault in kernfs_kill_sb (2)
Hi Al, On Mon, 14 May 2018 05:04:15 +0100 Al Viro wrote: > > On Mon, May 14, 2018 at 12:20:16PM +0900, Tetsuo Handa wrote: > > > But there remains a refcount bug because deactivate_locked_super() from > > kernfs_mount_ns() triggers kobj_ns_drop() from sysfs_kill_sb() via > > sb->kill_sb() when kobj_ns_drop() is always called by sysfs_mount() > > if kernfs_mount_ns() returned an error. > > Trivial: > > unfuck sysfs_mount() > > Signed-off-by: Al Viro I noticed this turn up in linux-next today. Can I ask that you please add an actual commit message to it before sending it on to Linus. -- Cheers, Stephen Rothwell pgpt80ZhcWVlk.pgp Description: OpenPGP digital signature
Re: general protection fault in kernfs_kill_sb (2)
On Mon, May 14, 2018 at 05:04:15AM +0100, Al Viro wrote: > diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c > index b428d317ae92..92682fcc41f6 100644 > --- a/fs/sysfs/mount.c > +++ b/fs/sysfs/mount.c > @@ -25,7 +25,7 @@ static struct dentry *sysfs_mount(struct file_system_type > *fs_type, > { > struct dentry *root; > void *ns; > - bool new_sb; > + bool new_sb = false; > > if (!(flags & SB_KERNMOUNT)) { > if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET)) > @@ -35,9 +35,9 @@ static struct dentry *sysfs_mount(struct file_system_type > *fs_type, > ns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET); > root = kernfs_mount_ns(fs_type, flags, sysfs_root, > SYSFS_MAGIC, _sb, ns); > - if (IS_ERR(root) || !new_sb) > + if (!new_sb) > kobj_ns_drop(KOBJ_NS_TYPE_NET, ns); > - else if (new_sb) > + else if (!IS_ERR(root)) > root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE; > > return root; What we want for that kobj_ns_drop() is "no fs instances created" (== no ->kill_sb(), be it now or later, to drop that kobj reference); for setting ->s_iflags - "new instance successfully set up". That's it; all we need is new_sb that would be accurate on its own. The problem is with kludging over the cases when it's left uninitialized (early exits from kernfs_mount_ns()) with IS_ERR(root), which happens to grab the cases when new_sb *was* set to true. So the fix is to initialize new_sb properly and get rid of that kludge. Which turns the whole thing into if (!new_sb) ... if (!IS_ERR(root) && new_sb) ... i.e. if (!new_sb) ... else if (!IS_ERR(root)) ...
Re: general protection fault in kernfs_kill_sb (2)
On Mon, May 14, 2018 at 05:04:15AM +0100, Al Viro wrote: > diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c > index b428d317ae92..92682fcc41f6 100644 > --- a/fs/sysfs/mount.c > +++ b/fs/sysfs/mount.c > @@ -25,7 +25,7 @@ static struct dentry *sysfs_mount(struct file_system_type > *fs_type, > { > struct dentry *root; > void *ns; > - bool new_sb; > + bool new_sb = false; > > if (!(flags & SB_KERNMOUNT)) { > if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET)) > @@ -35,9 +35,9 @@ static struct dentry *sysfs_mount(struct file_system_type > *fs_type, > ns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET); > root = kernfs_mount_ns(fs_type, flags, sysfs_root, > SYSFS_MAGIC, _sb, ns); > - if (IS_ERR(root) || !new_sb) > + if (!new_sb) > kobj_ns_drop(KOBJ_NS_TYPE_NET, ns); > - else if (new_sb) > + else if (!IS_ERR(root)) > root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE; > > return root; What we want for that kobj_ns_drop() is "no fs instances created" (== no ->kill_sb(), be it now or later, to drop that kobj reference); for setting ->s_iflags - "new instance successfully set up". That's it; all we need is new_sb that would be accurate on its own. The problem is with kludging over the cases when it's left uninitialized (early exits from kernfs_mount_ns()) with IS_ERR(root), which happens to grab the cases when new_sb *was* set to true. So the fix is to initialize new_sb properly and get rid of that kludge. Which turns the whole thing into if (!new_sb) ... if (!IS_ERR(root) && new_sb) ... i.e. if (!new_sb) ... else if (!IS_ERR(root)) ...
Re: general protection fault in kernfs_kill_sb (2)
On Mon, May 14, 2018 at 12:20:16PM +0900, Tetsuo Handa wrote: > But there remains a refcount bug because deactivate_locked_super() from > kernfs_mount_ns() triggers kobj_ns_drop() from sysfs_kill_sb() via > sb->kill_sb() when kobj_ns_drop() is always called by sysfs_mount() > if kernfs_mount_ns() returned an error. Trivial: unfuck sysfs_mount() Signed-off-by: Al Viro--- diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c index b428d317ae92..92682fcc41f6 100644 --- a/fs/sysfs/mount.c +++ b/fs/sysfs/mount.c @@ -25,7 +25,7 @@ static struct dentry *sysfs_mount(struct file_system_type *fs_type, { struct dentry *root; void *ns; - bool new_sb; + bool new_sb = false; if (!(flags & SB_KERNMOUNT)) { if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET)) @@ -35,9 +35,9 @@ static struct dentry *sysfs_mount(struct file_system_type *fs_type, ns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET); root = kernfs_mount_ns(fs_type, flags, sysfs_root, SYSFS_MAGIC, _sb, ns); - if (IS_ERR(root) || !new_sb) + if (!new_sb) kobj_ns_drop(KOBJ_NS_TYPE_NET, ns); - else if (new_sb) + else if (!IS_ERR(root)) root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE; return root;
Re: general protection fault in kernfs_kill_sb (2)
On Mon, May 14, 2018 at 12:20:16PM +0900, Tetsuo Handa wrote: > But there remains a refcount bug because deactivate_locked_super() from > kernfs_mount_ns() triggers kobj_ns_drop() from sysfs_kill_sb() via > sb->kill_sb() when kobj_ns_drop() is always called by sysfs_mount() > if kernfs_mount_ns() returned an error. Trivial: unfuck sysfs_mount() Signed-off-by: Al Viro --- diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c index b428d317ae92..92682fcc41f6 100644 --- a/fs/sysfs/mount.c +++ b/fs/sysfs/mount.c @@ -25,7 +25,7 @@ static struct dentry *sysfs_mount(struct file_system_type *fs_type, { struct dentry *root; void *ns; - bool new_sb; + bool new_sb = false; if (!(flags & SB_KERNMOUNT)) { if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET)) @@ -35,9 +35,9 @@ static struct dentry *sysfs_mount(struct file_system_type *fs_type, ns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET); root = kernfs_mount_ns(fs_type, flags, sysfs_root, SYSFS_MAGIC, _sb, ns); - if (IS_ERR(root) || !new_sb) + if (!new_sb) kobj_ns_drop(KOBJ_NS_TYPE_NET, ns); - else if (new_sb) + else if (!IS_ERR(root)) root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE; return root;
Re: general protection fault in kernfs_kill_sb (2)
On Sun, May 13, 2018 at 11:19:46AM +0900, Tetsuo Handa wrote: > This is what I reported at > https://groups.google.com/d/msg/syzkaller-bugs/ISOJlV2I2QM/qHslGMi3AwAJ . > > We are currently waiting for comments from Al Viro. 1) the damn thing is unusable without javashit. Which gets about the same reaction as sending something.doc in attachment. Please, find a less obnoxious way to archive the thing (or to generate URLs that would work without that garbage). 2) deactivate_locked_super() *WILL* be called when fill_super() fails. Live with it; it allows to simplify a whole lot of cleanup logics in various filesystems. Again, we are not going for a model where ->kill_sb() is not called for something returned by sget(). Rationale: rarely exercised paths tend to rot, so anything that increases the duplication of bits and pieces of normal teardown into failure exits of foo_fill_super() is a bloody bad idea. If anything, we want to take a lot of stuff out of ->put_super() instances directly into ->kill_sb() ones, precisely because ->put_super() is only called for fully set up filesystems. 3) kernfs needs to be fixed. The rest of the dropped commits were made redundant by 8e04944f0ea8; this one wasn't. Mea culpa.
Re: general protection fault in kernfs_kill_sb (2)
On Sun, May 13, 2018 at 11:19:46AM +0900, Tetsuo Handa wrote: > This is what I reported at > https://groups.google.com/d/msg/syzkaller-bugs/ISOJlV2I2QM/qHslGMi3AwAJ . > > We are currently waiting for comments from Al Viro. 1) the damn thing is unusable without javashit. Which gets about the same reaction as sending something.doc in attachment. Please, find a less obnoxious way to archive the thing (or to generate URLs that would work without that garbage). 2) deactivate_locked_super() *WILL* be called when fill_super() fails. Live with it; it allows to simplify a whole lot of cleanup logics in various filesystems. Again, we are not going for a model where ->kill_sb() is not called for something returned by sget(). Rationale: rarely exercised paths tend to rot, so anything that increases the duplication of bits and pieces of normal teardown into failure exits of foo_fill_super() is a bloody bad idea. If anything, we want to take a lot of stuff out of ->put_super() instances directly into ->kill_sb() ones, precisely because ->put_super() is only called for fully set up filesystems. 3) kernfs needs to be fixed. The rest of the dropped commits were made redundant by 8e04944f0ea8; this one wasn't. Mea culpa.
Re: general protection fault in kernfs_kill_sb (2)
On 2018/05/13 2:01, syzbot wrote: > Call Trace: > __list_del_entry include/linux/list.h:117 [inline] > list_del include/linux/list.h:125 [inline] > kernfs_kill_sb+0xa0/0x350 fs/kernfs/mount.c:361 > sysfs_kill_sb+0x22/0x40 fs/sysfs/mount.c:50 > deactivate_locked_super+0x97/0x100 fs/super.c:316 > kernfs_mount_ns+0x753/0x8e0 fs/kernfs/mount.c:335 > sysfs_mount+0xdf/0x200 fs/sysfs/mount.c:36 > mount_fs+0xae/0x328 fs/super.c:1267 > vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037 > vfs_kern_mount fs/namespace.c:1027 [inline] > do_new_mount fs/namespace.c:2518 [inline] > do_mount+0x564/0x3070 fs/namespace.c:2848 > ksys_mount+0x12d/0x140 fs/namespace.c:3064 > __do_sys_mount fs/namespace.c:3078 [inline] > __se_sys_mount fs/namespace.c:3075 [inline] > __x64_sys_mount+0xbe/0x150 fs/namespace.c:3075 > do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > RIP: 0033:0x4426a9 > RSP: 002b:7ffc558bd1b8 EFLAGS: 0246 ORIG_RAX: 00a5 > RAX: ffda RBX: RCX: 004426a9 > RDX: 20c0 RSI: 2080 RDI: 2040 > RBP: 7ffc558bda60 R08: R09: 0003 > R10: R11: 0246 R12: > R13: 0004 R14: 1380 R15: 7ffc558bd2f8 > Code: c5 0f 84 cc 00 00 00 48 b8 00 02 00 00 00 00 ad de 49 39 c4 0f 84 a5 00 > 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 75 5f > 49 8b 14 24 48 39 da 0f 85 ba 00 00 00 48 b8 > RIP: __list_del_entry_valid+0x84/0xf3 lib/list_debug.c:51 RSP: > 8801aca6f860 > ---[ end trace d148f307a34e229f ]--- This is what I reported at https://groups.google.com/d/msg/syzkaller-bugs/ISOJlV2I2QM/qHslGMi3AwAJ . We are currently waiting for comments from Al Viro.
Re: general protection fault in kernfs_kill_sb (2)
On 2018/05/13 2:01, syzbot wrote: > Call Trace: > __list_del_entry include/linux/list.h:117 [inline] > list_del include/linux/list.h:125 [inline] > kernfs_kill_sb+0xa0/0x350 fs/kernfs/mount.c:361 > sysfs_kill_sb+0x22/0x40 fs/sysfs/mount.c:50 > deactivate_locked_super+0x97/0x100 fs/super.c:316 > kernfs_mount_ns+0x753/0x8e0 fs/kernfs/mount.c:335 > sysfs_mount+0xdf/0x200 fs/sysfs/mount.c:36 > mount_fs+0xae/0x328 fs/super.c:1267 > vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037 > vfs_kern_mount fs/namespace.c:1027 [inline] > do_new_mount fs/namespace.c:2518 [inline] > do_mount+0x564/0x3070 fs/namespace.c:2848 > ksys_mount+0x12d/0x140 fs/namespace.c:3064 > __do_sys_mount fs/namespace.c:3078 [inline] > __se_sys_mount fs/namespace.c:3075 [inline] > __x64_sys_mount+0xbe/0x150 fs/namespace.c:3075 > do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > RIP: 0033:0x4426a9 > RSP: 002b:7ffc558bd1b8 EFLAGS: 0246 ORIG_RAX: 00a5 > RAX: ffda RBX: RCX: 004426a9 > RDX: 20c0 RSI: 2080 RDI: 2040 > RBP: 7ffc558bda60 R08: R09: 0003 > R10: R11: 0246 R12: > R13: 0004 R14: 1380 R15: 7ffc558bd2f8 > Code: c5 0f 84 cc 00 00 00 48 b8 00 02 00 00 00 00 ad de 49 39 c4 0f 84 a5 00 > 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 75 5f > 49 8b 14 24 48 39 da 0f 85 ba 00 00 00 48 b8 > RIP: __list_del_entry_valid+0x84/0xf3 lib/list_debug.c:51 RSP: > 8801aca6f860 > ---[ end trace d148f307a34e229f ]--- This is what I reported at https://groups.google.com/d/msg/syzkaller-bugs/ISOJlV2I2QM/qHslGMi3AwAJ . We are currently waiting for comments from Al Viro.
general protection fault in kernfs_kill_sb (2)
Hello, syzbot found the following crash on: HEAD commit:f0ab773f5c96 Merge branch 'akpm' (patches from Andrew) git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=140ce81780 kernel config: https://syzkaller.appspot.com/x/.config?x=fcce42b221691ff9 dashboard link: https://syzkaller.appspot.com/bug?extid=481ad819c717f6b78df9 compiler: gcc (GCC) 8.0.1 20180413 (experimental) syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=15078e9780 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12bf369780 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+481ad819c717f6b78...@syzkaller.appspotmail.com RBP: 7ffc558bda60 R08: R09: 0003 R10: R11: 0246 R12: R13: 0004 R14: 1380 R15: 7ffc558bd2f8 kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 4540 Comm: syz-executor884 Not tainted 4.17.0-rc4+ #43 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__list_del_entry_valid+0x84/0xf3 lib/list_debug.c:51 RSP: 0018:8801aca6f860 EFLAGS: 00010246 RAX: dc00 RBX: 8801ac528498 RCX: RDX: RSI: RDI: 8801ac5284a0 RBP: 8801aca6f878 R08: fbfff11cc855 R09: fbfff11cc854 R10: 8801aca6f878 R11: 88e642a7 R12: R13: R14: 8801aca6f908 R15: 8801ac528498 FS: 00f86880() GS:8801daf0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 004b6f3c CR3: 0001d973 CR4: 001406e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: __list_del_entry include/linux/list.h:117 [inline] list_del include/linux/list.h:125 [inline] kernfs_kill_sb+0xa0/0x350 fs/kernfs/mount.c:361 sysfs_kill_sb+0x22/0x40 fs/sysfs/mount.c:50 deactivate_locked_super+0x97/0x100 fs/super.c:316 kernfs_mount_ns+0x753/0x8e0 fs/kernfs/mount.c:335 sysfs_mount+0xdf/0x200 fs/sysfs/mount.c:36 mount_fs+0xae/0x328 fs/super.c:1267 vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037 vfs_kern_mount fs/namespace.c:1027 [inline] do_new_mount fs/namespace.c:2518 [inline] do_mount+0x564/0x3070 fs/namespace.c:2848 ksys_mount+0x12d/0x140 fs/namespace.c:3064 __do_sys_mount fs/namespace.c:3078 [inline] __se_sys_mount fs/namespace.c:3075 [inline] __x64_sys_mount+0xbe/0x150 fs/namespace.c:3075 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4426a9 RSP: 002b:7ffc558bd1b8 EFLAGS: 0246 ORIG_RAX: 00a5 RAX: ffda RBX: RCX: 004426a9 RDX: 20c0 RSI: 2080 RDI: 2040 RBP: 7ffc558bda60 R08: R09: 0003 R10: R11: 0246 R12: R13: 0004 R14: 1380 R15: 7ffc558bd2f8 Code: c5 0f 84 cc 00 00 00 48 b8 00 02 00 00 00 00 ad de 49 39 c4 0f 84 a5 00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 75 5f 49 8b 14 24 48 39 da 0f 85 ba 00 00 00 48 b8 RIP: __list_del_entry_valid+0x84/0xf3 lib/list_debug.c:51 RSP: 8801aca6f860 ---[ end trace d148f307a34e229f ]--- --- This bug is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this bug report. If you forgot to add the Reported-by tag, once the fix for this bug is merged into any tree, please reply to this email with: #syz fix: exact-commit-title If you want to test a patch for this bug, please reply with: #syz test: git://repo/address.git branch and provide the patch inline or as an attachment. To mark this as a duplicate of another syzbot report, please reply with: #syz dup: exact-subject-of-another-report If it's a one-off invalid bug report, please reply with: #syz invalid Note: if the crash happens again, it will cause creation of a new bug report. Note: all commands must start from beginning of the line in the email body.
general protection fault in kernfs_kill_sb (2)
Hello, syzbot found the following crash on: HEAD commit:f0ab773f5c96 Merge branch 'akpm' (patches from Andrew) git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=140ce81780 kernel config: https://syzkaller.appspot.com/x/.config?x=fcce42b221691ff9 dashboard link: https://syzkaller.appspot.com/bug?extid=481ad819c717f6b78df9 compiler: gcc (GCC) 8.0.1 20180413 (experimental) syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=15078e9780 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12bf369780 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+481ad819c717f6b78...@syzkaller.appspotmail.com RBP: 7ffc558bda60 R08: R09: 0003 R10: R11: 0246 R12: R13: 0004 R14: 1380 R15: 7ffc558bd2f8 kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 4540 Comm: syz-executor884 Not tainted 4.17.0-rc4+ #43 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__list_del_entry_valid+0x84/0xf3 lib/list_debug.c:51 RSP: 0018:8801aca6f860 EFLAGS: 00010246 RAX: dc00 RBX: 8801ac528498 RCX: RDX: RSI: RDI: 8801ac5284a0 RBP: 8801aca6f878 R08: fbfff11cc855 R09: fbfff11cc854 R10: 8801aca6f878 R11: 88e642a7 R12: R13: R14: 8801aca6f908 R15: 8801ac528498 FS: 00f86880() GS:8801daf0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 004b6f3c CR3: 0001d973 CR4: 001406e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: __list_del_entry include/linux/list.h:117 [inline] list_del include/linux/list.h:125 [inline] kernfs_kill_sb+0xa0/0x350 fs/kernfs/mount.c:361 sysfs_kill_sb+0x22/0x40 fs/sysfs/mount.c:50 deactivate_locked_super+0x97/0x100 fs/super.c:316 kernfs_mount_ns+0x753/0x8e0 fs/kernfs/mount.c:335 sysfs_mount+0xdf/0x200 fs/sysfs/mount.c:36 mount_fs+0xae/0x328 fs/super.c:1267 vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037 vfs_kern_mount fs/namespace.c:1027 [inline] do_new_mount fs/namespace.c:2518 [inline] do_mount+0x564/0x3070 fs/namespace.c:2848 ksys_mount+0x12d/0x140 fs/namespace.c:3064 __do_sys_mount fs/namespace.c:3078 [inline] __se_sys_mount fs/namespace.c:3075 [inline] __x64_sys_mount+0xbe/0x150 fs/namespace.c:3075 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4426a9 RSP: 002b:7ffc558bd1b8 EFLAGS: 0246 ORIG_RAX: 00a5 RAX: ffda RBX: RCX: 004426a9 RDX: 20c0 RSI: 2080 RDI: 2040 RBP: 7ffc558bda60 R08: R09: 0003 R10: R11: 0246 R12: R13: 0004 R14: 1380 R15: 7ffc558bd2f8 Code: c5 0f 84 cc 00 00 00 48 b8 00 02 00 00 00 00 ad de 49 39 c4 0f 84 a5 00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 75 5f 49 8b 14 24 48 39 da 0f 85 ba 00 00 00 48 b8 RIP: __list_del_entry_valid+0x84/0xf3 lib/list_debug.c:51 RSP: 8801aca6f860 ---[ end trace d148f307a34e229f ]--- --- This bug is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this bug report. If you forgot to add the Reported-by tag, once the fix for this bug is merged into any tree, please reply to this email with: #syz fix: exact-commit-title If you want to test a patch for this bug, please reply with: #syz test: git://repo/address.git branch and provide the patch inline or as an attachment. To mark this as a duplicate of another syzbot report, please reply with: #syz dup: exact-subject-of-another-report If it's a one-off invalid bug report, please reply with: #syz invalid Note: if the crash happens again, it will cause creation of a new bug report. Note: all commands must start from beginning of the line in the email body.
Re: general protection fault in kernfs_kill_sb
On 2018/04/20 11:44, Eric Biggers wrote: > Fix for the kernfs bug is now queued in vfs/for-linus: > > #syz fix: kernfs: deal with early sget() failures Well, the following patches rpc_pipefs: deal with early sget() failures kernfs: deal with early sget() failures procfs: deal with early sget() failures nfsd_umount(): deal with early sget() failures nfs: avoid double-free on early sget() failures are dropped from vfs.git#for-linus while this report is marked as "#syz fix: kernfs: deal with early sget() failures". The patch which actually went to linux.git is 8e04944f0ea8b838. #syz fix: mm,vmscan: Allow preallocating memory for register_shrinker(). By the way, we still have NULL pointer dereference (as of f2125992e7cb25ec on linux.git) shown below due to calling deactivate_locked_super() without successful fill_super(). -- [ 162.865231] BUG: unable to handle kernel NULL pointer dereference at [ 162.873678] PGD 130487067 P4D 130487067 PUD 138750067 PMD 0 [ 162.879845] Oops: [#1] SMP [ 162.883295] Modules linked in: [ 162.886648] CPU: 2 PID: 15505 Comm: a.out Kdump: loaded Tainted: G T 4.17.0-rc3+ #522 [ 162.894891] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017 [ 162.899415] RIP: 0010:__list_del_entry_valid+0x29/0x90 [ 162.901609] RSP: 0018:c90001e07ce0 EFLAGS: 00010207 [ 162.903834] RAX: RBX: 880132359580 RCX: dead0200 [ 162.906825] RDX: RSI: e85efffd RDI: 880132359598 [ 162.909863] RBP: c90001e07ce0 R08: 815269c5 R09: 0004 [ 162.912923] R10: c90001e07ce0 R11: 840f2060 R12: 880134c6e000 [ 162.915929] R13: 88013a014f00 R14: 880134c6e000 R15: 880132359580 [ 162.918927] FS: 7f6525b13740() GS:88013a68() knlGS: [ 162.922325] CS: 0010 DS: ES: CR0: 80050033 [ 162.924751] CR2: CR3: 000134b2e006 CR4: 000606e0 [ 162.927236] Call Trace: [ 162.928153] kernfs_kill_sb+0x2e/0x90 [ 162.929377] sysfs_kill_sb+0x22/0x40 [ 162.930571] deactivate_locked_super+0x50/0x90 [ 162.931958] kernfs_mount_ns+0x283/0x290 [ 162.933126] sysfs_mount+0x74/0xf0 [ 162.934146] mount_fs+0x46/0x1a0 [ 162.935137] vfs_kern_mount.part.28+0x67/0x190 [ 162.936449] do_mount+0x7b0/0x11f0 [ 162.937473] ? memdup_user+0x5e/0x90 [ 162.938541] ? copy_mount_options+0x1a4/0x2d0 [ 162.939828] ksys_mount+0xab/0x120 [ 162.940954] __x64_sys_mount+0x26/0x30 [ 162.942153] do_syscall_64+0x7b/0x260 [ 162.943274] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 162.944903] RIP: 0033:0x7f6525640aaa [ 162.946012] RSP: 002b:7ffc4f4e6f78 EFLAGS: 0246 ORIG_RAX: 00a5 [ 162.948291] RAX: ffda RBX: 000e RCX: 7f6525640aaa [ 162.950396] RDX: 00400896 RSI: 0040089c RDI: 00400896 [ 162.952480] RBP: 0003 R08: R09: 0002 [ 162.954545] R10: R11: 0246 R12: 00400694 [ 162.957760] R13: 7ffc4f4e7080 R14: R15: [ 162.963839] Code: 00 00 55 48 8b 07 48 b9 00 01 00 00 00 00 ad de 48 8b 57 08 48 89 e5 48 39 c8 74 27 48 b9 00 02 00 00 00 00 ad de 48 39 ca 74 2c <48> 8b 32 48 39 fe 75 35 48 8b 50 08 48 39 f2 75 40 b8 01 00 00 [ 162.974795] RIP: __list_del_entry_valid+0x29/0x90 RSP: c90001e07ce0 [ 162.976752] CR2: -- Below patch can avoid NULL pointer dereference at kernfs_kill_sb(). -- diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c index 26dd9a5..498c044 100644 --- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -314,6 +314,7 @@ struct dentry *kernfs_mount_ns(struct file_system_type *fs_type, int flags, if (!info) return ERR_PTR(-ENOMEM); + INIT_LIST_HEAD(>node); info->root = root; info->ns = ns; -- But there remains a refcount bug because deactivate_locked_super() from kernfs_mount_ns() triggers kobj_ns_drop() from sysfs_kill_sb() via sb->kill_sb() when kobj_ns_drop() is always called by sysfs_mount() if kernfs_mount_ns() returned an error. -- static void *net_grab_current_ns(void) { struct net *ns = current->nsproxy->net_ns; #ifdef CONFIG_NET_NS if (ns) refcount_inc(>passive); if (ns && !strcmp(current->comm, "a.out")) printk("net_grab_current_ns: %px %d %d\n", ns, refcount_read(>passive), refcount_read(>count)); #endif return ns; } void net_drop_ns(void *p) { struct net *ns = p; if (ns && !strcmp(current->comm, "a.out")) { printk("net_drop_ns: %px %d %d\n", ns, refcount_read(>passive), refcount_read(>count)); dump_stack(); } if (ns &&
Re: general protection fault in kernfs_kill_sb
On 2018/04/20 11:44, Eric Biggers wrote: > Fix for the kernfs bug is now queued in vfs/for-linus: > > #syz fix: kernfs: deal with early sget() failures Well, the following patches rpc_pipefs: deal with early sget() failures kernfs: deal with early sget() failures procfs: deal with early sget() failures nfsd_umount(): deal with early sget() failures nfs: avoid double-free on early sget() failures are dropped from vfs.git#for-linus while this report is marked as "#syz fix: kernfs: deal with early sget() failures". The patch which actually went to linux.git is 8e04944f0ea8b838. #syz fix: mm,vmscan: Allow preallocating memory for register_shrinker(). By the way, we still have NULL pointer dereference (as of f2125992e7cb25ec on linux.git) shown below due to calling deactivate_locked_super() without successful fill_super(). -- [ 162.865231] BUG: unable to handle kernel NULL pointer dereference at [ 162.873678] PGD 130487067 P4D 130487067 PUD 138750067 PMD 0 [ 162.879845] Oops: [#1] SMP [ 162.883295] Modules linked in: [ 162.886648] CPU: 2 PID: 15505 Comm: a.out Kdump: loaded Tainted: G T 4.17.0-rc3+ #522 [ 162.894891] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017 [ 162.899415] RIP: 0010:__list_del_entry_valid+0x29/0x90 [ 162.901609] RSP: 0018:c90001e07ce0 EFLAGS: 00010207 [ 162.903834] RAX: RBX: 880132359580 RCX: dead0200 [ 162.906825] RDX: RSI: e85efffd RDI: 880132359598 [ 162.909863] RBP: c90001e07ce0 R08: 815269c5 R09: 0004 [ 162.912923] R10: c90001e07ce0 R11: 840f2060 R12: 880134c6e000 [ 162.915929] R13: 88013a014f00 R14: 880134c6e000 R15: 880132359580 [ 162.918927] FS: 7f6525b13740() GS:88013a68() knlGS: [ 162.922325] CS: 0010 DS: ES: CR0: 80050033 [ 162.924751] CR2: CR3: 000134b2e006 CR4: 000606e0 [ 162.927236] Call Trace: [ 162.928153] kernfs_kill_sb+0x2e/0x90 [ 162.929377] sysfs_kill_sb+0x22/0x40 [ 162.930571] deactivate_locked_super+0x50/0x90 [ 162.931958] kernfs_mount_ns+0x283/0x290 [ 162.933126] sysfs_mount+0x74/0xf0 [ 162.934146] mount_fs+0x46/0x1a0 [ 162.935137] vfs_kern_mount.part.28+0x67/0x190 [ 162.936449] do_mount+0x7b0/0x11f0 [ 162.937473] ? memdup_user+0x5e/0x90 [ 162.938541] ? copy_mount_options+0x1a4/0x2d0 [ 162.939828] ksys_mount+0xab/0x120 [ 162.940954] __x64_sys_mount+0x26/0x30 [ 162.942153] do_syscall_64+0x7b/0x260 [ 162.943274] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 162.944903] RIP: 0033:0x7f6525640aaa [ 162.946012] RSP: 002b:7ffc4f4e6f78 EFLAGS: 0246 ORIG_RAX: 00a5 [ 162.948291] RAX: ffda RBX: 000e RCX: 7f6525640aaa [ 162.950396] RDX: 00400896 RSI: 0040089c RDI: 00400896 [ 162.952480] RBP: 0003 R08: R09: 0002 [ 162.954545] R10: R11: 0246 R12: 00400694 [ 162.957760] R13: 7ffc4f4e7080 R14: R15: [ 162.963839] Code: 00 00 55 48 8b 07 48 b9 00 01 00 00 00 00 ad de 48 8b 57 08 48 89 e5 48 39 c8 74 27 48 b9 00 02 00 00 00 00 ad de 48 39 ca 74 2c <48> 8b 32 48 39 fe 75 35 48 8b 50 08 48 39 f2 75 40 b8 01 00 00 [ 162.974795] RIP: __list_del_entry_valid+0x29/0x90 RSP: c90001e07ce0 [ 162.976752] CR2: -- Below patch can avoid NULL pointer dereference at kernfs_kill_sb(). -- diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c index 26dd9a5..498c044 100644 --- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -314,6 +314,7 @@ struct dentry *kernfs_mount_ns(struct file_system_type *fs_type, int flags, if (!info) return ERR_PTR(-ENOMEM); + INIT_LIST_HEAD(>node); info->root = root; info->ns = ns; -- But there remains a refcount bug because deactivate_locked_super() from kernfs_mount_ns() triggers kobj_ns_drop() from sysfs_kill_sb() via sb->kill_sb() when kobj_ns_drop() is always called by sysfs_mount() if kernfs_mount_ns() returned an error. -- static void *net_grab_current_ns(void) { struct net *ns = current->nsproxy->net_ns; #ifdef CONFIG_NET_NS if (ns) refcount_inc(>passive); if (ns && !strcmp(current->comm, "a.out")) printk("net_grab_current_ns: %px %d %d\n", ns, refcount_read(>passive), refcount_read(>count)); #endif return ns; } void net_drop_ns(void *p) { struct net *ns = p; if (ns && !strcmp(current->comm, "a.out")) { printk("net_drop_ns: %px %d %d\n", ns, refcount_read(>passive), refcount_read(>count)); dump_stack(); } if (ns &&
Re: general protection fault in kernfs_kill_sb
On Fri, Apr 20, 2018 at 09:31:58AM +0200, Michal Hocko wrote: > On Fri 20-04-18 14:29:39, Tetsuo Handa wrote: > > Eric Biggers wrote: > > > But, there is still a related bug: when mounting sysfs, if > > > register_shrinker() > > > fails in sget_userns(), then kernfs_kill_sb() gets called, which frees the > > > 'struct kernfs_super_info'. But, the 'struct kernfs_super_info' is also > > > freed > > > in kernfs_mount_ns() by: > > > > > > sb = sget_userns(fs_type, kernfs_test_super, kernfs_set_super, > > > flags, > > > _user_ns, info); > > > if (IS_ERR(sb) || sb->s_fs_info != info) > > > kfree(info); > > > if (IS_ERR(sb)) > > > return ERR_CAST(sb); > > > > > > I guess the problem is that sget_userns() shouldn't take ownership of the > > > 'info' > > > if it returns an error -- but, it actually does if register_shrinker() > > > fails, > > > resulting in a double free. > > > > > > Here is a reproducer and the KASAN splat. This is on Linus' tree > > > (87ef12027b9b) > > > with vfs/for-linus merged in. > > > > I'm waiting for response from Michal Hocko regarding > > http://lkml.kernel.org/r/201804111909.egc64586.qsflfjfovho...@i-love.sakura.ne.jp > > . > > I didn't plan to respond util all the Al's concerns with the existing > scheme are resolved. This is not an urgent thing to fix so better fix it > properly. Your API change is kinda ugly so it would be preferable to do > it properly as suggested by Al. Maybe that will be more work but my > understanding is that the resulting code would be better. If that is not > the case then I do not really have any fundamental objection to your > patch except it is ugly. Okay, the fix was merged already as commit 8e04944f0ea8b8 ("mm,vmscan: Allow preallocating memory for register_shrinker()."). Thanks Tetsuo! - Eric
Re: general protection fault in kernfs_kill_sb
On Fri, Apr 20, 2018 at 09:31:58AM +0200, Michal Hocko wrote: > On Fri 20-04-18 14:29:39, Tetsuo Handa wrote: > > Eric Biggers wrote: > > > But, there is still a related bug: when mounting sysfs, if > > > register_shrinker() > > > fails in sget_userns(), then kernfs_kill_sb() gets called, which frees the > > > 'struct kernfs_super_info'. But, the 'struct kernfs_super_info' is also > > > freed > > > in kernfs_mount_ns() by: > > > > > > sb = sget_userns(fs_type, kernfs_test_super, kernfs_set_super, > > > flags, > > > _user_ns, info); > > > if (IS_ERR(sb) || sb->s_fs_info != info) > > > kfree(info); > > > if (IS_ERR(sb)) > > > return ERR_CAST(sb); > > > > > > I guess the problem is that sget_userns() shouldn't take ownership of the > > > 'info' > > > if it returns an error -- but, it actually does if register_shrinker() > > > fails, > > > resulting in a double free. > > > > > > Here is a reproducer and the KASAN splat. This is on Linus' tree > > > (87ef12027b9b) > > > with vfs/for-linus merged in. > > > > I'm waiting for response from Michal Hocko regarding > > http://lkml.kernel.org/r/201804111909.egc64586.qsflfjfovho...@i-love.sakura.ne.jp > > . > > I didn't plan to respond util all the Al's concerns with the existing > scheme are resolved. This is not an urgent thing to fix so better fix it > properly. Your API change is kinda ugly so it would be preferable to do > it properly as suggested by Al. Maybe that will be more work but my > understanding is that the resulting code would be better. If that is not > the case then I do not really have any fundamental objection to your > patch except it is ugly. Okay, the fix was merged already as commit 8e04944f0ea8b8 ("mm,vmscan: Allow preallocating memory for register_shrinker()."). Thanks Tetsuo! - Eric
Re: general protection fault in kernfs_kill_sb
On Fri 20-04-18 14:29:39, Tetsuo Handa wrote: > Eric Biggers wrote: > > But, there is still a related bug: when mounting sysfs, if > > register_shrinker() > > fails in sget_userns(), then kernfs_kill_sb() gets called, which frees the > > 'struct kernfs_super_info'. But, the 'struct kernfs_super_info' is also > > freed > > in kernfs_mount_ns() by: > > > > sb = sget_userns(fs_type, kernfs_test_super, kernfs_set_super, > > flags, > > _user_ns, info); > > if (IS_ERR(sb) || sb->s_fs_info != info) > > kfree(info); > > if (IS_ERR(sb)) > > return ERR_CAST(sb); > > > > I guess the problem is that sget_userns() shouldn't take ownership of the > > 'info' > > if it returns an error -- but, it actually does if register_shrinker() > > fails, > > resulting in a double free. > > > > Here is a reproducer and the KASAN splat. This is on Linus' tree > > (87ef12027b9b) > > with vfs/for-linus merged in. > > I'm waiting for response from Michal Hocko regarding > http://lkml.kernel.org/r/201804111909.egc64586.qsflfjfovho...@i-love.sakura.ne.jp > . I didn't plan to respond util all the Al's concerns with the existing scheme are resolved. This is not an urgent thing to fix so better fix it properly. Your API change is kinda ugly so it would be preferable to do it properly as suggested by Al. Maybe that will be more work but my understanding is that the resulting code would be better. If that is not the case then I do not really have any fundamental objection to your patch except it is ugly. -- Michal Hocko SUSE Labs
Re: general protection fault in kernfs_kill_sb
On Fri 20-04-18 14:29:39, Tetsuo Handa wrote: > Eric Biggers wrote: > > But, there is still a related bug: when mounting sysfs, if > > register_shrinker() > > fails in sget_userns(), then kernfs_kill_sb() gets called, which frees the > > 'struct kernfs_super_info'. But, the 'struct kernfs_super_info' is also > > freed > > in kernfs_mount_ns() by: > > > > sb = sget_userns(fs_type, kernfs_test_super, kernfs_set_super, > > flags, > > _user_ns, info); > > if (IS_ERR(sb) || sb->s_fs_info != info) > > kfree(info); > > if (IS_ERR(sb)) > > return ERR_CAST(sb); > > > > I guess the problem is that sget_userns() shouldn't take ownership of the > > 'info' > > if it returns an error -- but, it actually does if register_shrinker() > > fails, > > resulting in a double free. > > > > Here is a reproducer and the KASAN splat. This is on Linus' tree > > (87ef12027b9b) > > with vfs/for-linus merged in. > > I'm waiting for response from Michal Hocko regarding > http://lkml.kernel.org/r/201804111909.egc64586.qsflfjfovho...@i-love.sakura.ne.jp > . I didn't plan to respond util all the Al's concerns with the existing scheme are resolved. This is not an urgent thing to fix so better fix it properly. Your API change is kinda ugly so it would be preferable to do it properly as suggested by Al. Maybe that will be more work but my understanding is that the resulting code would be better. If that is not the case then I do not really have any fundamental objection to your patch except it is ugly. -- Michal Hocko SUSE Labs
Re: general protection fault in kernfs_kill_sb
On Thu, Apr 19, 2018 at 07:44:40PM -0700, Eric Biggers wrote: > On Mon, Apr 02, 2018 at 03:34:15PM +0100, Al Viro wrote: > > On Mon, Apr 02, 2018 at 07:40:22PM +0900, Tetsuo Handa wrote: > > > > > That commit assumes that calling kill_sb() from deactivate_locked_super(s) > > > without corresponding fill_super() is safe. We have so far crashed with > > > rpc_mount() and kernfs_mount_ns(). Is that really safe? > > > > Consider the case when fill_super() returns an error immediately. > > It is exactly the same situation. And ->kill_sb() *is* called in cases > > when fill_super() has failed. Always had been - it's much less boilerplate > > that way. > > > > deactivate_locked_super() on that failure exit is the least painful > > variant, unfortunately. > > > > Filesystems with ->kill_sb() instances that rely upon something > > done between sget() and the first failure exit after it need to be fixed. > > And yes, that should've been spotted back then. Sorry. > > > > Fortunately, we don't have many of those - kill_{block,litter,anon}_super() > > are safe and those are the majority. Looking through the rest uncovers > > some bugs; so far all I've seen were already there. Note that normally > > we have something like > > static void affs_kill_sb(struct super_block *sb) > > { > > struct affs_sb_info *sbi = AFFS_SB(sb); > > kill_block_super(sb); > > if (sbi) { > > affs_free_bitmap(sb); > > affs_brelse(sbi->s_root_bh); > > kfree(sbi->s_prefix); > > mutex_destroy(>s_bmlock); > > kfree(sbi); > > } > > } > > which basically does one of the safe ones augmented with something that > > takes care *not* to assume that e.g. ->s_fs_info has been allocated. > > Not everyone does, though: > > > > jffs2_fill_super(): > > c = kzalloc(sizeof(*c), GFP_KERNEL); > > if (!c) > > return -ENOMEM; > > in the very beginning. So we can return from it with NULL ->s_fs_info. > > Now, consider > > struct jffs2_sb_info *c = JFFS2_SB_INFO(sb); > > if (!(sb->s_flags & MS_RDONLY)) > > jffs2_stop_garbage_collect_thread(c); > > in jffs2_kill_sb() and > > void jffs2_stop_garbage_collect_thread(struct jffs2_sb_info *c) > > { > > int wait = 0; > > spin_lock(>erase_completion_lock); > > if (c->gc_task) { > > > > IOW, fail that kzalloc() (or, indeed, an allocation in register_shrinker()) > > and eat an oops. Always had been there, always hard to hit without > > fault injectors and fortunately trivial to fix. > > > > Similar in nfs_kill_super() calling nfs_free_server(). > > Similar in v9fs_kill_super() with > > v9fs_session_cancel()/v9fs_session_close() calls. > > Similar in hypfs_kill_super(), afs_kill_super(), btrfs_kill_super(), > > cifs_kill_sb() > > (all trivial to fix) > > > > Aha... nfsd_umount() is a new regression. > > > > orangefs: old, trivial to fix. > > > > cgroup_kill_sb(): old, hopefully easy to fix. Note that > > kernfs_root_from_sb() > > can bloody well return NULL, making cgroup_root_from_kf() oops. Always had > > been > > there. > > > > AFAICS, after discarding the instances that do the right thing we are left > > with: > > hypfs_kill_super, rdt_kill_sb, v9fs_kill_super, afs_kill_super, > > btrfs_kill_super, > > cifs_kill_sb, jffs2_kill_sb, nfs_kill_super, nfsd_umount, orangefs_kill_sb, > > proc_kill_sb, sysfs_kill_sb, cgroup_kill_sb, rpc_kill_sb. > > > > Out of those, nfsd_umount(), proc_kill_sb() and rpc_kill_sb() are > > regressions. > > So are rdt_kill_sb() and sysfs_kill_sb() (victims of the issue you've > > spotted > > in kernfs_kill_sb()). The rest are old (and I wonder if syzbot had been > > catching those - they are also dependent upon a specific allocation failing > > at the right time). > > > > Fix for the kernfs bug is now queued in vfs/for-linus: > > #syz fix: kernfs: deal with early sget() failures > But, there is still a related bug: when mounting sysfs, if register_shrinker() fails in sget_userns(), then kernfs_kill_sb() gets called, which frees the 'struct kernfs_super_info'. But, the 'struct kernfs_super_info' is also freed in kernfs_mount_ns() by: sb = sget_userns(fs_type, kernfs_test_super, kernfs_set_super, flags, _user_ns, info); if (IS_ERR(sb) || sb->s_fs_info != info) kfree(info); if (IS_ERR(sb)) return ERR_CAST(sb); I guess the problem is that sget_userns() shouldn't take ownership of the 'info' if it returns an error -- but, it actually does if register_shrinker() fails, resulting in a double free. Here is a reproducer and the KASAN splat. This is on Linus' tree (87ef12027b9b) with vfs/for-linus merged in. #define _GNU_SOURCE #include #include #include #include #include #include #include int main() { int fd, i; char buf[16]; unshare(CLONE_NEWNET);
Re: general protection fault in kernfs_kill_sb
On Thu, Apr 19, 2018 at 07:44:40PM -0700, Eric Biggers wrote: > On Mon, Apr 02, 2018 at 03:34:15PM +0100, Al Viro wrote: > > On Mon, Apr 02, 2018 at 07:40:22PM +0900, Tetsuo Handa wrote: > > > > > That commit assumes that calling kill_sb() from deactivate_locked_super(s) > > > without corresponding fill_super() is safe. We have so far crashed with > > > rpc_mount() and kernfs_mount_ns(). Is that really safe? > > > > Consider the case when fill_super() returns an error immediately. > > It is exactly the same situation. And ->kill_sb() *is* called in cases > > when fill_super() has failed. Always had been - it's much less boilerplate > > that way. > > > > deactivate_locked_super() on that failure exit is the least painful > > variant, unfortunately. > > > > Filesystems with ->kill_sb() instances that rely upon something > > done between sget() and the first failure exit after it need to be fixed. > > And yes, that should've been spotted back then. Sorry. > > > > Fortunately, we don't have many of those - kill_{block,litter,anon}_super() > > are safe and those are the majority. Looking through the rest uncovers > > some bugs; so far all I've seen were already there. Note that normally > > we have something like > > static void affs_kill_sb(struct super_block *sb) > > { > > struct affs_sb_info *sbi = AFFS_SB(sb); > > kill_block_super(sb); > > if (sbi) { > > affs_free_bitmap(sb); > > affs_brelse(sbi->s_root_bh); > > kfree(sbi->s_prefix); > > mutex_destroy(>s_bmlock); > > kfree(sbi); > > } > > } > > which basically does one of the safe ones augmented with something that > > takes care *not* to assume that e.g. ->s_fs_info has been allocated. > > Not everyone does, though: > > > > jffs2_fill_super(): > > c = kzalloc(sizeof(*c), GFP_KERNEL); > > if (!c) > > return -ENOMEM; > > in the very beginning. So we can return from it with NULL ->s_fs_info. > > Now, consider > > struct jffs2_sb_info *c = JFFS2_SB_INFO(sb); > > if (!(sb->s_flags & MS_RDONLY)) > > jffs2_stop_garbage_collect_thread(c); > > in jffs2_kill_sb() and > > void jffs2_stop_garbage_collect_thread(struct jffs2_sb_info *c) > > { > > int wait = 0; > > spin_lock(>erase_completion_lock); > > if (c->gc_task) { > > > > IOW, fail that kzalloc() (or, indeed, an allocation in register_shrinker()) > > and eat an oops. Always had been there, always hard to hit without > > fault injectors and fortunately trivial to fix. > > > > Similar in nfs_kill_super() calling nfs_free_server(). > > Similar in v9fs_kill_super() with > > v9fs_session_cancel()/v9fs_session_close() calls. > > Similar in hypfs_kill_super(), afs_kill_super(), btrfs_kill_super(), > > cifs_kill_sb() > > (all trivial to fix) > > > > Aha... nfsd_umount() is a new regression. > > > > orangefs: old, trivial to fix. > > > > cgroup_kill_sb(): old, hopefully easy to fix. Note that > > kernfs_root_from_sb() > > can bloody well return NULL, making cgroup_root_from_kf() oops. Always had > > been > > there. > > > > AFAICS, after discarding the instances that do the right thing we are left > > with: > > hypfs_kill_super, rdt_kill_sb, v9fs_kill_super, afs_kill_super, > > btrfs_kill_super, > > cifs_kill_sb, jffs2_kill_sb, nfs_kill_super, nfsd_umount, orangefs_kill_sb, > > proc_kill_sb, sysfs_kill_sb, cgroup_kill_sb, rpc_kill_sb. > > > > Out of those, nfsd_umount(), proc_kill_sb() and rpc_kill_sb() are > > regressions. > > So are rdt_kill_sb() and sysfs_kill_sb() (victims of the issue you've > > spotted > > in kernfs_kill_sb()). The rest are old (and I wonder if syzbot had been > > catching those - they are also dependent upon a specific allocation failing > > at the right time). > > > > Fix for the kernfs bug is now queued in vfs/for-linus: > > #syz fix: kernfs: deal with early sget() failures > But, there is still a related bug: when mounting sysfs, if register_shrinker() fails in sget_userns(), then kernfs_kill_sb() gets called, which frees the 'struct kernfs_super_info'. But, the 'struct kernfs_super_info' is also freed in kernfs_mount_ns() by: sb = sget_userns(fs_type, kernfs_test_super, kernfs_set_super, flags, _user_ns, info); if (IS_ERR(sb) || sb->s_fs_info != info) kfree(info); if (IS_ERR(sb)) return ERR_CAST(sb); I guess the problem is that sget_userns() shouldn't take ownership of the 'info' if it returns an error -- but, it actually does if register_shrinker() fails, resulting in a double free. Here is a reproducer and the KASAN splat. This is on Linus' tree (87ef12027b9b) with vfs/for-linus merged in. #define _GNU_SOURCE #include #include #include #include #include #include #include int main() { int fd, i; char buf[16]; unshare(CLONE_NEWNET);
Re: general protection fault in kernfs_kill_sb
On Mon, Apr 02, 2018 at 03:34:15PM +0100, Al Viro wrote: > On Mon, Apr 02, 2018 at 07:40:22PM +0900, Tetsuo Handa wrote: > > > That commit assumes that calling kill_sb() from deactivate_locked_super(s) > > without corresponding fill_super() is safe. We have so far crashed with > > rpc_mount() and kernfs_mount_ns(). Is that really safe? > > Consider the case when fill_super() returns an error immediately. > It is exactly the same situation. And ->kill_sb() *is* called in cases > when fill_super() has failed. Always had been - it's much less boilerplate > that way. > > deactivate_locked_super() on that failure exit is the least painful > variant, unfortunately. > > Filesystems with ->kill_sb() instances that rely upon something > done between sget() and the first failure exit after it need to be fixed. > And yes, that should've been spotted back then. Sorry. > > Fortunately, we don't have many of those - kill_{block,litter,anon}_super() > are safe and those are the majority. Looking through the rest uncovers > some bugs; so far all I've seen were already there. Note that normally > we have something like > static void affs_kill_sb(struct super_block *sb) > { > struct affs_sb_info *sbi = AFFS_SB(sb); > kill_block_super(sb); > if (sbi) { > affs_free_bitmap(sb); > affs_brelse(sbi->s_root_bh); > kfree(sbi->s_prefix); > mutex_destroy(>s_bmlock); > kfree(sbi); > } > } > which basically does one of the safe ones augmented with something that > takes care *not* to assume that e.g. ->s_fs_info has been allocated. > Not everyone does, though: > > jffs2_fill_super(): > c = kzalloc(sizeof(*c), GFP_KERNEL); > if (!c) > return -ENOMEM; > in the very beginning. So we can return from it with NULL ->s_fs_info. > Now, consider > struct jffs2_sb_info *c = JFFS2_SB_INFO(sb); > if (!(sb->s_flags & MS_RDONLY)) > jffs2_stop_garbage_collect_thread(c); > in jffs2_kill_sb() and > void jffs2_stop_garbage_collect_thread(struct jffs2_sb_info *c) > { > int wait = 0; > spin_lock(>erase_completion_lock); > if (c->gc_task) { > > IOW, fail that kzalloc() (or, indeed, an allocation in register_shrinker()) > and eat an oops. Always had been there, always hard to hit without > fault injectors and fortunately trivial to fix. > > Similar in nfs_kill_super() calling nfs_free_server(). > Similar in v9fs_kill_super() with v9fs_session_cancel()/v9fs_session_close() > calls. > Similar in hypfs_kill_super(), afs_kill_super(), btrfs_kill_super(), > cifs_kill_sb() > (all trivial to fix) > > Aha... nfsd_umount() is a new regression. > > orangefs: old, trivial to fix. > > cgroup_kill_sb(): old, hopefully easy to fix. Note that kernfs_root_from_sb() > can bloody well return NULL, making cgroup_root_from_kf() oops. Always had > been > there. > > AFAICS, after discarding the instances that do the right thing we are left > with: > hypfs_kill_super, rdt_kill_sb, v9fs_kill_super, afs_kill_super, > btrfs_kill_super, > cifs_kill_sb, jffs2_kill_sb, nfs_kill_super, nfsd_umount, orangefs_kill_sb, > proc_kill_sb, sysfs_kill_sb, cgroup_kill_sb, rpc_kill_sb. > > Out of those, nfsd_umount(), proc_kill_sb() and rpc_kill_sb() are regressions. > So are rdt_kill_sb() and sysfs_kill_sb() (victims of the issue you've spotted > in kernfs_kill_sb()). The rest are old (and I wonder if syzbot had been > catching those - they are also dependent upon a specific allocation failing > at the right time). > Fix for the kernfs bug is now queued in vfs/for-linus: #syz fix: kernfs: deal with early sget() failures syzkaller just recently (3 weeks ago) gained the ability to mount filesystem images, so that's the main reason for the increase in filesystem bug reports. Each time syzkaller is updated to cover more code, bugs are found. - Eric
Re: general protection fault in kernfs_kill_sb
On Mon, Apr 02, 2018 at 03:34:15PM +0100, Al Viro wrote: > On Mon, Apr 02, 2018 at 07:40:22PM +0900, Tetsuo Handa wrote: > > > That commit assumes that calling kill_sb() from deactivate_locked_super(s) > > without corresponding fill_super() is safe. We have so far crashed with > > rpc_mount() and kernfs_mount_ns(). Is that really safe? > > Consider the case when fill_super() returns an error immediately. > It is exactly the same situation. And ->kill_sb() *is* called in cases > when fill_super() has failed. Always had been - it's much less boilerplate > that way. > > deactivate_locked_super() on that failure exit is the least painful > variant, unfortunately. > > Filesystems with ->kill_sb() instances that rely upon something > done between sget() and the first failure exit after it need to be fixed. > And yes, that should've been spotted back then. Sorry. > > Fortunately, we don't have many of those - kill_{block,litter,anon}_super() > are safe and those are the majority. Looking through the rest uncovers > some bugs; so far all I've seen were already there. Note that normally > we have something like > static void affs_kill_sb(struct super_block *sb) > { > struct affs_sb_info *sbi = AFFS_SB(sb); > kill_block_super(sb); > if (sbi) { > affs_free_bitmap(sb); > affs_brelse(sbi->s_root_bh); > kfree(sbi->s_prefix); > mutex_destroy(>s_bmlock); > kfree(sbi); > } > } > which basically does one of the safe ones augmented with something that > takes care *not* to assume that e.g. ->s_fs_info has been allocated. > Not everyone does, though: > > jffs2_fill_super(): > c = kzalloc(sizeof(*c), GFP_KERNEL); > if (!c) > return -ENOMEM; > in the very beginning. So we can return from it with NULL ->s_fs_info. > Now, consider > struct jffs2_sb_info *c = JFFS2_SB_INFO(sb); > if (!(sb->s_flags & MS_RDONLY)) > jffs2_stop_garbage_collect_thread(c); > in jffs2_kill_sb() and > void jffs2_stop_garbage_collect_thread(struct jffs2_sb_info *c) > { > int wait = 0; > spin_lock(>erase_completion_lock); > if (c->gc_task) { > > IOW, fail that kzalloc() (or, indeed, an allocation in register_shrinker()) > and eat an oops. Always had been there, always hard to hit without > fault injectors and fortunately trivial to fix. > > Similar in nfs_kill_super() calling nfs_free_server(). > Similar in v9fs_kill_super() with v9fs_session_cancel()/v9fs_session_close() > calls. > Similar in hypfs_kill_super(), afs_kill_super(), btrfs_kill_super(), > cifs_kill_sb() > (all trivial to fix) > > Aha... nfsd_umount() is a new regression. > > orangefs: old, trivial to fix. > > cgroup_kill_sb(): old, hopefully easy to fix. Note that kernfs_root_from_sb() > can bloody well return NULL, making cgroup_root_from_kf() oops. Always had > been > there. > > AFAICS, after discarding the instances that do the right thing we are left > with: > hypfs_kill_super, rdt_kill_sb, v9fs_kill_super, afs_kill_super, > btrfs_kill_super, > cifs_kill_sb, jffs2_kill_sb, nfs_kill_super, nfsd_umount, orangefs_kill_sb, > proc_kill_sb, sysfs_kill_sb, cgroup_kill_sb, rpc_kill_sb. > > Out of those, nfsd_umount(), proc_kill_sb() and rpc_kill_sb() are regressions. > So are rdt_kill_sb() and sysfs_kill_sb() (victims of the issue you've spotted > in kernfs_kill_sb()). The rest are old (and I wonder if syzbot had been > catching those - they are also dependent upon a specific allocation failing > at the right time). > Fix for the kernfs bug is now queued in vfs/for-linus: #syz fix: kernfs: deal with early sget() failures syzkaller just recently (3 weeks ago) gained the ability to mount filesystem images, so that's the main reason for the increase in filesystem bug reports. Each time syzkaller is updated to cover more code, bugs are found. - Eric
Re: general protection fault in kernfs_kill_sb
On Mon, Apr 02, 2018 at 07:40:22PM +0900, Tetsuo Handa wrote: > That commit assumes that calling kill_sb() from deactivate_locked_super(s) > without corresponding fill_super() is safe. We have so far crashed with > rpc_mount() and kernfs_mount_ns(). Is that really safe? Consider the case when fill_super() returns an error immediately. It is exactly the same situation. And ->kill_sb() *is* called in cases when fill_super() has failed. Always had been - it's much less boilerplate that way. deactivate_locked_super() on that failure exit is the least painful variant, unfortunately. Filesystems with ->kill_sb() instances that rely upon something done between sget() and the first failure exit after it need to be fixed. And yes, that should've been spotted back then. Sorry. Fortunately, we don't have many of those - kill_{block,litter,anon}_super() are safe and those are the majority. Looking through the rest uncovers some bugs; so far all I've seen were already there. Note that normally we have something like static void affs_kill_sb(struct super_block *sb) { struct affs_sb_info *sbi = AFFS_SB(sb); kill_block_super(sb); if (sbi) { affs_free_bitmap(sb); affs_brelse(sbi->s_root_bh); kfree(sbi->s_prefix); mutex_destroy(>s_bmlock); kfree(sbi); } } which basically does one of the safe ones augmented with something that takes care *not* to assume that e.g. ->s_fs_info has been allocated. Not everyone does, though: jffs2_fill_super(): c = kzalloc(sizeof(*c), GFP_KERNEL); if (!c) return -ENOMEM; in the very beginning. So we can return from it with NULL ->s_fs_info. Now, consider struct jffs2_sb_info *c = JFFS2_SB_INFO(sb); if (!(sb->s_flags & MS_RDONLY)) jffs2_stop_garbage_collect_thread(c); in jffs2_kill_sb() and void jffs2_stop_garbage_collect_thread(struct jffs2_sb_info *c) { int wait = 0; spin_lock(>erase_completion_lock); if (c->gc_task) { IOW, fail that kzalloc() (or, indeed, an allocation in register_shrinker()) and eat an oops. Always had been there, always hard to hit without fault injectors and fortunately trivial to fix. Similar in nfs_kill_super() calling nfs_free_server(). Similar in v9fs_kill_super() with v9fs_session_cancel()/v9fs_session_close() calls. Similar in hypfs_kill_super(), afs_kill_super(), btrfs_kill_super(), cifs_kill_sb() (all trivial to fix) Aha... nfsd_umount() is a new regression. orangefs: old, trivial to fix. cgroup_kill_sb(): old, hopefully easy to fix. Note that kernfs_root_from_sb() can bloody well return NULL, making cgroup_root_from_kf() oops. Always had been there. AFAICS, after discarding the instances that do the right thing we are left with: hypfs_kill_super, rdt_kill_sb, v9fs_kill_super, afs_kill_super, btrfs_kill_super, cifs_kill_sb, jffs2_kill_sb, nfs_kill_super, nfsd_umount, orangefs_kill_sb, proc_kill_sb, sysfs_kill_sb, cgroup_kill_sb, rpc_kill_sb. Out of those, nfsd_umount(), proc_kill_sb() and rpc_kill_sb() are regressions. So are rdt_kill_sb() and sysfs_kill_sb() (victims of the issue you've spotted in kernfs_kill_sb()). The rest are old (and I wonder if syzbot had been catching those - they are also dependent upon a specific allocation failing at the right time).
Re: general protection fault in kernfs_kill_sb
On Mon, Apr 02, 2018 at 07:40:22PM +0900, Tetsuo Handa wrote: > That commit assumes that calling kill_sb() from deactivate_locked_super(s) > without corresponding fill_super() is safe. We have so far crashed with > rpc_mount() and kernfs_mount_ns(). Is that really safe? Consider the case when fill_super() returns an error immediately. It is exactly the same situation. And ->kill_sb() *is* called in cases when fill_super() has failed. Always had been - it's much less boilerplate that way. deactivate_locked_super() on that failure exit is the least painful variant, unfortunately. Filesystems with ->kill_sb() instances that rely upon something done between sget() and the first failure exit after it need to be fixed. And yes, that should've been spotted back then. Sorry. Fortunately, we don't have many of those - kill_{block,litter,anon}_super() are safe and those are the majority. Looking through the rest uncovers some bugs; so far all I've seen were already there. Note that normally we have something like static void affs_kill_sb(struct super_block *sb) { struct affs_sb_info *sbi = AFFS_SB(sb); kill_block_super(sb); if (sbi) { affs_free_bitmap(sb); affs_brelse(sbi->s_root_bh); kfree(sbi->s_prefix); mutex_destroy(>s_bmlock); kfree(sbi); } } which basically does one of the safe ones augmented with something that takes care *not* to assume that e.g. ->s_fs_info has been allocated. Not everyone does, though: jffs2_fill_super(): c = kzalloc(sizeof(*c), GFP_KERNEL); if (!c) return -ENOMEM; in the very beginning. So we can return from it with NULL ->s_fs_info. Now, consider struct jffs2_sb_info *c = JFFS2_SB_INFO(sb); if (!(sb->s_flags & MS_RDONLY)) jffs2_stop_garbage_collect_thread(c); in jffs2_kill_sb() and void jffs2_stop_garbage_collect_thread(struct jffs2_sb_info *c) { int wait = 0; spin_lock(>erase_completion_lock); if (c->gc_task) { IOW, fail that kzalloc() (or, indeed, an allocation in register_shrinker()) and eat an oops. Always had been there, always hard to hit without fault injectors and fortunately trivial to fix. Similar in nfs_kill_super() calling nfs_free_server(). Similar in v9fs_kill_super() with v9fs_session_cancel()/v9fs_session_close() calls. Similar in hypfs_kill_super(), afs_kill_super(), btrfs_kill_super(), cifs_kill_sb() (all trivial to fix) Aha... nfsd_umount() is a new regression. orangefs: old, trivial to fix. cgroup_kill_sb(): old, hopefully easy to fix. Note that kernfs_root_from_sb() can bloody well return NULL, making cgroup_root_from_kf() oops. Always had been there. AFAICS, after discarding the instances that do the right thing we are left with: hypfs_kill_super, rdt_kill_sb, v9fs_kill_super, afs_kill_super, btrfs_kill_super, cifs_kill_sb, jffs2_kill_sb, nfs_kill_super, nfsd_umount, orangefs_kill_sb, proc_kill_sb, sysfs_kill_sb, cgroup_kill_sb, rpc_kill_sb. Out of those, nfsd_umount(), proc_kill_sb() and rpc_kill_sb() are regressions. So are rdt_kill_sb() and sysfs_kill_sb() (victims of the issue you've spotted in kernfs_kill_sb()). The rest are old (and I wonder if syzbot had been catching those - they are also dependent upon a specific allocation failing at the right time).
Re: general protection fault in kernfs_kill_sb
On 2018/04/02 2:01, syzbot wrote: > Hello, > > syzbot hit the following crash on bpf-next commit > 7828f20e3779e4e85e55371e0e43f5006a15fb41 (Sat Mar 31 00:17:57 2018 +) > Merge branch 'bpf-cgroup-bind-connect' > syzbot dashboard link: > https://syzkaller.appspot.com/bug?extid=151de3f2be6b40ac8026 > > So far this crash happened 3 times on bpf-next. > C reproducer: https://syzkaller.appspot.com/x/repro.c?id=4857382450495488 > syzkaller reproducer: > https://syzkaller.appspot.com/x/repro.syz?id=4644052230209536 > Raw console output: > https://syzkaller.appspot.com/x/log.txt?id=5798498637185024 > Kernel config: https://syzkaller.appspot.com/x/.config?id=5909223872832634926 > compiler: gcc (GCC) 7.1.1 20170620 Al, I think this is another example of crash triggered by commit 9ee332d99e4d5a97 ("sget(): handle failures of register_shrinker()"). [ 23.407545] FAULT_INJECTION: forcing a failure. [ 23.407545] name failslab, interval 1, probability 0, space 0, times 1 [ 23.414735] CPU: 1 PID: 4471 Comm: syzkaller129261 Not tainted 4.16.0-rc6+ #43 [ 23.433147] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 23.442491] Call Trace: [ 23.445074] dump_stack+0x194/0x24d [ 23.448689] ? arch_local_irq_restore+0x53/0x53 [ 23.453347] ? find_held_lock+0x35/0x1d0 [ 23.457401] should_fail+0x8c0/0xa40 [ 23.461100] ? __list_lru_init+0x352/0x750 [ 23.465331] ? fault_create_debugfs_attr+0x1f0/0x1f0 [ 23.470453] ? find_held_lock+0x35/0x1d0 [ 23.474503] ? __lock_is_held+0xb6/0x140 [ 23.478556] ? check_same_owner+0x320/0x320 [ 23.482870] ? rcu_note_context_switch+0x710/0x710 [ 23.487785] ? find_held_lock+0x35/0x1d0 [ 23.491931] should_failslab+0xec/0x120 [ 23.495895] __kmalloc+0x63/0x760 [ 23.499332] ? lock_downgrade+0x980/0x980 [ 23.503469] ? _raw_spin_unlock+0x22/0x30 [ 23.507605] ? register_shrinker+0x10e/0x2d0 [ 23.511999] ? trace_event_raw_event_module_request+0x320/0x320 [ 23.518044] register_shrinker+0x10e/0x2d0 [ 23.522265] ? __bpf_trace_mm_vmscan_wakeup_kswapd+0x40/0x40 [ 23.528051] ? memcpy+0x45/0x50 [ 23.531588] sget_userns+0xbbf/0xe40 [ 23.535296] ? kernfs_sop_show_path+0x190/0x190 [ 23.539959] ? kernfs_sop_show_options+0x180/0x180 [ 23.544876] ? destroy_unused_super.part.6+0xd0/0xd0 [ 23.549972] ? check_same_owner+0x320/0x320 [ 23.554281] ? rcu_pm_notify+0xc0/0xc0 [ 23.558161] ? rcu_read_lock_sched_held+0x108/0x120 [ 23.563168] ? kmem_cache_alloc_trace+0x459/0x740 [ 23.567997] ? lock_downgrade+0x980/0x980 [ 23.572142] kernfs_mount_ns+0x13d/0x8b0 [ 23.576192] ? kernfs_super_ns+0x70/0x70 [ 23.580244] sysfs_mount+0xc2/0x1c0 That commit assumes that calling kill_sb() from deactivate_locked_super(s) without corresponding fill_super() is safe. We have so far crashed with rpc_mount() and kernfs_mount_ns(). Is that really safe? Also, I think struct dentry *kernfs_mount_ns(struct file_system_type *fs_type, int flags, struct kernfs_root *root, unsigned long magic, bool *new_sb_created, const void *ns) { (...snipped...) if (!sb->s_root) { struct kernfs_super_info *info = kernfs_info(sb); error = kernfs_fill_super(sb, magic); if (error) { deactivate_locked_super(sb); // <= this call return ERR_PTR(error); } sb->s_flags |= SB_ACTIVE; mutex_lock(_mutex); list_add(>node, >supers); mutex_unlock(_mutex); } (...snipped...) } is not safe, for list_del() is called via kill_sb() without corresponding list_add(). void kernfs_kill_sb(struct super_block *sb) { struct kernfs_super_info *info = kernfs_info(sb); mutex_lock(_mutex); list_del(>node); // <= NULL pointer dereference mutex_unlock(_mutex); /* * Remove the superblock from fs_supers/s_instances * so we can't find it, before freeing kernfs_super_info. */ kill_anon_super(sb); kfree(info); } > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+151de3f2be6b40ac8...@syzkaller.appspotmail.com > It will help syzbot understand when the bug is fixed. See footer for details. > If you forward the report, please keep this part and the footer. > > kasan: GPF could be caused by NULL-ptr deref or user memory access > should_failslab+0xec/0x120 mm/failslab.c:32 > slab_pre_alloc_hook mm/slab.h:422 [inline] > slab_alloc mm/slab.c:3365 [inline] > __do_kmalloc mm/slab.c:3703 [inline] > __kmalloc+0x63/0x760
Re: general protection fault in kernfs_kill_sb
On 2018/04/02 2:01, syzbot wrote: > Hello, > > syzbot hit the following crash on bpf-next commit > 7828f20e3779e4e85e55371e0e43f5006a15fb41 (Sat Mar 31 00:17:57 2018 +) > Merge branch 'bpf-cgroup-bind-connect' > syzbot dashboard link: > https://syzkaller.appspot.com/bug?extid=151de3f2be6b40ac8026 > > So far this crash happened 3 times on bpf-next. > C reproducer: https://syzkaller.appspot.com/x/repro.c?id=4857382450495488 > syzkaller reproducer: > https://syzkaller.appspot.com/x/repro.syz?id=4644052230209536 > Raw console output: > https://syzkaller.appspot.com/x/log.txt?id=5798498637185024 > Kernel config: https://syzkaller.appspot.com/x/.config?id=5909223872832634926 > compiler: gcc (GCC) 7.1.1 20170620 Al, I think this is another example of crash triggered by commit 9ee332d99e4d5a97 ("sget(): handle failures of register_shrinker()"). [ 23.407545] FAULT_INJECTION: forcing a failure. [ 23.407545] name failslab, interval 1, probability 0, space 0, times 1 [ 23.414735] CPU: 1 PID: 4471 Comm: syzkaller129261 Not tainted 4.16.0-rc6+ #43 [ 23.433147] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 23.442491] Call Trace: [ 23.445074] dump_stack+0x194/0x24d [ 23.448689] ? arch_local_irq_restore+0x53/0x53 [ 23.453347] ? find_held_lock+0x35/0x1d0 [ 23.457401] should_fail+0x8c0/0xa40 [ 23.461100] ? __list_lru_init+0x352/0x750 [ 23.465331] ? fault_create_debugfs_attr+0x1f0/0x1f0 [ 23.470453] ? find_held_lock+0x35/0x1d0 [ 23.474503] ? __lock_is_held+0xb6/0x140 [ 23.478556] ? check_same_owner+0x320/0x320 [ 23.482870] ? rcu_note_context_switch+0x710/0x710 [ 23.487785] ? find_held_lock+0x35/0x1d0 [ 23.491931] should_failslab+0xec/0x120 [ 23.495895] __kmalloc+0x63/0x760 [ 23.499332] ? lock_downgrade+0x980/0x980 [ 23.503469] ? _raw_spin_unlock+0x22/0x30 [ 23.507605] ? register_shrinker+0x10e/0x2d0 [ 23.511999] ? trace_event_raw_event_module_request+0x320/0x320 [ 23.518044] register_shrinker+0x10e/0x2d0 [ 23.522265] ? __bpf_trace_mm_vmscan_wakeup_kswapd+0x40/0x40 [ 23.528051] ? memcpy+0x45/0x50 [ 23.531588] sget_userns+0xbbf/0xe40 [ 23.535296] ? kernfs_sop_show_path+0x190/0x190 [ 23.539959] ? kernfs_sop_show_options+0x180/0x180 [ 23.544876] ? destroy_unused_super.part.6+0xd0/0xd0 [ 23.549972] ? check_same_owner+0x320/0x320 [ 23.554281] ? rcu_pm_notify+0xc0/0xc0 [ 23.558161] ? rcu_read_lock_sched_held+0x108/0x120 [ 23.563168] ? kmem_cache_alloc_trace+0x459/0x740 [ 23.567997] ? lock_downgrade+0x980/0x980 [ 23.572142] kernfs_mount_ns+0x13d/0x8b0 [ 23.576192] ? kernfs_super_ns+0x70/0x70 [ 23.580244] sysfs_mount+0xc2/0x1c0 That commit assumes that calling kill_sb() from deactivate_locked_super(s) without corresponding fill_super() is safe. We have so far crashed with rpc_mount() and kernfs_mount_ns(). Is that really safe? Also, I think struct dentry *kernfs_mount_ns(struct file_system_type *fs_type, int flags, struct kernfs_root *root, unsigned long magic, bool *new_sb_created, const void *ns) { (...snipped...) if (!sb->s_root) { struct kernfs_super_info *info = kernfs_info(sb); error = kernfs_fill_super(sb, magic); if (error) { deactivate_locked_super(sb); // <= this call return ERR_PTR(error); } sb->s_flags |= SB_ACTIVE; mutex_lock(_mutex); list_add(>node, >supers); mutex_unlock(_mutex); } (...snipped...) } is not safe, for list_del() is called via kill_sb() without corresponding list_add(). void kernfs_kill_sb(struct super_block *sb) { struct kernfs_super_info *info = kernfs_info(sb); mutex_lock(_mutex); list_del(>node); // <= NULL pointer dereference mutex_unlock(_mutex); /* * Remove the superblock from fs_supers/s_instances * so we can't find it, before freeing kernfs_super_info. */ kill_anon_super(sb); kfree(info); } > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+151de3f2be6b40ac8...@syzkaller.appspotmail.com > It will help syzbot understand when the bug is fixed. See footer for details. > If you forward the report, please keep this part and the footer. > > kasan: GPF could be caused by NULL-ptr deref or user memory access > should_failslab+0xec/0x120 mm/failslab.c:32 > slab_pre_alloc_hook mm/slab.h:422 [inline] > slab_alloc mm/slab.c:3365 [inline] > __do_kmalloc mm/slab.c:3703 [inline] > __kmalloc+0x63/0x760
general protection fault in kernfs_kill_sb
Hello, syzbot hit the following crash on bpf-next commit 7828f20e3779e4e85e55371e0e43f5006a15fb41 (Sat Mar 31 00:17:57 2018 +) Merge branch 'bpf-cgroup-bind-connect' syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=151de3f2be6b40ac8026 So far this crash happened 3 times on bpf-next. C reproducer: https://syzkaller.appspot.com/x/repro.c?id=4857382450495488 syzkaller reproducer: https://syzkaller.appspot.com/x/repro.syz?id=4644052230209536 Raw console output: https://syzkaller.appspot.com/x/log.txt?id=5798498637185024 Kernel config: https://syzkaller.appspot.com/x/.config?id=5909223872832634926 compiler: gcc (GCC) 7.1.1 20170620 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+151de3f2be6b40ac8...@syzkaller.appspotmail.com It will help syzbot understand when the bug is fixed. See footer for details. If you forward the report, please keep this part and the footer. kasan: GPF could be caused by NULL-ptr deref or user memory access should_failslab+0xec/0x120 mm/failslab.c:32 slab_pre_alloc_hook mm/slab.h:422 [inline] slab_alloc mm/slab.c:3365 [inline] __do_kmalloc mm/slab.c:3703 [inline] __kmalloc+0x63/0x760 mm/slab.c:3714 general protection fault: [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) kmalloc include/linux/slab.h:517 [inline] kzalloc include/linux/slab.h:701 [inline] register_shrinker+0x10e/0x2d0 mm/vmscan.c:268 Modules linked in: CPU: 1 PID: 4471 Comm: syzkaller129261 Not tainted 4.16.0-rc6+ #43 sget_userns+0xbbf/0xe40 fs/super.c:520 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__list_del_entry_valid+0x7e/0x150 lib/list_debug.c:51 RSP: 0018:8801ae017658 EFLAGS: 00010246 RAX: dc00 RBX: RCX: RDX: RSI: 8801d97a6e98 RDI: 8801d97a6ea0 RBP: 8801ae017670 R08: 81d39d22 R09: 0004 R10: 8801ae017670 R11: R12: R13: 8801d91dec00 R14: 8801ae017700 R15: 8801d97a6e98 FS: 01569880() GS:8801db10() knlGS: kernfs_mount_ns+0x13d/0x8b0 fs/kernfs/mount.c:320 CS: 0010 DS: ES: CR0: 80050033 CR2: 006d0188 CR3: 0001da40c005 CR4: 001606e0 DR0: DR1: DR2: sysfs_mount+0xc2/0x1c0 fs/sysfs/mount.c:36 DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: __list_del_entry include/linux/list.h:117 [inline] list_del include/linux/list.h:125 [inline] kernfs_kill_sb+0x9e/0x330 fs/kernfs/mount.c:361 mount_fs+0x66/0x2d0 fs/super.c:1222 vfs_kern_mount.part.26+0xc6/0x4a0 fs/namespace.c:1037 sysfs_kill_sb+0x22/0x40 fs/sysfs/mount.c:50 vfs_kern_mount fs/namespace.c:2509 [inline] do_new_mount fs/namespace.c:2512 [inline] do_mount+0xea4/0x2bb0 fs/namespace.c:2842 deactivate_locked_super+0x88/0xd0 fs/super.c:312 sget_userns+0xbda/0xe40 fs/super.c:522 SYSC_mount fs/namespace.c:3058 [inline] SyS_mount+0xab/0x120 fs/namespace.c:3035 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 kernfs_mount_ns+0x13d/0x8b0 fs/kernfs/mount.c:320 sysfs_mount+0xc2/0x1c0 fs/sysfs/mount.c:36 mount_fs+0x66/0x2d0 fs/super.c:1222 entry_SYSCALL_64_after_hwframe+0x42/0xb7 vfs_kern_mount.part.26+0xc6/0x4a0 fs/namespace.c:1037 RIP: 0033:0x442609 RSP: 002b:7fff40a278e8 EFLAGS: 0246 ORIG_RAX: 00a5 RAX: ffda RBX: RCX: 00442609 RDX: 2140 RSI: 2040 RDI: 2000 RBP: 7fff40a28190 R08: 22c0 R09: 0003 vfs_kern_mount fs/namespace.c:2509 [inline] do_new_mount fs/namespace.c:2512 [inline] do_mount+0xea4/0x2bb0 fs/namespace.c:2842 R10: R11: 0246 R12: R13: 0003 R14: 1380 R15: 7fff40a27a28 SYSC_mount fs/namespace.c:3058 [inline] SyS_mount+0xab/0x120 fs/namespace.c:3035 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 RIP: 0033:0x442609 RSP: 002b:7fff40a278e8 EFLAGS: 0246 ORIG_RAX: 00a5 RAX: ffda RBX: RCX: 00442609 RDX: 2140 RSI: 2040 RDI: 2000 RBP: 7fff40a28190 R08: 22c0 R09: 0003 R10: R11: 0246 R12: R13: 0003 R14: 1380 R15: 7fff40a27a28 Code: 00 00 00 00 ad de 49 39 c4 74 66 48 b8 00 02 00 00 00 00 ad de 48 89 da 48 39 c3 74 65 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df <80> 3c 02 00 75 7b 48 8b 13 48 39 f2 75 57 49 8d 7c 24 08 48 b8 RIP: __list_del_entry_valid+0x7e/0x150 lib/list_debug.c:51 RSP: 8801ae017658 ---[ end trace b14d521943ecadbd ]--- --- This bug is generated by a dumb bot. It may contain errors. See https://goo.gl/tpsmEJ for details. Direct
general protection fault in kernfs_kill_sb
Hello, syzbot hit the following crash on bpf-next commit 7828f20e3779e4e85e55371e0e43f5006a15fb41 (Sat Mar 31 00:17:57 2018 +) Merge branch 'bpf-cgroup-bind-connect' syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=151de3f2be6b40ac8026 So far this crash happened 3 times on bpf-next. C reproducer: https://syzkaller.appspot.com/x/repro.c?id=4857382450495488 syzkaller reproducer: https://syzkaller.appspot.com/x/repro.syz?id=4644052230209536 Raw console output: https://syzkaller.appspot.com/x/log.txt?id=5798498637185024 Kernel config: https://syzkaller.appspot.com/x/.config?id=5909223872832634926 compiler: gcc (GCC) 7.1.1 20170620 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+151de3f2be6b40ac8...@syzkaller.appspotmail.com It will help syzbot understand when the bug is fixed. See footer for details. If you forward the report, please keep this part and the footer. kasan: GPF could be caused by NULL-ptr deref or user memory access should_failslab+0xec/0x120 mm/failslab.c:32 slab_pre_alloc_hook mm/slab.h:422 [inline] slab_alloc mm/slab.c:3365 [inline] __do_kmalloc mm/slab.c:3703 [inline] __kmalloc+0x63/0x760 mm/slab.c:3714 general protection fault: [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) kmalloc include/linux/slab.h:517 [inline] kzalloc include/linux/slab.h:701 [inline] register_shrinker+0x10e/0x2d0 mm/vmscan.c:268 Modules linked in: CPU: 1 PID: 4471 Comm: syzkaller129261 Not tainted 4.16.0-rc6+ #43 sget_userns+0xbbf/0xe40 fs/super.c:520 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__list_del_entry_valid+0x7e/0x150 lib/list_debug.c:51 RSP: 0018:8801ae017658 EFLAGS: 00010246 RAX: dc00 RBX: RCX: RDX: RSI: 8801d97a6e98 RDI: 8801d97a6ea0 RBP: 8801ae017670 R08: 81d39d22 R09: 0004 R10: 8801ae017670 R11: R12: R13: 8801d91dec00 R14: 8801ae017700 R15: 8801d97a6e98 FS: 01569880() GS:8801db10() knlGS: kernfs_mount_ns+0x13d/0x8b0 fs/kernfs/mount.c:320 CS: 0010 DS: ES: CR0: 80050033 CR2: 006d0188 CR3: 0001da40c005 CR4: 001606e0 DR0: DR1: DR2: sysfs_mount+0xc2/0x1c0 fs/sysfs/mount.c:36 DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: __list_del_entry include/linux/list.h:117 [inline] list_del include/linux/list.h:125 [inline] kernfs_kill_sb+0x9e/0x330 fs/kernfs/mount.c:361 mount_fs+0x66/0x2d0 fs/super.c:1222 vfs_kern_mount.part.26+0xc6/0x4a0 fs/namespace.c:1037 sysfs_kill_sb+0x22/0x40 fs/sysfs/mount.c:50 vfs_kern_mount fs/namespace.c:2509 [inline] do_new_mount fs/namespace.c:2512 [inline] do_mount+0xea4/0x2bb0 fs/namespace.c:2842 deactivate_locked_super+0x88/0xd0 fs/super.c:312 sget_userns+0xbda/0xe40 fs/super.c:522 SYSC_mount fs/namespace.c:3058 [inline] SyS_mount+0xab/0x120 fs/namespace.c:3035 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 kernfs_mount_ns+0x13d/0x8b0 fs/kernfs/mount.c:320 sysfs_mount+0xc2/0x1c0 fs/sysfs/mount.c:36 mount_fs+0x66/0x2d0 fs/super.c:1222 entry_SYSCALL_64_after_hwframe+0x42/0xb7 vfs_kern_mount.part.26+0xc6/0x4a0 fs/namespace.c:1037 RIP: 0033:0x442609 RSP: 002b:7fff40a278e8 EFLAGS: 0246 ORIG_RAX: 00a5 RAX: ffda RBX: RCX: 00442609 RDX: 2140 RSI: 2040 RDI: 2000 RBP: 7fff40a28190 R08: 22c0 R09: 0003 vfs_kern_mount fs/namespace.c:2509 [inline] do_new_mount fs/namespace.c:2512 [inline] do_mount+0xea4/0x2bb0 fs/namespace.c:2842 R10: R11: 0246 R12: R13: 0003 R14: 1380 R15: 7fff40a27a28 SYSC_mount fs/namespace.c:3058 [inline] SyS_mount+0xab/0x120 fs/namespace.c:3035 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 RIP: 0033:0x442609 RSP: 002b:7fff40a278e8 EFLAGS: 0246 ORIG_RAX: 00a5 RAX: ffda RBX: RCX: 00442609 RDX: 2140 RSI: 2040 RDI: 2000 RBP: 7fff40a28190 R08: 22c0 R09: 0003 R10: R11: 0246 R12: R13: 0003 R14: 1380 R15: 7fff40a27a28 Code: 00 00 00 00 ad de 49 39 c4 74 66 48 b8 00 02 00 00 00 00 ad de 48 89 da 48 39 c3 74 65 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df <80> 3c 02 00 75 7b 48 8b 13 48 39 f2 75 57 49 8d 7c 24 08 48 b8 RIP: __list_del_entry_valid+0x7e/0x150 lib/list_debug.c:51 RSP: 8801ae017658 ---[ end trace b14d521943ecadbd ]--- --- This bug is generated by a dumb bot. It may contain errors. See https://goo.gl/tpsmEJ for details. Direct