Re: general protection fault in kernfs_kill_sb (2)

2018-05-14 Thread Stephen Rothwell
Hi Al,

On Mon, 14 May 2018 05:04:15 +0100 Al Viro  wrote:
>
> On Mon, May 14, 2018 at 12:20:16PM +0900, Tetsuo Handa wrote:
> 
> > But there remains a refcount bug because deactivate_locked_super() from
> > kernfs_mount_ns() triggers kobj_ns_drop() from sysfs_kill_sb() via
> > sb->kill_sb() when kobj_ns_drop() is always called by sysfs_mount()
> > if kernfs_mount_ns() returned an error.  
> 
> Trivial:
> 
> unfuck sysfs_mount()
> 
> Signed-off-by: Al Viro 

I noticed this turn up in linux-next today.  Can I ask that you please
add an actual commit message to it before sending it on to Linus.
-- 
Cheers,
Stephen Rothwell


pgpt80ZhcWVlk.pgp
Description: OpenPGP digital signature


Re: general protection fault in kernfs_kill_sb (2)

2018-05-14 Thread Stephen Rothwell
Hi Al,

On Mon, 14 May 2018 05:04:15 +0100 Al Viro  wrote:
>
> On Mon, May 14, 2018 at 12:20:16PM +0900, Tetsuo Handa wrote:
> 
> > But there remains a refcount bug because deactivate_locked_super() from
> > kernfs_mount_ns() triggers kobj_ns_drop() from sysfs_kill_sb() via
> > sb->kill_sb() when kobj_ns_drop() is always called by sysfs_mount()
> > if kernfs_mount_ns() returned an error.  
> 
> Trivial:
> 
> unfuck sysfs_mount()
> 
> Signed-off-by: Al Viro 

I noticed this turn up in linux-next today.  Can I ask that you please
add an actual commit message to it before sending it on to Linus.
-- 
Cheers,
Stephen Rothwell


pgpt80ZhcWVlk.pgp
Description: OpenPGP digital signature


Re: general protection fault in kernfs_kill_sb (2)

2018-05-13 Thread Al Viro
On Mon, May 14, 2018 at 05:04:15AM +0100, Al Viro wrote:
> diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
> index b428d317ae92..92682fcc41f6 100644
> --- a/fs/sysfs/mount.c
> +++ b/fs/sysfs/mount.c
> @@ -25,7 +25,7 @@ static struct dentry *sysfs_mount(struct file_system_type 
> *fs_type,
>  {
>   struct dentry *root;
>   void *ns;
> - bool new_sb;
> + bool new_sb = false;
>  
>   if (!(flags & SB_KERNMOUNT)) {
>   if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET))
> @@ -35,9 +35,9 @@ static struct dentry *sysfs_mount(struct file_system_type 
> *fs_type,
>   ns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET);
>   root = kernfs_mount_ns(fs_type, flags, sysfs_root,
>   SYSFS_MAGIC, _sb, ns);
> - if (IS_ERR(root) || !new_sb)
> + if (!new_sb)
>   kobj_ns_drop(KOBJ_NS_TYPE_NET, ns);
> - else if (new_sb)
> + else if (!IS_ERR(root))
>   root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE;
>  
>   return root;

What we want for that kobj_ns_drop() is "no fs instances created" (== no
->kill_sb(), be it now or later, to drop that kobj reference); for setting
->s_iflags - "new instance successfully set up".

That's it; all we need is new_sb that would be accurate on its own.
The problem is with kludging over the cases when it's left uninitialized
(early exits from kernfs_mount_ns()) with IS_ERR(root), which happens to
grab the cases when new_sb *was* set to true.  So the fix is to initialize
new_sb properly and get rid of that kludge.  Which turns the whole thing
into
if (!new_sb)
...
if (!IS_ERR(root) && new_sb)
...
i.e.
if (!new_sb)
...
else if (!IS_ERR(root))
...


Re: general protection fault in kernfs_kill_sb (2)

2018-05-13 Thread Al Viro
On Mon, May 14, 2018 at 05:04:15AM +0100, Al Viro wrote:
> diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
> index b428d317ae92..92682fcc41f6 100644
> --- a/fs/sysfs/mount.c
> +++ b/fs/sysfs/mount.c
> @@ -25,7 +25,7 @@ static struct dentry *sysfs_mount(struct file_system_type 
> *fs_type,
>  {
>   struct dentry *root;
>   void *ns;
> - bool new_sb;
> + bool new_sb = false;
>  
>   if (!(flags & SB_KERNMOUNT)) {
>   if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET))
> @@ -35,9 +35,9 @@ static struct dentry *sysfs_mount(struct file_system_type 
> *fs_type,
>   ns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET);
>   root = kernfs_mount_ns(fs_type, flags, sysfs_root,
>   SYSFS_MAGIC, _sb, ns);
> - if (IS_ERR(root) || !new_sb)
> + if (!new_sb)
>   kobj_ns_drop(KOBJ_NS_TYPE_NET, ns);
> - else if (new_sb)
> + else if (!IS_ERR(root))
>   root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE;
>  
>   return root;

What we want for that kobj_ns_drop() is "no fs instances created" (== no
->kill_sb(), be it now or later, to drop that kobj reference); for setting
->s_iflags - "new instance successfully set up".

That's it; all we need is new_sb that would be accurate on its own.
The problem is with kludging over the cases when it's left uninitialized
(early exits from kernfs_mount_ns()) with IS_ERR(root), which happens to
grab the cases when new_sb *was* set to true.  So the fix is to initialize
new_sb properly and get rid of that kludge.  Which turns the whole thing
into
if (!new_sb)
...
if (!IS_ERR(root) && new_sb)
...
i.e.
if (!new_sb)
...
else if (!IS_ERR(root))
...


Re: general protection fault in kernfs_kill_sb (2)

2018-05-13 Thread Al Viro
On Mon, May 14, 2018 at 12:20:16PM +0900, Tetsuo Handa wrote:

> But there remains a refcount bug because deactivate_locked_super() from
> kernfs_mount_ns() triggers kobj_ns_drop() from sysfs_kill_sb() via
> sb->kill_sb() when kobj_ns_drop() is always called by sysfs_mount()
> if kernfs_mount_ns() returned an error.

Trivial:

unfuck sysfs_mount()

Signed-off-by: Al Viro 
---
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index b428d317ae92..92682fcc41f6 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -25,7 +25,7 @@ static struct dentry *sysfs_mount(struct file_system_type 
*fs_type,
 {
struct dentry *root;
void *ns;
-   bool new_sb;
+   bool new_sb = false;
 
if (!(flags & SB_KERNMOUNT)) {
if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET))
@@ -35,9 +35,9 @@ static struct dentry *sysfs_mount(struct file_system_type 
*fs_type,
ns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET);
root = kernfs_mount_ns(fs_type, flags, sysfs_root,
SYSFS_MAGIC, _sb, ns);
-   if (IS_ERR(root) || !new_sb)
+   if (!new_sb)
kobj_ns_drop(KOBJ_NS_TYPE_NET, ns);
-   else if (new_sb)
+   else if (!IS_ERR(root))
root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE;
 
return root;


Re: general protection fault in kernfs_kill_sb (2)

2018-05-13 Thread Al Viro
On Mon, May 14, 2018 at 12:20:16PM +0900, Tetsuo Handa wrote:

> But there remains a refcount bug because deactivate_locked_super() from
> kernfs_mount_ns() triggers kobj_ns_drop() from sysfs_kill_sb() via
> sb->kill_sb() when kobj_ns_drop() is always called by sysfs_mount()
> if kernfs_mount_ns() returned an error.

Trivial:

unfuck sysfs_mount()

Signed-off-by: Al Viro 
---
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index b428d317ae92..92682fcc41f6 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -25,7 +25,7 @@ static struct dentry *sysfs_mount(struct file_system_type 
*fs_type,
 {
struct dentry *root;
void *ns;
-   bool new_sb;
+   bool new_sb = false;
 
if (!(flags & SB_KERNMOUNT)) {
if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET))
@@ -35,9 +35,9 @@ static struct dentry *sysfs_mount(struct file_system_type 
*fs_type,
ns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET);
root = kernfs_mount_ns(fs_type, flags, sysfs_root,
SYSFS_MAGIC, _sb, ns);
-   if (IS_ERR(root) || !new_sb)
+   if (!new_sb)
kobj_ns_drop(KOBJ_NS_TYPE_NET, ns);
-   else if (new_sb)
+   else if (!IS_ERR(root))
root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE;
 
return root;


Re: general protection fault in kernfs_kill_sb (2)

2018-05-13 Thread Al Viro
On Sun, May 13, 2018 at 11:19:46AM +0900, Tetsuo Handa wrote:

> This is what I reported at
> https://groups.google.com/d/msg/syzkaller-bugs/ISOJlV2I2QM/qHslGMi3AwAJ .
> 
> We are currently waiting for comments from Al Viro.

1) the damn thing is unusable without javashit.  Which gets about
the same reaction as sending something.doc in attachment.  Please,
find a less obnoxious way to archive the thing (or to generate
URLs that would work without that garbage).

2) deactivate_locked_super() *WILL* be called when fill_super() fails.
Live with it; it allows to simplify a whole lot of cleanup logics
in various filesystems.  Again, we are not going for a model where
->kill_sb() is not called for something returned by sget().
Rationale: rarely exercised paths tend to rot, so anything that increases
the duplication of bits and pieces of normal teardown into failure exits
of foo_fill_super() is a bloody bad idea.  If anything, we want to take
a lot of stuff out of ->put_super() instances directly into ->kill_sb()
ones, precisely because ->put_super() is only called for fully set up
filesystems.

3) kernfs needs to be fixed.  The rest of the dropped commits were
made redundant by 8e04944f0ea8; this one wasn't.  Mea culpa.


Re: general protection fault in kernfs_kill_sb (2)

2018-05-13 Thread Al Viro
On Sun, May 13, 2018 at 11:19:46AM +0900, Tetsuo Handa wrote:

> This is what I reported at
> https://groups.google.com/d/msg/syzkaller-bugs/ISOJlV2I2QM/qHslGMi3AwAJ .
> 
> We are currently waiting for comments from Al Viro.

1) the damn thing is unusable without javashit.  Which gets about
the same reaction as sending something.doc in attachment.  Please,
find a less obnoxious way to archive the thing (or to generate
URLs that would work without that garbage).

2) deactivate_locked_super() *WILL* be called when fill_super() fails.
Live with it; it allows to simplify a whole lot of cleanup logics
in various filesystems.  Again, we are not going for a model where
->kill_sb() is not called for something returned by sget().
Rationale: rarely exercised paths tend to rot, so anything that increases
the duplication of bits and pieces of normal teardown into failure exits
of foo_fill_super() is a bloody bad idea.  If anything, we want to take
a lot of stuff out of ->put_super() instances directly into ->kill_sb()
ones, precisely because ->put_super() is only called for fully set up
filesystems.

3) kernfs needs to be fixed.  The rest of the dropped commits were
made redundant by 8e04944f0ea8; this one wasn't.  Mea culpa.


Re: general protection fault in kernfs_kill_sb (2)

2018-05-12 Thread Tetsuo Handa
On 2018/05/13 2:01, syzbot wrote:
> Call Trace:
>  __list_del_entry include/linux/list.h:117 [inline]
>  list_del include/linux/list.h:125 [inline]
>  kernfs_kill_sb+0xa0/0x350 fs/kernfs/mount.c:361
>  sysfs_kill_sb+0x22/0x40 fs/sysfs/mount.c:50
>  deactivate_locked_super+0x97/0x100 fs/super.c:316
>  kernfs_mount_ns+0x753/0x8e0 fs/kernfs/mount.c:335
>  sysfs_mount+0xdf/0x200 fs/sysfs/mount.c:36
>  mount_fs+0xae/0x328 fs/super.c:1267
>  vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037
>  vfs_kern_mount fs/namespace.c:1027 [inline]
>  do_new_mount fs/namespace.c:2518 [inline]
>  do_mount+0x564/0x3070 fs/namespace.c:2848
>  ksys_mount+0x12d/0x140 fs/namespace.c:3064
>  __do_sys_mount fs/namespace.c:3078 [inline]
>  __se_sys_mount fs/namespace.c:3075 [inline]
>  __x64_sys_mount+0xbe/0x150 fs/namespace.c:3075
>  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x4426a9
> RSP: 002b:7ffc558bd1b8 EFLAGS: 0246 ORIG_RAX: 00a5
> RAX: ffda RBX:  RCX: 004426a9
> RDX: 20c0 RSI: 2080 RDI: 2040
> RBP: 7ffc558bda60 R08:  R09: 0003
> R10:  R11: 0246 R12: 
> R13: 0004 R14: 1380 R15: 7ffc558bd2f8
> Code: c5 0f 84 cc 00 00 00 48 b8 00 02 00 00 00 00 ad de 49 39 c4 0f 84 a5 00 
> 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 75 5f 
> 49 8b 14 24 48 39 da 0f 85 ba 00 00 00 48 b8
> RIP: __list_del_entry_valid+0x84/0xf3 lib/list_debug.c:51 RSP: 
> 8801aca6f860
> ---[ end trace d148f307a34e229f ]---

This is what I reported at
https://groups.google.com/d/msg/syzkaller-bugs/ISOJlV2I2QM/qHslGMi3AwAJ .

We are currently waiting for comments from Al Viro.


Re: general protection fault in kernfs_kill_sb (2)

2018-05-12 Thread Tetsuo Handa
On 2018/05/13 2:01, syzbot wrote:
> Call Trace:
>  __list_del_entry include/linux/list.h:117 [inline]
>  list_del include/linux/list.h:125 [inline]
>  kernfs_kill_sb+0xa0/0x350 fs/kernfs/mount.c:361
>  sysfs_kill_sb+0x22/0x40 fs/sysfs/mount.c:50
>  deactivate_locked_super+0x97/0x100 fs/super.c:316
>  kernfs_mount_ns+0x753/0x8e0 fs/kernfs/mount.c:335
>  sysfs_mount+0xdf/0x200 fs/sysfs/mount.c:36
>  mount_fs+0xae/0x328 fs/super.c:1267
>  vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037
>  vfs_kern_mount fs/namespace.c:1027 [inline]
>  do_new_mount fs/namespace.c:2518 [inline]
>  do_mount+0x564/0x3070 fs/namespace.c:2848
>  ksys_mount+0x12d/0x140 fs/namespace.c:3064
>  __do_sys_mount fs/namespace.c:3078 [inline]
>  __se_sys_mount fs/namespace.c:3075 [inline]
>  __x64_sys_mount+0xbe/0x150 fs/namespace.c:3075
>  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x4426a9
> RSP: 002b:7ffc558bd1b8 EFLAGS: 0246 ORIG_RAX: 00a5
> RAX: ffda RBX:  RCX: 004426a9
> RDX: 20c0 RSI: 2080 RDI: 2040
> RBP: 7ffc558bda60 R08:  R09: 0003
> R10:  R11: 0246 R12: 
> R13: 0004 R14: 1380 R15: 7ffc558bd2f8
> Code: c5 0f 84 cc 00 00 00 48 b8 00 02 00 00 00 00 ad de 49 39 c4 0f 84 a5 00 
> 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 75 5f 
> 49 8b 14 24 48 39 da 0f 85 ba 00 00 00 48 b8
> RIP: __list_del_entry_valid+0x84/0xf3 lib/list_debug.c:51 RSP: 
> 8801aca6f860
> ---[ end trace d148f307a34e229f ]---

This is what I reported at
https://groups.google.com/d/msg/syzkaller-bugs/ISOJlV2I2QM/qHslGMi3AwAJ .

We are currently waiting for comments from Al Viro.


general protection fault in kernfs_kill_sb (2)

2018-05-12 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:f0ab773f5c96 Merge branch 'akpm' (patches from Andrew)
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=140ce81780
kernel config:  https://syzkaller.appspot.com/x/.config?x=fcce42b221691ff9
dashboard link: https://syzkaller.appspot.com/bug?extid=481ad819c717f6b78df9
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=15078e9780
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12bf369780

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+481ad819c717f6b78...@syzkaller.appspotmail.com

RBP: 7ffc558bda60 R08:  R09: 0003
R10:  R11: 0246 R12: 
R13: 0004 R14: 1380 R15: 7ffc558bd2f8
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault:  [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 4540 Comm: syz-executor884 Not tainted 4.17.0-rc4+ #43
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

RIP: 0010:__list_del_entry_valid+0x84/0xf3 lib/list_debug.c:51
RSP: 0018:8801aca6f860 EFLAGS: 00010246
RAX: dc00 RBX: 8801ac528498 RCX: 
RDX:  RSI:  RDI: 8801ac5284a0
RBP: 8801aca6f878 R08: fbfff11cc855 R09: fbfff11cc854
R10: 8801aca6f878 R11: 88e642a7 R12: 
R13:  R14: 8801aca6f908 R15: 8801ac528498
FS:  00f86880() GS:8801daf0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 004b6f3c CR3: 0001d973 CR4: 001406e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 __list_del_entry include/linux/list.h:117 [inline]
 list_del include/linux/list.h:125 [inline]
 kernfs_kill_sb+0xa0/0x350 fs/kernfs/mount.c:361
 sysfs_kill_sb+0x22/0x40 fs/sysfs/mount.c:50
 deactivate_locked_super+0x97/0x100 fs/super.c:316
 kernfs_mount_ns+0x753/0x8e0 fs/kernfs/mount.c:335
 sysfs_mount+0xdf/0x200 fs/sysfs/mount.c:36
 mount_fs+0xae/0x328 fs/super.c:1267
 vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037
 vfs_kern_mount fs/namespace.c:1027 [inline]
 do_new_mount fs/namespace.c:2518 [inline]
 do_mount+0x564/0x3070 fs/namespace.c:2848
 ksys_mount+0x12d/0x140 fs/namespace.c:3064
 __do_sys_mount fs/namespace.c:3078 [inline]
 __se_sys_mount fs/namespace.c:3075 [inline]
 __x64_sys_mount+0xbe/0x150 fs/namespace.c:3075
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4426a9
RSP: 002b:7ffc558bd1b8 EFLAGS: 0246 ORIG_RAX: 00a5
RAX: ffda RBX:  RCX: 004426a9
RDX: 20c0 RSI: 2080 RDI: 2040
RBP: 7ffc558bda60 R08:  R09: 0003
R10:  R11: 0246 R12: 
R13: 0004 R14: 1380 R15: 7ffc558bd2f8
Code: c5 0f 84 cc 00 00 00 48 b8 00 02 00 00 00 00 ad de 49 39 c4 0f 84 a5  
00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00  
75 5f 49 8b 14 24 48 39 da 0f 85 ba 00 00 00 48 b8
RIP: __list_del_entry_valid+0x84/0xf3 lib/list_debug.c:51 RSP:  
8801aca6f860

---[ end trace d148f307a34e229f ]---


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is  
merged

into any tree, please reply to this email with:
#syz fix: exact-commit-title
If you want to test a patch for this bug, please reply with:
#syz test: git://repo/address.git branch
and provide the patch inline or as an attachment.
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug  
report.

Note: all commands must start from beginning of the line in the email body.


general protection fault in kernfs_kill_sb (2)

2018-05-12 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:f0ab773f5c96 Merge branch 'akpm' (patches from Andrew)
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=140ce81780
kernel config:  https://syzkaller.appspot.com/x/.config?x=fcce42b221691ff9
dashboard link: https://syzkaller.appspot.com/bug?extid=481ad819c717f6b78df9
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=15078e9780
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12bf369780

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+481ad819c717f6b78...@syzkaller.appspotmail.com

RBP: 7ffc558bda60 R08:  R09: 0003
R10:  R11: 0246 R12: 
R13: 0004 R14: 1380 R15: 7ffc558bd2f8
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault:  [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 4540 Comm: syz-executor884 Not tainted 4.17.0-rc4+ #43
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

RIP: 0010:__list_del_entry_valid+0x84/0xf3 lib/list_debug.c:51
RSP: 0018:8801aca6f860 EFLAGS: 00010246
RAX: dc00 RBX: 8801ac528498 RCX: 
RDX:  RSI:  RDI: 8801ac5284a0
RBP: 8801aca6f878 R08: fbfff11cc855 R09: fbfff11cc854
R10: 8801aca6f878 R11: 88e642a7 R12: 
R13:  R14: 8801aca6f908 R15: 8801ac528498
FS:  00f86880() GS:8801daf0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 004b6f3c CR3: 0001d973 CR4: 001406e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 __list_del_entry include/linux/list.h:117 [inline]
 list_del include/linux/list.h:125 [inline]
 kernfs_kill_sb+0xa0/0x350 fs/kernfs/mount.c:361
 sysfs_kill_sb+0x22/0x40 fs/sysfs/mount.c:50
 deactivate_locked_super+0x97/0x100 fs/super.c:316
 kernfs_mount_ns+0x753/0x8e0 fs/kernfs/mount.c:335
 sysfs_mount+0xdf/0x200 fs/sysfs/mount.c:36
 mount_fs+0xae/0x328 fs/super.c:1267
 vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037
 vfs_kern_mount fs/namespace.c:1027 [inline]
 do_new_mount fs/namespace.c:2518 [inline]
 do_mount+0x564/0x3070 fs/namespace.c:2848
 ksys_mount+0x12d/0x140 fs/namespace.c:3064
 __do_sys_mount fs/namespace.c:3078 [inline]
 __se_sys_mount fs/namespace.c:3075 [inline]
 __x64_sys_mount+0xbe/0x150 fs/namespace.c:3075
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4426a9
RSP: 002b:7ffc558bd1b8 EFLAGS: 0246 ORIG_RAX: 00a5
RAX: ffda RBX:  RCX: 004426a9
RDX: 20c0 RSI: 2080 RDI: 2040
RBP: 7ffc558bda60 R08:  R09: 0003
R10:  R11: 0246 R12: 
R13: 0004 R14: 1380 R15: 7ffc558bd2f8
Code: c5 0f 84 cc 00 00 00 48 b8 00 02 00 00 00 00 ad de 49 39 c4 0f 84 a5  
00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00  
75 5f 49 8b 14 24 48 39 da 0f 85 ba 00 00 00 48 b8
RIP: __list_del_entry_valid+0x84/0xf3 lib/list_debug.c:51 RSP:  
8801aca6f860

---[ end trace d148f307a34e229f ]---


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is  
merged

into any tree, please reply to this email with:
#syz fix: exact-commit-title
If you want to test a patch for this bug, please reply with:
#syz test: git://repo/address.git branch
and provide the patch inline or as an attachment.
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug  
report.

Note: all commands must start from beginning of the line in the email body.


Re: general protection fault in kernfs_kill_sb

2018-05-02 Thread Tetsuo Handa
On 2018/04/20 11:44, Eric Biggers wrote:
> Fix for the kernfs bug is now queued in vfs/for-linus:
> 
> #syz fix: kernfs: deal with early sget() failures

Well, the following patches

  rpc_pipefs: deal with early sget() failures
  kernfs: deal with early sget() failures
  procfs: deal with early sget() failures
  nfsd_umount(): deal with early sget() failures
  nfs: avoid double-free on early sget() failures

are dropped from vfs.git#for-linus while this report is marked
as "#syz fix: kernfs: deal with early sget() failures". The patch which
actually went to linux.git is 8e04944f0ea8b838.

#syz fix: mm,vmscan: Allow preallocating memory for register_shrinker().



By the way, we still have NULL pointer dereference (as of f2125992e7cb25ec
on linux.git) shown below due to calling deactivate_locked_super() without
successful fill_super().

--
[  162.865231] BUG: unable to handle kernel NULL pointer dereference at 

[  162.873678] PGD 130487067 P4D 130487067 PUD 138750067 PMD 0 
[  162.879845] Oops:  [#1] SMP
[  162.883295] Modules linked in:
[  162.886648] CPU: 2 PID: 15505 Comm: a.out Kdump: loaded Tainted: G   
 T 4.17.0-rc3+ #522
[  162.894891] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 05/19/2017
[  162.899415] RIP: 0010:__list_del_entry_valid+0x29/0x90
[  162.901609] RSP: 0018:c90001e07ce0 EFLAGS: 00010207
[  162.903834] RAX:  RBX: 880132359580 RCX: dead0200
[  162.906825] RDX:  RSI: e85efffd RDI: 880132359598
[  162.909863] RBP: c90001e07ce0 R08: 815269c5 R09: 0004
[  162.912923] R10: c90001e07ce0 R11: 840f2060 R12: 880134c6e000
[  162.915929] R13: 88013a014f00 R14: 880134c6e000 R15: 880132359580
[  162.918927] FS:  7f6525b13740() GS:88013a68() 
knlGS:
[  162.922325] CS:  0010 DS:  ES:  CR0: 80050033
[  162.924751] CR2:  CR3: 000134b2e006 CR4: 000606e0
[  162.927236] Call Trace:
[  162.928153]  kernfs_kill_sb+0x2e/0x90
[  162.929377]  sysfs_kill_sb+0x22/0x40
[  162.930571]  deactivate_locked_super+0x50/0x90
[  162.931958]  kernfs_mount_ns+0x283/0x290
[  162.933126]  sysfs_mount+0x74/0xf0
[  162.934146]  mount_fs+0x46/0x1a0
[  162.935137]  vfs_kern_mount.part.28+0x67/0x190
[  162.936449]  do_mount+0x7b0/0x11f0
[  162.937473]  ? memdup_user+0x5e/0x90
[  162.938541]  ? copy_mount_options+0x1a4/0x2d0
[  162.939828]  ksys_mount+0xab/0x120
[  162.940954]  __x64_sys_mount+0x26/0x30
[  162.942153]  do_syscall_64+0x7b/0x260
[  162.943274]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  162.944903] RIP: 0033:0x7f6525640aaa
[  162.946012] RSP: 002b:7ffc4f4e6f78 EFLAGS: 0246 ORIG_RAX: 
00a5
[  162.948291] RAX: ffda RBX: 000e RCX: 7f6525640aaa
[  162.950396] RDX: 00400896 RSI: 0040089c RDI: 00400896
[  162.952480] RBP: 0003 R08:  R09: 0002
[  162.954545] R10:  R11: 0246 R12: 00400694
[  162.957760] R13: 7ffc4f4e7080 R14:  R15: 
[  162.963839] Code: 00 00 55 48 8b 07 48 b9 00 01 00 00 00 00 ad de 48 8b 57 
08 48 89 e5 48 39 c8 74 27 48 b9 00 02 00 00 00 00 ad de 48 39 ca 74 2c <48> 8b 
32 48 39 fe 75 35 48 8b 50 08 48 39 f2 75 40 b8 01 00 00 
[  162.974795] RIP: __list_del_entry_valid+0x29/0x90 RSP: c90001e07ce0
[  162.976752] CR2: 
--

Below patch can avoid NULL pointer dereference at kernfs_kill_sb().

--
diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
index 26dd9a5..498c044 100644
--- a/fs/kernfs/mount.c
+++ b/fs/kernfs/mount.c
@@ -314,6 +314,7 @@ struct dentry *kernfs_mount_ns(struct file_system_type 
*fs_type, int flags,
if (!info)
return ERR_PTR(-ENOMEM);
 
+   INIT_LIST_HEAD(>node);
info->root = root;
info->ns = ns;
 
--

But there remains a refcount bug because deactivate_locked_super() from
kernfs_mount_ns() triggers kobj_ns_drop() from sysfs_kill_sb() via
sb->kill_sb() when kobj_ns_drop() is always called by sysfs_mount()
if kernfs_mount_ns() returned an error.

--
 static void *net_grab_current_ns(void)
 {
struct net *ns = current->nsproxy->net_ns;
 #ifdef CONFIG_NET_NS
if (ns)
refcount_inc(>passive);
if (ns && !strcmp(current->comm, "a.out"))
printk("net_grab_current_ns: %px %d %d\n", ns,
   refcount_read(>passive), refcount_read(>count));
 #endif
return ns;
 }

 void net_drop_ns(void *p)
 {
struct net *ns = p;
if (ns && !strcmp(current->comm, "a.out")) {
printk("net_drop_ns: %px %d %d\n", ns,
   refcount_read(>passive), refcount_read(>count));
dump_stack();
}
if (ns && 

Re: general protection fault in kernfs_kill_sb

2018-05-02 Thread Tetsuo Handa
On 2018/04/20 11:44, Eric Biggers wrote:
> Fix for the kernfs bug is now queued in vfs/for-linus:
> 
> #syz fix: kernfs: deal with early sget() failures

Well, the following patches

  rpc_pipefs: deal with early sget() failures
  kernfs: deal with early sget() failures
  procfs: deal with early sget() failures
  nfsd_umount(): deal with early sget() failures
  nfs: avoid double-free on early sget() failures

are dropped from vfs.git#for-linus while this report is marked
as "#syz fix: kernfs: deal with early sget() failures". The patch which
actually went to linux.git is 8e04944f0ea8b838.

#syz fix: mm,vmscan: Allow preallocating memory for register_shrinker().



By the way, we still have NULL pointer dereference (as of f2125992e7cb25ec
on linux.git) shown below due to calling deactivate_locked_super() without
successful fill_super().

--
[  162.865231] BUG: unable to handle kernel NULL pointer dereference at 

[  162.873678] PGD 130487067 P4D 130487067 PUD 138750067 PMD 0 
[  162.879845] Oops:  [#1] SMP
[  162.883295] Modules linked in:
[  162.886648] CPU: 2 PID: 15505 Comm: a.out Kdump: loaded Tainted: G   
 T 4.17.0-rc3+ #522
[  162.894891] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 05/19/2017
[  162.899415] RIP: 0010:__list_del_entry_valid+0x29/0x90
[  162.901609] RSP: 0018:c90001e07ce0 EFLAGS: 00010207
[  162.903834] RAX:  RBX: 880132359580 RCX: dead0200
[  162.906825] RDX:  RSI: e85efffd RDI: 880132359598
[  162.909863] RBP: c90001e07ce0 R08: 815269c5 R09: 0004
[  162.912923] R10: c90001e07ce0 R11: 840f2060 R12: 880134c6e000
[  162.915929] R13: 88013a014f00 R14: 880134c6e000 R15: 880132359580
[  162.918927] FS:  7f6525b13740() GS:88013a68() 
knlGS:
[  162.922325] CS:  0010 DS:  ES:  CR0: 80050033
[  162.924751] CR2:  CR3: 000134b2e006 CR4: 000606e0
[  162.927236] Call Trace:
[  162.928153]  kernfs_kill_sb+0x2e/0x90
[  162.929377]  sysfs_kill_sb+0x22/0x40
[  162.930571]  deactivate_locked_super+0x50/0x90
[  162.931958]  kernfs_mount_ns+0x283/0x290
[  162.933126]  sysfs_mount+0x74/0xf0
[  162.934146]  mount_fs+0x46/0x1a0
[  162.935137]  vfs_kern_mount.part.28+0x67/0x190
[  162.936449]  do_mount+0x7b0/0x11f0
[  162.937473]  ? memdup_user+0x5e/0x90
[  162.938541]  ? copy_mount_options+0x1a4/0x2d0
[  162.939828]  ksys_mount+0xab/0x120
[  162.940954]  __x64_sys_mount+0x26/0x30
[  162.942153]  do_syscall_64+0x7b/0x260
[  162.943274]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  162.944903] RIP: 0033:0x7f6525640aaa
[  162.946012] RSP: 002b:7ffc4f4e6f78 EFLAGS: 0246 ORIG_RAX: 
00a5
[  162.948291] RAX: ffda RBX: 000e RCX: 7f6525640aaa
[  162.950396] RDX: 00400896 RSI: 0040089c RDI: 00400896
[  162.952480] RBP: 0003 R08:  R09: 0002
[  162.954545] R10:  R11: 0246 R12: 00400694
[  162.957760] R13: 7ffc4f4e7080 R14:  R15: 
[  162.963839] Code: 00 00 55 48 8b 07 48 b9 00 01 00 00 00 00 ad de 48 8b 57 
08 48 89 e5 48 39 c8 74 27 48 b9 00 02 00 00 00 00 ad de 48 39 ca 74 2c <48> 8b 
32 48 39 fe 75 35 48 8b 50 08 48 39 f2 75 40 b8 01 00 00 
[  162.974795] RIP: __list_del_entry_valid+0x29/0x90 RSP: c90001e07ce0
[  162.976752] CR2: 
--

Below patch can avoid NULL pointer dereference at kernfs_kill_sb().

--
diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
index 26dd9a5..498c044 100644
--- a/fs/kernfs/mount.c
+++ b/fs/kernfs/mount.c
@@ -314,6 +314,7 @@ struct dentry *kernfs_mount_ns(struct file_system_type 
*fs_type, int flags,
if (!info)
return ERR_PTR(-ENOMEM);
 
+   INIT_LIST_HEAD(>node);
info->root = root;
info->ns = ns;
 
--

But there remains a refcount bug because deactivate_locked_super() from
kernfs_mount_ns() triggers kobj_ns_drop() from sysfs_kill_sb() via
sb->kill_sb() when kobj_ns_drop() is always called by sysfs_mount()
if kernfs_mount_ns() returned an error.

--
 static void *net_grab_current_ns(void)
 {
struct net *ns = current->nsproxy->net_ns;
 #ifdef CONFIG_NET_NS
if (ns)
refcount_inc(>passive);
if (ns && !strcmp(current->comm, "a.out"))
printk("net_grab_current_ns: %px %d %d\n", ns,
   refcount_read(>passive), refcount_read(>count));
 #endif
return ns;
 }

 void net_drop_ns(void *p)
 {
struct net *ns = p;
if (ns && !strcmp(current->comm, "a.out")) {
printk("net_drop_ns: %px %d %d\n", ns,
   refcount_read(>passive), refcount_read(>count));
dump_stack();
}
if (ns && 

Re: general protection fault in kernfs_kill_sb

2018-04-20 Thread Eric Biggers
On Fri, Apr 20, 2018 at 09:31:58AM +0200, Michal Hocko wrote:
> On Fri 20-04-18 14:29:39, Tetsuo Handa wrote:
> > Eric Biggers wrote:
> > > But, there is still a related bug: when mounting sysfs, if 
> > > register_shrinker()
> > > fails in sget_userns(), then kernfs_kill_sb() gets called, which frees the
> > > 'struct kernfs_super_info'.  But, the 'struct kernfs_super_info' is also 
> > > freed
> > > in kernfs_mount_ns() by:
> > > 
> > > sb = sget_userns(fs_type, kernfs_test_super, kernfs_set_super, 
> > > flags,
> > >  _user_ns, info);
> > > if (IS_ERR(sb) || sb->s_fs_info != info)
> > > kfree(info);
> > > if (IS_ERR(sb))
> > > return ERR_CAST(sb);
> > > 
> > > I guess the problem is that sget_userns() shouldn't take ownership of the 
> > > 'info'
> > > if it returns an error -- but, it actually does if register_shrinker() 
> > > fails,
> > > resulting in a double free.
> > > 
> > > Here is a reproducer and the KASAN splat.  This is on Linus' tree 
> > > (87ef12027b9b)
> > > with vfs/for-linus merged in.
> > 
> > I'm waiting for response from Michal Hocko regarding
> > http://lkml.kernel.org/r/201804111909.egc64586.qsflfjfovho...@i-love.sakura.ne.jp
> >  .
> 
> I didn't plan to respond util all the Al's concerns with the existing
> scheme are resolved. This is not an urgent thing to fix so better fix it
> properly. Your API change is kinda ugly so it would be preferable to do
> it properly as suggested by Al. Maybe that will be more work but my
> understanding is that the resulting code would be better. If that is not
> the case then I do not really have any fundamental objection to your
> patch except it is ugly.

Okay, the fix was merged already as commit 8e04944f0ea8b8 ("mm,vmscan: Allow
preallocating memory for register_shrinker().").  Thanks Tetsuo!

- Eric


Re: general protection fault in kernfs_kill_sb

2018-04-20 Thread Eric Biggers
On Fri, Apr 20, 2018 at 09:31:58AM +0200, Michal Hocko wrote:
> On Fri 20-04-18 14:29:39, Tetsuo Handa wrote:
> > Eric Biggers wrote:
> > > But, there is still a related bug: when mounting sysfs, if 
> > > register_shrinker()
> > > fails in sget_userns(), then kernfs_kill_sb() gets called, which frees the
> > > 'struct kernfs_super_info'.  But, the 'struct kernfs_super_info' is also 
> > > freed
> > > in kernfs_mount_ns() by:
> > > 
> > > sb = sget_userns(fs_type, kernfs_test_super, kernfs_set_super, 
> > > flags,
> > >  _user_ns, info);
> > > if (IS_ERR(sb) || sb->s_fs_info != info)
> > > kfree(info);
> > > if (IS_ERR(sb))
> > > return ERR_CAST(sb);
> > > 
> > > I guess the problem is that sget_userns() shouldn't take ownership of the 
> > > 'info'
> > > if it returns an error -- but, it actually does if register_shrinker() 
> > > fails,
> > > resulting in a double free.
> > > 
> > > Here is a reproducer and the KASAN splat.  This is on Linus' tree 
> > > (87ef12027b9b)
> > > with vfs/for-linus merged in.
> > 
> > I'm waiting for response from Michal Hocko regarding
> > http://lkml.kernel.org/r/201804111909.egc64586.qsflfjfovho...@i-love.sakura.ne.jp
> >  .
> 
> I didn't plan to respond util all the Al's concerns with the existing
> scheme are resolved. This is not an urgent thing to fix so better fix it
> properly. Your API change is kinda ugly so it would be preferable to do
> it properly as suggested by Al. Maybe that will be more work but my
> understanding is that the resulting code would be better. If that is not
> the case then I do not really have any fundamental objection to your
> patch except it is ugly.

Okay, the fix was merged already as commit 8e04944f0ea8b8 ("mm,vmscan: Allow
preallocating memory for register_shrinker().").  Thanks Tetsuo!

- Eric


Re: general protection fault in kernfs_kill_sb

2018-04-20 Thread Michal Hocko
On Fri 20-04-18 14:29:39, Tetsuo Handa wrote:
> Eric Biggers wrote:
> > But, there is still a related bug: when mounting sysfs, if 
> > register_shrinker()
> > fails in sget_userns(), then kernfs_kill_sb() gets called, which frees the
> > 'struct kernfs_super_info'.  But, the 'struct kernfs_super_info' is also 
> > freed
> > in kernfs_mount_ns() by:
> > 
> > sb = sget_userns(fs_type, kernfs_test_super, kernfs_set_super, 
> > flags,
> >  _user_ns, info);
> > if (IS_ERR(sb) || sb->s_fs_info != info)
> > kfree(info);
> > if (IS_ERR(sb))
> > return ERR_CAST(sb);
> > 
> > I guess the problem is that sget_userns() shouldn't take ownership of the 
> > 'info'
> > if it returns an error -- but, it actually does if register_shrinker() 
> > fails,
> > resulting in a double free.
> > 
> > Here is a reproducer and the KASAN splat.  This is on Linus' tree 
> > (87ef12027b9b)
> > with vfs/for-linus merged in.
> 
> I'm waiting for response from Michal Hocko regarding
> http://lkml.kernel.org/r/201804111909.egc64586.qsflfjfovho...@i-love.sakura.ne.jp
>  .

I didn't plan to respond util all the Al's concerns with the existing
scheme are resolved. This is not an urgent thing to fix so better fix it
properly. Your API change is kinda ugly so it would be preferable to do
it properly as suggested by Al. Maybe that will be more work but my
understanding is that the resulting code would be better. If that is not
the case then I do not really have any fundamental objection to your
patch except it is ugly.
-- 
Michal Hocko
SUSE Labs


Re: general protection fault in kernfs_kill_sb

2018-04-20 Thread Michal Hocko
On Fri 20-04-18 14:29:39, Tetsuo Handa wrote:
> Eric Biggers wrote:
> > But, there is still a related bug: when mounting sysfs, if 
> > register_shrinker()
> > fails in sget_userns(), then kernfs_kill_sb() gets called, which frees the
> > 'struct kernfs_super_info'.  But, the 'struct kernfs_super_info' is also 
> > freed
> > in kernfs_mount_ns() by:
> > 
> > sb = sget_userns(fs_type, kernfs_test_super, kernfs_set_super, 
> > flags,
> >  _user_ns, info);
> > if (IS_ERR(sb) || sb->s_fs_info != info)
> > kfree(info);
> > if (IS_ERR(sb))
> > return ERR_CAST(sb);
> > 
> > I guess the problem is that sget_userns() shouldn't take ownership of the 
> > 'info'
> > if it returns an error -- but, it actually does if register_shrinker() 
> > fails,
> > resulting in a double free.
> > 
> > Here is a reproducer and the KASAN splat.  This is on Linus' tree 
> > (87ef12027b9b)
> > with vfs/for-linus merged in.
> 
> I'm waiting for response from Michal Hocko regarding
> http://lkml.kernel.org/r/201804111909.egc64586.qsflfjfovho...@i-love.sakura.ne.jp
>  .

I didn't plan to respond util all the Al's concerns with the existing
scheme are resolved. This is not an urgent thing to fix so better fix it
properly. Your API change is kinda ugly so it would be preferable to do
it properly as suggested by Al. Maybe that will be more work but my
understanding is that the resulting code would be better. If that is not
the case then I do not really have any fundamental objection to your
patch except it is ugly.
-- 
Michal Hocko
SUSE Labs


Re: general protection fault in kernfs_kill_sb

2018-04-19 Thread Eric Biggers
On Thu, Apr 19, 2018 at 07:44:40PM -0700, Eric Biggers wrote:
> On Mon, Apr 02, 2018 at 03:34:15PM +0100, Al Viro wrote:
> > On Mon, Apr 02, 2018 at 07:40:22PM +0900, Tetsuo Handa wrote:
> > 
> > > That commit assumes that calling kill_sb() from deactivate_locked_super(s)
> > > without corresponding fill_super() is safe. We have so far crashed with
> > > rpc_mount() and kernfs_mount_ns(). Is that really safe?
> > 
> > Consider the case when fill_super() returns an error immediately.
> > It is exactly the same situation.  And ->kill_sb() *is* called in cases
> > when fill_super() has failed.  Always had been - it's much less boilerplate
> > that way.
> > 
> > deactivate_locked_super() on that failure exit is the least painful
> > variant, unfortunately.
> > 
> > Filesystems with ->kill_sb() instances that rely upon something
> > done between sget() and the first failure exit after it need to be fixed.
> > And yes, that should've been spotted back then.  Sorry.
> > 
> > Fortunately, we don't have many of those - kill_{block,litter,anon}_super()
> > are safe and those are the majority.  Looking through the rest uncovers
> > some bugs; so far all I've seen were already there.  Note that normally
> > we have something like
> > static void affs_kill_sb(struct super_block *sb)
> > {
> > struct affs_sb_info *sbi = AFFS_SB(sb);
> > kill_block_super(sb);
> > if (sbi) {
> > affs_free_bitmap(sb);
> > affs_brelse(sbi->s_root_bh);
> > kfree(sbi->s_prefix);
> > mutex_destroy(>s_bmlock);
> > kfree(sbi);
> > }
> > }
> > which basically does one of the safe ones augmented with something that
> > takes care *not* to assume that e.g. ->s_fs_info has been allocated.
> > Not everyone does, though:
> > 
> > jffs2_fill_super():
> > c = kzalloc(sizeof(*c), GFP_KERNEL);
> > if (!c)
> > return -ENOMEM;
> > in the very beginning.  So we can return from it with NULL ->s_fs_info.
> > Now, consider
> > struct jffs2_sb_info *c = JFFS2_SB_INFO(sb);
> > if (!(sb->s_flags & MS_RDONLY))
> > jffs2_stop_garbage_collect_thread(c);
> > in jffs2_kill_sb() and
> > void jffs2_stop_garbage_collect_thread(struct jffs2_sb_info *c)
> > {
> > int wait = 0;
> > spin_lock(>erase_completion_lock);
> > if (c->gc_task) {
> > 
> > IOW, fail that kzalloc() (or, indeed, an allocation in register_shrinker())
> > and eat an oops.  Always had been there, always hard to hit without
> > fault injectors and fortunately trivial to fix.
> > 
> > Similar in nfs_kill_super() calling nfs_free_server().
> > Similar in v9fs_kill_super() with 
> > v9fs_session_cancel()/v9fs_session_close() calls.
> > Similar in hypfs_kill_super(), afs_kill_super(), btrfs_kill_super(), 
> > cifs_kill_sb()
> > (all trivial to fix)
> > 
> > Aha... nfsd_umount() is a new regression.
> > 
> > orangefs: old, trivial to fix.
> > 
> > cgroup_kill_sb(): old, hopefully easy to fix.  Note that 
> > kernfs_root_from_sb()
> > can bloody well return NULL, making cgroup_root_from_kf() oops.  Always had 
> > been
> > there.
> > 
> > AFAICS, after discarding the instances that do the right thing we are left 
> > with:
> > hypfs_kill_super, rdt_kill_sb, v9fs_kill_super, afs_kill_super, 
> > btrfs_kill_super,
> > cifs_kill_sb, jffs2_kill_sb, nfs_kill_super, nfsd_umount, orangefs_kill_sb,
> > proc_kill_sb, sysfs_kill_sb, cgroup_kill_sb, rpc_kill_sb.
> > 
> > Out of those, nfsd_umount(), proc_kill_sb() and rpc_kill_sb() are 
> > regressions.
> > So are rdt_kill_sb() and sysfs_kill_sb() (victims of the issue you've 
> > spotted
> > in kernfs_kill_sb()).  The rest are old (and I wonder if syzbot had been
> > catching those - they are also dependent upon a specific allocation failing
> > at the right time).
> > 
> 
> Fix for the kernfs bug is now queued in vfs/for-linus:
> 
> #syz fix: kernfs: deal with early sget() failures
> 

But, there is still a related bug: when mounting sysfs, if register_shrinker()
fails in sget_userns(), then kernfs_kill_sb() gets called, which frees the
'struct kernfs_super_info'.  But, the 'struct kernfs_super_info' is also freed
in kernfs_mount_ns() by:

sb = sget_userns(fs_type, kernfs_test_super, kernfs_set_super, flags,
 _user_ns, info);
if (IS_ERR(sb) || sb->s_fs_info != info)
kfree(info);
if (IS_ERR(sb))
return ERR_CAST(sb);

I guess the problem is that sget_userns() shouldn't take ownership of the 'info'
if it returns an error -- but, it actually does if register_shrinker() fails,
resulting in a double free.

Here is a reproducer and the KASAN splat.  This is on Linus' tree (87ef12027b9b)
with vfs/for-linus merged in.

#define _GNU_SOURCE
#include 
#include 
#include 
#include 
#include 
#include 
#include 

int main()
{
int fd, i;
char buf[16];

unshare(CLONE_NEWNET);
 

Re: general protection fault in kernfs_kill_sb

2018-04-19 Thread Eric Biggers
On Thu, Apr 19, 2018 at 07:44:40PM -0700, Eric Biggers wrote:
> On Mon, Apr 02, 2018 at 03:34:15PM +0100, Al Viro wrote:
> > On Mon, Apr 02, 2018 at 07:40:22PM +0900, Tetsuo Handa wrote:
> > 
> > > That commit assumes that calling kill_sb() from deactivate_locked_super(s)
> > > without corresponding fill_super() is safe. We have so far crashed with
> > > rpc_mount() and kernfs_mount_ns(). Is that really safe?
> > 
> > Consider the case when fill_super() returns an error immediately.
> > It is exactly the same situation.  And ->kill_sb() *is* called in cases
> > when fill_super() has failed.  Always had been - it's much less boilerplate
> > that way.
> > 
> > deactivate_locked_super() on that failure exit is the least painful
> > variant, unfortunately.
> > 
> > Filesystems with ->kill_sb() instances that rely upon something
> > done between sget() and the first failure exit after it need to be fixed.
> > And yes, that should've been spotted back then.  Sorry.
> > 
> > Fortunately, we don't have many of those - kill_{block,litter,anon}_super()
> > are safe and those are the majority.  Looking through the rest uncovers
> > some bugs; so far all I've seen were already there.  Note that normally
> > we have something like
> > static void affs_kill_sb(struct super_block *sb)
> > {
> > struct affs_sb_info *sbi = AFFS_SB(sb);
> > kill_block_super(sb);
> > if (sbi) {
> > affs_free_bitmap(sb);
> > affs_brelse(sbi->s_root_bh);
> > kfree(sbi->s_prefix);
> > mutex_destroy(>s_bmlock);
> > kfree(sbi);
> > }
> > }
> > which basically does one of the safe ones augmented with something that
> > takes care *not* to assume that e.g. ->s_fs_info has been allocated.
> > Not everyone does, though:
> > 
> > jffs2_fill_super():
> > c = kzalloc(sizeof(*c), GFP_KERNEL);
> > if (!c)
> > return -ENOMEM;
> > in the very beginning.  So we can return from it with NULL ->s_fs_info.
> > Now, consider
> > struct jffs2_sb_info *c = JFFS2_SB_INFO(sb);
> > if (!(sb->s_flags & MS_RDONLY))
> > jffs2_stop_garbage_collect_thread(c);
> > in jffs2_kill_sb() and
> > void jffs2_stop_garbage_collect_thread(struct jffs2_sb_info *c)
> > {
> > int wait = 0;
> > spin_lock(>erase_completion_lock);
> > if (c->gc_task) {
> > 
> > IOW, fail that kzalloc() (or, indeed, an allocation in register_shrinker())
> > and eat an oops.  Always had been there, always hard to hit without
> > fault injectors and fortunately trivial to fix.
> > 
> > Similar in nfs_kill_super() calling nfs_free_server().
> > Similar in v9fs_kill_super() with 
> > v9fs_session_cancel()/v9fs_session_close() calls.
> > Similar in hypfs_kill_super(), afs_kill_super(), btrfs_kill_super(), 
> > cifs_kill_sb()
> > (all trivial to fix)
> > 
> > Aha... nfsd_umount() is a new regression.
> > 
> > orangefs: old, trivial to fix.
> > 
> > cgroup_kill_sb(): old, hopefully easy to fix.  Note that 
> > kernfs_root_from_sb()
> > can bloody well return NULL, making cgroup_root_from_kf() oops.  Always had 
> > been
> > there.
> > 
> > AFAICS, after discarding the instances that do the right thing we are left 
> > with:
> > hypfs_kill_super, rdt_kill_sb, v9fs_kill_super, afs_kill_super, 
> > btrfs_kill_super,
> > cifs_kill_sb, jffs2_kill_sb, nfs_kill_super, nfsd_umount, orangefs_kill_sb,
> > proc_kill_sb, sysfs_kill_sb, cgroup_kill_sb, rpc_kill_sb.
> > 
> > Out of those, nfsd_umount(), proc_kill_sb() and rpc_kill_sb() are 
> > regressions.
> > So are rdt_kill_sb() and sysfs_kill_sb() (victims of the issue you've 
> > spotted
> > in kernfs_kill_sb()).  The rest are old (and I wonder if syzbot had been
> > catching those - they are also dependent upon a specific allocation failing
> > at the right time).
> > 
> 
> Fix for the kernfs bug is now queued in vfs/for-linus:
> 
> #syz fix: kernfs: deal with early sget() failures
> 

But, there is still a related bug: when mounting sysfs, if register_shrinker()
fails in sget_userns(), then kernfs_kill_sb() gets called, which frees the
'struct kernfs_super_info'.  But, the 'struct kernfs_super_info' is also freed
in kernfs_mount_ns() by:

sb = sget_userns(fs_type, kernfs_test_super, kernfs_set_super, flags,
 _user_ns, info);
if (IS_ERR(sb) || sb->s_fs_info != info)
kfree(info);
if (IS_ERR(sb))
return ERR_CAST(sb);

I guess the problem is that sget_userns() shouldn't take ownership of the 'info'
if it returns an error -- but, it actually does if register_shrinker() fails,
resulting in a double free.

Here is a reproducer and the KASAN splat.  This is on Linus' tree (87ef12027b9b)
with vfs/for-linus merged in.

#define _GNU_SOURCE
#include 
#include 
#include 
#include 
#include 
#include 
#include 

int main()
{
int fd, i;
char buf[16];

unshare(CLONE_NEWNET);
 

Re: general protection fault in kernfs_kill_sb

2018-04-19 Thread Eric Biggers
On Mon, Apr 02, 2018 at 03:34:15PM +0100, Al Viro wrote:
> On Mon, Apr 02, 2018 at 07:40:22PM +0900, Tetsuo Handa wrote:
> 
> > That commit assumes that calling kill_sb() from deactivate_locked_super(s)
> > without corresponding fill_super() is safe. We have so far crashed with
> > rpc_mount() and kernfs_mount_ns(). Is that really safe?
> 
>   Consider the case when fill_super() returns an error immediately.
> It is exactly the same situation.  And ->kill_sb() *is* called in cases
> when fill_super() has failed.  Always had been - it's much less boilerplate
> that way.
> 
>   deactivate_locked_super() on that failure exit is the least painful
> variant, unfortunately.
> 
>   Filesystems with ->kill_sb() instances that rely upon something
> done between sget() and the first failure exit after it need to be fixed.
> And yes, that should've been spotted back then.  Sorry.
> 
> Fortunately, we don't have many of those - kill_{block,litter,anon}_super()
> are safe and those are the majority.  Looking through the rest uncovers
> some bugs; so far all I've seen were already there.  Note that normally
> we have something like
> static void affs_kill_sb(struct super_block *sb)
> {
> struct affs_sb_info *sbi = AFFS_SB(sb);
> kill_block_super(sb);
> if (sbi) {
> affs_free_bitmap(sb);
> affs_brelse(sbi->s_root_bh);
> kfree(sbi->s_prefix);
> mutex_destroy(>s_bmlock);
> kfree(sbi);
> }
> }
> which basically does one of the safe ones augmented with something that
> takes care *not* to assume that e.g. ->s_fs_info has been allocated.
> Not everyone does, though:
> 
> jffs2_fill_super():
> c = kzalloc(sizeof(*c), GFP_KERNEL);
> if (!c)
> return -ENOMEM;
> in the very beginning.  So we can return from it with NULL ->s_fs_info.
> Now, consider
> struct jffs2_sb_info *c = JFFS2_SB_INFO(sb);
> if (!(sb->s_flags & MS_RDONLY))
> jffs2_stop_garbage_collect_thread(c);
> in jffs2_kill_sb() and
> void jffs2_stop_garbage_collect_thread(struct jffs2_sb_info *c)
> {
> int wait = 0;
> spin_lock(>erase_completion_lock);
> if (c->gc_task) {
> 
> IOW, fail that kzalloc() (or, indeed, an allocation in register_shrinker())
> and eat an oops.  Always had been there, always hard to hit without
> fault injectors and fortunately trivial to fix.
> 
> Similar in nfs_kill_super() calling nfs_free_server().
> Similar in v9fs_kill_super() with v9fs_session_cancel()/v9fs_session_close() 
> calls.
> Similar in hypfs_kill_super(), afs_kill_super(), btrfs_kill_super(), 
> cifs_kill_sb()
> (all trivial to fix)
> 
> Aha... nfsd_umount() is a new regression.
> 
> orangefs: old, trivial to fix.
> 
> cgroup_kill_sb(): old, hopefully easy to fix.  Note that kernfs_root_from_sb()
> can bloody well return NULL, making cgroup_root_from_kf() oops.  Always had 
> been
> there.
> 
> AFAICS, after discarding the instances that do the right thing we are left 
> with:
> hypfs_kill_super, rdt_kill_sb, v9fs_kill_super, afs_kill_super, 
> btrfs_kill_super,
> cifs_kill_sb, jffs2_kill_sb, nfs_kill_super, nfsd_umount, orangefs_kill_sb,
> proc_kill_sb, sysfs_kill_sb, cgroup_kill_sb, rpc_kill_sb.
> 
> Out of those, nfsd_umount(), proc_kill_sb() and rpc_kill_sb() are regressions.
> So are rdt_kill_sb() and sysfs_kill_sb() (victims of the issue you've spotted
> in kernfs_kill_sb()).  The rest are old (and I wonder if syzbot had been
> catching those - they are also dependent upon a specific allocation failing
> at the right time).
> 

Fix for the kernfs bug is now queued in vfs/for-linus:

#syz fix: kernfs: deal with early sget() failures

syzkaller just recently (3 weeks ago) gained the ability to mount filesystem
images, so that's the main reason for the increase in filesystem bug reports.
Each time syzkaller is updated to cover more code, bugs are found.

- Eric


Re: general protection fault in kernfs_kill_sb

2018-04-19 Thread Eric Biggers
On Mon, Apr 02, 2018 at 03:34:15PM +0100, Al Viro wrote:
> On Mon, Apr 02, 2018 at 07:40:22PM +0900, Tetsuo Handa wrote:
> 
> > That commit assumes that calling kill_sb() from deactivate_locked_super(s)
> > without corresponding fill_super() is safe. We have so far crashed with
> > rpc_mount() and kernfs_mount_ns(). Is that really safe?
> 
>   Consider the case when fill_super() returns an error immediately.
> It is exactly the same situation.  And ->kill_sb() *is* called in cases
> when fill_super() has failed.  Always had been - it's much less boilerplate
> that way.
> 
>   deactivate_locked_super() on that failure exit is the least painful
> variant, unfortunately.
> 
>   Filesystems with ->kill_sb() instances that rely upon something
> done between sget() and the first failure exit after it need to be fixed.
> And yes, that should've been spotted back then.  Sorry.
> 
> Fortunately, we don't have many of those - kill_{block,litter,anon}_super()
> are safe and those are the majority.  Looking through the rest uncovers
> some bugs; so far all I've seen were already there.  Note that normally
> we have something like
> static void affs_kill_sb(struct super_block *sb)
> {
> struct affs_sb_info *sbi = AFFS_SB(sb);
> kill_block_super(sb);
> if (sbi) {
> affs_free_bitmap(sb);
> affs_brelse(sbi->s_root_bh);
> kfree(sbi->s_prefix);
> mutex_destroy(>s_bmlock);
> kfree(sbi);
> }
> }
> which basically does one of the safe ones augmented with something that
> takes care *not* to assume that e.g. ->s_fs_info has been allocated.
> Not everyone does, though:
> 
> jffs2_fill_super():
> c = kzalloc(sizeof(*c), GFP_KERNEL);
> if (!c)
> return -ENOMEM;
> in the very beginning.  So we can return from it with NULL ->s_fs_info.
> Now, consider
> struct jffs2_sb_info *c = JFFS2_SB_INFO(sb);
> if (!(sb->s_flags & MS_RDONLY))
> jffs2_stop_garbage_collect_thread(c);
> in jffs2_kill_sb() and
> void jffs2_stop_garbage_collect_thread(struct jffs2_sb_info *c)
> {
> int wait = 0;
> spin_lock(>erase_completion_lock);
> if (c->gc_task) {
> 
> IOW, fail that kzalloc() (or, indeed, an allocation in register_shrinker())
> and eat an oops.  Always had been there, always hard to hit without
> fault injectors and fortunately trivial to fix.
> 
> Similar in nfs_kill_super() calling nfs_free_server().
> Similar in v9fs_kill_super() with v9fs_session_cancel()/v9fs_session_close() 
> calls.
> Similar in hypfs_kill_super(), afs_kill_super(), btrfs_kill_super(), 
> cifs_kill_sb()
> (all trivial to fix)
> 
> Aha... nfsd_umount() is a new regression.
> 
> orangefs: old, trivial to fix.
> 
> cgroup_kill_sb(): old, hopefully easy to fix.  Note that kernfs_root_from_sb()
> can bloody well return NULL, making cgroup_root_from_kf() oops.  Always had 
> been
> there.
> 
> AFAICS, after discarding the instances that do the right thing we are left 
> with:
> hypfs_kill_super, rdt_kill_sb, v9fs_kill_super, afs_kill_super, 
> btrfs_kill_super,
> cifs_kill_sb, jffs2_kill_sb, nfs_kill_super, nfsd_umount, orangefs_kill_sb,
> proc_kill_sb, sysfs_kill_sb, cgroup_kill_sb, rpc_kill_sb.
> 
> Out of those, nfsd_umount(), proc_kill_sb() and rpc_kill_sb() are regressions.
> So are rdt_kill_sb() and sysfs_kill_sb() (victims of the issue you've spotted
> in kernfs_kill_sb()).  The rest are old (and I wonder if syzbot had been
> catching those - they are also dependent upon a specific allocation failing
> at the right time).
> 

Fix for the kernfs bug is now queued in vfs/for-linus:

#syz fix: kernfs: deal with early sget() failures

syzkaller just recently (3 weeks ago) gained the ability to mount filesystem
images, so that's the main reason for the increase in filesystem bug reports.
Each time syzkaller is updated to cover more code, bugs are found.

- Eric


Re: general protection fault in kernfs_kill_sb

2018-04-02 Thread Al Viro
On Mon, Apr 02, 2018 at 07:40:22PM +0900, Tetsuo Handa wrote:

> That commit assumes that calling kill_sb() from deactivate_locked_super(s)
> without corresponding fill_super() is safe. We have so far crashed with
> rpc_mount() and kernfs_mount_ns(). Is that really safe?

Consider the case when fill_super() returns an error immediately.
It is exactly the same situation.  And ->kill_sb() *is* called in cases
when fill_super() has failed.  Always had been - it's much less boilerplate
that way.

deactivate_locked_super() on that failure exit is the least painful
variant, unfortunately.

Filesystems with ->kill_sb() instances that rely upon something
done between sget() and the first failure exit after it need to be fixed.
And yes, that should've been spotted back then.  Sorry.

Fortunately, we don't have many of those - kill_{block,litter,anon}_super()
are safe and those are the majority.  Looking through the rest uncovers
some bugs; so far all I've seen were already there.  Note that normally
we have something like
static void affs_kill_sb(struct super_block *sb)
{
struct affs_sb_info *sbi = AFFS_SB(sb);
kill_block_super(sb);
if (sbi) {
affs_free_bitmap(sb);
affs_brelse(sbi->s_root_bh);
kfree(sbi->s_prefix);
mutex_destroy(>s_bmlock);
kfree(sbi);
}
}
which basically does one of the safe ones augmented with something that
takes care *not* to assume that e.g. ->s_fs_info has been allocated.
Not everyone does, though:

jffs2_fill_super():
c = kzalloc(sizeof(*c), GFP_KERNEL);
if (!c)
return -ENOMEM;
in the very beginning.  So we can return from it with NULL ->s_fs_info.
Now, consider
struct jffs2_sb_info *c = JFFS2_SB_INFO(sb);
if (!(sb->s_flags & MS_RDONLY))
jffs2_stop_garbage_collect_thread(c);
in jffs2_kill_sb() and
void jffs2_stop_garbage_collect_thread(struct jffs2_sb_info *c)
{
int wait = 0;
spin_lock(>erase_completion_lock);
if (c->gc_task) {

IOW, fail that kzalloc() (or, indeed, an allocation in register_shrinker())
and eat an oops.  Always had been there, always hard to hit without
fault injectors and fortunately trivial to fix.

Similar in nfs_kill_super() calling nfs_free_server().
Similar in v9fs_kill_super() with v9fs_session_cancel()/v9fs_session_close() 
calls.
Similar in hypfs_kill_super(), afs_kill_super(), btrfs_kill_super(), 
cifs_kill_sb()
(all trivial to fix)

Aha... nfsd_umount() is a new regression.

orangefs: old, trivial to fix.

cgroup_kill_sb(): old, hopefully easy to fix.  Note that kernfs_root_from_sb()
can bloody well return NULL, making cgroup_root_from_kf() oops.  Always had been
there.

AFAICS, after discarding the instances that do the right thing we are left with:
hypfs_kill_super, rdt_kill_sb, v9fs_kill_super, afs_kill_super, 
btrfs_kill_super,
cifs_kill_sb, jffs2_kill_sb, nfs_kill_super, nfsd_umount, orangefs_kill_sb,
proc_kill_sb, sysfs_kill_sb, cgroup_kill_sb, rpc_kill_sb.

Out of those, nfsd_umount(), proc_kill_sb() and rpc_kill_sb() are regressions.
So are rdt_kill_sb() and sysfs_kill_sb() (victims of the issue you've spotted
in kernfs_kill_sb()).  The rest are old (and I wonder if syzbot had been
catching those - they are also dependent upon a specific allocation failing
at the right time).


Re: general protection fault in kernfs_kill_sb

2018-04-02 Thread Al Viro
On Mon, Apr 02, 2018 at 07:40:22PM +0900, Tetsuo Handa wrote:

> That commit assumes that calling kill_sb() from deactivate_locked_super(s)
> without corresponding fill_super() is safe. We have so far crashed with
> rpc_mount() and kernfs_mount_ns(). Is that really safe?

Consider the case when fill_super() returns an error immediately.
It is exactly the same situation.  And ->kill_sb() *is* called in cases
when fill_super() has failed.  Always had been - it's much less boilerplate
that way.

deactivate_locked_super() on that failure exit is the least painful
variant, unfortunately.

Filesystems with ->kill_sb() instances that rely upon something
done between sget() and the first failure exit after it need to be fixed.
And yes, that should've been spotted back then.  Sorry.

Fortunately, we don't have many of those - kill_{block,litter,anon}_super()
are safe and those are the majority.  Looking through the rest uncovers
some bugs; so far all I've seen were already there.  Note that normally
we have something like
static void affs_kill_sb(struct super_block *sb)
{
struct affs_sb_info *sbi = AFFS_SB(sb);
kill_block_super(sb);
if (sbi) {
affs_free_bitmap(sb);
affs_brelse(sbi->s_root_bh);
kfree(sbi->s_prefix);
mutex_destroy(>s_bmlock);
kfree(sbi);
}
}
which basically does one of the safe ones augmented with something that
takes care *not* to assume that e.g. ->s_fs_info has been allocated.
Not everyone does, though:

jffs2_fill_super():
c = kzalloc(sizeof(*c), GFP_KERNEL);
if (!c)
return -ENOMEM;
in the very beginning.  So we can return from it with NULL ->s_fs_info.
Now, consider
struct jffs2_sb_info *c = JFFS2_SB_INFO(sb);
if (!(sb->s_flags & MS_RDONLY))
jffs2_stop_garbage_collect_thread(c);
in jffs2_kill_sb() and
void jffs2_stop_garbage_collect_thread(struct jffs2_sb_info *c)
{
int wait = 0;
spin_lock(>erase_completion_lock);
if (c->gc_task) {

IOW, fail that kzalloc() (or, indeed, an allocation in register_shrinker())
and eat an oops.  Always had been there, always hard to hit without
fault injectors and fortunately trivial to fix.

Similar in nfs_kill_super() calling nfs_free_server().
Similar in v9fs_kill_super() with v9fs_session_cancel()/v9fs_session_close() 
calls.
Similar in hypfs_kill_super(), afs_kill_super(), btrfs_kill_super(), 
cifs_kill_sb()
(all trivial to fix)

Aha... nfsd_umount() is a new regression.

orangefs: old, trivial to fix.

cgroup_kill_sb(): old, hopefully easy to fix.  Note that kernfs_root_from_sb()
can bloody well return NULL, making cgroup_root_from_kf() oops.  Always had been
there.

AFAICS, after discarding the instances that do the right thing we are left with:
hypfs_kill_super, rdt_kill_sb, v9fs_kill_super, afs_kill_super, 
btrfs_kill_super,
cifs_kill_sb, jffs2_kill_sb, nfs_kill_super, nfsd_umount, orangefs_kill_sb,
proc_kill_sb, sysfs_kill_sb, cgroup_kill_sb, rpc_kill_sb.

Out of those, nfsd_umount(), proc_kill_sb() and rpc_kill_sb() are regressions.
So are rdt_kill_sb() and sysfs_kill_sb() (victims of the issue you've spotted
in kernfs_kill_sb()).  The rest are old (and I wonder if syzbot had been
catching those - they are also dependent upon a specific allocation failing
at the right time).


Re: general protection fault in kernfs_kill_sb

2018-04-02 Thread Tetsuo Handa
On 2018/04/02 2:01, syzbot wrote:
> Hello,
> 
> syzbot hit the following crash on bpf-next commit
> 7828f20e3779e4e85e55371e0e43f5006a15fb41 (Sat Mar 31 00:17:57 2018 +)
> Merge branch 'bpf-cgroup-bind-connect'
> syzbot dashboard link: 
> https://syzkaller.appspot.com/bug?extid=151de3f2be6b40ac8026
> 
> So far this crash happened 3 times on bpf-next.
> C reproducer: https://syzkaller.appspot.com/x/repro.c?id=4857382450495488
> syzkaller reproducer: 
> https://syzkaller.appspot.com/x/repro.syz?id=4644052230209536
> Raw console output: 
> https://syzkaller.appspot.com/x/log.txt?id=5798498637185024
> Kernel config: https://syzkaller.appspot.com/x/.config?id=5909223872832634926
> compiler: gcc (GCC) 7.1.1 20170620

Al, I think this is another example of crash triggered by
commit 9ee332d99e4d5a97 ("sget(): handle failures of register_shrinker()").


[   23.407545] FAULT_INJECTION: forcing a failure.
[   23.407545] name failslab, interval 1, probability 0, space 0, times 1
[   23.414735] CPU: 1 PID: 4471 Comm: syzkaller129261 Not tainted 4.16.0-rc6+ 
#43
[   23.433147] Hardware name: Google Google Compute Engine/Google Compute 
Engine, BIOS Google 01/01/2011
[   23.442491] Call Trace:
[   23.445074]  dump_stack+0x194/0x24d
[   23.448689]  ? arch_local_irq_restore+0x53/0x53
[   23.453347]  ? find_held_lock+0x35/0x1d0
[   23.457401]  should_fail+0x8c0/0xa40
[   23.461100]  ? __list_lru_init+0x352/0x750
[   23.465331]  ? fault_create_debugfs_attr+0x1f0/0x1f0
[   23.470453]  ? find_held_lock+0x35/0x1d0
[   23.474503]  ? __lock_is_held+0xb6/0x140
[   23.478556]  ? check_same_owner+0x320/0x320
[   23.482870]  ? rcu_note_context_switch+0x710/0x710
[   23.487785]  ? find_held_lock+0x35/0x1d0
[   23.491931]  should_failslab+0xec/0x120
[   23.495895]  __kmalloc+0x63/0x760
[   23.499332]  ? lock_downgrade+0x980/0x980
[   23.503469]  ? _raw_spin_unlock+0x22/0x30
[   23.507605]  ? register_shrinker+0x10e/0x2d0
[   23.511999]  ? trace_event_raw_event_module_request+0x320/0x320
[   23.518044]  register_shrinker+0x10e/0x2d0
[   23.522265]  ? __bpf_trace_mm_vmscan_wakeup_kswapd+0x40/0x40
[   23.528051]  ? memcpy+0x45/0x50
[   23.531588]  sget_userns+0xbbf/0xe40
[   23.535296]  ? kernfs_sop_show_path+0x190/0x190
[   23.539959]  ? kernfs_sop_show_options+0x180/0x180
[   23.544876]  ? destroy_unused_super.part.6+0xd0/0xd0
[   23.549972]  ? check_same_owner+0x320/0x320
[   23.554281]  ? rcu_pm_notify+0xc0/0xc0
[   23.558161]  ? rcu_read_lock_sched_held+0x108/0x120
[   23.563168]  ? kmem_cache_alloc_trace+0x459/0x740
[   23.567997]  ? lock_downgrade+0x980/0x980
[   23.572142]  kernfs_mount_ns+0x13d/0x8b0
[   23.576192]  ? kernfs_super_ns+0x70/0x70
[   23.580244]  sysfs_mount+0xc2/0x1c0


That commit assumes that calling kill_sb() from deactivate_locked_super(s)
without corresponding fill_super() is safe. We have so far crashed with
rpc_mount() and kernfs_mount_ns(). Is that really safe?

Also, I think


struct dentry *kernfs_mount_ns(struct file_system_type *fs_type, int flags,
   struct kernfs_root *root, unsigned long magic,
   bool *new_sb_created, const void *ns)
{
(...snipped...)
if (!sb->s_root) {
struct kernfs_super_info *info = kernfs_info(sb);

error = kernfs_fill_super(sb, magic);
if (error) {
deactivate_locked_super(sb); // <= this call
return ERR_PTR(error);
}
sb->s_flags |= SB_ACTIVE;

mutex_lock(_mutex);
list_add(>node, >supers);
mutex_unlock(_mutex);
}
(...snipped...)
}


is not safe, for list_del() is called via kill_sb() without
corresponding list_add().


void kernfs_kill_sb(struct super_block *sb)
{
struct kernfs_super_info *info = kernfs_info(sb);

mutex_lock(_mutex);
list_del(>node); // <= NULL pointer dereference
mutex_unlock(_mutex);

/*
 * Remove the superblock from fs_supers/s_instances
 * so we can't find it, before freeing kernfs_super_info.
 */
kill_anon_super(sb);
kfree(info);
}


> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+151de3f2be6b40ac8...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for details.
> If you forward the report, please keep this part and the footer.
> 
> kasan: GPF could be caused by NULL-ptr deref or user memory access
>  should_failslab+0xec/0x120 mm/failslab.c:32
>  slab_pre_alloc_hook mm/slab.h:422 [inline]
>  slab_alloc mm/slab.c:3365 [inline]
>  __do_kmalloc mm/slab.c:3703 [inline]
>  __kmalloc+0x63/0x760 

Re: general protection fault in kernfs_kill_sb

2018-04-02 Thread Tetsuo Handa
On 2018/04/02 2:01, syzbot wrote:
> Hello,
> 
> syzbot hit the following crash on bpf-next commit
> 7828f20e3779e4e85e55371e0e43f5006a15fb41 (Sat Mar 31 00:17:57 2018 +)
> Merge branch 'bpf-cgroup-bind-connect'
> syzbot dashboard link: 
> https://syzkaller.appspot.com/bug?extid=151de3f2be6b40ac8026
> 
> So far this crash happened 3 times on bpf-next.
> C reproducer: https://syzkaller.appspot.com/x/repro.c?id=4857382450495488
> syzkaller reproducer: 
> https://syzkaller.appspot.com/x/repro.syz?id=4644052230209536
> Raw console output: 
> https://syzkaller.appspot.com/x/log.txt?id=5798498637185024
> Kernel config: https://syzkaller.appspot.com/x/.config?id=5909223872832634926
> compiler: gcc (GCC) 7.1.1 20170620

Al, I think this is another example of crash triggered by
commit 9ee332d99e4d5a97 ("sget(): handle failures of register_shrinker()").


[   23.407545] FAULT_INJECTION: forcing a failure.
[   23.407545] name failslab, interval 1, probability 0, space 0, times 1
[   23.414735] CPU: 1 PID: 4471 Comm: syzkaller129261 Not tainted 4.16.0-rc6+ 
#43
[   23.433147] Hardware name: Google Google Compute Engine/Google Compute 
Engine, BIOS Google 01/01/2011
[   23.442491] Call Trace:
[   23.445074]  dump_stack+0x194/0x24d
[   23.448689]  ? arch_local_irq_restore+0x53/0x53
[   23.453347]  ? find_held_lock+0x35/0x1d0
[   23.457401]  should_fail+0x8c0/0xa40
[   23.461100]  ? __list_lru_init+0x352/0x750
[   23.465331]  ? fault_create_debugfs_attr+0x1f0/0x1f0
[   23.470453]  ? find_held_lock+0x35/0x1d0
[   23.474503]  ? __lock_is_held+0xb6/0x140
[   23.478556]  ? check_same_owner+0x320/0x320
[   23.482870]  ? rcu_note_context_switch+0x710/0x710
[   23.487785]  ? find_held_lock+0x35/0x1d0
[   23.491931]  should_failslab+0xec/0x120
[   23.495895]  __kmalloc+0x63/0x760
[   23.499332]  ? lock_downgrade+0x980/0x980
[   23.503469]  ? _raw_spin_unlock+0x22/0x30
[   23.507605]  ? register_shrinker+0x10e/0x2d0
[   23.511999]  ? trace_event_raw_event_module_request+0x320/0x320
[   23.518044]  register_shrinker+0x10e/0x2d0
[   23.522265]  ? __bpf_trace_mm_vmscan_wakeup_kswapd+0x40/0x40
[   23.528051]  ? memcpy+0x45/0x50
[   23.531588]  sget_userns+0xbbf/0xe40
[   23.535296]  ? kernfs_sop_show_path+0x190/0x190
[   23.539959]  ? kernfs_sop_show_options+0x180/0x180
[   23.544876]  ? destroy_unused_super.part.6+0xd0/0xd0
[   23.549972]  ? check_same_owner+0x320/0x320
[   23.554281]  ? rcu_pm_notify+0xc0/0xc0
[   23.558161]  ? rcu_read_lock_sched_held+0x108/0x120
[   23.563168]  ? kmem_cache_alloc_trace+0x459/0x740
[   23.567997]  ? lock_downgrade+0x980/0x980
[   23.572142]  kernfs_mount_ns+0x13d/0x8b0
[   23.576192]  ? kernfs_super_ns+0x70/0x70
[   23.580244]  sysfs_mount+0xc2/0x1c0


That commit assumes that calling kill_sb() from deactivate_locked_super(s)
without corresponding fill_super() is safe. We have so far crashed with
rpc_mount() and kernfs_mount_ns(). Is that really safe?

Also, I think


struct dentry *kernfs_mount_ns(struct file_system_type *fs_type, int flags,
   struct kernfs_root *root, unsigned long magic,
   bool *new_sb_created, const void *ns)
{
(...snipped...)
if (!sb->s_root) {
struct kernfs_super_info *info = kernfs_info(sb);

error = kernfs_fill_super(sb, magic);
if (error) {
deactivate_locked_super(sb); // <= this call
return ERR_PTR(error);
}
sb->s_flags |= SB_ACTIVE;

mutex_lock(_mutex);
list_add(>node, >supers);
mutex_unlock(_mutex);
}
(...snipped...)
}


is not safe, for list_del() is called via kill_sb() without
corresponding list_add().


void kernfs_kill_sb(struct super_block *sb)
{
struct kernfs_super_info *info = kernfs_info(sb);

mutex_lock(_mutex);
list_del(>node); // <= NULL pointer dereference
mutex_unlock(_mutex);

/*
 * Remove the superblock from fs_supers/s_instances
 * so we can't find it, before freeing kernfs_super_info.
 */
kill_anon_super(sb);
kfree(info);
}


> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+151de3f2be6b40ac8...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for details.
> If you forward the report, please keep this part and the footer.
> 
> kasan: GPF could be caused by NULL-ptr deref or user memory access
>  should_failslab+0xec/0x120 mm/failslab.c:32
>  slab_pre_alloc_hook mm/slab.h:422 [inline]
>  slab_alloc mm/slab.c:3365 [inline]
>  __do_kmalloc mm/slab.c:3703 [inline]
>  __kmalloc+0x63/0x760 

general protection fault in kernfs_kill_sb

2018-04-01 Thread syzbot

Hello,

syzbot hit the following crash on bpf-next commit
7828f20e3779e4e85e55371e0e43f5006a15fb41 (Sat Mar 31 00:17:57 2018 +)
Merge branch 'bpf-cgroup-bind-connect'
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=151de3f2be6b40ac8026


So far this crash happened 3 times on bpf-next.
C reproducer: https://syzkaller.appspot.com/x/repro.c?id=4857382450495488
syzkaller reproducer:  
https://syzkaller.appspot.com/x/repro.syz?id=4644052230209536
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=5798498637185024
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=5909223872832634926

compiler: gcc (GCC) 7.1.1 20170620

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+151de3f2be6b40ac8...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.

If you forward the report, please keep this part and the footer.

kasan: GPF could be caused by NULL-ptr deref or user memory access
 should_failslab+0xec/0x120 mm/failslab.c:32
 slab_pre_alloc_hook mm/slab.h:422 [inline]
 slab_alloc mm/slab.c:3365 [inline]
 __do_kmalloc mm/slab.c:3703 [inline]
 __kmalloc+0x63/0x760 mm/slab.c:3714
general protection fault:  [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
 kmalloc include/linux/slab.h:517 [inline]
 kzalloc include/linux/slab.h:701 [inline]
 register_shrinker+0x10e/0x2d0 mm/vmscan.c:268
Modules linked in:
CPU: 1 PID: 4471 Comm: syzkaller129261 Not tainted 4.16.0-rc6+ #43
 sget_userns+0xbbf/0xe40 fs/super.c:520
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

RIP: 0010:__list_del_entry_valid+0x7e/0x150 lib/list_debug.c:51
RSP: 0018:8801ae017658 EFLAGS: 00010246
RAX: dc00 RBX:  RCX: 
RDX:  RSI: 8801d97a6e98 RDI: 8801d97a6ea0
RBP: 8801ae017670 R08: 81d39d22 R09: 0004
R10: 8801ae017670 R11:  R12: 
R13: 8801d91dec00 R14: 8801ae017700 R15: 8801d97a6e98
FS:  01569880() GS:8801db10() knlGS:
 kernfs_mount_ns+0x13d/0x8b0 fs/kernfs/mount.c:320
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 006d0188 CR3: 0001da40c005 CR4: 001606e0
DR0:  DR1:  DR2: 
 sysfs_mount+0xc2/0x1c0 fs/sysfs/mount.c:36
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 __list_del_entry include/linux/list.h:117 [inline]
 list_del include/linux/list.h:125 [inline]
 kernfs_kill_sb+0x9e/0x330 fs/kernfs/mount.c:361
 mount_fs+0x66/0x2d0 fs/super.c:1222
 vfs_kern_mount.part.26+0xc6/0x4a0 fs/namespace.c:1037
 sysfs_kill_sb+0x22/0x40 fs/sysfs/mount.c:50
 vfs_kern_mount fs/namespace.c:2509 [inline]
 do_new_mount fs/namespace.c:2512 [inline]
 do_mount+0xea4/0x2bb0 fs/namespace.c:2842
 deactivate_locked_super+0x88/0xd0 fs/super.c:312
 sget_userns+0xbda/0xe40 fs/super.c:522
 SYSC_mount fs/namespace.c:3058 [inline]
 SyS_mount+0xab/0x120 fs/namespace.c:3035
 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
 kernfs_mount_ns+0x13d/0x8b0 fs/kernfs/mount.c:320
 sysfs_mount+0xc2/0x1c0 fs/sysfs/mount.c:36
 mount_fs+0x66/0x2d0 fs/super.c:1222
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
 vfs_kern_mount.part.26+0xc6/0x4a0 fs/namespace.c:1037
RIP: 0033:0x442609
RSP: 002b:7fff40a278e8 EFLAGS: 0246 ORIG_RAX: 00a5
RAX: ffda RBX:  RCX: 00442609
RDX: 2140 RSI: 2040 RDI: 2000
RBP: 7fff40a28190 R08: 22c0 R09: 0003
 vfs_kern_mount fs/namespace.c:2509 [inline]
 do_new_mount fs/namespace.c:2512 [inline]
 do_mount+0xea4/0x2bb0 fs/namespace.c:2842
R10:  R11: 0246 R12: 
R13: 0003 R14: 1380 R15: 7fff40a27a28
 SYSC_mount fs/namespace.c:3058 [inline]
 SyS_mount+0xab/0x120 fs/namespace.c:3035
 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x442609
RSP: 002b:7fff40a278e8 EFLAGS: 0246 ORIG_RAX: 00a5
RAX: ffda RBX:  RCX: 00442609
RDX: 2140 RSI: 2040 RDI: 2000
RBP: 7fff40a28190 R08: 22c0 R09: 0003
R10:  R11: 0246 R12: 
R13: 0003 R14: 1380 R15: 7fff40a27a28
Code: 00 00 00 00 ad de 49 39 c4 74 66 48 b8 00 02 00 00 00 00 ad de 48 89  
da 48 39 c3 74 65 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df <80> 3c 02 00  
75 7b 48 8b 13 48 39 f2 75 57 49 8d 7c 24 08 48 b8
RIP: __list_del_entry_valid+0x7e/0x150 lib/list_debug.c:51 RSP:  
8801ae017658

---[ end trace b14d521943ecadbd ]---


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct 

general protection fault in kernfs_kill_sb

2018-04-01 Thread syzbot

Hello,

syzbot hit the following crash on bpf-next commit
7828f20e3779e4e85e55371e0e43f5006a15fb41 (Sat Mar 31 00:17:57 2018 +)
Merge branch 'bpf-cgroup-bind-connect'
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=151de3f2be6b40ac8026


So far this crash happened 3 times on bpf-next.
C reproducer: https://syzkaller.appspot.com/x/repro.c?id=4857382450495488
syzkaller reproducer:  
https://syzkaller.appspot.com/x/repro.syz?id=4644052230209536
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=5798498637185024
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=5909223872832634926

compiler: gcc (GCC) 7.1.1 20170620

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+151de3f2be6b40ac8...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.

If you forward the report, please keep this part and the footer.

kasan: GPF could be caused by NULL-ptr deref or user memory access
 should_failslab+0xec/0x120 mm/failslab.c:32
 slab_pre_alloc_hook mm/slab.h:422 [inline]
 slab_alloc mm/slab.c:3365 [inline]
 __do_kmalloc mm/slab.c:3703 [inline]
 __kmalloc+0x63/0x760 mm/slab.c:3714
general protection fault:  [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
 kmalloc include/linux/slab.h:517 [inline]
 kzalloc include/linux/slab.h:701 [inline]
 register_shrinker+0x10e/0x2d0 mm/vmscan.c:268
Modules linked in:
CPU: 1 PID: 4471 Comm: syzkaller129261 Not tainted 4.16.0-rc6+ #43
 sget_userns+0xbbf/0xe40 fs/super.c:520
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

RIP: 0010:__list_del_entry_valid+0x7e/0x150 lib/list_debug.c:51
RSP: 0018:8801ae017658 EFLAGS: 00010246
RAX: dc00 RBX:  RCX: 
RDX:  RSI: 8801d97a6e98 RDI: 8801d97a6ea0
RBP: 8801ae017670 R08: 81d39d22 R09: 0004
R10: 8801ae017670 R11:  R12: 
R13: 8801d91dec00 R14: 8801ae017700 R15: 8801d97a6e98
FS:  01569880() GS:8801db10() knlGS:
 kernfs_mount_ns+0x13d/0x8b0 fs/kernfs/mount.c:320
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 006d0188 CR3: 0001da40c005 CR4: 001606e0
DR0:  DR1:  DR2: 
 sysfs_mount+0xc2/0x1c0 fs/sysfs/mount.c:36
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 __list_del_entry include/linux/list.h:117 [inline]
 list_del include/linux/list.h:125 [inline]
 kernfs_kill_sb+0x9e/0x330 fs/kernfs/mount.c:361
 mount_fs+0x66/0x2d0 fs/super.c:1222
 vfs_kern_mount.part.26+0xc6/0x4a0 fs/namespace.c:1037
 sysfs_kill_sb+0x22/0x40 fs/sysfs/mount.c:50
 vfs_kern_mount fs/namespace.c:2509 [inline]
 do_new_mount fs/namespace.c:2512 [inline]
 do_mount+0xea4/0x2bb0 fs/namespace.c:2842
 deactivate_locked_super+0x88/0xd0 fs/super.c:312
 sget_userns+0xbda/0xe40 fs/super.c:522
 SYSC_mount fs/namespace.c:3058 [inline]
 SyS_mount+0xab/0x120 fs/namespace.c:3035
 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
 kernfs_mount_ns+0x13d/0x8b0 fs/kernfs/mount.c:320
 sysfs_mount+0xc2/0x1c0 fs/sysfs/mount.c:36
 mount_fs+0x66/0x2d0 fs/super.c:1222
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
 vfs_kern_mount.part.26+0xc6/0x4a0 fs/namespace.c:1037
RIP: 0033:0x442609
RSP: 002b:7fff40a278e8 EFLAGS: 0246 ORIG_RAX: 00a5
RAX: ffda RBX:  RCX: 00442609
RDX: 2140 RSI: 2040 RDI: 2000
RBP: 7fff40a28190 R08: 22c0 R09: 0003
 vfs_kern_mount fs/namespace.c:2509 [inline]
 do_new_mount fs/namespace.c:2512 [inline]
 do_mount+0xea4/0x2bb0 fs/namespace.c:2842
R10:  R11: 0246 R12: 
R13: 0003 R14: 1380 R15: 7fff40a27a28
 SYSC_mount fs/namespace.c:3058 [inline]
 SyS_mount+0xab/0x120 fs/namespace.c:3035
 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x442609
RSP: 002b:7fff40a278e8 EFLAGS: 0246 ORIG_RAX: 00a5
RAX: ffda RBX:  RCX: 00442609
RDX: 2140 RSI: 2040 RDI: 2000
RBP: 7fff40a28190 R08: 22c0 R09: 0003
R10:  R11: 0246 R12: 
R13: 0003 R14: 1380 R15: 7fff40a27a28
Code: 00 00 00 00 ad de 49 39 c4 74 66 48 b8 00 02 00 00 00 00 ad de 48 89  
da 48 39 c3 74 65 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df <80> 3c 02 00  
75 7b 48 8b 13 48 39 f2 75 57 49 8d 7c 24 08 48 b8
RIP: __list_del_entry_valid+0x7e/0x150 lib/list_debug.c:51 RSP:  
8801ae017658

---[ end trace b14d521943ecadbd ]---


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct