Re: possible deadlock in ovl_copy_up_start

2018-10-18 Thread Amir Goldstein
On Thu, Oct 18, 2018 at 11:41 AM Miklos Szeredi  wrote:
>
> On Thu, Oct 18, 2018 at 8:26 AM, Amir Goldstein  wrote:
>
> > Can someone tell me what the expected behavior of a nested
> > mutex_lock_interruptible(); ?
> >
> > Why does the reproducer only warn and not really deadlock.
> > It is because that is considered the lesser evil?
> > and obviously, then inner unlock releases the outer lock?
>
> No, it's not the same lock, just the same lock class (first one is
> OVL_I(d_inode(old))->lock, the other is
> OVL_I(d_inode(new->d_parent)))->lock).

Doh! of course.

>
> So we could possibly get away with annotating with
> mutex_lock_nested().  Is this the only place that ovl_i_lock is
> nested?
>

As far as I can see it is.
But how would we annotate it for consistent and clear locking order?
NLINK/COPYUP?
if we want this annotation to maintain locking order we need to patch
I posted (does copyup of new->parent prior to nlink_start), so then
we don't need nested annotations anymore.

Nah, I don't think we need to add nested annotations.

FYI, I am working on a small cleanup series for ovl_nlink_start/end
and ovl_inode_lock/unlock, but it is independent of the fix patch I posted
for this bug.

Thanks,
Amir.


Re: possible deadlock in ovl_copy_up_start

2018-10-18 Thread Amir Goldstein
On Thu, Oct 18, 2018 at 11:41 AM Miklos Szeredi  wrote:
>
> On Thu, Oct 18, 2018 at 8:26 AM, Amir Goldstein  wrote:
>
> > Can someone tell me what the expected behavior of a nested
> > mutex_lock_interruptible(); ?
> >
> > Why does the reproducer only warn and not really deadlock.
> > It is because that is considered the lesser evil?
> > and obviously, then inner unlock releases the outer lock?
>
> No, it's not the same lock, just the same lock class (first one is
> OVL_I(d_inode(old))->lock, the other is
> OVL_I(d_inode(new->d_parent)))->lock).

Doh! of course.

>
> So we could possibly get away with annotating with
> mutex_lock_nested().  Is this the only place that ovl_i_lock is
> nested?
>

As far as I can see it is.
But how would we annotate it for consistent and clear locking order?
NLINK/COPYUP?
if we want this annotation to maintain locking order we need to patch
I posted (does copyup of new->parent prior to nlink_start), so then
we don't need nested annotations anymore.

Nah, I don't think we need to add nested annotations.

FYI, I am working on a small cleanup series for ovl_nlink_start/end
and ovl_inode_lock/unlock, but it is independent of the fix patch I posted
for this bug.

Thanks,
Amir.


Re: possible deadlock in ovl_copy_up_start

2018-10-18 Thread Miklos Szeredi
On Thu, Oct 18, 2018 at 8:26 AM, Amir Goldstein  wrote:

> Can someone tell me what the expected behavior of a nested
> mutex_lock_interruptible(); ?
>
> Why does the reproducer only warn and not really deadlock.
> It is because that is considered the lesser evil?
> and obviously, then inner unlock releases the outer lock?

No, it's not the same lock, just the same lock class (first one is
OVL_I(d_inode(old))->lock, the other is
OVL_I(d_inode(new->d_parent)))->lock).

So we could possibly get away with annotating with
mutex_lock_nested().  Is this the only place that ovl_i_lock is
nested?

Thanks,
Miklos


Re: possible deadlock in ovl_copy_up_start

2018-10-18 Thread Miklos Szeredi
On Thu, Oct 18, 2018 at 8:26 AM, Amir Goldstein  wrote:

> Can someone tell me what the expected behavior of a nested
> mutex_lock_interruptible(); ?
>
> Why does the reproducer only warn and not really deadlock.
> It is because that is considered the lesser evil?
> and obviously, then inner unlock releases the outer lock?

No, it's not the same lock, just the same lock class (first one is
OVL_I(d_inode(old))->lock, the other is
OVL_I(d_inode(new->d_parent)))->lock).

So we could possibly get away with annotating with
mutex_lock_nested().  Is this the only place that ovl_i_lock is
nested?

Thanks,
Miklos


Re: possible deadlock in ovl_copy_up_start

2018-10-18 Thread Amir Goldstein
On Thu, Oct 18, 2018 at 7:48 AM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:c343db455eb3 Merge branch 'parisc-4.19-3' of git://git.ker..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=167d08ee40
> kernel config:  https://syzkaller.appspot.com/x/.config?x=b3f55cb3dfcc6c33
> dashboard link: https://syzkaller.appspot.com/bug?extid=3ef5c0d1a5cb0b21e6be
> compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
>
> Unfortunately, I don't have any reproducer for this crash yet.

Reproducer is simple:
link a non-copied-up file into a non-copied-up parent:

~/unionmount-testsuite# ./run --ov -s
~/unionmount-testsuite# ln /mnt/a/foo100 /mnt/a/dir100/

>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+3ef5c0d1a5cb0b21e...@syzkaller.appspotmail.com
>

FYI, this is the fix:
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 276914ae3c60..e1a55ecb7aba 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -663,6 +663,10 @@ static int ovl_link(struct dentry *old, struct
inode *newdir,
if (err)
goto out_drop_write;

+   err = ovl_copy_up(new->d_parent);
+   if (err)
+   goto out_drop_write;
+
if (ovl_is_metacopy_dentry(old)) {
err = ovl_set_redirect(old, false);
if (err)

> overlayfs: filesystem on './file0' not supported as upperdir
> XFS (loop3): unknown mount option [uid<].
>
> kobject: 'loop2' (ce85f3f9): kobject_uevent_env
> 
> WARNING: possible recursive locking detected
> kobject: 'loop2' (ce85f3f9): fill_kobj_path: path
> = '/devices/virtual/block/loop2'
> 4.19.0-rc8+ #65 Not tainted
> 
> syz-executor2/8184 is trying to acquire lock:
> d7157f3f (_i_lock_key[depth]){+.+.}, at:
> ovl_copy_up_start+0x9c/0x2e0 fs/overlayfs/util.c:528
>
> but task is already holding lock:
> 6f802695 (_i_lock_key[depth]){+.+.}, at:
> ovl_nlink_start+0xe0/0x350 fs/overlayfs/util.c:771
>
> other info that might help us debug this:
>   Possible unsafe locking scenario:
>
> CPU0
> 
>lock(_i_lock_key[depth]);
>lock(_i_lock_key[depth]);
>
>   *** DEADLOCK ***
>

Can someone tell me what the expected behavior of a nested
mutex_lock_interruptible(); ?

Why does the reproducer only warn and not really deadlock.
It is because that is considered the lesser evil?
and obviously, then inner unlock releases the outer lock?

Thanks,
Amir.


Re: possible deadlock in ovl_copy_up_start

2018-10-18 Thread Amir Goldstein
On Thu, Oct 18, 2018 at 7:48 AM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:c343db455eb3 Merge branch 'parisc-4.19-3' of git://git.ker..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=167d08ee40
> kernel config:  https://syzkaller.appspot.com/x/.config?x=b3f55cb3dfcc6c33
> dashboard link: https://syzkaller.appspot.com/bug?extid=3ef5c0d1a5cb0b21e6be
> compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
>
> Unfortunately, I don't have any reproducer for this crash yet.

Reproducer is simple:
link a non-copied-up file into a non-copied-up parent:

~/unionmount-testsuite# ./run --ov -s
~/unionmount-testsuite# ln /mnt/a/foo100 /mnt/a/dir100/

>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+3ef5c0d1a5cb0b21e...@syzkaller.appspotmail.com
>

FYI, this is the fix:
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 276914ae3c60..e1a55ecb7aba 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -663,6 +663,10 @@ static int ovl_link(struct dentry *old, struct
inode *newdir,
if (err)
goto out_drop_write;

+   err = ovl_copy_up(new->d_parent);
+   if (err)
+   goto out_drop_write;
+
if (ovl_is_metacopy_dentry(old)) {
err = ovl_set_redirect(old, false);
if (err)

> overlayfs: filesystem on './file0' not supported as upperdir
> XFS (loop3): unknown mount option [uid<].
>
> kobject: 'loop2' (ce85f3f9): kobject_uevent_env
> 
> WARNING: possible recursive locking detected
> kobject: 'loop2' (ce85f3f9): fill_kobj_path: path
> = '/devices/virtual/block/loop2'
> 4.19.0-rc8+ #65 Not tainted
> 
> syz-executor2/8184 is trying to acquire lock:
> d7157f3f (_i_lock_key[depth]){+.+.}, at:
> ovl_copy_up_start+0x9c/0x2e0 fs/overlayfs/util.c:528
>
> but task is already holding lock:
> 6f802695 (_i_lock_key[depth]){+.+.}, at:
> ovl_nlink_start+0xe0/0x350 fs/overlayfs/util.c:771
>
> other info that might help us debug this:
>   Possible unsafe locking scenario:
>
> CPU0
> 
>lock(_i_lock_key[depth]);
>lock(_i_lock_key[depth]);
>
>   *** DEADLOCK ***
>

Can someone tell me what the expected behavior of a nested
mutex_lock_interruptible(); ?

Why does the reproducer only warn and not really deadlock.
It is because that is considered the lesser evil?
and obviously, then inner unlock releases the outer lock?

Thanks,
Amir.


possible deadlock in ovl_copy_up_start

2018-10-17 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:c343db455eb3 Merge branch 'parisc-4.19-3' of git://git.ker..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=167d08ee40
kernel config:  https://syzkaller.appspot.com/x/.config?x=b3f55cb3dfcc6c33
dashboard link: https://syzkaller.appspot.com/bug?extid=3ef5c0d1a5cb0b21e6be
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+3ef5c0d1a5cb0b21e...@syzkaller.appspotmail.com

overlayfs: filesystem on './file0' not supported as upperdir
XFS (loop3): unknown mount option [uid<].

kobject: 'loop2' (ce85f3f9): kobject_uevent_env

WARNING: possible recursive locking detected
kobject: 'loop2' (ce85f3f9): fill_kobj_path: path  
= '/devices/virtual/block/loop2'

4.19.0-rc8+ #65 Not tainted

syz-executor2/8184 is trying to acquire lock:
d7157f3f (_i_lock_key[depth]){+.+.}, at:  
ovl_copy_up_start+0x9c/0x2e0 fs/overlayfs/util.c:528


but task is already holding lock:
6f802695 (_i_lock_key[depth]){+.+.}, at:  
ovl_nlink_start+0xe0/0x350 fs/overlayfs/util.c:771


other info that might help us debug this:
 Possible unsafe locking scenario:

   CPU0
   
  lock(_i_lock_key[depth]);
  lock(_i_lock_key[depth]);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

5 locks held by syz-executor2/8184:
 #0: 75695ecf (sb_writers#17){.+.+}, at: sb_start_write  
include/linux/fs.h:1566 [inline]
 #0: 75695ecf (sb_writers#17){.+.+}, at: mnt_want_write+0x3f/0xc0  
fs/namespace.c:360
 #1: ee99eb71 (_i_mutex_dir_key[depth]/1){+.+.}, at:  
inode_lock_nested include/linux/fs.h:773 [inline]
 #1: ee99eb71 (_i_mutex_dir_key[depth]/1){+.+.}, at:  
filename_create+0x1b2/0x5b0 fs/namei.c:3635
 #2: 65f3eeeb (_i_mutex_key[depth]){+.+.}, at: inode_lock  
include/linux/fs.h:738 [inline]
 #2: 65f3eeeb (_i_mutex_key[depth]){+.+.}, at:  
vfs_link+0x543/0xb70 fs/namei.c:4232
 #3: 8b285486 (sb_writers#3){.+.+}, at: sb_start_write  
include/linux/fs.h:1566 [inline]
 #3: 8b285486 (sb_writers#3){.+.+}, at: mnt_want_write+0x3f/0xc0  
fs/namespace.c:360
 #4: 6f802695 (_i_lock_key[depth]){+.+.}, at:  
ovl_nlink_start+0xe0/0x350 fs/overlayfs/util.c:771


stack backtrace:
CPU: 1 PID: 8184 Comm: syz-executor2 Not tainted 4.19.0-rc8+ #65
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
 print_deadlock_bug kernel/locking/lockdep.c:1759 [inline]
 check_deadlock kernel/locking/lockdep.c:1803 [inline]
 validate_chain kernel/locking/lockdep.c:2399 [inline]
 __lock_acquire.cold.61+0x1fb/0x482 kernel/locking/lockdep.c:3411
 lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3900
 __mutex_lock_common kernel/locking/mutex.c:925 [inline]
 __mutex_lock+0x166/0x1700 kernel/locking/mutex.c:1072
 mutex_lock_interruptible_nested+0x16/0x20 kernel/locking/mutex.c:1109
 ovl_copy_up_start+0x9c/0x2e0 fs/overlayfs/util.c:528
 ovl_copy_up_one+0x51f/0x1970 fs/overlayfs/copy_up.c:775
 ovl_copy_up_flags+0x14e/0x1d0 fs/overlayfs/copy_up.c:827
 ovl_copy_up+0x17/0x1a fs/overlayfs/copy_up.c:874
 ovl_create_or_link+0xc7/0x1450 fs/overlayfs/dir.c:543
 ovl_link+0x28b/0x37c fs/overlayfs/dir.c:679
 vfs_link+0x7a9/0xb70 fs/namei.c:4241
 do_linkat+0x724/0xa90 fs/namei.c:4309
 __do_sys_linkat fs/namei.c:4333 [inline]
 __se_sys_linkat fs/namei.c:4330 [inline]
 __x64_sys_linkat+0xbe/0x150 fs/namei.c:4330
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457569
Code: fd b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 cb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00

RSP: 002b:7f8ab8a13c78 EFLAGS: 0246 ORIG_RAX: 0109
RAX: ffda RBX: 0005 RCX: 00457569
RDX: 000f RSI: 2200 RDI: 000f
RBP: 0072c040 R08: 0400 R09: 
R10: 2180 R11: 0246 R12: 7f8ab8a146d4
R13: 004c31d1 R14: 004d3840 R15: 
kobject: 'bluetooth' (bd8be530): kobject_add_internal:  
parent: 'virtual', set: '(null)'

kobject: 'loop3' (7c159c39): kobject_uevent_env
kobject: 'kvm' (3101062f): kobject_uevent_env
kobject: 'loop3' (7c159c39): fill_kobj_path: path  
= '/devices/virtual/block/loop3'
kobject: 'hci0' (6b48ee74): kobject_add_internal:  
parent: 'bluetooth', set: 'devices'
kobject: 'kvm' (3101062f): fill_kobj_path: path  
= 

possible deadlock in ovl_copy_up_start

2018-10-17 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:c343db455eb3 Merge branch 'parisc-4.19-3' of git://git.ker..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=167d08ee40
kernel config:  https://syzkaller.appspot.com/x/.config?x=b3f55cb3dfcc6c33
dashboard link: https://syzkaller.appspot.com/bug?extid=3ef5c0d1a5cb0b21e6be
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+3ef5c0d1a5cb0b21e...@syzkaller.appspotmail.com

overlayfs: filesystem on './file0' not supported as upperdir
XFS (loop3): unknown mount option [uid<].

kobject: 'loop2' (ce85f3f9): kobject_uevent_env

WARNING: possible recursive locking detected
kobject: 'loop2' (ce85f3f9): fill_kobj_path: path  
= '/devices/virtual/block/loop2'

4.19.0-rc8+ #65 Not tainted

syz-executor2/8184 is trying to acquire lock:
d7157f3f (_i_lock_key[depth]){+.+.}, at:  
ovl_copy_up_start+0x9c/0x2e0 fs/overlayfs/util.c:528


but task is already holding lock:
6f802695 (_i_lock_key[depth]){+.+.}, at:  
ovl_nlink_start+0xe0/0x350 fs/overlayfs/util.c:771


other info that might help us debug this:
 Possible unsafe locking scenario:

   CPU0
   
  lock(_i_lock_key[depth]);
  lock(_i_lock_key[depth]);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

5 locks held by syz-executor2/8184:
 #0: 75695ecf (sb_writers#17){.+.+}, at: sb_start_write  
include/linux/fs.h:1566 [inline]
 #0: 75695ecf (sb_writers#17){.+.+}, at: mnt_want_write+0x3f/0xc0  
fs/namespace.c:360
 #1: ee99eb71 (_i_mutex_dir_key[depth]/1){+.+.}, at:  
inode_lock_nested include/linux/fs.h:773 [inline]
 #1: ee99eb71 (_i_mutex_dir_key[depth]/1){+.+.}, at:  
filename_create+0x1b2/0x5b0 fs/namei.c:3635
 #2: 65f3eeeb (_i_mutex_key[depth]){+.+.}, at: inode_lock  
include/linux/fs.h:738 [inline]
 #2: 65f3eeeb (_i_mutex_key[depth]){+.+.}, at:  
vfs_link+0x543/0xb70 fs/namei.c:4232
 #3: 8b285486 (sb_writers#3){.+.+}, at: sb_start_write  
include/linux/fs.h:1566 [inline]
 #3: 8b285486 (sb_writers#3){.+.+}, at: mnt_want_write+0x3f/0xc0  
fs/namespace.c:360
 #4: 6f802695 (_i_lock_key[depth]){+.+.}, at:  
ovl_nlink_start+0xe0/0x350 fs/overlayfs/util.c:771


stack backtrace:
CPU: 1 PID: 8184 Comm: syz-executor2 Not tainted 4.19.0-rc8+ #65
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
 print_deadlock_bug kernel/locking/lockdep.c:1759 [inline]
 check_deadlock kernel/locking/lockdep.c:1803 [inline]
 validate_chain kernel/locking/lockdep.c:2399 [inline]
 __lock_acquire.cold.61+0x1fb/0x482 kernel/locking/lockdep.c:3411
 lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3900
 __mutex_lock_common kernel/locking/mutex.c:925 [inline]
 __mutex_lock+0x166/0x1700 kernel/locking/mutex.c:1072
 mutex_lock_interruptible_nested+0x16/0x20 kernel/locking/mutex.c:1109
 ovl_copy_up_start+0x9c/0x2e0 fs/overlayfs/util.c:528
 ovl_copy_up_one+0x51f/0x1970 fs/overlayfs/copy_up.c:775
 ovl_copy_up_flags+0x14e/0x1d0 fs/overlayfs/copy_up.c:827
 ovl_copy_up+0x17/0x1a fs/overlayfs/copy_up.c:874
 ovl_create_or_link+0xc7/0x1450 fs/overlayfs/dir.c:543
 ovl_link+0x28b/0x37c fs/overlayfs/dir.c:679
 vfs_link+0x7a9/0xb70 fs/namei.c:4241
 do_linkat+0x724/0xa90 fs/namei.c:4309
 __do_sys_linkat fs/namei.c:4333 [inline]
 __se_sys_linkat fs/namei.c:4330 [inline]
 __x64_sys_linkat+0xbe/0x150 fs/namei.c:4330
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457569
Code: fd b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 cb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00

RSP: 002b:7f8ab8a13c78 EFLAGS: 0246 ORIG_RAX: 0109
RAX: ffda RBX: 0005 RCX: 00457569
RDX: 000f RSI: 2200 RDI: 000f
RBP: 0072c040 R08: 0400 R09: 
R10: 2180 R11: 0246 R12: 7f8ab8a146d4
R13: 004c31d1 R14: 004d3840 R15: 
kobject: 'bluetooth' (bd8be530): kobject_add_internal:  
parent: 'virtual', set: '(null)'

kobject: 'loop3' (7c159c39): kobject_uevent_env
kobject: 'kvm' (3101062f): kobject_uevent_env
kobject: 'loop3' (7c159c39): fill_kobj_path: path  
= '/devices/virtual/block/loop3'
kobject: 'hci0' (6b48ee74): kobject_add_internal:  
parent: 'bluetooth', set: 'devices'
kobject: 'kvm' (3101062f): fill_kobj_path: path  
=