Re: possible deadlock in ovl_copy_up_start
On Thu, Oct 18, 2018 at 11:41 AM Miklos Szeredi wrote: > > On Thu, Oct 18, 2018 at 8:26 AM, Amir Goldstein wrote: > > > Can someone tell me what the expected behavior of a nested > > mutex_lock_interruptible(); ? > > > > Why does the reproducer only warn and not really deadlock. > > It is because that is considered the lesser evil? > > and obviously, then inner unlock releases the outer lock? > > No, it's not the same lock, just the same lock class (first one is > OVL_I(d_inode(old))->lock, the other is > OVL_I(d_inode(new->d_parent)))->lock). Doh! of course. > > So we could possibly get away with annotating with > mutex_lock_nested(). Is this the only place that ovl_i_lock is > nested? > As far as I can see it is. But how would we annotate it for consistent and clear locking order? NLINK/COPYUP? if we want this annotation to maintain locking order we need to patch I posted (does copyup of new->parent prior to nlink_start), so then we don't need nested annotations anymore. Nah, I don't think we need to add nested annotations. FYI, I am working on a small cleanup series for ovl_nlink_start/end and ovl_inode_lock/unlock, but it is independent of the fix patch I posted for this bug. Thanks, Amir.
Re: possible deadlock in ovl_copy_up_start
On Thu, Oct 18, 2018 at 11:41 AM Miklos Szeredi wrote: > > On Thu, Oct 18, 2018 at 8:26 AM, Amir Goldstein wrote: > > > Can someone tell me what the expected behavior of a nested > > mutex_lock_interruptible(); ? > > > > Why does the reproducer only warn and not really deadlock. > > It is because that is considered the lesser evil? > > and obviously, then inner unlock releases the outer lock? > > No, it's not the same lock, just the same lock class (first one is > OVL_I(d_inode(old))->lock, the other is > OVL_I(d_inode(new->d_parent)))->lock). Doh! of course. > > So we could possibly get away with annotating with > mutex_lock_nested(). Is this the only place that ovl_i_lock is > nested? > As far as I can see it is. But how would we annotate it for consistent and clear locking order? NLINK/COPYUP? if we want this annotation to maintain locking order we need to patch I posted (does copyup of new->parent prior to nlink_start), so then we don't need nested annotations anymore. Nah, I don't think we need to add nested annotations. FYI, I am working on a small cleanup series for ovl_nlink_start/end and ovl_inode_lock/unlock, but it is independent of the fix patch I posted for this bug. Thanks, Amir.
Re: possible deadlock in ovl_copy_up_start
On Thu, Oct 18, 2018 at 8:26 AM, Amir Goldstein wrote: > Can someone tell me what the expected behavior of a nested > mutex_lock_interruptible(); ? > > Why does the reproducer only warn and not really deadlock. > It is because that is considered the lesser evil? > and obviously, then inner unlock releases the outer lock? No, it's not the same lock, just the same lock class (first one is OVL_I(d_inode(old))->lock, the other is OVL_I(d_inode(new->d_parent)))->lock). So we could possibly get away with annotating with mutex_lock_nested(). Is this the only place that ovl_i_lock is nested? Thanks, Miklos
Re: possible deadlock in ovl_copy_up_start
On Thu, Oct 18, 2018 at 8:26 AM, Amir Goldstein wrote: > Can someone tell me what the expected behavior of a nested > mutex_lock_interruptible(); ? > > Why does the reproducer only warn and not really deadlock. > It is because that is considered the lesser evil? > and obviously, then inner unlock releases the outer lock? No, it's not the same lock, just the same lock class (first one is OVL_I(d_inode(old))->lock, the other is OVL_I(d_inode(new->d_parent)))->lock). So we could possibly get away with annotating with mutex_lock_nested(). Is this the only place that ovl_i_lock is nested? Thanks, Miklos
Re: possible deadlock in ovl_copy_up_start
On Thu, Oct 18, 2018 at 7:48 AM syzbot wrote: > > Hello, > > syzbot found the following crash on: > > HEAD commit:c343db455eb3 Merge branch 'parisc-4.19-3' of git://git.ker.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=167d08ee40 > kernel config: https://syzkaller.appspot.com/x/.config?x=b3f55cb3dfcc6c33 > dashboard link: https://syzkaller.appspot.com/bug?extid=3ef5c0d1a5cb0b21e6be > compiler: gcc (GCC) 8.0.1 20180413 (experimental) > > Unfortunately, I don't have any reproducer for this crash yet. Reproducer is simple: link a non-copied-up file into a non-copied-up parent: ~/unionmount-testsuite# ./run --ov -s ~/unionmount-testsuite# ln /mnt/a/foo100 /mnt/a/dir100/ > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+3ef5c0d1a5cb0b21e...@syzkaller.appspotmail.com > FYI, this is the fix: diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c index 276914ae3c60..e1a55ecb7aba 100644 --- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -663,6 +663,10 @@ static int ovl_link(struct dentry *old, struct inode *newdir, if (err) goto out_drop_write; + err = ovl_copy_up(new->d_parent); + if (err) + goto out_drop_write; + if (ovl_is_metacopy_dentry(old)) { err = ovl_set_redirect(old, false); if (err) > overlayfs: filesystem on './file0' not supported as upperdir > XFS (loop3): unknown mount option [uid<]. > > kobject: 'loop2' (ce85f3f9): kobject_uevent_env > > WARNING: possible recursive locking detected > kobject: 'loop2' (ce85f3f9): fill_kobj_path: path > = '/devices/virtual/block/loop2' > 4.19.0-rc8+ #65 Not tainted > > syz-executor2/8184 is trying to acquire lock: > d7157f3f (_i_lock_key[depth]){+.+.}, at: > ovl_copy_up_start+0x9c/0x2e0 fs/overlayfs/util.c:528 > > but task is already holding lock: > 6f802695 (_i_lock_key[depth]){+.+.}, at: > ovl_nlink_start+0xe0/0x350 fs/overlayfs/util.c:771 > > other info that might help us debug this: > Possible unsafe locking scenario: > > CPU0 > >lock(_i_lock_key[depth]); >lock(_i_lock_key[depth]); > > *** DEADLOCK *** > Can someone tell me what the expected behavior of a nested mutex_lock_interruptible(); ? Why does the reproducer only warn and not really deadlock. It is because that is considered the lesser evil? and obviously, then inner unlock releases the outer lock? Thanks, Amir.
Re: possible deadlock in ovl_copy_up_start
On Thu, Oct 18, 2018 at 7:48 AM syzbot wrote: > > Hello, > > syzbot found the following crash on: > > HEAD commit:c343db455eb3 Merge branch 'parisc-4.19-3' of git://git.ker.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=167d08ee40 > kernel config: https://syzkaller.appspot.com/x/.config?x=b3f55cb3dfcc6c33 > dashboard link: https://syzkaller.appspot.com/bug?extid=3ef5c0d1a5cb0b21e6be > compiler: gcc (GCC) 8.0.1 20180413 (experimental) > > Unfortunately, I don't have any reproducer for this crash yet. Reproducer is simple: link a non-copied-up file into a non-copied-up parent: ~/unionmount-testsuite# ./run --ov -s ~/unionmount-testsuite# ln /mnt/a/foo100 /mnt/a/dir100/ > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+3ef5c0d1a5cb0b21e...@syzkaller.appspotmail.com > FYI, this is the fix: diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c index 276914ae3c60..e1a55ecb7aba 100644 --- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -663,6 +663,10 @@ static int ovl_link(struct dentry *old, struct inode *newdir, if (err) goto out_drop_write; + err = ovl_copy_up(new->d_parent); + if (err) + goto out_drop_write; + if (ovl_is_metacopy_dentry(old)) { err = ovl_set_redirect(old, false); if (err) > overlayfs: filesystem on './file0' not supported as upperdir > XFS (loop3): unknown mount option [uid<]. > > kobject: 'loop2' (ce85f3f9): kobject_uevent_env > > WARNING: possible recursive locking detected > kobject: 'loop2' (ce85f3f9): fill_kobj_path: path > = '/devices/virtual/block/loop2' > 4.19.0-rc8+ #65 Not tainted > > syz-executor2/8184 is trying to acquire lock: > d7157f3f (_i_lock_key[depth]){+.+.}, at: > ovl_copy_up_start+0x9c/0x2e0 fs/overlayfs/util.c:528 > > but task is already holding lock: > 6f802695 (_i_lock_key[depth]){+.+.}, at: > ovl_nlink_start+0xe0/0x350 fs/overlayfs/util.c:771 > > other info that might help us debug this: > Possible unsafe locking scenario: > > CPU0 > >lock(_i_lock_key[depth]); >lock(_i_lock_key[depth]); > > *** DEADLOCK *** > Can someone tell me what the expected behavior of a nested mutex_lock_interruptible(); ? Why does the reproducer only warn and not really deadlock. It is because that is considered the lesser evil? and obviously, then inner unlock releases the outer lock? Thanks, Amir.
possible deadlock in ovl_copy_up_start
Hello, syzbot found the following crash on: HEAD commit:c343db455eb3 Merge branch 'parisc-4.19-3' of git://git.ker.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=167d08ee40 kernel config: https://syzkaller.appspot.com/x/.config?x=b3f55cb3dfcc6c33 dashboard link: https://syzkaller.appspot.com/bug?extid=3ef5c0d1a5cb0b21e6be compiler: gcc (GCC) 8.0.1 20180413 (experimental) Unfortunately, I don't have any reproducer for this crash yet. IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+3ef5c0d1a5cb0b21e...@syzkaller.appspotmail.com overlayfs: filesystem on './file0' not supported as upperdir XFS (loop3): unknown mount option [uid<]. kobject: 'loop2' (ce85f3f9): kobject_uevent_env WARNING: possible recursive locking detected kobject: 'loop2' (ce85f3f9): fill_kobj_path: path = '/devices/virtual/block/loop2' 4.19.0-rc8+ #65 Not tainted syz-executor2/8184 is trying to acquire lock: d7157f3f (_i_lock_key[depth]){+.+.}, at: ovl_copy_up_start+0x9c/0x2e0 fs/overlayfs/util.c:528 but task is already holding lock: 6f802695 (_i_lock_key[depth]){+.+.}, at: ovl_nlink_start+0xe0/0x350 fs/overlayfs/util.c:771 other info that might help us debug this: Possible unsafe locking scenario: CPU0 lock(_i_lock_key[depth]); lock(_i_lock_key[depth]); *** DEADLOCK *** May be due to missing lock nesting notation 5 locks held by syz-executor2/8184: #0: 75695ecf (sb_writers#17){.+.+}, at: sb_start_write include/linux/fs.h:1566 [inline] #0: 75695ecf (sb_writers#17){.+.+}, at: mnt_want_write+0x3f/0xc0 fs/namespace.c:360 #1: ee99eb71 (_i_mutex_dir_key[depth]/1){+.+.}, at: inode_lock_nested include/linux/fs.h:773 [inline] #1: ee99eb71 (_i_mutex_dir_key[depth]/1){+.+.}, at: filename_create+0x1b2/0x5b0 fs/namei.c:3635 #2: 65f3eeeb (_i_mutex_key[depth]){+.+.}, at: inode_lock include/linux/fs.h:738 [inline] #2: 65f3eeeb (_i_mutex_key[depth]){+.+.}, at: vfs_link+0x543/0xb70 fs/namei.c:4232 #3: 8b285486 (sb_writers#3){.+.+}, at: sb_start_write include/linux/fs.h:1566 [inline] #3: 8b285486 (sb_writers#3){.+.+}, at: mnt_want_write+0x3f/0xc0 fs/namespace.c:360 #4: 6f802695 (_i_lock_key[depth]){+.+.}, at: ovl_nlink_start+0xe0/0x350 fs/overlayfs/util.c:771 stack backtrace: CPU: 1 PID: 8184 Comm: syz-executor2 Not tainted 4.19.0-rc8+ #65 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113 print_deadlock_bug kernel/locking/lockdep.c:1759 [inline] check_deadlock kernel/locking/lockdep.c:1803 [inline] validate_chain kernel/locking/lockdep.c:2399 [inline] __lock_acquire.cold.61+0x1fb/0x482 kernel/locking/lockdep.c:3411 lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3900 __mutex_lock_common kernel/locking/mutex.c:925 [inline] __mutex_lock+0x166/0x1700 kernel/locking/mutex.c:1072 mutex_lock_interruptible_nested+0x16/0x20 kernel/locking/mutex.c:1109 ovl_copy_up_start+0x9c/0x2e0 fs/overlayfs/util.c:528 ovl_copy_up_one+0x51f/0x1970 fs/overlayfs/copy_up.c:775 ovl_copy_up_flags+0x14e/0x1d0 fs/overlayfs/copy_up.c:827 ovl_copy_up+0x17/0x1a fs/overlayfs/copy_up.c:874 ovl_create_or_link+0xc7/0x1450 fs/overlayfs/dir.c:543 ovl_link+0x28b/0x37c fs/overlayfs/dir.c:679 vfs_link+0x7a9/0xb70 fs/namei.c:4241 do_linkat+0x724/0xa90 fs/namei.c:4309 __do_sys_linkat fs/namei.c:4333 [inline] __se_sys_linkat fs/namei.c:4330 [inline] __x64_sys_linkat+0xbe/0x150 fs/namei.c:4330 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457569 Code: fd b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:7f8ab8a13c78 EFLAGS: 0246 ORIG_RAX: 0109 RAX: ffda RBX: 0005 RCX: 00457569 RDX: 000f RSI: 2200 RDI: 000f RBP: 0072c040 R08: 0400 R09: R10: 2180 R11: 0246 R12: 7f8ab8a146d4 R13: 004c31d1 R14: 004d3840 R15: kobject: 'bluetooth' (bd8be530): kobject_add_internal: parent: 'virtual', set: '(null)' kobject: 'loop3' (7c159c39): kobject_uevent_env kobject: 'kvm' (3101062f): kobject_uevent_env kobject: 'loop3' (7c159c39): fill_kobj_path: path = '/devices/virtual/block/loop3' kobject: 'hci0' (6b48ee74): kobject_add_internal: parent: 'bluetooth', set: 'devices' kobject: 'kvm' (3101062f): fill_kobj_path: path =
possible deadlock in ovl_copy_up_start
Hello, syzbot found the following crash on: HEAD commit:c343db455eb3 Merge branch 'parisc-4.19-3' of git://git.ker.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=167d08ee40 kernel config: https://syzkaller.appspot.com/x/.config?x=b3f55cb3dfcc6c33 dashboard link: https://syzkaller.appspot.com/bug?extid=3ef5c0d1a5cb0b21e6be compiler: gcc (GCC) 8.0.1 20180413 (experimental) Unfortunately, I don't have any reproducer for this crash yet. IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+3ef5c0d1a5cb0b21e...@syzkaller.appspotmail.com overlayfs: filesystem on './file0' not supported as upperdir XFS (loop3): unknown mount option [uid<]. kobject: 'loop2' (ce85f3f9): kobject_uevent_env WARNING: possible recursive locking detected kobject: 'loop2' (ce85f3f9): fill_kobj_path: path = '/devices/virtual/block/loop2' 4.19.0-rc8+ #65 Not tainted syz-executor2/8184 is trying to acquire lock: d7157f3f (_i_lock_key[depth]){+.+.}, at: ovl_copy_up_start+0x9c/0x2e0 fs/overlayfs/util.c:528 but task is already holding lock: 6f802695 (_i_lock_key[depth]){+.+.}, at: ovl_nlink_start+0xe0/0x350 fs/overlayfs/util.c:771 other info that might help us debug this: Possible unsafe locking scenario: CPU0 lock(_i_lock_key[depth]); lock(_i_lock_key[depth]); *** DEADLOCK *** May be due to missing lock nesting notation 5 locks held by syz-executor2/8184: #0: 75695ecf (sb_writers#17){.+.+}, at: sb_start_write include/linux/fs.h:1566 [inline] #0: 75695ecf (sb_writers#17){.+.+}, at: mnt_want_write+0x3f/0xc0 fs/namespace.c:360 #1: ee99eb71 (_i_mutex_dir_key[depth]/1){+.+.}, at: inode_lock_nested include/linux/fs.h:773 [inline] #1: ee99eb71 (_i_mutex_dir_key[depth]/1){+.+.}, at: filename_create+0x1b2/0x5b0 fs/namei.c:3635 #2: 65f3eeeb (_i_mutex_key[depth]){+.+.}, at: inode_lock include/linux/fs.h:738 [inline] #2: 65f3eeeb (_i_mutex_key[depth]){+.+.}, at: vfs_link+0x543/0xb70 fs/namei.c:4232 #3: 8b285486 (sb_writers#3){.+.+}, at: sb_start_write include/linux/fs.h:1566 [inline] #3: 8b285486 (sb_writers#3){.+.+}, at: mnt_want_write+0x3f/0xc0 fs/namespace.c:360 #4: 6f802695 (_i_lock_key[depth]){+.+.}, at: ovl_nlink_start+0xe0/0x350 fs/overlayfs/util.c:771 stack backtrace: CPU: 1 PID: 8184 Comm: syz-executor2 Not tainted 4.19.0-rc8+ #65 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113 print_deadlock_bug kernel/locking/lockdep.c:1759 [inline] check_deadlock kernel/locking/lockdep.c:1803 [inline] validate_chain kernel/locking/lockdep.c:2399 [inline] __lock_acquire.cold.61+0x1fb/0x482 kernel/locking/lockdep.c:3411 lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3900 __mutex_lock_common kernel/locking/mutex.c:925 [inline] __mutex_lock+0x166/0x1700 kernel/locking/mutex.c:1072 mutex_lock_interruptible_nested+0x16/0x20 kernel/locking/mutex.c:1109 ovl_copy_up_start+0x9c/0x2e0 fs/overlayfs/util.c:528 ovl_copy_up_one+0x51f/0x1970 fs/overlayfs/copy_up.c:775 ovl_copy_up_flags+0x14e/0x1d0 fs/overlayfs/copy_up.c:827 ovl_copy_up+0x17/0x1a fs/overlayfs/copy_up.c:874 ovl_create_or_link+0xc7/0x1450 fs/overlayfs/dir.c:543 ovl_link+0x28b/0x37c fs/overlayfs/dir.c:679 vfs_link+0x7a9/0xb70 fs/namei.c:4241 do_linkat+0x724/0xa90 fs/namei.c:4309 __do_sys_linkat fs/namei.c:4333 [inline] __se_sys_linkat fs/namei.c:4330 [inline] __x64_sys_linkat+0xbe/0x150 fs/namei.c:4330 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457569 Code: fd b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:7f8ab8a13c78 EFLAGS: 0246 ORIG_RAX: 0109 RAX: ffda RBX: 0005 RCX: 00457569 RDX: 000f RSI: 2200 RDI: 000f RBP: 0072c040 R08: 0400 R09: R10: 2180 R11: 0246 R12: 7f8ab8a146d4 R13: 004c31d1 R14: 004d3840 R15: kobject: 'bluetooth' (bd8be530): kobject_add_internal: parent: 'virtual', set: '(null)' kobject: 'loop3' (7c159c39): kobject_uevent_env kobject: 'kvm' (3101062f): kobject_uevent_env kobject: 'loop3' (7c159c39): fill_kobj_path: path = '/devices/virtual/block/loop3' kobject: 'hci0' (6b48ee74): kobject_add_internal: parent: 'bluetooth', set: 'devices' kobject: 'kvm' (3101062f): fill_kobj_path: path =