Re: INFO: task hung in lo_ioctl
On Sun, Jan 20, 2019 at 3:36 AM Tetsuo Handa wrote: > > On 2019/01/20 3:56, Dmitry Vyukov wrote: > >> Seems easy enough to fix without resorting to __mutex_owner() (untested): > >> > >> > >> diff --git a/drivers/block/loop.c b/drivers/block/loop.c > >> index 264abaaff662..cee258d12a1e 100644 > >> --- a/drivers/block/loop.c > >> +++ b/drivers/block/loop.c > >> @@ -1300,12 +1300,13 @@ loop_get_status_old(struct loop_device *lo, struct > >> loop_info __user *arg) { > >> static int > >> loop_get_status64(struct loop_device *lo, struct loop_info64 __user *arg) > >> { > >> struct loop_info64 info64; > >> - int err = 0; > >> + int err; > >> > >> - if (!arg) > >> - err = -EINVAL; > >> - if (!err) > >> - err = loop_get_status(lo, ); > >> + if (!arg) { > >> + mutex_unlock(>lo_ctl_mutex); > >> + return -EINVAL; > >> + } > >> + err = loop_get_status(lo, ); > >> if (!err && copy_to_user(arg, , sizeof(info64))) > >> err = -EFAULT; > >> > >> > >> I'll test it and send it up when I get into the office. > > > > > > Was this ever submitted? Or some other fix for this? > > > > The bug is still open, but last happened 289 days ago: > > https://syzkaller.appspot.com/bug?id=608144371e7fc2cb6285b9ed871fb1eb817a61ce > > > > But it also has 10 duplicates, some of which happened much more recently. > > If a fix was submitted, but Reported-by tag wasn't added this open bug > > can now mask lots of other new bugs. > > > > The commit for this specific patch is bdac616db9bbadb9 ("loop: fix > LOOP_GET_STATUS > lock imbalance"). But the root cause of these hung tasks would be fixed by a > series > containing commit 1dded9acf6dc9a34 ("Avoid circular locking dependency between > loop_ctl_mutex and bd_mutex") or commit 04906b2f542c2362 ("blockdev: Fix > livelocks > on loop device"). We were not aware of these bugs when you marked these > reports as > duplicates on 2017/12/12. You can undup them and fix them if you want. OK, let's just do then: #syz fix: blockdev: Fix livelocks on loop device
Re: INFO: task hung in lo_ioctl
On 2019/01/20 3:56, Dmitry Vyukov wrote: >> Seems easy enough to fix without resorting to __mutex_owner() (untested): >> >> >> diff --git a/drivers/block/loop.c b/drivers/block/loop.c >> index 264abaaff662..cee258d12a1e 100644 >> --- a/drivers/block/loop.c >> +++ b/drivers/block/loop.c >> @@ -1300,12 +1300,13 @@ loop_get_status_old(struct loop_device *lo, struct >> loop_info __user *arg) { >> static int >> loop_get_status64(struct loop_device *lo, struct loop_info64 __user *arg) { >> struct loop_info64 info64; >> - int err = 0; >> + int err; >> >> - if (!arg) >> - err = -EINVAL; >> - if (!err) >> - err = loop_get_status(lo, ); >> + if (!arg) { >> + mutex_unlock(>lo_ctl_mutex); >> + return -EINVAL; >> + } >> + err = loop_get_status(lo, ); >> if (!err && copy_to_user(arg, , sizeof(info64))) >> err = -EFAULT; >> >> >> I'll test it and send it up when I get into the office. > > > Was this ever submitted? Or some other fix for this? > > The bug is still open, but last happened 289 days ago: > https://syzkaller.appspot.com/bug?id=608144371e7fc2cb6285b9ed871fb1eb817a61ce > > But it also has 10 duplicates, some of which happened much more recently. > If a fix was submitted, but Reported-by tag wasn't added this open bug > can now mask lots of other new bugs. > The commit for this specific patch is bdac616db9bbadb9 ("loop: fix LOOP_GET_STATUS lock imbalance"). But the root cause of these hung tasks would be fixed by a series containing commit 1dded9acf6dc9a34 ("Avoid circular locking dependency between loop_ctl_mutex and bd_mutex") or commit 04906b2f542c2362 ("blockdev: Fix livelocks on loop device"). We were not aware of these bugs when you marked these reports as duplicates on 2017/12/12. You can undup them and fix them if you want.
Re: INFO: task hung in lo_ioctl
On Fri, Apr 6, 2018 at 5:59 PM Omar Sandoval wrote: > > On Fri, Apr 06, 2018 at 05:43:43PM +0200, Peter Zijlstra wrote: > > On Fri, Apr 06, 2018 at 10:55:03PM +0900, Tetsuo Handa wrote: > > > Peter Zijlstra wrote: > > > > On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote: > > > > > + /* Temporary hack for handling lock imbalance. */ > > > > > + if (__mutex_owner(>lo_ctl_mutex) == current) > > > > > + mutex_unlock(>lo_ctl_mutex); > > > > > > > > ARGGH.. you didn't read the comment we put on that? > > > > > > > > > > Commit 5b52330bbfe63b33 ("audit: fix auditd/kernel connection state > > > tracking") > > > is using __mutex_owner(). ;-) > > > > That got removed and the warning added. > > Seems easy enough to fix without resorting to __mutex_owner() (untested): > > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > index 264abaaff662..cee258d12a1e 100644 > --- a/drivers/block/loop.c > +++ b/drivers/block/loop.c > @@ -1300,12 +1300,13 @@ loop_get_status_old(struct loop_device *lo, struct > loop_info __user *arg) { > static int > loop_get_status64(struct loop_device *lo, struct loop_info64 __user *arg) { > struct loop_info64 info64; > - int err = 0; > + int err; > > - if (!arg) > - err = -EINVAL; > - if (!err) > - err = loop_get_status(lo, ); > + if (!arg) { > + mutex_unlock(>lo_ctl_mutex); > + return -EINVAL; > + } > + err = loop_get_status(lo, ); > if (!err && copy_to_user(arg, , sizeof(info64))) > err = -EFAULT; > > > I'll test it and send it up when I get into the office. Was this ever submitted? Or some other fix for this? The bug is still open, but last happened 289 days ago: https://syzkaller.appspot.com/bug?id=608144371e7fc2cb6285b9ed871fb1eb817a61ce But it also has 10 duplicates, some of which happened much more recently. If a fix was submitted, but Reported-by tag wasn't added this open bug can now mask lots of other new bugs.
Re: INFO: task hung in lo_ioctl
On Fri, Apr 06, 2018 at 05:43:43PM +0200, Peter Zijlstra wrote: > On Fri, Apr 06, 2018 at 10:55:03PM +0900, Tetsuo Handa wrote: > > Peter Zijlstra wrote: > > > On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote: > > > > + /* Temporary hack for handling lock imbalance. */ > > > > + if (__mutex_owner(>lo_ctl_mutex) == current) > > > > + mutex_unlock(>lo_ctl_mutex); > > > > > > ARGGH.. you didn't read the comment we put on that? > > > > > > > Commit 5b52330bbfe63b33 ("audit: fix auditd/kernel connection state > > tracking") > > is using __mutex_owner(). ;-) > > That got removed and the warning added. Seems easy enough to fix without resorting to __mutex_owner() (untested): diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 264abaaff662..cee258d12a1e 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1300,12 +1300,13 @@ loop_get_status_old(struct loop_device *lo, struct loop_info __user *arg) { static int loop_get_status64(struct loop_device *lo, struct loop_info64 __user *arg) { struct loop_info64 info64; - int err = 0; + int err; - if (!arg) - err = -EINVAL; - if (!err) - err = loop_get_status(lo, ); + if (!arg) { + mutex_unlock(>lo_ctl_mutex); + return -EINVAL; + } + err = loop_get_status(lo, ); if (!err && copy_to_user(arg, , sizeof(info64))) err = -EFAULT; I'll test it and send it up when I get into the office.
Re: INFO: task hung in lo_ioctl
On Fri, Apr 06, 2018 at 05:43:43PM +0200, Peter Zijlstra wrote: > On Fri, Apr 06, 2018 at 10:55:03PM +0900, Tetsuo Handa wrote: > > Peter Zijlstra wrote: > > > On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote: > > > > + /* Temporary hack for handling lock imbalance. */ > > > > + if (__mutex_owner(>lo_ctl_mutex) == current) > > > > + mutex_unlock(>lo_ctl_mutex); > > > > > > ARGGH.. you didn't read the comment we put on that? > > > > > > > Commit 5b52330bbfe63b33 ("audit: fix auditd/kernel connection state > > tracking") > > is using __mutex_owner(). ;-) > > That got removed and the warning added. Seems easy enough to fix without resorting to __mutex_owner() (untested): diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 264abaaff662..cee258d12a1e 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1300,12 +1300,13 @@ loop_get_status_old(struct loop_device *lo, struct loop_info __user *arg) { static int loop_get_status64(struct loop_device *lo, struct loop_info64 __user *arg) { struct loop_info64 info64; - int err = 0; + int err; - if (!arg) - err = -EINVAL; - if (!err) - err = loop_get_status(lo, ); + if (!arg) { + mutex_unlock(>lo_ctl_mutex); + return -EINVAL; + } + err = loop_get_status(lo, ); if (!err && copy_to_user(arg, , sizeof(info64))) err = -EFAULT; I'll test it and send it up when I get into the office.
Re: INFO: task hung in lo_ioctl
On Fri, Apr 06, 2018 at 10:55:03PM +0900, Tetsuo Handa wrote: > Peter Zijlstra wrote: > > On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote: > > > + /* Temporary hack for handling lock imbalance. */ > > > + if (__mutex_owner(>lo_ctl_mutex) == current) > > > + mutex_unlock(>lo_ctl_mutex); > > > > ARGGH.. you didn't read the comment we put on that? > > > > Commit 5b52330bbfe63b33 ("audit: fix auditd/kernel connection state tracking") > is using __mutex_owner(). ;-) That got removed and the warning added.
Re: INFO: task hung in lo_ioctl
On Fri, Apr 06, 2018 at 10:55:03PM +0900, Tetsuo Handa wrote: > Peter Zijlstra wrote: > > On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote: > > > + /* Temporary hack for handling lock imbalance. */ > > > + if (__mutex_owner(>lo_ctl_mutex) == current) > > > + mutex_unlock(>lo_ctl_mutex); > > > > ARGGH.. you didn't read the comment we put on that? > > > > Commit 5b52330bbfe63b33 ("audit: fix auditd/kernel connection state tracking") > is using __mutex_owner(). ;-) That got removed and the warning added.
Re: INFO: task hung in lo_ioctl
Peter Zijlstra wrote: > On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote: > > + /* Temporary hack for handling lock imbalance. */ > > + if (__mutex_owner(>lo_ctl_mutex) == current) > > + mutex_unlock(>lo_ctl_mutex); > > ARGGH.. you didn't read the comment we put on that? > Commit 5b52330bbfe63b33 ("audit: fix auditd/kernel connection state tracking") is using __mutex_owner(). ;-) Of course, regarding loop module, we will be able to add a flag variable associated with lo->lo_ctl_mutex. But that will be done after we solved the deadlock problem. I think whether we need to drop lo->lo_ctl_mutex is not clear. Maybe - err = mutex_lock_killable_nested(>lo_ctl_mutex, 1); + err = mutex_lock_killable(>lo_ctl_mutex); on top of this patch and listen to the lockdep? Commit f028f3b2f987ebc6 ("loop: fix circular locking in loop_clr_fd()") says A simple way to silence lockdep could be to mark the lo_ctl_mutex in ioctl to be a sub class, but this might mask some other real bugs. and we are currently hitting a deadlock problem.
Re: INFO: task hung in lo_ioctl
Peter Zijlstra wrote: > On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote: > > + /* Temporary hack for handling lock imbalance. */ > > + if (__mutex_owner(>lo_ctl_mutex) == current) > > + mutex_unlock(>lo_ctl_mutex); > > ARGGH.. you didn't read the comment we put on that? > Commit 5b52330bbfe63b33 ("audit: fix auditd/kernel connection state tracking") is using __mutex_owner(). ;-) Of course, regarding loop module, we will be able to add a flag variable associated with lo->lo_ctl_mutex. But that will be done after we solved the deadlock problem. I think whether we need to drop lo->lo_ctl_mutex is not clear. Maybe - err = mutex_lock_killable_nested(>lo_ctl_mutex, 1); + err = mutex_lock_killable(>lo_ctl_mutex); on top of this patch and listen to the lockdep? Commit f028f3b2f987ebc6 ("loop: fix circular locking in loop_clr_fd()") says A simple way to silence lockdep could be to mark the lo_ctl_mutex in ioctl to be a sub class, but this might mask some other real bugs. and we are currently hitting a deadlock problem.
Re: INFO: task hung in lo_ioctl
On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote: > + /* Temporary hack for handling lock imbalance. */ > + if (__mutex_owner(>lo_ctl_mutex) == current) > + mutex_unlock(>lo_ctl_mutex); ARGGH.. you didn't read the comment we put on that?
Re: INFO: task hung in lo_ioctl
On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote: > + /* Temporary hack for handling lock imbalance. */ > + if (__mutex_owner(>lo_ctl_mutex) == current) > + mutex_unlock(>lo_ctl_mutex); ARGGH.. you didn't read the comment we put on that?
Re: INFO: task hung in lo_ioctl
On 2018/04/05 0:23, Tetsuo Handa wrote: > This seems to be an AB-BA deadlock where the lockdep cannot report (due to > use of nested lock?). > What is happening here? > This patch does not directly fix the bug syzbot is reporting, but could be relevant. Maybe try replacing mutex_lock_killable_nested() with mutex_lock_killable() and check what the lockdep will say? >From 0b006d536b2e439f347eb857e482fc304e84fd1d Mon Sep 17 00:00:00 2001 From: Tetsuo HandaDate: Fri, 6 Apr 2018 20:52:10 +0900 Subject: [PATCH] block/loop: fix lock imbalance bug at lo_ioctl Commit 2d1d4c1e591fd40b ("loop: don't call into filesystem while holding lo_ctl_mutex") introduced mutex lock imbalance bug where syzbot would trigger. The caller of loop_get_status64() assumes that mutex_unlock() is already called by loop_get_status() but loop_get_status64() does not always call loop_get_status(). Commit f028f3b2f987ebc6 ("loop: fix circular locking in loop_clr_fd()") also dropped the mutex for deadlock avoidance reason. But we should recheck whether we have to drop the mutex. Dropping the mutex at the middle of an ioctl() request is not nice, but syzbot is reporting a deadlock inside loop_reread_partitions() which is called by loop_set_fd() and loop_change_fd(). Signed-off-by: Tetsuo Handa Fixes: 2d1d4c1e591fd40b ("loop: don't call into filesystem while holding lo_ctl_mutex") Cc: Omar Sandoval Cc: Nikanth Karthikesan Cc: Jens Axboe --- drivers/block/loop.c | 17 ++--- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 264abaa..64033e7 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1362,7 +1362,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, err = mutex_lock_killable_nested(>lo_ctl_mutex, 1); if (err) - goto out_unlocked; + return err; switch (cmd) { case LOOP_SET_FD: @@ -1372,10 +1372,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, err = loop_change_fd(lo, bdev, arg); break; case LOOP_CLR_FD: - /* loop_clr_fd would have unlocked lo_ctl_mutex on success */ err = loop_clr_fd(lo); - if (!err) - goto out_unlocked; break; case LOOP_SET_STATUS: err = -EPERM; @@ -1385,8 +1382,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, break; case LOOP_GET_STATUS: err = loop_get_status_old(lo, (struct loop_info __user *) arg); - /* loop_get_status() unlocks lo_ctl_mutex */ - goto out_unlocked; + break; case LOOP_SET_STATUS64: err = -EPERM; if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) @@ -1395,8 +1391,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, break; case LOOP_GET_STATUS64: err = loop_get_status64(lo, (struct loop_info64 __user *) arg); - /* loop_get_status() unlocks lo_ctl_mutex */ - goto out_unlocked; + break; case LOOP_SET_CAPACITY: err = -EPERM; if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) @@ -1415,9 +1410,9 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, default: err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL; } - mutex_unlock(>lo_ctl_mutex); - -out_unlocked: + /* Temporary hack for handling lock imbalance. */ + if (__mutex_owner(>lo_ctl_mutex) == current) + mutex_unlock(>lo_ctl_mutex); return err; } -- 1.8.3.1
Re: INFO: task hung in lo_ioctl
On 2018/04/05 0:23, Tetsuo Handa wrote: > This seems to be an AB-BA deadlock where the lockdep cannot report (due to > use of nested lock?). > What is happening here? > This patch does not directly fix the bug syzbot is reporting, but could be relevant. Maybe try replacing mutex_lock_killable_nested() with mutex_lock_killable() and check what the lockdep will say? >From 0b006d536b2e439f347eb857e482fc304e84fd1d Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Fri, 6 Apr 2018 20:52:10 +0900 Subject: [PATCH] block/loop: fix lock imbalance bug at lo_ioctl Commit 2d1d4c1e591fd40b ("loop: don't call into filesystem while holding lo_ctl_mutex") introduced mutex lock imbalance bug where syzbot would trigger. The caller of loop_get_status64() assumes that mutex_unlock() is already called by loop_get_status() but loop_get_status64() does not always call loop_get_status(). Commit f028f3b2f987ebc6 ("loop: fix circular locking in loop_clr_fd()") also dropped the mutex for deadlock avoidance reason. But we should recheck whether we have to drop the mutex. Dropping the mutex at the middle of an ioctl() request is not nice, but syzbot is reporting a deadlock inside loop_reread_partitions() which is called by loop_set_fd() and loop_change_fd(). Signed-off-by: Tetsuo Handa Fixes: 2d1d4c1e591fd40b ("loop: don't call into filesystem while holding lo_ctl_mutex") Cc: Omar Sandoval Cc: Nikanth Karthikesan Cc: Jens Axboe --- drivers/block/loop.c | 17 ++--- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 264abaa..64033e7 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1362,7 +1362,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, err = mutex_lock_killable_nested(>lo_ctl_mutex, 1); if (err) - goto out_unlocked; + return err; switch (cmd) { case LOOP_SET_FD: @@ -1372,10 +1372,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, err = loop_change_fd(lo, bdev, arg); break; case LOOP_CLR_FD: - /* loop_clr_fd would have unlocked lo_ctl_mutex on success */ err = loop_clr_fd(lo); - if (!err) - goto out_unlocked; break; case LOOP_SET_STATUS: err = -EPERM; @@ -1385,8 +1382,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, break; case LOOP_GET_STATUS: err = loop_get_status_old(lo, (struct loop_info __user *) arg); - /* loop_get_status() unlocks lo_ctl_mutex */ - goto out_unlocked; + break; case LOOP_SET_STATUS64: err = -EPERM; if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) @@ -1395,8 +1391,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, break; case LOOP_GET_STATUS64: err = loop_get_status64(lo, (struct loop_info64 __user *) arg); - /* loop_get_status() unlocks lo_ctl_mutex */ - goto out_unlocked; + break; case LOOP_SET_CAPACITY: err = -EPERM; if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) @@ -1415,9 +1410,9 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, default: err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL; } - mutex_unlock(>lo_ctl_mutex); - -out_unlocked: + /* Temporary hack for handling lock imbalance. */ + if (__mutex_owner(>lo_ctl_mutex) == current) + mutex_unlock(>lo_ctl_mutex); return err; } -- 1.8.3.1
Re: INFO: task hung in lo_ioctl
This seems to be an AB-BA deadlock where the lockdep cannot report (due to use of nested lock?). When PID=6540 was (reported as hung) at mutex_lock_nested(>lo_ctl_mutex, 1) (id=43ca8836), it was already holding down_write_nested(>s_umount, SINGLE_DEPTH_NESTING) (id=566d4c39). But when PID=6541 was (which would have been reported as hung if sysctl_hung_task_panic were not set) at down_read(>s_umount) (id=566d4c39), it was already holding mutex_lock_nested(>lo_ctl_mutex, 1) (id=43ca8836). INFO: task syz-executor0:6540 blocked for more than 120 seconds. Not tainted 4.16.0+ #13 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syz-executor0 D23560 6540 4521 0x8004 Call Trace: context_switch kernel/sched/core.c:2848 [inline] __schedule+0x8fb/0x1ef0 kernel/sched/core.c:3490 schedule+0xf5/0x430 kernel/sched/core.c:3549 schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3607 __mutex_lock_common kernel/locking/mutex.c:833 [inline] __mutex_lock+0xb7f/0x1810 kernel/locking/mutex.c:893 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908 lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 __blkdev_driver_ioctl block/ioctl.c:303 [inline] blkdev_ioctl+0x1759/0x1e00 block/ioctl.c:601 ioctl_by_bdev+0xa5/0x110 fs/block_dev.c:2060 isofs_get_last_session fs/isofs/inode.c:567 [inline] isofs_fill_super+0x2ba9/0x3bc0 fs/isofs/inode.c:660 mount_bdev+0x2b7/0x370 fs/super.c:1119 isofs_mount+0x34/0x40 fs/isofs/inode.c:1560 mount_fs+0x66/0x2d0 fs/super.c:1222 2 locks held by syz-executor0/6540: #0: 566d4c39 (>s_umount_key#49/1){+.+.}, at: alloc_super fs/super.c:211 [inline] #0: 566d4c39 (>s_umount_key#49/1){+.+.}, at: sget_userns+0x3b2/0xe60 fs/super.c:502 /* down_write_nested(>s_umount, SINGLE_DEPTH_NESTING); */ #1: 43ca8836 (>lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 /* mutex_lock_nested(>lo_ctl_mutex, 1); */ 3 locks held by syz-executor7/6541: #0: 43ca8836 (>lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 /* mutex_lock_nested(>lo_ctl_mutex, 1); */ #1: 7bf3d3f9 (>bd_mutex){+.+.}, at: blkdev_reread_part+0x1e/0x40 block/ioctl.c:192 #2: 566d4c39 (>s_umount_key#50){.+.+}, at: __get_super.part.10+0x1d3/0x280 fs/super.c:663 /* down_read(>s_umount); */ sget() is using down_write_nested(>s_umount, SINGLE_DEPTH_NESTING) with a comment block asserting that there is no risk of deadlock /* * sget() can have s_umount recursion. * * When it cannot find a suitable sb, it allocates a new * one (this one), and tries again to find a suitable old * one. * * In case that succeeds, it will acquire the s_umount * lock of the old one. Since these are clearly distrinct * locks, and this object isn't exposed yet, there's no * risk of deadlocks. * * Annotate this by putting this lock in a different * subclass. */ but this object (id=566d4c39) is already locked by other thread. What is happening here?
Re: INFO: task hung in lo_ioctl
This seems to be an AB-BA deadlock where the lockdep cannot report (due to use of nested lock?). When PID=6540 was (reported as hung) at mutex_lock_nested(>lo_ctl_mutex, 1) (id=43ca8836), it was already holding down_write_nested(>s_umount, SINGLE_DEPTH_NESTING) (id=566d4c39). But when PID=6541 was (which would have been reported as hung if sysctl_hung_task_panic were not set) at down_read(>s_umount) (id=566d4c39), it was already holding mutex_lock_nested(>lo_ctl_mutex, 1) (id=43ca8836). INFO: task syz-executor0:6540 blocked for more than 120 seconds. Not tainted 4.16.0+ #13 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syz-executor0 D23560 6540 4521 0x8004 Call Trace: context_switch kernel/sched/core.c:2848 [inline] __schedule+0x8fb/0x1ef0 kernel/sched/core.c:3490 schedule+0xf5/0x430 kernel/sched/core.c:3549 schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3607 __mutex_lock_common kernel/locking/mutex.c:833 [inline] __mutex_lock+0xb7f/0x1810 kernel/locking/mutex.c:893 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908 lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 __blkdev_driver_ioctl block/ioctl.c:303 [inline] blkdev_ioctl+0x1759/0x1e00 block/ioctl.c:601 ioctl_by_bdev+0xa5/0x110 fs/block_dev.c:2060 isofs_get_last_session fs/isofs/inode.c:567 [inline] isofs_fill_super+0x2ba9/0x3bc0 fs/isofs/inode.c:660 mount_bdev+0x2b7/0x370 fs/super.c:1119 isofs_mount+0x34/0x40 fs/isofs/inode.c:1560 mount_fs+0x66/0x2d0 fs/super.c:1222 2 locks held by syz-executor0/6540: #0: 566d4c39 (>s_umount_key#49/1){+.+.}, at: alloc_super fs/super.c:211 [inline] #0: 566d4c39 (>s_umount_key#49/1){+.+.}, at: sget_userns+0x3b2/0xe60 fs/super.c:502 /* down_write_nested(>s_umount, SINGLE_DEPTH_NESTING); */ #1: 43ca8836 (>lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 /* mutex_lock_nested(>lo_ctl_mutex, 1); */ 3 locks held by syz-executor7/6541: #0: 43ca8836 (>lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 /* mutex_lock_nested(>lo_ctl_mutex, 1); */ #1: 7bf3d3f9 (>bd_mutex){+.+.}, at: blkdev_reread_part+0x1e/0x40 block/ioctl.c:192 #2: 566d4c39 (>s_umount_key#50){.+.+}, at: __get_super.part.10+0x1d3/0x280 fs/super.c:663 /* down_read(>s_umount); */ sget() is using down_write_nested(>s_umount, SINGLE_DEPTH_NESTING) with a comment block asserting that there is no risk of deadlock /* * sget() can have s_umount recursion. * * When it cannot find a suitable sb, it allocates a new * one (this one), and tries again to find a suitable old * one. * * In case that succeeds, it will acquire the s_umount * lock of the old one. Since these are clearly distrinct * locks, and this object isn't exposed yet, there's no * risk of deadlocks. * * Annotate this by putting this lock in a different * subclass. */ but this object (id=566d4c39) is already locked by other thread. What is happening here?
Re: INFO: task hung in lo_ioctl
On Sun, Dec 10, 2017 at 2:32 PM, syzbotwrote: > Hello, > > syzkaller hit the following crash on > ad4dac17f9d563b9e34aab78a34293b10993e9b5 > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached > Raw console output is attached. > > Unfortunately, I don't have any reproducer for this bug yet. > > > INFO: task syz-executor3:12157 blocked for more than 120 seconds. > Not tainted 4.15.0-rc2-next-20171208+ #63 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > syz-executor3 D25536 12157 3362 0x0004 > Call Trace: > context_switch kernel/sched/core.c:2800 [inline] > __schedule+0x8eb/0x2060 kernel/sched/core.c:3376 > schedule+0xf5/0x430 kernel/sched/core.c:3435 > schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3493 > __mutex_lock_common kernel/locking/mutex.c:833 [inline] > __mutex_lock+0xaad/0x1a80 kernel/locking/mutex.c:893 > mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908 > lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355 > __blkdev_driver_ioctl block/ioctl.c:303 [inline] > blkdev_ioctl+0x1759/0x1e00 block/ioctl.c:601 > block_ioctl+0xea/0x130 fs/block_dev.c:1860 > vfs_ioctl fs/ioctl.c:46 [inline] > do_vfs_ioctl+0x1b1/0x1530 fs/ioctl.c:686 > SYSC_ioctl fs/ioctl.c:701 [inline] > SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692 > entry_SYSCALL_64_fastpath+0x1f/0x96 > RIP: 0033:0x452a39 > RSP: 002b:7fd85b57ec58 EFLAGS: 0212 ORIG_RAX: 0010 > RAX: ffda RBX: 00758190 RCX: 00452a39 > RDX: 20e67fd8 RSI: 400454ca RDI: 0016 > RBP: 005b R08: R09: > R10: R11: 0212 R12: 006ee928 > R13: R14: 7fd85b57f6d4 R15: 0002 > > Showing all locks held in the system: > 2 locks held by khungtaskd/671: > #0: (rcu_read_lock){}, at: [<53bc3983>] > check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline] > #0: (rcu_read_lock){}, at: [<53bc3983>] watchdog+0x1c5/0xd60 > kernel/hung_task.c:249 > #1: (tasklist_lock){.+.+}, at: [ ] > debug_show_all_locks+0xd3/0x400 kernel/locking/lockdep.c:4554 > 2 locks held by getty/3115: > #0: (>ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (>atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131 > 2 locks held by getty/3116: > #0: (>ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (>atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131 > 2 locks held by getty/3117: > #0: (>ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (>atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131 > 2 locks held by getty/3118: > #0: (>ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (>atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131 > 2 locks held by getty/3119: > #0: (>ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (>atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131 > 2 locks held by getty/3120: > #0: (>ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (>atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131 > 1 lock held by syz-executor3/12157: > #0: (>lo_ctl_mutex/1){+.+.}, at: [ ] > lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355 > 1 lock held by syz-executor7/12154: > #0: (>lo_ctl_mutex/1){+.+.}, at: [ ] > lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355 > 1 lock held by syz-executor7/12168: > #0: (>lo_ctl_mutex/1){+.+.}, at: [ ] > lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355 > 1 lock held by syz-executor5/12216: > #0: (>lo_ctl_mutex/1){+.+.}, at: [ ] > lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355 > 2 locks held by getty/12231: > #0: (>ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (>atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131 > 1 lock held by blkid/12232: > #0: (>lo_ctl_mutex/1){+.+.}, at: [ ] > lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355 > > = > > NMI backtrace for cpu 0 > CPU: 0 PID: 671 Comm: khungtaskd Not tainted 4.15.0-rc2-next-20171208+ #63 > Hardware name: Google Google
Re: INFO: task hung in lo_ioctl
On Sun, Dec 10, 2017 at 2:32 PM, syzbot wrote: > Hello, > > syzkaller hit the following crash on > ad4dac17f9d563b9e34aab78a34293b10993e9b5 > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached > Raw console output is attached. > > Unfortunately, I don't have any reproducer for this bug yet. > > > INFO: task syz-executor3:12157 blocked for more than 120 seconds. > Not tainted 4.15.0-rc2-next-20171208+ #63 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > syz-executor3 D25536 12157 3362 0x0004 > Call Trace: > context_switch kernel/sched/core.c:2800 [inline] > __schedule+0x8eb/0x2060 kernel/sched/core.c:3376 > schedule+0xf5/0x430 kernel/sched/core.c:3435 > schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3493 > __mutex_lock_common kernel/locking/mutex.c:833 [inline] > __mutex_lock+0xaad/0x1a80 kernel/locking/mutex.c:893 > mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908 > lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355 > __blkdev_driver_ioctl block/ioctl.c:303 [inline] > blkdev_ioctl+0x1759/0x1e00 block/ioctl.c:601 > block_ioctl+0xea/0x130 fs/block_dev.c:1860 > vfs_ioctl fs/ioctl.c:46 [inline] > do_vfs_ioctl+0x1b1/0x1530 fs/ioctl.c:686 > SYSC_ioctl fs/ioctl.c:701 [inline] > SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692 > entry_SYSCALL_64_fastpath+0x1f/0x96 > RIP: 0033:0x452a39 > RSP: 002b:7fd85b57ec58 EFLAGS: 0212 ORIG_RAX: 0010 > RAX: ffda RBX: 00758190 RCX: 00452a39 > RDX: 20e67fd8 RSI: 400454ca RDI: 0016 > RBP: 005b R08: R09: > R10: R11: 0212 R12: 006ee928 > R13: R14: 7fd85b57f6d4 R15: 0002 > > Showing all locks held in the system: > 2 locks held by khungtaskd/671: > #0: (rcu_read_lock){}, at: [<53bc3983>] > check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline] > #0: (rcu_read_lock){}, at: [<53bc3983>] watchdog+0x1c5/0xd60 > kernel/hung_task.c:249 > #1: (tasklist_lock){.+.+}, at: [] > debug_show_all_locks+0xd3/0x400 kernel/locking/lockdep.c:4554 > 2 locks held by getty/3115: > #0: (>ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (>atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131 > 2 locks held by getty/3116: > #0: (>ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (>atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131 > 2 locks held by getty/3117: > #0: (>ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (>atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131 > 2 locks held by getty/3118: > #0: (>ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (>atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131 > 2 locks held by getty/3119: > #0: (>ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (>atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131 > 2 locks held by getty/3120: > #0: (>ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (>atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131 > 1 lock held by syz-executor3/12157: > #0: (>lo_ctl_mutex/1){+.+.}, at: [ ] > lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355 > 1 lock held by syz-executor7/12154: > #0: (>lo_ctl_mutex/1){+.+.}, at: [ ] > lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355 > 1 lock held by syz-executor7/12168: > #0: (>lo_ctl_mutex/1){+.+.}, at: [ ] > lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355 > 1 lock held by syz-executor5/12216: > #0: (>lo_ctl_mutex/1){+.+.}, at: [ ] > lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355 > 2 locks held by getty/12231: > #0: (>ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (>atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131 > 1 lock held by blkid/12232: > #0: (>lo_ctl_mutex/1){+.+.}, at: [ ] > lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355 > > = > > NMI backtrace for cpu 0 > CPU: 0 PID: 671 Comm: khungtaskd Not tainted 4.15.0-rc2-next-20171208+ #63 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call