Re: INFO: task hung in lo_ioctl

2019-01-20 Thread Dmitry Vyukov
On Sun, Jan 20, 2019 at 3:36 AM Tetsuo Handa
 wrote:
>
> On 2019/01/20 3:56, Dmitry Vyukov wrote:
> >> Seems easy enough to fix without resorting to __mutex_owner() (untested):
> >>
> >>
> >> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> >> index 264abaaff662..cee258d12a1e 100644
> >> --- a/drivers/block/loop.c
> >> +++ b/drivers/block/loop.c
> >> @@ -1300,12 +1300,13 @@ loop_get_status_old(struct loop_device *lo, struct 
> >> loop_info __user *arg) {
> >>  static int
> >>  loop_get_status64(struct loop_device *lo, struct loop_info64 __user *arg) 
> >> {
> >> struct loop_info64 info64;
> >> -   int err = 0;
> >> +   int err;
> >>
> >> -   if (!arg)
> >> -   err = -EINVAL;
> >> -   if (!err)
> >> -   err = loop_get_status(lo, );
> >> +   if (!arg) {
> >> +   mutex_unlock(>lo_ctl_mutex);
> >> +   return -EINVAL;
> >> +   }
> >> +   err = loop_get_status(lo, );
> >> if (!err && copy_to_user(arg, , sizeof(info64)))
> >> err = -EFAULT;
> >>
> >>
> >> I'll test it and send it up when I get into the office.
> >
> >
> > Was this ever submitted? Or some other fix for this?
> >
> > The bug is still open, but last happened 289 days ago:
> > https://syzkaller.appspot.com/bug?id=608144371e7fc2cb6285b9ed871fb1eb817a61ce
> >
> > But it also has 10 duplicates, some of which happened much more recently.
> > If a fix was submitted, but Reported-by tag wasn't added this open bug
> > can now mask lots of other new bugs.
> >
>
> The commit for this specific patch is bdac616db9bbadb9 ("loop: fix 
> LOOP_GET_STATUS
> lock imbalance"). But the root cause of these hung tasks would be fixed by a 
> series
> containing commit 1dded9acf6dc9a34 ("Avoid circular locking dependency between
> loop_ctl_mutex and bd_mutex") or commit 04906b2f542c2362 ("blockdev: Fix 
> livelocks
> on loop device"). We were not aware of these bugs when you marked these 
> reports as
> duplicates on 2017/12/12. You can undup them and fix them if you want.

OK, let's just do then:

#syz fix: blockdev: Fix livelocks on loop device


Re: INFO: task hung in lo_ioctl

2019-01-19 Thread Tetsuo Handa
On 2019/01/20 3:56, Dmitry Vyukov wrote:
>> Seems easy enough to fix without resorting to __mutex_owner() (untested):
>>
>>
>> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
>> index 264abaaff662..cee258d12a1e 100644
>> --- a/drivers/block/loop.c
>> +++ b/drivers/block/loop.c
>> @@ -1300,12 +1300,13 @@ loop_get_status_old(struct loop_device *lo, struct 
>> loop_info __user *arg) {
>>  static int
>>  loop_get_status64(struct loop_device *lo, struct loop_info64 __user *arg) {
>> struct loop_info64 info64;
>> -   int err = 0;
>> +   int err;
>>
>> -   if (!arg)
>> -   err = -EINVAL;
>> -   if (!err)
>> -   err = loop_get_status(lo, );
>> +   if (!arg) {
>> +   mutex_unlock(>lo_ctl_mutex);
>> +   return -EINVAL;
>> +   }
>> +   err = loop_get_status(lo, );
>> if (!err && copy_to_user(arg, , sizeof(info64)))
>> err = -EFAULT;
>>
>>
>> I'll test it and send it up when I get into the office.
> 
> 
> Was this ever submitted? Or some other fix for this?
> 
> The bug is still open, but last happened 289 days ago:
> https://syzkaller.appspot.com/bug?id=608144371e7fc2cb6285b9ed871fb1eb817a61ce
> 
> But it also has 10 duplicates, some of which happened much more recently.
> If a fix was submitted, but Reported-by tag wasn't added this open bug
> can now mask lots of other new bugs.
> 

The commit for this specific patch is bdac616db9bbadb9 ("loop: fix 
LOOP_GET_STATUS
lock imbalance"). But the root cause of these hung tasks would be fixed by a 
series
containing commit 1dded9acf6dc9a34 ("Avoid circular locking dependency between
loop_ctl_mutex and bd_mutex") or commit 04906b2f542c2362 ("blockdev: Fix 
livelocks
on loop device"). We were not aware of these bugs when you marked these reports 
as
duplicates on 2017/12/12. You can undup them and fix them if you want.


Re: INFO: task hung in lo_ioctl

2019-01-19 Thread Dmitry Vyukov
On Fri, Apr 6, 2018 at 5:59 PM Omar Sandoval  wrote:
>
> On Fri, Apr 06, 2018 at 05:43:43PM +0200, Peter Zijlstra wrote:
> > On Fri, Apr 06, 2018 at 10:55:03PM +0900, Tetsuo Handa wrote:
> > > Peter Zijlstra wrote:
> > > > On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote:
> > >
 > > +   /* Temporary hack for handling lock imbalance. */
> > > > > +   if (__mutex_owner(>lo_ctl_mutex) == current)
> > > > > +   mutex_unlock(>lo_ctl_mutex);
> > > >
> > > > ARGGH.. you didn't read the comment we put on that?
> > > >
> > >
> > > Commit 5b52330bbfe63b33 ("audit: fix auditd/kernel connection state 
> > > tracking")
> > > is using __mutex_owner(). ;-)
> >
> > That got removed and the warning added.
>
> Seems easy enough to fix without resorting to __mutex_owner() (untested):
>
>
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index 264abaaff662..cee258d12a1e 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -1300,12 +1300,13 @@ loop_get_status_old(struct loop_device *lo, struct 
> loop_info __user *arg) {
>  static int
>  loop_get_status64(struct loop_device *lo, struct loop_info64 __user *arg) {
> struct loop_info64 info64;
> -   int err = 0;
> +   int err;
>
> -   if (!arg)
> -   err = -EINVAL;
> -   if (!err)
> -   err = loop_get_status(lo, );
> +   if (!arg) {
> +   mutex_unlock(>lo_ctl_mutex);
> +   return -EINVAL;
> +   }
> +   err = loop_get_status(lo, );
> if (!err && copy_to_user(arg, , sizeof(info64)))
> err = -EFAULT;
>
>
> I'll test it and send it up when I get into the office.


Was this ever submitted? Or some other fix for this?

The bug is still open, but last happened 289 days ago:
https://syzkaller.appspot.com/bug?id=608144371e7fc2cb6285b9ed871fb1eb817a61ce

But it also has 10 duplicates, some of which happened much more recently.
If a fix was submitted, but Reported-by tag wasn't added this open bug
can now mask lots of other new bugs.


Re: INFO: task hung in lo_ioctl

2018-04-06 Thread Omar Sandoval
On Fri, Apr 06, 2018 at 05:43:43PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 06, 2018 at 10:55:03PM +0900, Tetsuo Handa wrote:
> > Peter Zijlstra wrote:
> > > On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote:
> > > > +   /* Temporary hack for handling lock imbalance. */
> > > > +   if (__mutex_owner(>lo_ctl_mutex) == current)
> > > > +   mutex_unlock(>lo_ctl_mutex);
> > > 
> > > ARGGH.. you didn't read the comment we put on that?
> > > 
> > 
> > Commit 5b52330bbfe63b33 ("audit: fix auditd/kernel connection state 
> > tracking")
> > is using __mutex_owner(). ;-)
> 
> That got removed and the warning added.

Seems easy enough to fix without resorting to __mutex_owner() (untested):


diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 264abaaff662..cee258d12a1e 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1300,12 +1300,13 @@ loop_get_status_old(struct loop_device *lo, struct 
loop_info __user *arg) {
 static int
 loop_get_status64(struct loop_device *lo, struct loop_info64 __user *arg) {
struct loop_info64 info64;
-   int err = 0;
+   int err;
 
-   if (!arg)
-   err = -EINVAL;
-   if (!err)
-   err = loop_get_status(lo, );
+   if (!arg) {
+   mutex_unlock(>lo_ctl_mutex);
+   return -EINVAL;
+   }
+   err = loop_get_status(lo, );
if (!err && copy_to_user(arg, , sizeof(info64)))
err = -EFAULT;
 

I'll test it and send it up when I get into the office.


Re: INFO: task hung in lo_ioctl

2018-04-06 Thread Omar Sandoval
On Fri, Apr 06, 2018 at 05:43:43PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 06, 2018 at 10:55:03PM +0900, Tetsuo Handa wrote:
> > Peter Zijlstra wrote:
> > > On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote:
> > > > +   /* Temporary hack for handling lock imbalance. */
> > > > +   if (__mutex_owner(>lo_ctl_mutex) == current)
> > > > +   mutex_unlock(>lo_ctl_mutex);
> > > 
> > > ARGGH.. you didn't read the comment we put on that?
> > > 
> > 
> > Commit 5b52330bbfe63b33 ("audit: fix auditd/kernel connection state 
> > tracking")
> > is using __mutex_owner(). ;-)
> 
> That got removed and the warning added.

Seems easy enough to fix without resorting to __mutex_owner() (untested):


diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 264abaaff662..cee258d12a1e 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1300,12 +1300,13 @@ loop_get_status_old(struct loop_device *lo, struct 
loop_info __user *arg) {
 static int
 loop_get_status64(struct loop_device *lo, struct loop_info64 __user *arg) {
struct loop_info64 info64;
-   int err = 0;
+   int err;
 
-   if (!arg)
-   err = -EINVAL;
-   if (!err)
-   err = loop_get_status(lo, );
+   if (!arg) {
+   mutex_unlock(>lo_ctl_mutex);
+   return -EINVAL;
+   }
+   err = loop_get_status(lo, );
if (!err && copy_to_user(arg, , sizeof(info64)))
err = -EFAULT;
 

I'll test it and send it up when I get into the office.


Re: INFO: task hung in lo_ioctl

2018-04-06 Thread Peter Zijlstra
On Fri, Apr 06, 2018 at 10:55:03PM +0900, Tetsuo Handa wrote:
> Peter Zijlstra wrote:
> > On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote:
> > > + /* Temporary hack for handling lock imbalance. */
> > > + if (__mutex_owner(>lo_ctl_mutex) == current)
> > > + mutex_unlock(>lo_ctl_mutex);
> > 
> > ARGGH.. you didn't read the comment we put on that?
> > 
> 
> Commit 5b52330bbfe63b33 ("audit: fix auditd/kernel connection state tracking")
> is using __mutex_owner(). ;-)

That got removed and the warning added.


Re: INFO: task hung in lo_ioctl

2018-04-06 Thread Peter Zijlstra
On Fri, Apr 06, 2018 at 10:55:03PM +0900, Tetsuo Handa wrote:
> Peter Zijlstra wrote:
> > On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote:
> > > + /* Temporary hack for handling lock imbalance. */
> > > + if (__mutex_owner(>lo_ctl_mutex) == current)
> > > + mutex_unlock(>lo_ctl_mutex);
> > 
> > ARGGH.. you didn't read the comment we put on that?
> > 
> 
> Commit 5b52330bbfe63b33 ("audit: fix auditd/kernel connection state tracking")
> is using __mutex_owner(). ;-)

That got removed and the warning added.


Re: INFO: task hung in lo_ioctl

2018-04-06 Thread Tetsuo Handa
Peter Zijlstra wrote:
> On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote:
> > +   /* Temporary hack for handling lock imbalance. */
> > +   if (__mutex_owner(>lo_ctl_mutex) == current)
> > +   mutex_unlock(>lo_ctl_mutex);
> 
> ARGGH.. you didn't read the comment we put on that?
> 

Commit 5b52330bbfe63b33 ("audit: fix auditd/kernel connection state tracking")
is using __mutex_owner(). ;-)

Of course, regarding loop module, we will be able to add a flag variable
associated with lo->lo_ctl_mutex. But that will be done after we solved
the deadlock problem. I think whether we need to drop lo->lo_ctl_mutex is
not clear. Maybe

-   err = mutex_lock_killable_nested(>lo_ctl_mutex, 1);
+   err = mutex_lock_killable(>lo_ctl_mutex);

on top of this patch and listen to the lockdep?

Commit f028f3b2f987ebc6 ("loop: fix circular locking in loop_clr_fd()") says

  A simple way to silence lockdep could be to mark the lo_ctl_mutex
  in ioctl to be a sub class, but this might mask some other real bugs.

and we are currently hitting a deadlock problem.


Re: INFO: task hung in lo_ioctl

2018-04-06 Thread Tetsuo Handa
Peter Zijlstra wrote:
> On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote:
> > +   /* Temporary hack for handling lock imbalance. */
> > +   if (__mutex_owner(>lo_ctl_mutex) == current)
> > +   mutex_unlock(>lo_ctl_mutex);
> 
> ARGGH.. you didn't read the comment we put on that?
> 

Commit 5b52330bbfe63b33 ("audit: fix auditd/kernel connection state tracking")
is using __mutex_owner(). ;-)

Of course, regarding loop module, we will be able to add a flag variable
associated with lo->lo_ctl_mutex. But that will be done after we solved
the deadlock problem. I think whether we need to drop lo->lo_ctl_mutex is
not clear. Maybe

-   err = mutex_lock_killable_nested(>lo_ctl_mutex, 1);
+   err = mutex_lock_killable(>lo_ctl_mutex);

on top of this patch and listen to the lockdep?

Commit f028f3b2f987ebc6 ("loop: fix circular locking in loop_clr_fd()") says

  A simple way to silence lockdep could be to mark the lo_ctl_mutex
  in ioctl to be a sub class, but this might mask some other real bugs.

and we are currently hitting a deadlock problem.


Re: INFO: task hung in lo_ioctl

2018-04-06 Thread Peter Zijlstra
On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote:
> + /* Temporary hack for handling lock imbalance. */
> + if (__mutex_owner(>lo_ctl_mutex) == current)
> + mutex_unlock(>lo_ctl_mutex);

ARGGH.. you didn't read the comment we put on that?


Re: INFO: task hung in lo_ioctl

2018-04-06 Thread Peter Zijlstra
On Fri, Apr 06, 2018 at 09:04:18PM +0900, Tetsuo Handa wrote:
> + /* Temporary hack for handling lock imbalance. */
> + if (__mutex_owner(>lo_ctl_mutex) == current)
> + mutex_unlock(>lo_ctl_mutex);

ARGGH.. you didn't read the comment we put on that?


Re: INFO: task hung in lo_ioctl

2018-04-06 Thread Tetsuo Handa
On 2018/04/05 0:23, Tetsuo Handa wrote:
> This seems to be an AB-BA deadlock where the lockdep cannot report (due to 
> use of nested lock?).
> What is happening here?
> 

This patch does not directly fix the bug syzbot is reporting, but could be 
relevant.
Maybe try replacing mutex_lock_killable_nested() with mutex_lock_killable() and
check what the lockdep will say?



>From 0b006d536b2e439f347eb857e482fc304e84fd1d Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Fri, 6 Apr 2018 20:52:10 +0900
Subject: [PATCH] block/loop: fix lock imbalance bug at lo_ioctl

Commit 2d1d4c1e591fd40b ("loop: don't call into filesystem while holding
lo_ctl_mutex") introduced mutex lock imbalance bug where syzbot would
trigger. The caller of loop_get_status64() assumes that mutex_unlock() is
already called by loop_get_status() but loop_get_status64() does not
always call loop_get_status().

Commit f028f3b2f987ebc6 ("loop: fix circular locking in loop_clr_fd()")
also dropped the mutex for deadlock avoidance reason. But we should
recheck whether we have to drop the mutex. Dropping the mutex at the
middle of an ioctl() request is not nice, but syzbot is reporting a
deadlock inside loop_reread_partitions() which is called by loop_set_fd()
and loop_change_fd().

Signed-off-by: Tetsuo Handa 
Fixes: 2d1d4c1e591fd40b ("loop: don't call into filesystem while holding 
lo_ctl_mutex")
Cc: Omar Sandoval 
Cc: Nikanth Karthikesan 
Cc: Jens Axboe 
---
 drivers/block/loop.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 264abaa..64033e7 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1362,7 +1362,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t 
mode,
 
err = mutex_lock_killable_nested(>lo_ctl_mutex, 1);
if (err)
-   goto out_unlocked;
+   return err;
 
switch (cmd) {
case LOOP_SET_FD:
@@ -1372,10 +1372,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t 
mode,
err = loop_change_fd(lo, bdev, arg);
break;
case LOOP_CLR_FD:
-   /* loop_clr_fd would have unlocked lo_ctl_mutex on success */
err = loop_clr_fd(lo);
-   if (!err)
-   goto out_unlocked;
break;
case LOOP_SET_STATUS:
err = -EPERM;
@@ -1385,8 +1382,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t 
mode,
break;
case LOOP_GET_STATUS:
err = loop_get_status_old(lo, (struct loop_info __user *) arg);
-   /* loop_get_status() unlocks lo_ctl_mutex */
-   goto out_unlocked;
+   break;
case LOOP_SET_STATUS64:
err = -EPERM;
if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN))
@@ -1395,8 +1391,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t 
mode,
break;
case LOOP_GET_STATUS64:
err = loop_get_status64(lo, (struct loop_info64 __user *) arg);
-   /* loop_get_status() unlocks lo_ctl_mutex */
-   goto out_unlocked;
+   break;
case LOOP_SET_CAPACITY:
err = -EPERM;
if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN))
@@ -1415,9 +1410,9 @@ static int lo_ioctl(struct block_device *bdev, fmode_t 
mode,
default:
err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL;
}
-   mutex_unlock(>lo_ctl_mutex);
-
-out_unlocked:
+   /* Temporary hack for handling lock imbalance. */
+   if (__mutex_owner(>lo_ctl_mutex) == current)
+   mutex_unlock(>lo_ctl_mutex);
return err;
 }
 
-- 
1.8.3.1


Re: INFO: task hung in lo_ioctl

2018-04-06 Thread Tetsuo Handa
On 2018/04/05 0:23, Tetsuo Handa wrote:
> This seems to be an AB-BA deadlock where the lockdep cannot report (due to 
> use of nested lock?).
> What is happening here?
> 

This patch does not directly fix the bug syzbot is reporting, but could be 
relevant.
Maybe try replacing mutex_lock_killable_nested() with mutex_lock_killable() and
check what the lockdep will say?



>From 0b006d536b2e439f347eb857e482fc304e84fd1d Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Fri, 6 Apr 2018 20:52:10 +0900
Subject: [PATCH] block/loop: fix lock imbalance bug at lo_ioctl

Commit 2d1d4c1e591fd40b ("loop: don't call into filesystem while holding
lo_ctl_mutex") introduced mutex lock imbalance bug where syzbot would
trigger. The caller of loop_get_status64() assumes that mutex_unlock() is
already called by loop_get_status() but loop_get_status64() does not
always call loop_get_status().

Commit f028f3b2f987ebc6 ("loop: fix circular locking in loop_clr_fd()")
also dropped the mutex for deadlock avoidance reason. But we should
recheck whether we have to drop the mutex. Dropping the mutex at the
middle of an ioctl() request is not nice, but syzbot is reporting a
deadlock inside loop_reread_partitions() which is called by loop_set_fd()
and loop_change_fd().

Signed-off-by: Tetsuo Handa 
Fixes: 2d1d4c1e591fd40b ("loop: don't call into filesystem while holding 
lo_ctl_mutex")
Cc: Omar Sandoval 
Cc: Nikanth Karthikesan 
Cc: Jens Axboe 
---
 drivers/block/loop.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 264abaa..64033e7 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1362,7 +1362,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t 
mode,
 
err = mutex_lock_killable_nested(>lo_ctl_mutex, 1);
if (err)
-   goto out_unlocked;
+   return err;
 
switch (cmd) {
case LOOP_SET_FD:
@@ -1372,10 +1372,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t 
mode,
err = loop_change_fd(lo, bdev, arg);
break;
case LOOP_CLR_FD:
-   /* loop_clr_fd would have unlocked lo_ctl_mutex on success */
err = loop_clr_fd(lo);
-   if (!err)
-   goto out_unlocked;
break;
case LOOP_SET_STATUS:
err = -EPERM;
@@ -1385,8 +1382,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t 
mode,
break;
case LOOP_GET_STATUS:
err = loop_get_status_old(lo, (struct loop_info __user *) arg);
-   /* loop_get_status() unlocks lo_ctl_mutex */
-   goto out_unlocked;
+   break;
case LOOP_SET_STATUS64:
err = -EPERM;
if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN))
@@ -1395,8 +1391,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t 
mode,
break;
case LOOP_GET_STATUS64:
err = loop_get_status64(lo, (struct loop_info64 __user *) arg);
-   /* loop_get_status() unlocks lo_ctl_mutex */
-   goto out_unlocked;
+   break;
case LOOP_SET_CAPACITY:
err = -EPERM;
if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN))
@@ -1415,9 +1410,9 @@ static int lo_ioctl(struct block_device *bdev, fmode_t 
mode,
default:
err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL;
}
-   mutex_unlock(>lo_ctl_mutex);
-
-out_unlocked:
+   /* Temporary hack for handling lock imbalance. */
+   if (__mutex_owner(>lo_ctl_mutex) == current)
+   mutex_unlock(>lo_ctl_mutex);
return err;
 }
 
-- 
1.8.3.1


Re: INFO: task hung in lo_ioctl

2018-04-04 Thread Tetsuo Handa
This seems to be an AB-BA deadlock where the lockdep cannot report (due to use 
of nested lock?).

When PID=6540 was (reported as hung) at mutex_lock_nested(>lo_ctl_mutex, 1) 
(id=43ca8836),
it was already holding down_write_nested(>s_umount, SINGLE_DEPTH_NESTING) 
(id=566d4c39).
But when PID=6541 was (which would have been reported as hung if 
sysctl_hung_task_panic
were not set) at down_read(>s_umount) (id=566d4c39), it was already holding
mutex_lock_nested(>lo_ctl_mutex, 1) (id=43ca8836).


INFO: task syz-executor0:6540 blocked for more than 120 seconds.
  Not tainted 4.16.0+ #13
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor0   D23560  6540   4521 0x8004
Call Trace:
 context_switch kernel/sched/core.c:2848 [inline]
 __schedule+0x8fb/0x1ef0 kernel/sched/core.c:3490
 schedule+0xf5/0x430 kernel/sched/core.c:3549
 schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3607
 __mutex_lock_common kernel/locking/mutex.c:833 [inline]
 __mutex_lock+0xb7f/0x1810 kernel/locking/mutex.c:893
 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
 lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355
 __blkdev_driver_ioctl block/ioctl.c:303 [inline]
 blkdev_ioctl+0x1759/0x1e00 block/ioctl.c:601
 ioctl_by_bdev+0xa5/0x110 fs/block_dev.c:2060
 isofs_get_last_session fs/isofs/inode.c:567 [inline]
 isofs_fill_super+0x2ba9/0x3bc0 fs/isofs/inode.c:660
 mount_bdev+0x2b7/0x370 fs/super.c:1119
 isofs_mount+0x34/0x40 fs/isofs/inode.c:1560
 mount_fs+0x66/0x2d0 fs/super.c:1222

2 locks held by syz-executor0/6540:
 #0: 566d4c39 (>s_umount_key#49/1){+.+.}, at: alloc_super 
fs/super.c:211 [inline]
 #0: 566d4c39 (>s_umount_key#49/1){+.+.}, at: 
sget_userns+0x3b2/0xe60 fs/super.c:502 /* down_write_nested(>s_umount, 
SINGLE_DEPTH_NESTING); */
 #1: 43ca8836 (>lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 
drivers/block/loop.c:1355 /* mutex_lock_nested(>lo_ctl_mutex, 1); */
3 locks held by syz-executor7/6541:
 #0: 43ca8836 (>lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 
drivers/block/loop.c:1355 /* mutex_lock_nested(>lo_ctl_mutex, 1); */
 #1: 7bf3d3f9 (>bd_mutex){+.+.}, at: blkdev_reread_part+0x1e/0x40 
block/ioctl.c:192
 #2: 566d4c39 (>s_umount_key#50){.+.+}, at: 
__get_super.part.10+0x1d3/0x280 fs/super.c:663 /* down_read(>s_umount); */


sget() is using down_write_nested(>s_umount, SINGLE_DEPTH_NESTING)
with a comment block asserting that there is no risk of deadlock

/*
 * sget() can have s_umount recursion.
 *
 * When it cannot find a suitable sb, it allocates a new
 * one (this one), and tries again to find a suitable old
 * one.
 *
 * In case that succeeds, it will acquire the s_umount
 * lock of the old one. Since these are clearly distrinct
 * locks, and this object isn't exposed yet, there's no
 * risk of deadlocks.
 *
 * Annotate this by putting this lock in a different
 * subclass.
 */

but this object (id=566d4c39) is already locked by other thread.
What is happening here?



Re: INFO: task hung in lo_ioctl

2018-04-04 Thread Tetsuo Handa
This seems to be an AB-BA deadlock where the lockdep cannot report (due to use 
of nested lock?).

When PID=6540 was (reported as hung) at mutex_lock_nested(>lo_ctl_mutex, 1) 
(id=43ca8836),
it was already holding down_write_nested(>s_umount, SINGLE_DEPTH_NESTING) 
(id=566d4c39).
But when PID=6541 was (which would have been reported as hung if 
sysctl_hung_task_panic
were not set) at down_read(>s_umount) (id=566d4c39), it was already holding
mutex_lock_nested(>lo_ctl_mutex, 1) (id=43ca8836).


INFO: task syz-executor0:6540 blocked for more than 120 seconds.
  Not tainted 4.16.0+ #13
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor0   D23560  6540   4521 0x8004
Call Trace:
 context_switch kernel/sched/core.c:2848 [inline]
 __schedule+0x8fb/0x1ef0 kernel/sched/core.c:3490
 schedule+0xf5/0x430 kernel/sched/core.c:3549
 schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3607
 __mutex_lock_common kernel/locking/mutex.c:833 [inline]
 __mutex_lock+0xb7f/0x1810 kernel/locking/mutex.c:893
 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
 lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355
 __blkdev_driver_ioctl block/ioctl.c:303 [inline]
 blkdev_ioctl+0x1759/0x1e00 block/ioctl.c:601
 ioctl_by_bdev+0xa5/0x110 fs/block_dev.c:2060
 isofs_get_last_session fs/isofs/inode.c:567 [inline]
 isofs_fill_super+0x2ba9/0x3bc0 fs/isofs/inode.c:660
 mount_bdev+0x2b7/0x370 fs/super.c:1119
 isofs_mount+0x34/0x40 fs/isofs/inode.c:1560
 mount_fs+0x66/0x2d0 fs/super.c:1222

2 locks held by syz-executor0/6540:
 #0: 566d4c39 (>s_umount_key#49/1){+.+.}, at: alloc_super 
fs/super.c:211 [inline]
 #0: 566d4c39 (>s_umount_key#49/1){+.+.}, at: 
sget_userns+0x3b2/0xe60 fs/super.c:502 /* down_write_nested(>s_umount, 
SINGLE_DEPTH_NESTING); */
 #1: 43ca8836 (>lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 
drivers/block/loop.c:1355 /* mutex_lock_nested(>lo_ctl_mutex, 1); */
3 locks held by syz-executor7/6541:
 #0: 43ca8836 (>lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 
drivers/block/loop.c:1355 /* mutex_lock_nested(>lo_ctl_mutex, 1); */
 #1: 7bf3d3f9 (>bd_mutex){+.+.}, at: blkdev_reread_part+0x1e/0x40 
block/ioctl.c:192
 #2: 566d4c39 (>s_umount_key#50){.+.+}, at: 
__get_super.part.10+0x1d3/0x280 fs/super.c:663 /* down_read(>s_umount); */


sget() is using down_write_nested(>s_umount, SINGLE_DEPTH_NESTING)
with a comment block asserting that there is no risk of deadlock

/*
 * sget() can have s_umount recursion.
 *
 * When it cannot find a suitable sb, it allocates a new
 * one (this one), and tries again to find a suitable old
 * one.
 *
 * In case that succeeds, it will acquire the s_umount
 * lock of the old one. Since these are clearly distrinct
 * locks, and this object isn't exposed yet, there's no
 * risk of deadlocks.
 *
 * Annotate this by putting this lock in a different
 * subclass.
 */

but this object (id=566d4c39) is already locked by other thread.
What is happening here?



Re: INFO: task hung in lo_ioctl

2017-12-12 Thread Dmitry Vyukov
On Sun, Dec 10, 2017 at 2:32 PM, syzbot

wrote:
> Hello,
>
> syzkaller hit the following crash on
> ad4dac17f9d563b9e34aab78a34293b10993e9b5
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
>
> Unfortunately, I don't have any reproducer for this bug yet.
>
>
> INFO: task syz-executor3:12157 blocked for more than 120 seconds.
>   Not tainted 4.15.0-rc2-next-20171208+ #63
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> syz-executor3   D25536 12157   3362 0x0004
> Call Trace:
>  context_switch kernel/sched/core.c:2800 [inline]
>  __schedule+0x8eb/0x2060 kernel/sched/core.c:3376
>  schedule+0xf5/0x430 kernel/sched/core.c:3435
>  schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3493
>  __mutex_lock_common kernel/locking/mutex.c:833 [inline]
>  __mutex_lock+0xaad/0x1a80 kernel/locking/mutex.c:893
>  mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
>  lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355
>  __blkdev_driver_ioctl block/ioctl.c:303 [inline]
>  blkdev_ioctl+0x1759/0x1e00 block/ioctl.c:601
>  block_ioctl+0xea/0x130 fs/block_dev.c:1860
>  vfs_ioctl fs/ioctl.c:46 [inline]
>  do_vfs_ioctl+0x1b1/0x1530 fs/ioctl.c:686
>  SYSC_ioctl fs/ioctl.c:701 [inline]
>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
>  entry_SYSCALL_64_fastpath+0x1f/0x96
> RIP: 0033:0x452a39
> RSP: 002b:7fd85b57ec58 EFLAGS: 0212 ORIG_RAX: 0010
> RAX: ffda RBX: 00758190 RCX: 00452a39
> RDX: 20e67fd8 RSI: 400454ca RDI: 0016
> RBP: 005b R08:  R09: 
> R10:  R11: 0212 R12: 006ee928
> R13:  R14: 7fd85b57f6d4 R15: 0002
>
> Showing all locks held in the system:
> 2 locks held by khungtaskd/671:
>  #0:  (rcu_read_lock){}, at: [<53bc3983>]
> check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline]
>  #0:  (rcu_read_lock){}, at: [<53bc3983>] watchdog+0x1c5/0xd60
> kernel/hung_task.c:249
>  #1:  (tasklist_lock){.+.+}, at: []
> debug_show_all_locks+0xd3/0x400 kernel/locking/lockdep.c:4554
> 2 locks held by getty/3115:
>  #0:  (>ldisc_sem){}, at: []
> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>  #1:  (>atomic_read_lock){+.+.}, at: []
> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
> 2 locks held by getty/3116:
>  #0:  (>ldisc_sem){}, at: []
> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>  #1:  (>atomic_read_lock){+.+.}, at: []
> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
> 2 locks held by getty/3117:
>  #0:  (>ldisc_sem){}, at: []
> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>  #1:  (>atomic_read_lock){+.+.}, at: []
> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
> 2 locks held by getty/3118:
>  #0:  (>ldisc_sem){}, at: []
> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>  #1:  (>atomic_read_lock){+.+.}, at: []
> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
> 2 locks held by getty/3119:
>  #0:  (>ldisc_sem){}, at: []
> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>  #1:  (>atomic_read_lock){+.+.}, at: []
> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
> 2 locks held by getty/3120:
>  #0:  (>ldisc_sem){}, at: []
> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>  #1:  (>atomic_read_lock){+.+.}, at: []
> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
> 1 lock held by syz-executor3/12157:
>  #0:  (>lo_ctl_mutex/1){+.+.}, at: []
> lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355
> 1 lock held by syz-executor7/12154:
>  #0:  (>lo_ctl_mutex/1){+.+.}, at: []
> lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355
> 1 lock held by syz-executor7/12168:
>  #0:  (>lo_ctl_mutex/1){+.+.}, at: []
> lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355
> 1 lock held by syz-executor5/12216:
>  #0:  (>lo_ctl_mutex/1){+.+.}, at: []
> lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355
> 2 locks held by getty/12231:
>  #0:  (>ldisc_sem){}, at: []
> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>  #1:  (>atomic_read_lock){+.+.}, at: []
> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
> 1 lock held by blkid/12232:
>  #0:  (>lo_ctl_mutex/1){+.+.}, at: []
> lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355
>
> =
>
> NMI backtrace for cpu 0
> CPU: 0 PID: 671 Comm: khungtaskd Not tainted 4.15.0-rc2-next-20171208+ #63
> Hardware name: Google Google 

Re: INFO: task hung in lo_ioctl

2017-12-12 Thread Dmitry Vyukov
On Sun, Dec 10, 2017 at 2:32 PM, syzbot

wrote:
> Hello,
>
> syzkaller hit the following crash on
> ad4dac17f9d563b9e34aab78a34293b10993e9b5
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
>
> Unfortunately, I don't have any reproducer for this bug yet.
>
>
> INFO: task syz-executor3:12157 blocked for more than 120 seconds.
>   Not tainted 4.15.0-rc2-next-20171208+ #63
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> syz-executor3   D25536 12157   3362 0x0004
> Call Trace:
>  context_switch kernel/sched/core.c:2800 [inline]
>  __schedule+0x8eb/0x2060 kernel/sched/core.c:3376
>  schedule+0xf5/0x430 kernel/sched/core.c:3435
>  schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3493
>  __mutex_lock_common kernel/locking/mutex.c:833 [inline]
>  __mutex_lock+0xaad/0x1a80 kernel/locking/mutex.c:893
>  mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
>  lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355
>  __blkdev_driver_ioctl block/ioctl.c:303 [inline]
>  blkdev_ioctl+0x1759/0x1e00 block/ioctl.c:601
>  block_ioctl+0xea/0x130 fs/block_dev.c:1860
>  vfs_ioctl fs/ioctl.c:46 [inline]
>  do_vfs_ioctl+0x1b1/0x1530 fs/ioctl.c:686
>  SYSC_ioctl fs/ioctl.c:701 [inline]
>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
>  entry_SYSCALL_64_fastpath+0x1f/0x96
> RIP: 0033:0x452a39
> RSP: 002b:7fd85b57ec58 EFLAGS: 0212 ORIG_RAX: 0010
> RAX: ffda RBX: 00758190 RCX: 00452a39
> RDX: 20e67fd8 RSI: 400454ca RDI: 0016
> RBP: 005b R08:  R09: 
> R10:  R11: 0212 R12: 006ee928
> R13:  R14: 7fd85b57f6d4 R15: 0002
>
> Showing all locks held in the system:
> 2 locks held by khungtaskd/671:
>  #0:  (rcu_read_lock){}, at: [<53bc3983>]
> check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline]
>  #0:  (rcu_read_lock){}, at: [<53bc3983>] watchdog+0x1c5/0xd60
> kernel/hung_task.c:249
>  #1:  (tasklist_lock){.+.+}, at: []
> debug_show_all_locks+0xd3/0x400 kernel/locking/lockdep.c:4554
> 2 locks held by getty/3115:
>  #0:  (>ldisc_sem){}, at: []
> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>  #1:  (>atomic_read_lock){+.+.}, at: []
> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
> 2 locks held by getty/3116:
>  #0:  (>ldisc_sem){}, at: []
> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>  #1:  (>atomic_read_lock){+.+.}, at: []
> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
> 2 locks held by getty/3117:
>  #0:  (>ldisc_sem){}, at: []
> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>  #1:  (>atomic_read_lock){+.+.}, at: []
> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
> 2 locks held by getty/3118:
>  #0:  (>ldisc_sem){}, at: []
> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>  #1:  (>atomic_read_lock){+.+.}, at: []
> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
> 2 locks held by getty/3119:
>  #0:  (>ldisc_sem){}, at: []
> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>  #1:  (>atomic_read_lock){+.+.}, at: []
> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
> 2 locks held by getty/3120:
>  #0:  (>ldisc_sem){}, at: []
> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>  #1:  (>atomic_read_lock){+.+.}, at: []
> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
> 1 lock held by syz-executor3/12157:
>  #0:  (>lo_ctl_mutex/1){+.+.}, at: []
> lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355
> 1 lock held by syz-executor7/12154:
>  #0:  (>lo_ctl_mutex/1){+.+.}, at: []
> lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355
> 1 lock held by syz-executor7/12168:
>  #0:  (>lo_ctl_mutex/1){+.+.}, at: []
> lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355
> 1 lock held by syz-executor5/12216:
>  #0:  (>lo_ctl_mutex/1){+.+.}, at: []
> lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355
> 2 locks held by getty/12231:
>  #0:  (>ldisc_sem){}, at: []
> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>  #1:  (>atomic_read_lock){+.+.}, at: []
> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
> 1 lock held by blkid/12232:
>  #0:  (>lo_ctl_mutex/1){+.+.}, at: []
> lo_ioctl+0x8b/0x1b90 drivers/block/loop.c:1355
>
> =
>
> NMI backtrace for cpu 0
> CPU: 0 PID: 671 Comm: khungtaskd Not tainted 4.15.0-rc2-next-20171208+ #63
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call