We triggered soft-lockup under stress test which
open/access/write/close one file concurrently on more than
five different CPUs:

WARN: soft lockup - CPU#0 stuck for 11s! [who:30631]
...
[<ffffffc0003986f8>] dput+0x100/0x298
[<ffffffc00038c2dc>] terminate_walk+0x4c/0x60
[<ffffffc00038f56c>] path_lookupat+0x5cc/0x7a8
[<ffffffc00038f780>] filename_lookup+0x38/0xf0
[<ffffffc000391180>] user_path_at_empty+0x78/0xd0
[<ffffffc0003911f4>] user_path_at+0x1c/0x28
[<ffffffc00037d4fc>] SyS_faccessat+0xb4/0x230

->d_lock trylock may failed many times because of concurrently
operations, and dput() may execute a long time.

Fix this by replacing cpu_relax() with cond_resched().
dput() used to be sleepable, so make it sleepable again
should be safe.

Cc: <sta...@vger.kernel.org>
Signed-off-by: Wei Fang <fangw...@huawei.com>
---
Changes v1->v2:
- add might_sleep() to annotate that dput() can sleep
Changes v2->v3:
- put cond_resched() in dput()

 fs/dcache.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index ad4a542..77f6687 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -589,7 +589,6 @@ static struct dentry *dentry_kill(struct dentry *dentry)
 
 failed:
        spin_unlock(&dentry->d_lock);
-       cpu_relax();
        return dentry; /* try again with same dentry */
 }
 
@@ -763,6 +762,8 @@ void dput(struct dentry *dentry)
                return;
 
 repeat:
+       might_sleep();
+
        rcu_read_lock();
        if (likely(fast_dput(dentry))) {
                rcu_read_unlock();
@@ -796,8 +797,10 @@ repeat:
 
 kill_it:
        dentry = dentry_kill(dentry);
-       if (dentry)
+       if (dentry) {
+               cond_resched();
                goto repeat;
+       }
 }
 EXPORT_SYMBOL(dput);
 
-- 
1.7.1

Reply via email to