在 2021/11/23 10:59, Jianhua1 Hao 郝建华 via Linux-erofs 写道:
*We also found that it is easy to cause deadlock in the kswap scene, We observed the following deadlock in the stress test under low memory scenario,****Same as "erofs: fix deadlock when shrink erofs slab".*
**

Thread A: Thread B:

erofs_try_to_release_workgroup(grp = 0xFFFFFF87ADFEE610)erofs_insert_workgroup()

erofs_workgroup_try_to_freeze(grp, 1)//xa lock is held here

//set ref count to EROFS_LOCKED_MAGICxa_lock(&sbi->managed_pslots);

atomic_cmpxchg(&grp->refcount, val,EROFS_LOCKED_MAGIC)pre = __xa_cmpxchg(&sbi->managed_pslots, grp->index, NULL, grp, GFP_NOFS);

xa_erase(&sbi->managed_pslots, grp->index)erofs_workgroup_get(pre)  //pre = grp = 0xFFFFFF87ADFEE610

//stuck there to wait for xa lock, already held by thread Berofs_wait_on_workgroup_freezed(grp);

xa_lock(xa); //wait ref count to be unlocked, which should be done by thread A

atomic_cond_read_relaxed(&grp->refcount, VAL != EROFS_LOCKED_MAGIC);

Follow-up fix:it need to hold the xa lock before freeze the workgroup

beacuse we will operate xarry?

Hi,  JianHua,

The fix is in the patch, please test it kindly if you have condition.
https://lore.kernel.org/linux-erofs/YZcJpDs3FKpSfzAE@B-P7TQMD6M-0146/T/#t

---
 fs/erofs/utils.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/erofs/utils.c b/fs/erofs/utils.c
index 84da2c280012..84a59f075dd1 100644
--- a/fs/erofs/utils.c
+++ b/fs/erofs/utils.c
@@ -150,7 +150,7 @@ static bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi,
      * however in order to avoid some race conditions, add a
      * DBG_BUGON to observe this in advance.
      */
-    DBG_BUGON(xa_erase(&sbi->managed_pslots, grp->index) != grp);
+    DBG_BUGON(__xa_erase(&sbi->managed_pslots, grp->index) != grp);

     /* last refcount should be connected with its managed pslot. */
     erofs_workgroup_unfreeze(grp, 0);
@@ -165,15 +165,20 @@ static unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
     unsigned int freed = 0;
     unsigned long index;

+    xa_lock(&sbi->managed_pslots);
     xa_for_each(&sbi->managed_pslots, index, grp) {
         /* try to shrink each valid workgroup */
         if (!erofs_try_to_release_workgroup(sbi, grp))
             continue;
+        xa_unlock(&sbi->managed_pslots);

         ++freed;
         if (!--nr_shrink)
-            break;
+            return freed;
+        xa_lock(&sbi->managed_pslots);
     }
+    xa_unlock(&sbi->managed_pslots);
+
     return freed;
 }


Thanks,
Jianan


------------------------------------------------------------------------
Jianhua1 Hao
#/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#

Reply via email to