Hi Jianan and Jianhua, On Tue, Nov 23, 2021 at 11:58:32AM +0800, Huang Jianan wrote: > 在 2021/11/23 10:59, Jianhua1 Hao 郝建华 via Linux-erofs 写道: > > *We also found that it is easy to cause deadlock in the kswap scene, We > > observed the following deadlock in the stress test under low memory > > scenario,****Same as "erofs: fix deadlock when shrink erofs slab".* > > ** > > > > Thread A: Thread B: > > > > erofs_try_to_release_workgroup(grp = > > 0xFFFFFF87ADFEE610)erofs_insert_workgroup() > > > > erofs_workgroup_try_to_freeze(grp, 1)//xa lock is held here > > > > //set ref count to EROFS_LOCKED_MAGICxa_lock(&sbi->managed_pslots); > > > > atomic_cmpxchg(&grp->refcount, val,EROFS_LOCKED_MAGIC)pre = > > __xa_cmpxchg(&sbi->managed_pslots, grp->index, NULL, grp, GFP_NOFS); > > > > xa_erase(&sbi->managed_pslots, grp->index)erofs_workgroup_get(pre) > > //pre = grp = 0xFFFFFF87ADFEE610 > > > > //stuck there to wait for xa lock, already held by thread > > Berofs_wait_on_workgroup_freezed(grp); > > > > xa_lock(xa); //wait ref count to be unlocked, which should be done by > > thread A > > > > atomic_cond_read_relaxed(&grp->refcount, VAL != EROFS_LOCKED_MAGIC); > > > > Follow-up fix:it need to hold the xa lock before freeze the workgroup > > > > beacuse we will operate xarry? > > > Hi, JianHua, > > The fix is in the patch, please test it kindly if you have condition. > https://lore.kernel.org/linux-erofs/YZcJpDs3FKpSfzAE@B-P7TQMD6M-0146/T/#t
Thanks for the report, I had some other work to do just now. I've pushed out this patch to fixes branch and will send to Linus this week: https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git/commit/?id=deccd444d2844f1e89314dfc3956cccfdb813b65 As Jianan said, I believe this patch can fix your issue and please take a try in advance. Also, it doesn't effect v4.19 and v5.4 LTS, only v5.10 and v5.15 LTS are impacted. Thanks for your report! Gao Xiang
