Le 10/03/2021 à 08:59, Vaibhav Jain a écrit :
While removing large number of mappings from hash page tables for
large memory systems as soft-lockup is reported because of the time
spent inside htap_remove_mapping() like one below:

  watchdog: BUG: soft lockup - CPU#8 stuck for 23s!
  <snip>
  NIP plpar_hcall+0x38/0x58
  LR  pSeries_lpar_hpte_invalidate+0x68/0xb0
  Call Trace:
   0x1fffffffffff000 (unreliable)
   pSeries_lpar_hpte_removebolted+0x9c/0x230
   hash__remove_section_mapping+0xec/0x1c0
   remove_section_mapping+0x28/0x3c
   arch_remove_memory+0xfc/0x150
   devm_memremap_pages_release+0x180/0x2f0
   devm_action_release+0x30/0x50
   release_nodes+0x28c/0x300
   device_release_driver_internal+0x16c/0x280
   unbind_store+0x124/0x170
   drv_attr_store+0x44/0x60
   sysfs_kf_write+0x64/0x90
   kernfs_fop_write+0x1b0/0x290
   __vfs_write+0x3c/0x70
   vfs_write+0xd4/0x270
   ksys_write+0xdc/0x130
   system_call+0x5c/0x70

Fix this by adding a cond_resched() to the loop in
htap_remove_mapping() that issues hcall to remove hpte mapping. This
should prevent the soft-lockup from being reported.

Isn't it overkill to call is at each iteration ?

Looking at a few other places, there is some mitigation. For instance fadump_free_reserved_memory() does it based on elapsed time. Another exemple is drmem_lmb_next() doing it every 16 iteration.



Suggested-by: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Vaibhav Jain <[email protected]>
---
  arch/powerpc/mm/book3s64/hash_utils.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/mm/book3s64/hash_utils.c 
b/arch/powerpc/mm/book3s64/hash_utils.c
index 581b20a2feaf..ea3945c70b18 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -359,6 +359,8 @@ int htab_remove_mapping(unsigned long vstart, unsigned long 
vend,
                }
                if (rc < 0)
                        return rc;
+
+               cond_resched();
        }
return ret;


Christophe

Reply via email to