While doing LPM on large system (for instance a Brazos system with 1024 CPUs and 12TB of memory) with an heavy load (I ran 'stress-ng --futex 500 -vm 5'), watchdog hard lockup are seen when the hypervisor is taking too much time handling the page tables to track page's changes.
When this happens, the system may hung with a deadlock between the watchdog lock and the console owner lock. The first patch of this series prevents that deadlock by not calling printk while holding the watchdog lock, and also not sending IPI (and waiting for CPU's answer during 1s) while holding the watchdog lock. The second patch ensures that the watchdog's data are accessed under the protection of the watchdog lock. Laurent Dufour (2): powerpc/watchdog: prevent printk and send IPI while holding the wd lock powerpc/watchdog: ensure watchdog data accesses are protected arch/powerpc/kernel/watchdog.c | 45 +++++++++++++++++++--------------- 1 file changed, 25 insertions(+), 20 deletions(-) -- 2.33.1