While doing LPM on large system (for instance a Brazos system with 1024
CPUs and 12TB of memory) with an heavy load (I ran 'stress-ng --futex 500
-vm 5'), watchdog hard lockup are seen when the hypervisor is taking
too much time handling the page tables to track page's changes.

When this happens, the system may hung with a deadlock between the watchdog
lock and the console owner lock.

The first patch of this series prevents that deadlock by not calling printk
while holding the watchdog lock, and also not sending IPI (and waiting for
CPU's answer during 1s) while holding the watchdog lock.

The second patch ensures that the watchdog's data are accessed under the
protection of the watchdog lock.

Laurent Dufour (2):
  powerpc/watchdog: prevent printk and send IPI while holding the wd
    lock
  powerpc/watchdog: ensure watchdog data accesses are protected

 arch/powerpc/kernel/watchdog.c | 45 +++++++++++++++++++---------------
 1 file changed, 25 insertions(+), 20 deletions(-)

-- 
2.33.1

Reply via email to