Call healthcheck with a shell script that starts with: sleep $[ ( $RANDOM % 10 ) + 1 ], or similar.
M.K. ________________________________ From: slurm-users <[email protected]> on behalf of SJTU <[email protected]> Sent: Thursday, November 26, 2020 8:24 PM To: [email protected] <[email protected]> Subject: [slurm-users] Set a ramdom offset when starting node health check in SLURM Hi, We uses HealthCheckProgram = /usr/sbin/nhc in slurm to check node health every 600 seconds. However, some NHC checks points to a same central resource thus starting these checks simultaneously may lead to false alarms of service degrade. Is it possible to set a random offset to when HealthCheckProgram starts? Thank you! Jianwen
