Re: [slurm-users] How to run one maintenance job on each node in the cluster
Jeffrey Tunison wrote: > Is there a straightforward way to create a batch job that runs once on every > node in the cluster? A wrapper around reboot configured as RebootProgram in slurm.conf?
Re: [slurm-users] How to run one maintenance job on each node in the cluster
On 23-12-2023 05:09, Jeffrey Tunison wrote: Is there a straightforward way to create a batch job that runs once on every node in the cluster? A technique simpler than generating a list from sinfo output and dispatching the job in a for loop for the N nodes. That’s not very hard, but I thought there might be an elegant solution which would make dispatching maintenance jobs easier. One solution is the method in this script: https://github.com/OleHolmNielsen/Slurm_tools/blob/master/nodes/update.sh This works very reliably for us when we need to apply OS or firmware updates. SLURM 22.05.09 Note: You should apply the recent Slurm security updates ASAP! /Ole
[slurm-users] How to run one maintenance job on each node in the cluster
Is there a straightforward way to create a batch job that runs once on every node in the cluster? A technique simpler than generating a list from sinfo output and dispatching the job in a for loop for the N nodes. That’s not very hard, but I thought there might be an elegant solution which would make dispatching maintenance jobs easier. SLURM 22.05.09 Thanks, Jeffrey