Typically due to non-killable processes. Slurm will repeatedly send sigkill, 
but job stays in cg state. Check for processes then either reboot node or 
cold-start slurmd on effected nodes (leaving processes around).
--
Sent from my Android phone. Please excuse my brevity and typos.

Michel Bourget <[email protected]> wrote:


Hi all,

what could cause a job to remain in Completing State(CG) ? It can't be
killed via
scancel either. Any solution for this ? I noticed it happens many times
when I "play" with slurm 2.2.7. I'd speculate it seems less frequent
with 2.3.3 version.

TIA

--

_____________________________________________

Michel Bourget - SGI - Linux Software Engineering
"Past BIOS POST, everything else is extra" (travis)
_____________________________________________

Reply via email to