Hi, Does anyone know what conditions might cause srun to peg a CPU (looks like mostly system time)? Might it have to do with keeping stdout/stderr open?
I have a script that submits batches of jobs using srun (I find this more convenient than using sbatch) but when it submits to many jobs at the same time (20+) some of the srun instances can start to use 100% cpu and make the head node angry. I’d like to be able to submit as many as I can but prevent this from happening. Thanks, -Craig
