That error is from when the configured value of MaxStepCount is reached. What problems are you seeing with larger step counts?

From "man slurm.conf":
MaxStepCount
The maximum number of steps that any job can initiate. This parameter is intended to limit the effect of bad batch scripts. The default value is 40000 steps.

Quoting Jeroen Meijer <[email protected]>:

So we have had some issues configuring SLURM to cope with 25000 job steps.
Therefore we have configured the SLURM control daemon to accept only 600
job steps per job. So we split the job steps over multiple jobs. I have
verified that we only have max 600 job steps per job, but still SLURM
outputs:

srun: error: Unable to create job step: Step limit reached for this job
srun: error: Unable to create job step: Step limit reached for this job
srun: error: Unable to create job step: Step limit reached for this job

How could this be happening?

Reply via email to