Quoting "Holmes, Christopher (CMU)" <[email protected]>:
Thanks Moe.
For further clarification, they are single-threaded jobs, and they
may be in a job array. I noticed that the job array size is limited
by its data type:
ctl_conf_ptr->max_array_sz = (uint16_t) NO_VAL;
That goes to 32-bits in version 14.11, but for various practical
reasons should be limited to 1M or so.
Does anyone know if jobsteps in a batch script are limited in size?
I’m guessing that they are unlimited.
The counter is a 32-bit field.
I’m also guessing that job arrays and job steps have faster
throughput capability since the resources are already scheduled. Is
there much difference between jobsteps and job arrays in terms of
throughput or scale?
The job arrays don't really help in throughput or scale until version
14.11. Both throughput and scale should be better with job steps than
job arrays, if the use case permits that.
Thanks,
--Chris
-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: Monday, October 13, 2014 12:45 PM
To: slurm-dev
Subject: [slurm-dev] Re: SLURM experience with high throughout of
short-running jobs?
That is highly configuration dependent. It is also notable that each
major releas eof Slurm over the past few years has been
significantly faster than the previous release. Generally you should
be able to sustain a few hundred jobs per second.
Quoting "Holmes, Christopher (CMU)" <[email protected]>:
Hello,
Can anyone provide some information or experience with using SLURM to
manage a high volume of short-running jobs (<60 seconds) on a large
(2000+ node) cluster? Any rough numbers on throughput (ex.
1000+jobs scheduled per second) would be appreciated!
Thanks,
--Chris
--
Morris "Moe" Jette
CTO, SchedMD LLC
--
Morris "Moe" Jette
CTO, SchedMD LLC