Quoting "Holmes, Christopher (CMU)" <[email protected]>:

Thanks Moe.

For further clarification, they are single-threaded jobs, and they may be in a job array. I noticed that the job array size is limited by its data type:

ctl_conf_ptr->max_array_sz              = (uint16_t) NO_VAL;

That goes to 32-bits in version 14.11, but for various practical reasons should be limited to 1M or so.

Does anyone know if jobsteps in a batch script are limited in size? I’m guessing that they are unlimited.

The counter is a 32-bit field.

I’m also guessing that job arrays and job steps have faster throughput capability since the resources are already scheduled. Is there much difference between jobsteps and job arrays in terms of throughput or scale?

The job arrays don't really help in throughput or scale until version 14.11. Both throughput and scale should be better with job steps than job arrays, if the use case permits that.

Thanks,
--Chris

-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: Monday, October 13, 2014 12:45 PM
To: slurm-dev
Subject: [slurm-dev] Re: SLURM experience with high throughout of short-running jobs?


That is highly configuration dependent. It is also notable that each major releas eof Slurm over the past few years has been significantly faster than the previous release. Generally you should be able to sustain a few hundred jobs per second.

Quoting "Holmes, Christopher (CMU)" <[email protected]>:

Hello,

Can anyone provide some information or experience with using SLURM to
manage a high volume of short-running jobs (<60 seconds) on a large
(2000+ node) cluster? Any rough numbers on throughput (ex.
1000+jobs scheduled per second) would be appreciated!

Thanks,
--Chris


--
Morris "Moe" Jette
CTO, SchedMD LLC


--
Morris "Moe" Jette
CTO, SchedMD LLC

Reply via email to