[slurm-dev] Re: SLURM experience with high throughout of short-running jobs?

jette Tue, 14 Oct 2014 08:27:38 -0700


Quoting "Holmes, Christopher (CMU)" <[email protected]>:

Thanks Moe.
For further clarification, they are single-threaded jobs, and theymay be in a job array. I noticed that the job array size is limitedby its data type:
ctl_conf_ptr->max_array_sz              = (uint16_t) NO_VAL;

That goes to 32-bits in version 14.11, but for various practicalreasons should be limited to 1M or so.

Does anyone know if jobsteps in a batch script are limited in size?I’m guessing that they are unlimited.


The counter is a 32-bit field.

I’m also guessing that job arrays and job steps have fasterthroughput capability since the resources are already scheduled. Isthere much difference between jobsteps and job arrays in terms ofthroughput or scale?

The job arrays don't really help in throughput or scale until version14.11. Both throughput and scale should be better with job steps thanjob arrays, if the use case permits that.

Thanks,
--Chris

-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: Monday, October 13, 2014 12:45 PM
To: slurm-dev
Subject: [slurm-dev] Re: SLURM experience with high throughout ofshort-running jobs?
That is highly configuration dependent. It is also notable that eachmajor releas eof Slurm over the past few years has beensignificantly faster than the previous release. Generally you shouldbe able to sustain a few hundred jobs per second.
Quoting "Holmes, Christopher (CMU)" <[email protected]>:
Hello,

Can anyone provide some information or experience with using SLURM to
manage a high volume of short-running jobs (<60 seconds) on a large
(2000+ node) cluster? Any rough numbers on throughput (ex.
1000+jobs scheduled per second) would be appreciated!

Thanks,
--Chris
--
Morris "Moe" Jette
CTO, SchedMD LLC



--
Morris "Moe" Jette
CTO, SchedMD LLC

[slurm-dev] Re: SLURM experience with high throughout of short-running jobs?

Reply via email to