[slurm-dev] RE: A little bit help from my slurm-friends

Paul Edmon Mon, 16 Jan 2017 07:37:40 -0800

I agree having multiple partitions will decrease efficiency of thescheduler. That said if you have to do it, you have to do it. Usingthe features is a good way to go if people need specific ones. I couldsee having multiple partitions so you can charge differently for eachgeneration of hardware, as run times will invariably be different.Still if that isn't a concern just have a single queue.

For multifactor I would turn on fairshare and age. JobSize really isn'tuseful unless you have people running multicore jobs and you want toprioritize, or deprioritize those.

If you end up in a multipartition scenario then I recommend having abackfill queue that underlies all the partitions and setting up REQUEUEon that partition. That way people can farm idle cycles. This isespecially good for people who are hardware agnostic and don't reallycare when their jobs get done but rather just have a ton to do that canbe interrupted at any moment. That's what we do here and we have 110partitions. Our backfill queue does a pretty good job up picking up theidle cores but still there is structural inefficiencies with that manypartitions so we never get above about 70% usage of our hardware.

So just keep that in mind when you are setting things up. Morepartitions means more structural inefficiency but it does give you otherbenefits such as isolating hardware for specific use. It really dependson what you need. I highly recommend experimenting to figure out whatfits you and your users best.


-Paul Edmon-

On 1/16/2017 10:16 AM, Loris Bennett wrote:

David WALTER <david.wal...@ens.fr> writes:

Dear Loris,

Thanks for your response !

I'm going to look on this features in slurm.conf.  I only configured
the CPUs, Sockets.... per node. Do you have any example or link to
explain me how it's working and what can I use ?

It's not very complicated.  A feature is just a label, so if you had
some nodes with Intel processors and some with AMD, you could attach the
features, e.g.

NodeName=node[001,002] Procs=12 Sockets=2 CoresPerSocket=6 ThreadsPerCore=1 
RealMemory=42000 State=unknown Feature=intel
NodeName=node[003,004] Procs=12 Sockets=2 CoresPerSocket=6 ThreadsPerCore=1 
RealMemory=42000 State=unknown Feature=amd

Users then just request the required CPU type in their batch scripts as
a constraint, e.g:

#SBATCH --constraint="intel"

My goal is to respond to people needs and launch their jobs as fast as
possible without losing time when one partition is idle whereas the
others are fully loaded.

The easiest way to avoid the problem you describe is to just have one
partition.  If you have multiple partitions, the users have to
understand what the differences are so that they can choose sensibly.

That's why I thought the fair share factor was the best solution

Fairshare won't really help you with the problem that one partition
might be full while another is empty.  It will just affect the ordering
of jobs in the full partition, although the weight of the partition term
in the priority expression can affect the relative attractiveness of the
partitions.

In general, however, I would suggest you start with a simple set-up.
You can always add to it later to address specific issues as they arise.
For instance, you could start with one partition and two QOS: one for
normal jobs and one for test jobs.  The latter could have a higher
priority, but only a short maximum run-time and possibly a low maximum
number of jobs per user.

Cheers,

Loris

[slurm-dev] RE: A little bit help from my slurm-friends

Reply via email to