I'm setting up a test installation of SLURM 2.4.
I'm trying to get GANG scheduling to work, so I've got the following 
configuration:

FastSchedule=0
DefMemPerCPU=1536
MaxMemPerCPU=32768

SchedulerType=sched/backfill
SchedulerParameters=defer
SchedulerPort=7321
SelectType=select/cons_res


PreemptMode=SUSPEND,GANG
PreemptType=preempt/partition_prio

PartitionName=batch Shared=FORCE:2 ...

I have several questions about the behavior I get.

First: I would like SLURM to overcommit CPUs only, not memory. By reading the 
docs I'm unsure if Shared=FORCE:2 also allows a x2 overcommit of memory too.

Second: With SLURM 2.4 I get the following if I set 
SelectTypeParameters=CR_Memory:

[2011-11-17T09:10:25] cons_res: select_p_node_init
[2011-11-17T09:10:25] fatal: Invalid SelectTypeParameter: CR_MEMORY

Is this expected?

Third: I was expecting GANG to suspend jobs only if a job with a higher 
priority is present in the queue, but what I see is that jobs with _equal_ 
priority also get preempted. Is there a way to avoid this?

The resulting behavior is kind of unexpected for me: twice as many jobs are 
immediately scheduled on the nodes (due to FORCE:2), saturating the resources. 
As a result, an entering new job with a higher priority will have to wait twice 
as much to be running, which is exactly the opposite of what I would like to 
accomplish.

Thanks for any hint.

Reply via email to