I'm setting up a test installation of SLURM 2.4. I'm trying to get GANG scheduling to work, so I've got the following configuration:
FastSchedule=0 DefMemPerCPU=1536 MaxMemPerCPU=32768 SchedulerType=sched/backfill SchedulerParameters=defer SchedulerPort=7321 SelectType=select/cons_res PreemptMode=SUSPEND,GANG PreemptType=preempt/partition_prio PartitionName=batch Shared=FORCE:2 ... I have several questions about the behavior I get. First: I would like SLURM to overcommit CPUs only, not memory. By reading the docs I'm unsure if Shared=FORCE:2 also allows a x2 overcommit of memory too. Second: With SLURM 2.4 I get the following if I set SelectTypeParameters=CR_Memory: [2011-11-17T09:10:25] cons_res: select_p_node_init [2011-11-17T09:10:25] fatal: Invalid SelectTypeParameter: CR_MEMORY Is this expected? Third: I was expecting GANG to suspend jobs only if a job with a higher priority is present in the queue, but what I see is that jobs with _equal_ priority also get preempted. Is there a way to avoid this? The resulting behavior is kind of unexpected for me: twice as many jobs are immediately scheduled on the nodes (due to FORCE:2), saturating the resources. As a result, an entering new job with a higher priority will have to wait twice as much to be running, which is exactly the opposite of what I would like to accomplish. Thanks for any hint.
