Hi,

I would like to have a partition of N nodes without statically defining which 
nodes should belong to a partition and I'm trying to work out the best way to 
achieve this.

Currently I have partitions which span across all the nodes in my cluster with 
differing settings, but I would like some of these to only occupy a subset of 
the cluster. I could say define partition A which can use all nodes but 
partition B may only access nodes 01-10. But I would like avoid partition B 
being reduced in size in the event of maintenance or hardware failure.

I'm thinking the way to do this would be via a plugin. I would keep all 
partitions spanning all nodes in the cluster but upon submission check how many 
nodes are in use on the requested partition. If there were say already 10 nodes 
in use in partition B the job should be queued. However things then get a bit 
more complex as to when slurm should de-queue and then run the job.

Is there a native method to do this in slurm? Essentially I would like 
something like the MaxNodes option that exists for partitions today but have it 
limit the total number of nodes used by jobs submitted to that partition rather 
than just a limit per job.

Many thanks,
George

Reply via email to