Hi, I would like to have a partition of N nodes without statically defining which nodes should belong to a partition and I'm trying to work out the best way to achieve this.
Currently I have partitions which span across all the nodes in my cluster with differing settings, but I would like some of these to only occupy a subset of the cluster. I could say define partition A which can use all nodes but partition B may only access nodes 01-10. But I would like avoid partition B being reduced in size in the event of maintenance or hardware failure. I'm thinking the way to do this would be via a plugin. I would keep all partitions spanning all nodes in the cluster but upon submission check how many nodes are in use on the requested partition. If there were say already 10 nodes in use in partition B the job should be queued. However things then get a bit more complex as to when slurm should de-queue and then run the job. Is there a native method to do this in slurm? Essentially I would like something like the MaxNodes option that exists for partitions today but have it limit the total number of nodes used by jobs submitted to that partition rather than just a limit per job. Many thanks, George
