Hi, To go into more specifics I'm wanting to be able to limit the number of nodes or cores providing the ability to run jobs with a wall time up to one week, all other nodes defaulting to 1 day. So I'd set say Q1 to have a MaxWallDurationPerJob of 24 hours and set it as the default then add another QOS with MaxWallDurationPerJob as a week and GrpNodes to N. Where this was previously two partitions I would now have a single partition with a max wall time of 1 week and two QOSes. In this case I'd want users to be able to do exactly what you highlight as a bug.
For completeness I would also set the DenyOnLimit flag. In your case perhaps the MaxNodes setting in sacctmgr may help? In 14.03 features were added which now allow you to more finely control which accounts get used with partitions as well as QOSes, this may be of interest. Kind regards, George ________________________________________ From: Tingyang Xu [[email protected]] Sent: 30 October 2014 19:42 To: slurm-dev Subject: [slurm-dev] Re: Non static partition definition Hello George, We do have the same issue now. I think the solution of QOS has bug. For example, Assume that Partition B allows two QOSes, Q1 and Q2. Then you set up GrpNodes=10 on both Q1 and Q2. Then, the users can actually use 20 nodes if they submit jobs to Q1 and Q2, respectively. Best, Tingyang Xu -----Original Message----- From: Brown George Andrew Sent: Thursday, October 30, 2014 2:36 PM To: slurm-dev Subject: [slurm-dev] Re: Non static partition definition Thanks for the quick replies! Indeed a QOS seems like what I want here. Sorry I was stuck thinking in partitions and clearly was having some tunnel vision. Cheers, George ________________________________________ From: [email protected] [[email protected]] Sent: 30 October 2014 19:08 To: slurm-dev Subject: [slurm-dev] Re: Non static partition definition In addition to a QOS, an advanced reservation may also satisfy your needs: http://slurm.schedmd.com/reservations.html Quoting Ryan Cox <[email protected]>: > George, > > Wouldn't a QOS with GrpNodes=10 accomplish that? > > Ryan > > On 10/30/2014 11:47 AM, Brown George Andrew wrote: >> Hi, >> >> I would like to have a partition of N nodes without statically >> defining which nodes should belong to a partition and I'm trying to >> work out the best way to achieve this. >> >> Currently I have partitions which span across all the nodes in my >> cluster with differing settings, but I would like some of these to >> only occupy a subset of the cluster. I could say define partition A >> which can use all nodes but partition B may only access nodes >> 01-10. But I would like avoid partition B being reduced in size in >> the event of maintenance or hardware failure. >> >> I'm thinking the way to do this would be via a plugin. I would keep >> all partitions spanning all nodes in the cluster but upon >> submission check how many nodes are in use on the requested >> partition. If there were say already 10 nodes in use in partition B >> the job should be queued. However things then get a bit more >> complex as to when slurm should de-queue and then run the job. >> >> Is there a native method to do this in slurm? Essentially I would >> like something like the MaxNodes option that exists for partitions >> today but have it limit the total number of nodes used by jobs >> submitted to that partition rather than just a limit per job. >> >> Many thanks, >> George -- Morris "Moe" Jette CTO, SchedMD LLC
