Hi,

To go into more specifics I'm wanting to be able to limit the number of nodes 
or cores providing the ability to run jobs with a wall time up to one week, all 
other nodes defaulting to 1 day. So I'd set say Q1 to have a 
MaxWallDurationPerJob of 24 hours and set it as the default then add another 
QOS with MaxWallDurationPerJob as a week and GrpNodes to N. Where this was 
previously two partitions I would now have a single partition with a max wall 
time of 1 week and two QOSes. In this case I'd want users to be able to do 
exactly what you highlight as a bug.

For completeness I would also set the DenyOnLimit flag.

In your case perhaps the MaxNodes setting in sacctmgr may help? In 14.03 
features were added which now allow you to more finely control which accounts 
get used with partitions as well as QOSes, this may be of interest.

Kind regards,
George
________________________________________
From: Tingyang Xu [[email protected]]
Sent: 30 October 2014 19:42
To: slurm-dev
Subject: [slurm-dev] Re: Non static partition definition

Hello George,
We do have the same issue now. I think the solution of QOS has bug. For
example,
Assume that Partition B allows two QOSes, Q1 and Q2. Then you set up
GrpNodes=10 on both Q1 and Q2. Then, the users can actually use 20 nodes if
they submit jobs to Q1 and Q2, respectively.

Best,
Tingyang Xu

-----Original Message-----
From: Brown George Andrew
Sent: Thursday, October 30, 2014 2:36 PM
To: slurm-dev
Subject: [slurm-dev] Re: Non static partition definition


Thanks for the quick replies!

Indeed a QOS seems like what I want here. Sorry I was stuck thinking in
partitions and clearly was having some tunnel vision.

Cheers,
George
________________________________________
From: [email protected] [[email protected]]
Sent: 30 October 2014 19:08
To: slurm-dev
Subject: [slurm-dev] Re: Non static partition definition

In addition to a QOS, an advanced reservation may also satisfy your needs:
http://slurm.schedmd.com/reservations.html

Quoting Ryan Cox <[email protected]>:

> George,
>
> Wouldn't a QOS with GrpNodes=10 accomplish that?
>
> Ryan
>
> On 10/30/2014 11:47 AM, Brown George Andrew wrote:
>> Hi,
>>
>> I would like to have a partition of N nodes without statically
>> defining which nodes should belong to a partition and I'm trying to
>> work out the best way to achieve this.
>>
>> Currently I have partitions which span across all the nodes in my
>> cluster with differing settings, but I would like some of these to
>> only occupy a subset of the cluster. I could say define partition A
>> which can use all nodes but partition B may only access nodes
>> 01-10. But I would like avoid partition B being reduced in size in
>> the event of maintenance or hardware failure.
>>
>> I'm thinking the way to do this would be via a plugin. I would keep
>> all partitions spanning all nodes in the cluster but upon
>> submission check how many nodes are in use on the requested
>> partition. If there were say already 10 nodes in use in partition B
>> the job should be queued. However things then get a bit more
>> complex as to when slurm should de-queue and then run the job.
>>
>> Is there a native method to do this in slurm? Essentially I would
>> like something like the MaxNodes option that exists for partitions
>> today but have it limit the total number of nodes used by jobs
>> submitted to that partition rather than just a limit per job.
>>
>> Many thanks,
>> George


--
Morris "Moe" Jette
CTO, SchedMD LLC

Reply via email to