Re-reading your question, I don't think reservations are applicable to your 
problem.

Reservations are used to isolate resources for an amount of time for a group of 
users -- not the job scheduling policies tied to the resources.  You can 
allocate a number of dedicated resources for a long duration, say 10 nodes for 
60 days, but the partition/assoc/QOS/etc limits will still apply to these 
resources -- your 24hr runtime limit would still apply.

I think of reservations as a lightweight, virtual partition.

Our max runtime is 14 days (via QOS).  This is a major difference between our 
systems.  We use reservations to allocate resources from our shared pool for 
very long durations, but we haven't needed to adjust the max job runtime 
because it's been sufficient.  This confused my answer.  Sorry!

Best,

Sebastian

--

[University of Nevada, Reno]<http://www.unr.edu/>
Sebastian Smith
High-Performance Computing Engineer
Office of Information Technology
1664 North Virginia Street
MS 0291

work-phone: 775-682-5050<tel:7756825050>
email: stsm...@unr.edu<mailto:stsm...@unr.edu>
website: http://rc.unr.edu<http://rc.unr.edu/>

________________________________
From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of Matthew 
BETTINGER <matthew.bettin...@external.total.com>
Sent: Wednesday, July 8, 2020 10:53 AM
To: Slurm User Community List <slurm-users@lists.schedmd.com>
Subject: Re: [slurm-users] Allow certain users to run over partition limit

Ok I see the resource hierarchy limits :

Partition QOS limit
Job QOS limit
User association
Account association(s), ascending the hierarchy
Root/Cluster association
Partition limit
None

Where in this list does the reservations fall under?  Do reservations override 
all of these if they are set to exceed resources imposed by the partition 
configuration?  Thanks!

On 7/7/20, 6:02 PM, "slurm-users on behalf of Sebastian T Smith" 
<slurm-users-boun...@lists.schedmd.com on behalf of stsm...@unr.edu> wrote:

    Hi,

    We use Job QOS and Resource Reservations for this purpose.  QOS is a good 
option for a "permanent" change to a user's resource limits.  We use 
reservations similar to how you're currently using partitions to "temporarily" 
provide a resource boost without the complexities of re-partitioning or mucking 
with associations.


    Precedence  in resource limits: 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fresource_limits&amp;data=01%7C01%7Cstsmith%40unr.edu%7Cd78b383953814945112908d823680dff%7C523b4bfc0ebd4c03b2b96f6a17fd31d8%7C1&amp;sdata=2zkNSAHfRPnral7FbGtC9Q68YyYzhSaaLMONOCFAv1E%3D&amp;reserved=0.

    ________________________________________
    From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of 
Matthew BETTINGER <matthew.bettin...@external.total.com>
    Sent: Tuesday, July 7, 2020 9:40 AM
    To: slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com>
    Subject: [slurm-users] Allow certain users to run over partition limit

    Hello,

    We have a slurm system with partitions set for max runtime of 24hours.  
What would be the proper way to allow a certain set of users to run jobs on the 
current partitions beyond the partition limits?  In the past we would isolate 
some nodes based on their job requirements , make a new partition and a 
reservation with the users and have to push out the new configuration.  This is 
pretty unwieldy but works but doing it this way the nodes are basically wasted 
unless they are not being used by these special users and unavailable for 
others.

    Is there some way we can allow some users sometimes to run over partition 
run time more easily than manually modifying slurm.conf.  Possibly with qos?

    Thanks.


Reply via email to