Ryan,

I believe this is the default behavior of reservations unless the flag
"static_alloc" is specified.

John DeSantis

2015-11-21 22:13 GMT-05:00 Novosielski, Ryan <[email protected]>:
> I could have sworn that I just heard it was possible to create a floating
> reservation for any number of nodes and that you could also cause it to
> replace nodes if any went missing with the "replace" flag. Is that not all
> in the current release?
>
> --
> ____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences*
> || \\UTGERS      |---------------------*O*---------------------
> ||_// Biomedical | Ryan Novosielski - Senior Technologist
> || \\ and Health | [email protected] 973/972.0922 (2x0922)
> ||  \\  Sciences | OIRT/High Perf & Res Comp - MSB C630, Newark
>     `'
>
> On Nov 21, 2015, at 11:30, Daniel Letai <[email protected]> wrote:
>
>
> John,
>
> That's correct - exclusive use means the project must always have at
> least 5 nodes available to it, at all times, even if it means those
> nodes will be idle some of the time.
>
> OTOH, if some of the other nodes are idle for whatever reason (no one
> else is using the cluster), let the project use any (up to all)
> available nodes.
>
> The project is run automatically based on some data as it becomes
> available to a dispatching app. Optimally it should be preemptable on
> the other nodes but not on the exclusive ones, and must not preempt
> other jobs, but the entire preemption issue is of secondary importance.
>
> A reservation is somewhat better than hardcoded nodelist as in my first
> post, but it's major drawback is that on reservation "renewal" there
> might not be enough (or any) nodes available and the project will not
> have enough nodes (since it can't preempt - unless somehow it can
> preempt, but only on those 5 nodes in the "new" reservation?).
>
> --Dani_L.
>
> On 11/19/2015 10:35 PM, John Desantis wrote:
>
> Daniel,
>
>
> Could you provide more information on the project's needs?
>
>
> A QOS could be configured with a generous priority and limits so that
>
> the project cannot dominate the partition;  Reservations could be used
>
> too, but you'd need to define at a minimum a start time and duration -
>
> and when not in use the hardware would be idle and unavailable to
>
> other users.
>
>
> John DeSantis
>
>
>
> 2015-11-19 13:31 GMT-05:00 Daniel Letai <[email protected]>:
>
> The other issue is how to define the "public" partition. It would also have
>
> to float, with lower priority, or else how would you achieve exclusivity  of
>
> "special" on the 5node float?
>
>
> --Dani_L.
>
>
>
> On 11/19/2015 06:10 PM, Paul Edmon wrote:
>
>
> Yeah, I guess QoS won't really work for overflow.  I was more thinking of
>
> the QoS as a way to create a floating partition of 5 nodes with the rest
>
> being in the public queue.  They would send jobs to the QoS to hit that and
>
> then when it is full they would submit to public as normal.  That's at least
>
> my thinking, but it's less seamless to the users as they will have to
>
> consciously monitor what is going on.
>
>
> -Paul Edmon-
>
>
> On 11/19/2015 10:50 AM, Daniel Letai wrote:
>
>
> Can you elaborate a little? I'm not sure what kind of QoS will help, nor
>
> how to implement one that will satisfy the requirements.
>
>
> On 11/19/2015 04:52 PM, Paul Edmon wrote:
>
>
> You might consider a QoS for this.  It may not do everything you want
>
> but it will give you the flexibility.
>
>
> -Paul Edmon-
>
>
> On 11/19/2015 04:49 AM, Daniel Letai wrote:
>
>
> Hi,
>
>
> Suppose I have a 100 node cluster with ~5% nodes down at any given time
>
> (maintanence/hw failure/...).
>
>
> One of the projects requires exclusive use of 5 nodes, and be able to
>
> use entire cluster when available (when other projects aren't running).
>
>
> I can do this easily if I maintain a static list of the exclusive nodes
>
> in slurm.conf:
>
>
> PartitionName=public Nodes=tux0[01-95] Default=YES
>
> PartitionName=special Nodes=tux[001-100] Default=NO
>
>
> And allowing only that project to use partition special.
>
>
> However, due to the downtime of 5%, I'd like to maintain a dynamic
>
> exclusive 5 nodes.
>
> Any suggestions?
>
>
> The project is serial and deployed as array of single node jobs, so I
>
> can run it even when the other 95 nodes are full.
>
>
> Thanks,
>
> --Dani_L.

Reply via email to