Might be worth to look into node features (see
http://slurm.schedmd.com/slurm.conf.html).
Regards,
Uwe
Am 03.02.2015 um 18:36 schrieb John Desantis:
>
> Hello all,
>
> Unfortunately, I have some confusion regarding how to achieve a global
> and single partition for our users with several separate host groups
> after reading the man pages and various documentation.
>
> When I say host groups, I mean separate sets of hardware which utilize
> different infiniband fabrics and/or are accessible in different data
> centers, different CPU architectures, etc.
>
> During initial testing periods, I was able to have use of a default
> partition with all of the nodes allocated via the "Nodes=" value.
> All was well until a latter set of nodes were added which had a
> separate infiniband fabric. Testing proved that applications were
> attempting to utilize the nodes within the separate fabrics, which
> failed miserably, and as a result we're using separate partitions -
> which most users don't mind.
>
> Now that we're getting more users converted to Slurm, we're realizing
> that some users don't know how to check for free partitions and
> available hardware (boo!) and have grown used to our previous
> scheduler configuration of 1 global queue.
>
> I'm looking into how to emulate this and I'm not quite clear if this
> can be done using multiple partition definitions with a DEFAULT clause
> or not. I've looked at the topology/tree plugin as well and seeing
> that you can specify either switches or nodes, if this would be the
> preferred method to achieve 1 "global" partition which utilizes all of
> the separate hardware pools and respects the separate host groups.
>
> Thank you,
> John DeSantis
>