On Thu, Apr 16, 2009 at 01:55:30PM -0500, Ling C. Ho wrote:
> We have two group of nodes running jobs submitted to two groups of Torque
> queues. For example,
> node001-node100 run jobs submitted to queue a, b and c, while node101-node200
> run jobs submitted to
> z, y and z only
>
> We tried splitting them into two partitions in Maui, using nodecfg for each
> node. eg.
> NODECFG[node001] PARTITION=a
> .
> .
> .
> NODECFG[node101] PARTITION=b
> .
> .
> .
>
> This setup works pretty well, but having to maintain cfg file with the number
> of nodes make me feel
> uneasy.
>
> Jobs for queue a,b,c usually have higher so they are always at the top of the
> priority table (based
> on diagnose -p). Without using partitioning, we could never get any job for
> queue x,y,z started on
> node101-200 even when node001-100 are all busy. Even when maui is not
> processing any jobs in queue
> a,b,c, it doesn't do anything to jobs from x,y,z waiting to start. I have
> tried using BACKFILL like
> this and it doesn't work:
>
> BACKFILLPOLICY BESTFIT
> BACKFILLMETRIC PROCS
> BACKFILLDEPTH 0
> BFCHUNKDURATION 00:00:30
> BFCHUNKSIZE 1
> BFPRIORITYPOLICY RANDOM
>
> My questions are,
> - is there another way to assign a node to a partition other than having to
> enter NOCFG for each and
> every node in the maui.cfg file? I have 800+ nodes so the cfg file is going
> to be huge.
>
> - beside using partition, what other features are appropriate for managing
> jobs for two non-overlap
> group of machines like ours?
>
> - Should like look into using NODESET ? (Is it even a Maui feature, or just
> Moab?)
I've always thought of partitions as impenetrable boundries in Maui and
Moab. I use them as the highest level container, and I never want jobs or
reservations to span them.
To this end, I use NODECFG to assign my nodes to a partition, and then
CLASSCFG to assign my queues to a partition:
NODECFG[node001] PARTITION=part1
.
.
.
CLASSCFG[a] PARTITION=part1
CLASSCFG[b] PARTITION=part1
.
.
.
My users then submit to queue and their job is guaranteed to be in the
right partition.
Now, having job reservations per partition is a bit trickier, but it's
doable (I do it in Moab, so YMMV with Maui). What I would do is assign a
default QOS for each of your queues
QOSCFG[part1qos]
QOSFLAGS[part1qos] RESERVEALWAYS
CLASSCFG[a] QDEF=part1qos
and add to that QOS a reservation profile that has it's own
RESERVATIONDEPTH:
RSVPROFILE[part1rsv]
RESERVATIONDEPTH[part1rsv] 1
RESERVATIONQOSLIST[part1rsv] part1qos
Then you would also make that QOS the default for your other queues in that
partition:
CLASSCFG[b] QDEF=part1qos
CLASSCFG[c] QDEF=part1qos
.
.
.
I've been using a configuration in Moab similar to this for almost 3 years
and it's been working just fine. I did my best to map Moab's parameters to
Maui ones using the Maui docs, but you will need to do some
experimentation.
HTH,
Gabe
--
Gabe Turner [email protected]
HPC Systems Administrator,
University of Minnesota
Supercomputing Institute http://www.msi.umn.edu
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers