On Thu, Apr 16, 2009 at 01:55:30PM -0500, Ling C. Ho wrote:
> We have two group of nodes running jobs submitted to two groups of Torque 
> queues. For example, 
> node001-node100 run jobs submitted to queue a, b and c, while node101-node200 
> run jobs submitted to 
> z, y and z only
> 
> We tried splitting them into two partitions in Maui, using nodecfg for each 
> node. eg. 
> NODECFG[node001] PARTITION=a
> .
> .
> .
> NODECFG[node101] PARTITION=b
> .
> .
> .
> 
> This setup works pretty well, but having to maintain cfg file with the number 
> of nodes make me feel 
> uneasy.
> 
> Jobs for queue a,b,c usually have higher so they are always at the top of the 
> priority table (based 
> on diagnose -p). Without using partitioning, we could never get any job for 
> queue x,y,z started on 
> node101-200 even when node001-100 are all busy. Even when maui is not 
> processing any jobs in queue 
> a,b,c, it doesn't do anything to jobs from x,y,z waiting to start. I have 
> tried using BACKFILL like 
> this and it doesn't work:
> 
> BACKFILLPOLICY        BESTFIT
> BACKFILLMETRIC          PROCS
> BACKFILLDEPTH           0
> BFCHUNKDURATION         00:00:30
> BFCHUNKSIZE             1
> BFPRIORITYPOLICY        RANDOM
> 
> My questions are,
> - is there another way to assign a node to a partition other than having to 
> enter NOCFG for each and 
> every node in the maui.cfg file? I have 800+ nodes so the cfg file is going 
> to be huge.
> 
> - beside using partition, what other features are appropriate for managing 
> jobs for two non-overlap 
> group of machines like ours?
> 
> - Should like look into using NODESET ?  (Is it even a Maui feature, or just 
> Moab?)

I've always thought of partitions as impenetrable boundries in Maui and
Moab.  I use them as the highest level container, and I never want jobs or
reservations to span them.

To this end, I use NODECFG to assign my nodes to a partition, and then
CLASSCFG to assign my queues to a partition:

    NODECFG[node001]    PARTITION=part1
    .
    .
    .

    CLASSCFG[a]         PARTITION=part1
    CLASSCFG[b]         PARTITION=part1
    .
    .
    .

My users then submit to queue and their job is guaranteed to be in the
right partition.

Now, having job reservations per partition is a bit trickier, but it's
doable (I do it in Moab, so YMMV with Maui).  What I would do is assign a
default QOS for each of your queues

    QOSCFG[part1qos]
    QOSFLAGS[part1qos]  RESERVEALWAYS
    CLASSCFG[a]         QDEF=part1qos

and add to that QOS a reservation profile that has it's own
RESERVATIONDEPTH:

    RSVPROFILE[part1rsv]
    RESERVATIONDEPTH[part1rsv]          1
    RESERVATIONQOSLIST[part1rsv]        part1qos

Then you would also make that QOS the default for your other queues in that
partition:

    CLASSCFG[b]         QDEF=part1qos
    CLASSCFG[c]         QDEF=part1qos
    .
    .
    .

I've been using a configuration in Moab similar to this for almost 3 years
and it's been working just fine.  I did my best to map Moab's parameters to
Maui ones using the Maui docs, but you will need to do some
experimentation.

HTH,

Gabe
-- 
Gabe Turner                                             [email protected]
HPC Systems Administrator,
University of Minnesota
Supercomputing Institute                          http://www.msi.umn.edu
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to