We have two group of nodes running jobs submitted to two groups of Torque queues. For example, node001-node100 run jobs submitted to queue a, b and c, while node101-node200 run jobs submitted to z, y and z only
We tried splitting them into two partitions in Maui, using nodecfg for each node. eg. NODECFG[node001] PARTITION=a . . . NODECFG[node101] PARTITION=b . . . This setup works pretty well, but having to maintain cfg file with the number of nodes make me feel uneasy. Jobs for queue a,b,c usually have higher so they are always at the top of the priority table (based on diagnose -p). Without using partitioning, we could never get any job for queue x,y,z started on node101-200 even when node001-100 are all busy. Even when maui is not processing any jobs in queue a,b,c, it doesn't do anything to jobs from x,y,z waiting to start. I have tried using BACKFILL like this and it doesn't work: BACKFILLPOLICY BESTFIT BACKFILLMETRIC PROCS BACKFILLDEPTH 0 BFCHUNKDURATION 00:00:30 BFCHUNKSIZE 1 BFPRIORITYPOLICY RANDOM My questions are, - is there another way to assign a node to a partition other than having to enter NOCFG for each and every node in the maui.cfg file? I have 800+ nodes so the cfg file is going to be huge. - beside using partition, what other features are appropriate for managing jobs for two non-overlap group of machines like ours? - Should like look into using NODESET ? (Is it even a Maui feature, or just Moab?) Thanks, ... ling _______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
