I've been having some pretty serious difficulties with Maui (3.2.6p14-snap.1138394201) and Torque (2.0.0p8) on a small Opteron cluster (SuSE 9.3). In general, when a low-priority job gets suspended through preemption by a high-priority job, the suspended job is ignored when SPACEFLEX standing reservations are calculated in later Maui cycles.
As an example, suppose there are two SPACEFLEX reservations, each taking up an entire dual-processor node, and three types of jobs: A) preemptee, can run in reservation 1 B) neither preemptor nor preemptee, can run in reservation 1 and 2 C) preemptor, can run in reservation 1 and 2 At time 0, a node is running job A and job B, with reservation 1 assigned to it. At time 1, job C has gained sufficient priority points to suspend job A. At that point, or shortly thereafter, the node is assigned to both reservation 1 and 2, which should not be possible. At time 2, the node has job A suspended, with B and C running, but now only has reservation 2 assigned. Job A is now locked into a node that is not reserved for it. At time 3, another job of type B is queued, and is scheduled to start when job C ends. This effectively continues the preemption of job A even though B is not normally able to suspend A. How do I fix this? Here's the relevant part of my maui.cfg. Job A is group ck with class prefinity. Job B is group cb with class long. Job C is group ck with class short. QOSCFG[high] QFLAGS=PREEMPTOR QOSCFG[med] QOSCFG[low] QFLAGS=PREEMPTEE CLASSCFG[infinity] QDEF=med CLASSCFG[verylong] QDEF=med CLASSCFG[long] QDEF=med CLASSCFG[medium] WCOVERRUN=00:30:00 QDEF=high QLIST=high^ CLASSCFG[short] WCOVERRUN=00:05:00 QDEF=high QLIST=high^ CLASSCFG[prefinity] MAX.PROC=1 QDEF=low QLIST=low^ #reservation 1 SRCFG[dayjobs] STARTTIME=8:00:00 ENDTIME=18:00:00 SRCFG[dayjobs] PERIOD=DAY DAYS=MON,TUE,WED,THU,FRI DEPTH=3 SRCFG[dayjobs] FLAGS=SPACEFLEX SRCFG[dayjobs] CLASSLIST=short-,medium-,prefinity+ SRCFG[dayjobs] GROUPLIST=cb- SRCFG[dayjobs] TASKCOUNT=4 RESOURCES=PROCS:1;MEM:3750 SRCFG[dayjobs] TPN=2 SRCFG[dayjobs] ACCESS=DEDICATED # Reservation 2 SRCFG[anorgjobs] STARTTIME=00:00:00 ENDTIME=00:00:00 SRCFG[anorgjobs] PERIOD=DAY DAYS=ALL DEPTH=2 SRCFG[anorgjobs] OWNER=GROUP:cb SRCFG[anorgjobs] FLAGS=SPACEFLEX SRCFG[anorgjobs] CLASSLIST=short-,medium- SRCFG[anorgjobs] GROUPLIST=cb+ SRCFG[anorgjobs] TASKCOUNT=4 RESOURCES=PROCS:1;MEM:3750 SRCFG[anorgjobs] TPN=2 SRCFG[anorgjobs] ACCESS=DEDICATED Thanks for any help, Nate __________________________________ Dr. Nathan Crawford Theoretische Chemie Universität Karlsruhe [EMAIL PROTECTED] _______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
