There was a problem with the way Maui was calculating J->RULVTime. It was being incremented each scheduling iteration by the number of iterations Moab had processed up to that point. This makes no sense, since the max violation time is TIME based and not ITERATION based. I also added code which resets RULVTime when there is no longer a violation detected. These issues have been fixed and, I believe, you should see better results without having to reset RULVTime at job start.
These fixes are available in the latest snapshot and will also be included in the next Maui patch. -- Joshua Butikofer Cluster Resources, Inc. [EMAIL PROTECTED] Voice: (801) 717-3707 Fax: (801) 717-3738 -------------------------- Nick Sonneveld wrote: > Hi guys, > > I think I've found a bug in Maui. Is this the right place to post? > > Maui does not wait the full extended violation time if the job has been > idle in the queue for a while. If the job starts violating a resource > restriction immediately when it starts, then it will be killed > immediately instead of after the violation time. This does not happen > if the job has not been waiting in the queue for long. > > The problem is that MaxViolationTime doesn't take into account the time > the job is in the queue. > > To find this out, I inserted a line into maui to print out J->RULVTime > and P->ResourceLimitMaxViolationTime[VRes]: > 04/12 19:57:01 MSysRegEvent(For job '4660' , is J->RULVTime (45426) < > P->ResourceLimitMaxViolationTime[VRes] (300) ? ,0,0,1) > J->RULVTime was a very large number despite the fact that the job had > only just started. > > Fix suggestion, reset J->RULVTime somewhere when the job starts? _______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
