There was a problem with the way Maui was calculating J->RULVTime. It was being 
incremented each
scheduling iteration by the number of iterations Moab had processed up to that 
point. This makes no
sense, since the max violation time is TIME based and not ITERATION based. I 
also added code which
resets RULVTime when there is no longer a violation detected. These issues have 
been fixed and, I
believe, you should see better results without having to reset RULVTime at job 
start.

These fixes are available in the latest snapshot and will also be included in 
the next Maui patch.

Get the latest snapshot at:
http://www.clusterresources.com/downloads/maui/temp/maui-3.2.6p20-snap.1176920941.tar.gz

--
Joshua Butikofer
Cluster Resources, Inc.

[EMAIL PROTECTED]
Voice: (801) 717-3707
Fax:   (801) 717-3738
--------------------------


Nick Sonneveld wrote:
Hi guys,

I think I've found a bug in Maui.  Is this the right place to post?

Maui does not wait the full extended violation time if the job has been
idle in the queue for a while.   If the job starts violating a resource
restriction immediately when it starts, then it will be killed
immediately instead of after the violation time.   This does not happen
if the job has not been waiting in the queue for long.

The problem is that MaxViolationTime doesn't take into account the time
the job is in the queue.

To find this out, I inserted a line into maui to print out J->RULVTime
and  P->ResourceLimitMaxViolationTime[VRes]:
04/12 19:57:01 MSysRegEvent(For job '4660' , is J->RULVTime (45426) <
P->ResourceLimitMaxViolationTime[VRes] (300)  ? ,0,0,1)
J->RULVTime was a very large number despite the fact that the job had
only just started.

Fix suggestion, reset J->RULVTime somewhere when the job starts?



_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to