We are running Maui-3.2.6p14  along with Torque-2.1.8. with over 100 nodes.

We have the following "Standing Reservation" configured:


     SRCFG[development] STARTTIME=8:00:00 ENDTIME=18:00:00
     SRCFG[development] ACCESS=DEDICATED
     SRCFG[development] NODEFEATURES=prod
     SRCFG[development] PERIOD=DAY DAYS=MON,TUE,WED,THU,FRI
     SRCFG[development] PRIORITY=200
     SRCFG[development] TASKCOUNT=14
     SRCFG[development] MAXTIME=2:00:00


Sometimes we also need to reserve the nodes for maintenance. For this we
normally use Administrative Reservations, configured for eg by:


    setres -u root -s 9:00_06/25 -d 3:00 ALL


On occasion these "Administrative Reservations" just disappear (now it is happening repeatedly.)

After perusing the log files it appears that the disappearance of
the Administrative Reservations occur when Maui decides "re-shuffle" the Standing Reservations. I have included a "snapshot" of the appropriate portion of the log files at the end of this message.

From looking at the log files it's almost as if the "Standing Reservations"
preempt the existing "Administrative Reservation" (see * in the maui.log file.)

This happens even for those Administrative Reservations which are not to occur for days.


Has anyone seen this before?

(I am not sure if this has anything to do with the disappearance but I also
  noticed that the administrative reservations are configured with
  Flags set as "PREEMTEE" and I can't seem to change this.

 I have lowered the priority of the standing reservation from 200 to 0
 to see if this helps)



maui.log:

   06/21 11:10:01 WARNING:  job 'development.0' has NULL cred list
   06/21 11:10:01 INFO:     adequate tasks found for all reqs (time 00:00:00)
   06/21 11:10:01 MJobNLDistribute(development.0,SrcMNL,DstMNL)
   06/21 11:10:01 INFO:     tasks found for job development.0 (tasks requested: 
   14)
   06/21 11:10:01    
MJobAllocMNL(development.0,MFeasibleList,NodeMap,MOutList,MINRESOURCE,1182449401)
   06/21 11:10:01 INFO:     tasks located for job development.0:  3 of 3    
required (3 feasible)
   06/21 11:10:01 INFO:     allocated MNode[000]x1 'node004' to development.0:0
   06/21 11:10:01 INFO:     allocated MNode[001]x1 'node002' to development.0:0
   06/21 11:10:01 INFO:     allocated MNode[002]x1 'node001' to development.0:0
   06/21 11:10:01 MResDestroy(development.0.0)
   06/21 11:10:01 MResChargeAllocation(development.0.0,2)
   06/21 11:10:01 MSysRegEvent(RESERVATIONDESTROYED:  development.0.0 User    
1182449401 1182449385 1182474000 0   ,0,0,1)
   06/21 11:10:01 MSysLaunchAction(ASList,1)
   06/21 11:10:01    
MResCreate(User,ACL,NULL,514,NodeList,1182449401,1182474000,3,0,development.0,ResP,'',DRes)
   06/21 11:10:01 WARNING:  partial standing reservation development reserved   
 12 of 56 procs in partition   '[ALL]' to start in 00:00:00 at (1182449401) Thu 
Jun 21 11:10:01
*  06/21 11:10:01 MResPreempt(development.0.0)
*  06/21 11:10:01 MResDestroy(root.0)
*  06/21 11:10:01 MResChargeAllocation(root.0,2)
*  06/21 11:10:01 MSysRegEvent(RESERVATIONDESTROYED:  root.0 User 1182449401 
1182787200 1182794400 0   ,0,0,1)
   06/21 11:10:01 MSysLaunchAction(ASList,1)
   06/21 11:10:01 MSRSetRes(development,0,0)
   06/21 11:10:01 MJobSetCreds(development.0,[ALL],[ALL],[ALL])
   06/21 11:10:01 MSRGetAttributes(development,0,Start,Duration)
   06/21 11:10:01 INFO:     attempting standing reservation of 56 procs in    
00:00:00 for 6:49:59



_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to