Dear Maui and Torque fans

I think I can answer my own question, after I dug out some
information from the Maui parameters list
(Appendix F, Maui Admin Guide).

Please, correct me if I am wrong.

The Maui parameter DEFERTIME defaults to one hour (01:00:00).
This is the time a deferred job waits before starting,
if I understood the Guide right.
Somehow, a job that is on hold because it depends on previous job,
seems to become deferred after the previous job finishes.
Therefore, it will sit on the queue for another hour, only after that
time elapses it will get ready to run, if the resources are available.
(Is the latter job state called 'idle' state?)

Hence, setting DEFERTIME in maui.cfg to a lesser value
(I used one minute) did the trick I wanted:

DEFERTIME                       00:01:00

No more 'one hour delay'.
Just 'one minute delay'.

A couple of questions to the Maui experts:

1) I am not sure if this is the right way to solve the problem.
Is there a better alternative?

2) Can I expect any impact or side effect on other policies
after the change above?

Many thanks,
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------


Gus Correa wrote:
> Dear Torque and Maui Pros
> 
> I am using Torque 2.4.11 and Maui 2.3.6p21,
> both built from source on a CentOS 5.5 x86_64 cluster.
> 
> I queued up several jobs with the -W attribute
> (-W depend=afterany:${PREVIOUS_JOB_NUMBER}).
> However, they only start one hour after the previous job has finished.
> This 'one hour delay' happens when the nodes are idle,
> with plenty of resources available to run the job.
> 
> Is this a misconfiguration of Maui or Torque?
> Is there a fix?
> 
> ntp is running, and the head and compute nodes clocks are synchronized.
> 
> My maui.cfg and the output of qmgc -c 'p s'
> are enclosed below.
> 
> Thank you,
> Gus Correa
> 
> #####################
> $ qmgr -c 'p s'
> #####################
> 
> create queue batch
> set queue batch queue_type = Execution
> set queue batch resources_default.nodes = 1
> set queue batch enabled = True
> set queue batch started = True
> 
> set server scheduling = True
> set server acl_host_enable = False
> set server acl_hosts = my.computer.dot
> set server managers = [email protected]
> set server managers += [email protected]
> set server operators = [email protected]
> set server operators += [email protected]
> set server default_queue = batch
> set server log_events = 511
> set server mail_from = adm
> set server query_other_jobs = True
> set server scheduler_iteration = 60
> set server node_check_rate = 150
> set server tcp_timeout = 6
> set server keep_completed = 1200
> set server allow_node_submit = True
> set server next_job_number = XXXXX
> 
> ############################
> # maui.cfg
> ############################
> 
> # maui.cfg 3.2.6p20
> 
> RMPOLLINTERVAL        00:00:30
> 
> SERVERHOST            my.computer.dot
> SERVERPORT            42559
> SERVERMODE            NORMAL
> 
> RMCFG[base]           TYPE=PBS
> 
> AMCFG[bank]  TYPE=NONE
> 
> ADMIN1                maui root
> 
> LOGFILE               maui.log
> LOGFILEMAXSIZE        10000000
> LOGLEVEL              3
> 
> QUEUETIMEWEIGHT       1
> 
> # BACKFILLPOLICY        FIRSTFIT
> BACKFILLPOLICY        BESTFIT
> RESERVATIONPOLICY     CURRENTHIGHEST
> 
> NODEALLOCATIONPOLICY  MINRESOURCE
> 
> JOBPRIOACCRUALPOLICY  FULLPOLICY
> USERCFG[DEFAULT]      MAXNODE=16,16
> USERCFG[DEFAULT]      MAXPROC=256,256
> 
> ENABLEMULTIREQJOBS   TRUE
> 
> JOBNODEMATCHPOLICY EXACTNODE
> 
> ####################################################
> _______________________________________________
> mauiusers mailing list
> [email protected]
> http://www.supercluster.org/mailman/listinfo/mauiusers

_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to