Hi Bas, > we just limit the number of job a user can submit in a execution queue, for > example for a 512 node cluster. we have set for the serial queue.
this might be valid approach. Let me try to summarize: * Configuration (with x <= number of nodes) * Execution queues (max_user_queuable = x): queue_1, queue_2, ..., queue_n * Routing queue: queue_default * Workflow * User 1 submits y >> x jobs to the default queue * User 2 submits z jobs to the default queue * Scheduling * Maui only sees x*n jobs (so the hard limit of about 4096 jobs would be ok) * The user can submit as many jobs as he wants * Torque moves the jobs to the execution queues based on a fair scheduling configuration Best regards, Alex On 13.02.2011, at 00:20, Bas van der Vlies wrote: > Alexander, > > On 12 feb 2011, at 13:49, Alexander Willner wrote: > >> Hi Roy, >> >> thank you for your answer. >> >> On 11.02.2011, at 22:03, Roy Dragseth wrote: >>> We have upped the job limit significantly, we currently set the limit to >>> 32000, >>> but you need to recompile maui for this. >> >> How exactly have you achieved this? I already pushed the limit to 16384 by >> following: >> >> On Friday, February 11, 2011 17:29:39 Alexander Willner wrote: >>> (even though I've tested [2]) >> >> I recompiled the sources, installed them and restarted maui. Still I only >> have short list of queued jobs: >> >>> $ qstat|wc -l >>> 9482 >> >>> $ /usr/local/maui/bin/showq|wc -l >>> 3773 >>> $ qstat|tail -n1 >>> 625162.xxxx xxxx xxxx x xxxxx xxxxx >>> $ runjob 625162 >>> ERROR: 'runjob' failed >>> ERROR: cannot locate job '625162' >> >> >> Best regards, Alex >> >> [2] http://www.supercluster.org/pipermail/mauiusers/2007-April/002705.html >> >> -- >> net.cs.bonn.edu/willner >> > we just limit the number of job a user can submit in a execution queue, for > example for a 512 node cluster. we have set for the serial queue. > {{{ > create queue q_serial > set queue q_serial queue_type = Execution > set queue q_serial max_user_queuable = 512 > set queue q_serial acl_host_enable = False > set queue q_serial resources_max.nodect = 1 > set queue q_serial resources_default.ncpus = 1 > set queue q_serial resources_default.neednodes = q_serial > set queue q_serial resources_default.nodes = 1 > set queue q_serial enabled = True > set queue q_serial started = True > }}} > > I user can not run more the 512 jobs this is equal to the number of nodes in > the cluster. The other jobs are held in in the routing queue. So every time a > job has finished a job a new job can enter the execution queue. The jobs are > held by torque so maui does not see all jobs. So this will prevent floodinf > the maui queues. > > regards > >> <smime.p7s><PGP.sig><ATT00001..txt> > > -- > Bas van der Vlies > [email protected] > > > -- net.cs.bonn.edu/willner
smime.p7s
Description: S/MIME cryptographic signature
PGP.sig
Description: This is a digitally signed message part
_______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
