Hi Bas,

> we just limit the number of job a user can submit in a execution queue, for 
> example for a 512 node cluster. we have set for the serial queue.

this might be valid approach. Let me try to summarize:

* Configuration (with x <= number of nodes)
  * Execution queues (max_user_queuable = x): queue_1, queue_2, ..., queue_n
  * Routing queue: queue_default
* Workflow
  * User 1 submits y >> x jobs to the default queue
  * User 2 submits z jobs to the default queue
* Scheduling
  * Maui only sees x*n jobs (so the hard limit of about 4096 jobs would be ok)
  * The user can submit as many jobs as he wants
  * Torque moves the jobs to the execution queues based on a fair scheduling 
configuration

Best regards, Alex

On 13.02.2011, at 00:20, Bas van der Vlies wrote:

> Alexander,
> 
> On 12 feb 2011, at 13:49, Alexander Willner wrote:
> 
>> Hi Roy,
>> 
>> thank you for your answer. 
>> 
>> On 11.02.2011, at 22:03, Roy Dragseth wrote:
>>> We have upped the job limit significantly, we currently set the limit to 
>>> 32000, 
>>> but you need to recompile maui for this.
>> 
>> How exactly have you achieved this? I already pushed the limit to 16384 by 
>> following:
>> 
>> On Friday, February 11, 2011 17:29:39 Alexander Willner wrote:
>>> (even though I've tested [2])
>> 
>> I recompiled the sources, installed them and restarted maui. Still I only 
>> have short list of queued jobs:
>> 
>>> $ qstat|wc -l
>>> 9482
>> 
>>> $ /usr/local/maui/bin/showq|wc -l
>>> 3773
>>> $ qstat|tail -n1
>>> 625162.xxxx  xxxx   xxxx x xxxxx  xxxxx   
>>> $ runjob 625162
>>> ERROR:    'runjob' failed
>>> ERROR:  cannot locate job '625162'
>> 
>> 
>> Best regards, Alex
>> 
>> [2] http://www.supercluster.org/pipermail/mauiusers/2007-April/002705.html
>> 
>> --
>> net.cs.bonn.edu/willner
>> 
> we just limit the number of job a user can submit in a execution queue, for 
> example for a 512 node cluster. we have set for the serial queue.
> {{{
> create queue q_serial
> set queue q_serial queue_type = Execution
> set queue q_serial max_user_queuable = 512
> set queue q_serial acl_host_enable = False
> set queue q_serial resources_max.nodect = 1
> set queue q_serial resources_default.ncpus = 1
> set queue q_serial resources_default.neednodes = q_serial
> set queue q_serial resources_default.nodes = 1
> set queue q_serial enabled = True
> set queue q_serial started = True
> }}}
> 
> I user can not run more the 512 jobs this is equal to the number of nodes in 
> the cluster. The other jobs are held in in the routing queue. So every time a 
> job has finished a job a new job can enter the execution queue.  The jobs are 
> held by torque so maui does not see all jobs. So this will prevent floodinf 
> the maui queues.
> 
> regards
> 
>> <smime.p7s><PGP.sig><ATT00001..txt>
> 
> --
> Bas van der Vlies
> [email protected]
> 
> 
> 

--
net.cs.bonn.edu/willner

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Attachment: PGP.sig
Description: This is a digitally signed message part

_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to