Re: [gridengine users] understanding the slots

Reuti Sun, 07 Apr 2013 04:22:27 -0700

Am 06.04.2013 um 04:54 schrieb Fan Dong:

> Thanks for the good explanation.
> 
> You happened to brought up my another question -- you said '
> 
> For a parallel job you will need to request more than one slot by a parallel 
> environment (PE) depending on the job requirements.
> 
> '
> 
> We have a multithreaded java app that at the most uses 4 cores on a single 
> node.  Do I need a PE for it?


Although you could fool SGE and submit a serial job only, it's in SGE's 
paradigm to use a parallel environment and let SGE know that it's a parallel 
job for proper resource allocation and scheduling.

Please check these resources: `man sge_pe` and 
http://docs.oracle.com/cd/E19080-01/n1.grid.eng6/817-5677/6ml49n2c0/index.html 
how to set it up.

The as default provided "allocation_rule $pe_slots" in a new PE is best for the 
intended purpose (it would be the same for OpenMP or custom thread programming).

-- Reuti


>  If yes, can you give an example?
> 
> Thanks again!
> 
> Fan
> 
> 
> On 05/04/2013 4:41 PM, Reuti wrote:
>> Hi,
>> 
>> Am 05.04.2013 um 22:21 schrieb Fan Dong:
>> 
>>> Maybe someone here can clarify the concept of slots for me. I am very much 
>>> confused by the definition "The maximum number of concurrently executing 
>>> jobs allowed in the queue.  Type is number, valid values are 0 to 9999999.".
>>> 
>>> Is the queue here cluster queue or queue instance?  In my particular 
>>> setting, the host group @allhosts contains 4 hosts and 28 CPU in total.  
>>> Run 'qconf -sq all.q', giving the following by the default. The slots is 1 
>>> -- what does it really mean?
>>> 
>>> Does that mean 1 active job per queue instance?
>> Yes, in detail: one serial job. You could phrase it also "slots" = "allowed 
>> user processes" per queue instance, often set to the number of available 
>> cores in an exechost. For a parallel job you will need to request more than 
>> one slot by a parallel environment (PE) depending on the job requirements.
>> 
>> 
>>>  I have 4 queue instances and at the most I can have 4 jobs running 
>>> concurrently?  I am really not sure....
>>> 
>>> qname                 all.q
>>> hostlist              @allhosts
>>> seq_no                0
>>> load_thresholds       np_load_avg=1.75
>>> suspend_thresholds    NONE
>>> nsuspend              1
>>> suspend_interval      00:05:00
>>> priority              0
>>> min_cpu_interval      00:05:00
>>> processors            UNDEFINED
>>> qtype                 BATCH INTERACTIVE
>>> ckpt_list             NONE
>>> pe_list               make
>>> rerun                 FALSE
>>> slots                 1,[comp01=6],[comp02=6], [comp03=8],[comp04=8]
>> 1 is the default for queue instances not listed after it. I.e. if you have 
>> an uniform cluster where each machine has 16 cores one could write:
>> 
>> $ qconf -sq all.q
>> ...
>> slots 16
>> 
>> Nevertheless:
>> 
>> $ qconf -sq all.q
>> ...
>> slots 42,[@allhosts=16]
>> 
>> would result in the same slot count when all hosts are listed in the 
>> hostgroup @allhosts.
>> 
>> -- Reuti
>> 
>> 
>>> _______________________________________________
>>> users mailing list
>>> [email protected]
>>> https://gridengine.org/mailman/listinfo/users
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] understanding the slots

Reply via email to