Am 24.10.2012 um 09:57 schrieb Lars van der bijl: > On 23 October 2012 22:08, Reuti <[email protected]> wrote: >> Hi, >> >> Am 23.10.2012 um 21:41 schrieb Lars van der bijl: >> >>> I've got 2 queue's >>> >>> $ qconf -sq final.q >>> qname final.q >>> hostlist @allhosts >>> suspend_thresholds NONE >>> nsuspend 1 >>> suspend_interval 00:01:00 >>> pe_list make smp >>> rerun TRUE >>> >>> >>> $ qconf -sq quick.q >>> qname quick.q >>> hostlist @allhosts >>> suspend_thresholds NONE >>> nsuspend 1 >>> suspend_interval 00:01:00 >>> pe_list make smp >>> rerun TRUE >>> subordinate_list final.q=1 >>> >>> we have about 325 procs and both queue's have access to the same machines. >>> >>> what I'd except to see is if I have 200 slots running in final.q and I >>> submit a task to quick.q that it would suspend the task in final.q and >>> push the new task in front. >>> however what I am seeing that that only 32 slots are being used. and >>> not all tasks are being pushed in front of the final.q >>> >>> we only use parallel submission in case that makes a difference. >>> >>> what could I change to get this behavior? >> >> hard to tell from the information you posted, as I don't know how 32 are in >> any way related to 325 procs without knowing more details. So some remarks, >> maybe you can refine the setup or question then: >> >> - the subordinate_list will only work "per exechost" queue instance >> - in your current setup all slots from queue instance on a particular >> exechost will be suspended as soon as one slot in quick.q is used >> - (may you are looking for a slot-wise subordination?) > > the problem I have with the slot-wise setup is that you can only set 1 > slot value for the subordinate_list. > what we have is a lot of 8 core machine. a few 4 ,6 and 12 cores. so > those would have to go in separate queue's i'd imagine. > > We frequently submit the same task with 4 cores or 8 cores. using "-pe > smp 4" or "-pe smp 8" this causes a slotwise setup to be difficult to > setup because if its set to 2 slots per host then a task submitted > with 8 proc won't get suspended. > >> - jobs in the quick.q don't have a higher priority >> - it's best not to submit to queues in SGE, but think of "request resources" >> and SGE will select an appropriate queue for your job >> >> For your setup this could mean to define a BOOL complex "quick" as >> "requestable FORCED" and attach it to the quick.q, then request "-l quick" >> (which implies "-l quick=TRUE") and in addition attach a high "urgency" >> value to this complex. Then they should go also to the top of the list. And >> only "quick" will run in this queue. > > thanks this is a different way of thinking about them problem for me. > > to specify what hosts can run a type of job we currently submit with > hostgroups like so. > > quick.q@@mantra
Assuming you have a complex "mantra" attached to the exechosts: -l quick,mantra or: -l quick -q "*@@mantra" Maybe one complex would do already: qmantra is attached only to @manta exechosts, qfoobar only to @foobar. > now for other types of task we have a host group setup because we only > have 10 license for a application. a single machine can run more then > one of these tasks at a time but the license is only consumed ones per > host. > is there a way to have this setup with a complex? Unfortunately no. Although it was an RFE a long time ago to have such a type of complex and was several times on the list: http://arc.liv.ac.uk/pipermail/gridengine-users/2010-November.txt (please search for HOSTONCE) or https://arc.liv.ac.uk/trac/SGE/ticket/1318 But as you have mostly an SMP environment: what about submitting in bunches? I mean: we have access to a remote cluster where always complete exechosts are granted to a job, even if the job uses only 2 out of 24 cores. This is similar to your setup as you can start additional computations on the same machine without the need for another license. I adjusted our submission scripts, that in one job submission several tasks are started in the background by &, and later on after the "wait" in the jobscript all results are collected to assemble one email for the overall outcome of the individual tasks. For best usage, the implies that one assumes that all tasks in the job have around the same execution time. -- Reuti _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
