yes, suspending the job when all 12 slots are used on a particular host. This is what I want to. So, I tried to submit job using 12 slots, but it did not work. Still not working..
--Sangmin On Mon, Oct 28, 2013 at 9:47 PM, Reuti <[email protected]> wrote: > Am 28.10.2013 um 13:45 schrieb Sangmin Park: > > > This is the RQS > > > > limit hosts {@parallelhosts} to slots=$num_proc > > limit queues !matlab.q hosts {@matlabhosts} to slots=$num_proc > > parallelhosts include matlabhosts. > > > > slots value in the matlab.q means the number of cores per node. > > > > All hosts is included in parallelhosts, node1 ~ node30. > > matlabhosts include node1 ~ node7. > > short.q, normal.q and long.q could be used in node1 ~ node7. > > > > I want to set up when jobs with short.q, normal.q and long.q are > running, if matlab job is submitted, > > running job not using matlab.q in node1 ~ node7 is suspended and matlab > job is run. > > This is what I want to set up. > > > > I don't understand why it can not be happened if I setup slots value 12. > > It will suspend the job when all 12 slots are used on a particular host. > You may want to try with 1 instead. As s refinement, you could also look > into slotwise subordination. > > -- Reuti > > > > --Sangmin > > > > > > On Mon, Oct 28, 2013 at 8:58 PM, Reuti <[email protected]> > wrote: > > Am 28.10.2013 um 12:30 schrieb Sangmin Park: > > > > > I've edit the negative value in the priority section, short.q is 4, > normal.q is 6 and long.q is 8, respectively. > > > And I configured 72 cores for each queues. > > > > But you didn't answer the question: How do you limit the overall slot > count? RQS oder definition in the exechost? > > > > > Below is matlab.q instance details. > > > qname matlab.q > > > hostlist @matlabhosts > > > seq_no 0 > > > load_thresholds np_load_avg=1.75 > > > suspend_thresholds NONE > > > nsuspend 1 > > > suspend_interval 00:05:00 > > > priority 2 > > > min_cpu_interval 00:05:00 > > > processors UNDEFINED > > > qtype BATCH INTERACTIVE > > > ckpt_list NONE > > > pe_list fill_up make matlab > > > rerun FALSE > > > slots 12 > > > tmpdir /tmp > > > shell /bin/bash > > > prolog NONE > > > epilog NONE > > > shell_start_mode posix_compliant > > > starter_method NONE > > > suspend_method NONE > > > resume_method NONE > > > terminate_method NONE > > > notify 00:00:60 > > > owner_list NONE > > > user_lists octausers onsiteusers > > > xuser_lists NONE > > > subordinate_list short.q=72, normal.q=72, long.q=72 > > > > This will suspend these tree queues when 72 slots per queue instance in > matlab.q is used. As you have only 12 defined above, this will never happen. > > > > What behavior would you like to set up? > > > > -- Reuti > > > > > > > complex_values NONE > > > projects NONE > > > xprojects NONE > > > calendar NONE > > > initial_state default > > > s_rt INFINITY > > > h_rt 168:00:00 > > > s_cpu INFINITY > > > h_cpu INFINITY > > > s_fsize INFINITY > > > h_fsize INFINITY > > > s_data INFINITY > > > h_data INFINITY > > > s_stack INFINITY > > > h_stack INFINITY > > > s_core INFINITY > > > h_core INFINITY > > > s_rss INFINITY > > > h_rss INFINITY > > > s_vmem INFINITY > > > h_vmem INFINITY > > > > > > thanks, > > > > > > --Sangmin > > > > > > > > > On Mon, Oct 28, 2013 at 3:51 PM, Reuti <[email protected]> > wrote: > > > Hi, > > > > > > Am 28.10.2013 um 06:40 schrieb Sangmin Park: > > > > > > > Thanks, adam > > > > > > > > I configured sge queue configuration following second link you said. > > > > But, it does not work. > > > > > > > > I make 4 queues, short.q, normal.q, long.q and matlab.q > > > > short.q, normal.q and long.q queue instances are running all > computing nodes, node1 ~ node30. > > > > matlab.q instance is configured only for a few nodes, node1 ~ node7, > called matlabhosts > > > > > > > > The priorities of each queue is below. > > > > [short.q] > > > > priority -5 > > > > > > Don't use negative values here. This number is the "nice value" under > which the Linux kernel will run the process (i.e. the scheduler in the > kernel, for SGE it doesn't influence the scheduling). User processes should > be in the range 0..19 [20 on Solaris]. The negative ones are reserved for > kernel processes. > > > > > > > > > > subordinate_list NONE > > > > [normal.q] > > > > priority 0 > > > > subordinate_list NONE > > > > [long.q] > > > > priority 5 > > > > subordinate_list NONE > > > > > > > > and matlab.q is > > > > priority -10 > > > > subordinate_list short.q normal.q long.q > > > > > > Same here. It's also worth to note, that these values are relative. > I.e. having the same number of user processes and cores, it doesn't matter > which values are used as nice values, as each process gets it's own core > anyway. Only when there are more processes than cores it will have an > effect. But as these are relative values, it's the same whether (cores+1) > processes have all 0 or 19 as nice value. > > > > > > > > > > I submited several jobs using normal.q to the matlabhosts > > > > and I submited a job using matlab.q that has subordinate_list > > > > I expected one of normal.q queue job is suspended and matlab.q queue > job is running. > > > > But, matlab.q queue job waits in queue with status qw. not submitted. > > > > > > > > what's the matter with this? > > > > please help!! > > > > > > http://gridengine.org/pipermail/users/2013-October/006820.html > > > > > > How do you limit the overall slot count? > > > > > > -- Reuti > > > > > > > > > > Sangmin > > > > > > > > > > > > > > > > > > > > On Tue, Oct 15, 2013 at 3:50 PM, Adam Brenner <[email protected]> > wrote: > > > > Sangmin, > > > > > > > > I believe the phrase / term you are looking for is Subordinate > > > > Queues[1][2]. This should handle what you are looking for. > > > > > > > > If not ... I am sure Reuti (or someone else) will correct me on this. > > > > > > > > Enjoy, > > > > -Adam > > > > > > > > [1]: http://docs.oracle.com/cd/E19957-01/820-0698/i998889/index.html > > > > [2]: > http://grid-gurus.blogspot.com/2011/03/using-grid-engine-subordinate-queues.html > > > > > > > > -- > > > > Adam Brenner > > > > Computer Science, Undergraduate Student > > > > Donald Bren School of Information and Computer Sciences > > > > > > > > Research Computing Support > > > > Office of Information Technology > > > > http://www.oit.uci.edu/rcs/ > > > > > > > > University of California, Irvine > > > > www.ics.uci.edu/~aebrenne/ > > > > [email protected] > > > > > > > > > > > > On Mon, Oct 14, 2013 at 11:18 PM, Sangmin Park < > [email protected]> wrote: > > > > > Howdy, > > > > > > > > > > For specific purpose in my organization, > > > > > I want to configure something to SGE scheduler. > > > > > > > > > > Imazine. > > > > > a job is running, called A-job. > > > > > If B-job is submitted during A-job is running, > > > > > I want to hold A-job and run B-job first. > > > > > And after B-job is finished, restart A-job. > > > > > > > > > > What do I do for this? > > > > > > > > > > Sangmin > > > > > > > > > > -- > > > > > =========================== > > > > > Sangmin Park > > > > > Supercomputing Center > > > > > Ulsan National Institute of Science and Technology(UNIST) > > > > > Ulsan, 689-798, Korea > > > > > > > > > > phone : +82-52-217-4201 > > > > > mobile : +82-10-5094-0405 > > > > > fax : +82-52-217-4209 > > > > > =========================== > > > > > > > > > > _______________________________________________ > > > > > users mailing list > > > > > [email protected] > > > > > https://gridengine.org/mailman/listinfo/users > > > > > > > > > > > > > > > > > > > > > -- > > > > =========================== > > > > Sangmin Park > > > > Supercomputing Center > > > > Ulsan National Institute of Science and Technology(UNIST) > > > > Ulsan, 689-798, Korea > > > > > > > > phone : +82-52-217-4201 > > > > mobile : +82-10-5094-0405 > > > > fax : +82-52-217-4209 > > > > =========================== > > > > _______________________________________________ > > > > users mailing list > > > > [email protected] > > > > https://gridengine.org/mailman/listinfo/users > > > > > > > > > > > > > > > -- > > > =========================== > > > Sangmin Park > > > Supercomputing Center > > > Ulsan National Institute of Science and Technology(UNIST) > > > Ulsan, 689-798, Korea > > > > > > phone : +82-52-217-4201 > > > mobile : +82-10-5094-0405 > > > fax : +82-52-217-4209 > > > =========================== > > > > > > > > > > -- > > =========================== > > Sangmin Park > > Supercomputing Center > > Ulsan National Institute of Science and Technology(UNIST) > > Ulsan, 689-798, Korea > > > > phone : +82-52-217-4201 > > mobile : +82-10-5094-0405 > > fax : +82-52-217-4209 > > =========================== > > -- =========================== Sangmin Park Supercomputing Center Ulsan National Institute of Science and Technology(UNIST) Ulsan, 689-798, Korea phone : +82-52-217-4201 mobile : +82-10-5094-0405 fax : +82-52-217-4209 ===========================
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
