Am 28.10.2013 um 13:59 schrieb Sangmin Park: > yes, suspending the job when all 12 slots are used on a particular host. This > is what I want to. > So, I tried to submit job using 12 slots, but it did not work.
Aha, it might be necessary to change the order of rules in your RQS. The first matching one will allow or deny the job to be started. I.e. if all slots are used the (current) first rules matches and the job is rejected. -- Reuti > Still not working.. > > --Sangmin > > > On Mon, Oct 28, 2013 at 9:47 PM, Reuti <[email protected]> wrote: > Am 28.10.2013 um 13:45 schrieb Sangmin Park: > > > This is the RQS > > > > limit hosts {@parallelhosts} to slots=$num_proc > > limit queues !matlab.q hosts {@matlabhosts} to slots=$num_proc > > parallelhosts include matlabhosts. > > > > slots value in the matlab.q means the number of cores per node. > > > > All hosts is included in parallelhosts, node1 ~ node30. > > matlabhosts include node1 ~ node7. > > short.q, normal.q and long.q could be used in node1 ~ node7. > > > > I want to set up when jobs with short.q, normal.q and long.q are running, > > if matlab job is submitted, > > running job not using matlab.q in node1 ~ node7 is suspended and matlab job > > is run. > > This is what I want to set up. > > > > I don't understand why it can not be happened if I setup slots value 12. > > It will suspend the job when all 12 slots are used on a particular host. You > may want to try with 1 instead. As s refinement, you could also look into > slotwise subordination. > > -- Reuti > > > > --Sangmin > > > > > > On Mon, Oct 28, 2013 at 8:58 PM, Reuti <[email protected]> wrote: > > Am 28.10.2013 um 12:30 schrieb Sangmin Park: > > > > > I've edit the negative value in the priority section, short.q is 4, > > > normal.q is 6 and long.q is 8, respectively. > > > And I configured 72 cores for each queues. > > > > But you didn't answer the question: How do you limit the overall slot > > count? RQS oder definition in the exechost? > > > > > Below is matlab.q instance details. > > > qname matlab.q > > > hostlist @matlabhosts > > > seq_no 0 > > > load_thresholds np_load_avg=1.75 > > > suspend_thresholds NONE > > > nsuspend 1 > > > suspend_interval 00:05:00 > > > priority 2 > > > min_cpu_interval 00:05:00 > > > processors UNDEFINED > > > qtype BATCH INTERACTIVE > > > ckpt_list NONE > > > pe_list fill_up make matlab > > > rerun FALSE > > > slots 12 > > > tmpdir /tmp > > > shell /bin/bash > > > prolog NONE > > > epilog NONE > > > shell_start_mode posix_compliant > > > starter_method NONE > > > suspend_method NONE > > > resume_method NONE > > > terminate_method NONE > > > notify 00:00:60 > > > owner_list NONE > > > user_lists octausers onsiteusers > > > xuser_lists NONE > > > subordinate_list short.q=72, normal.q=72, long.q=72 > > > > This will suspend these tree queues when 72 slots per queue instance in > > matlab.q is used. As you have only 12 defined above, this will never happen. > > > > What behavior would you like to set up? > > > > -- Reuti > > > > > > > complex_values NONE > > > projects NONE > > > xprojects NONE > > > calendar NONE > > > initial_state default > > > s_rt INFINITY > > > h_rt 168:00:00 > > > s_cpu INFINITY > > > h_cpu INFINITY > > > s_fsize INFINITY > > > h_fsize INFINITY > > > s_data INFINITY > > > h_data INFINITY > > > s_stack INFINITY > > > h_stack INFINITY > > > s_core INFINITY > > > h_core INFINITY > > > s_rss INFINITY > > > h_rss INFINITY > > > s_vmem INFINITY > > > h_vmem INFINITY > > > > > > thanks, > > > > > > --Sangmin > > > > > > > > > On Mon, Oct 28, 2013 at 3:51 PM, Reuti <[email protected]> wrote: > > > Hi, > > > > > > Am 28.10.2013 um 06:40 schrieb Sangmin Park: > > > > > > > Thanks, adam > > > > > > > > I configured sge queue configuration following second link you said. > > > > But, it does not work. > > > > > > > > I make 4 queues, short.q, normal.q, long.q and matlab.q > > > > short.q, normal.q and long.q queue instances are running all computing > > > > nodes, node1 ~ node30. > > > > matlab.q instance is configured only for a few nodes, node1 ~ node7, > > > > called matlabhosts > > > > > > > > The priorities of each queue is below. > > > > [short.q] > > > > priority -5 > > > > > > Don't use negative values here. This number is the "nice value" under > > > which the Linux kernel will run the process (i.e. the scheduler in the > > > kernel, for SGE it doesn't influence the scheduling). User processes > > > should be in the range 0..19 [20 on Solaris]. The negative ones are > > > reserved for kernel processes. > > > > > > > > > > subordinate_list NONE > > > > [normal.q] > > > > priority 0 > > > > subordinate_list NONE > > > > [long.q] > > > > priority 5 > > > > subordinate_list NONE > > > > > > > > and matlab.q is > > > > priority -10 > > > > subordinate_list short.q normal.q long.q > > > > > > Same here. It's also worth to note, that these values are relative. I.e. > > > having the same number of user processes and cores, it doesn't matter > > > which values are used as nice values, as each process gets it's own core > > > anyway. Only when there are more processes than cores it will have an > > > effect. But as these are relative values, it's the same whether (cores+1) > > > processes have all 0 or 19 as nice value. > > > > > > > > > > I submited several jobs using normal.q to the matlabhosts > > > > and I submited a job using matlab.q that has subordinate_list > > > > I expected one of normal.q queue job is suspended and matlab.q queue > > > > job is running. > > > > But, matlab.q queue job waits in queue with status qw. not submitted. > > > > > > > > what's the matter with this? > > > > please help!! > > > > > > http://gridengine.org/pipermail/users/2013-October/006820.html > > > > > > How do you limit the overall slot count? > > > > > > -- Reuti > > > > > > > > > > Sangmin > > > > > > > > > > > > > > > > > > > > On Tue, Oct 15, 2013 at 3:50 PM, Adam Brenner <[email protected]> wrote: > > > > Sangmin, > > > > > > > > I believe the phrase / term you are looking for is Subordinate > > > > Queues[1][2]. This should handle what you are looking for. > > > > > > > > If not ... I am sure Reuti (or someone else) will correct me on this. > > > > > > > > Enjoy, > > > > -Adam > > > > > > > > [1]: http://docs.oracle.com/cd/E19957-01/820-0698/i998889/index.html > > > > [2]: > > > > http://grid-gurus.blogspot.com/2011/03/using-grid-engine-subordinate-queues.html > > > > > > > > -- > > > > Adam Brenner > > > > Computer Science, Undergraduate Student > > > > Donald Bren School of Information and Computer Sciences > > > > > > > > Research Computing Support > > > > Office of Information Technology > > > > http://www.oit.uci.edu/rcs/ > > > > > > > > University of California, Irvine > > > > www.ics.uci.edu/~aebrenne/ > > > > [email protected] > > > > > > > > > > > > On Mon, Oct 14, 2013 at 11:18 PM, Sangmin Park <[email protected]> > > > > wrote: > > > > > Howdy, > > > > > > > > > > For specific purpose in my organization, > > > > > I want to configure something to SGE scheduler. > > > > > > > > > > Imazine. > > > > > a job is running, called A-job. > > > > > If B-job is submitted during A-job is running, > > > > > I want to hold A-job and run B-job first. > > > > > And after B-job is finished, restart A-job. > > > > > > > > > > What do I do for this? > > > > > > > > > > Sangmin > > > > > > > > > > -- > > > > > =========================== > > > > > Sangmin Park > > > > > Supercomputing Center > > > > > Ulsan National Institute of Science and Technology(UNIST) > > > > > Ulsan, 689-798, Korea > > > > > > > > > > phone : +82-52-217-4201 > > > > > mobile : +82-10-5094-0405 > > > > > fax : +82-52-217-4209 > > > > > =========================== > > > > > > > > > > _______________________________________________ > > > > > users mailing list > > > > > [email protected] > > > > > https://gridengine.org/mailman/listinfo/users > > > > > > > > > > > > > > > > > > > > > -- > > > > =========================== > > > > Sangmin Park > > > > Supercomputing Center > > > > Ulsan National Institute of Science and Technology(UNIST) > > > > Ulsan, 689-798, Korea > > > > > > > > phone : +82-52-217-4201 > > > > mobile : +82-10-5094-0405 > > > > fax : +82-52-217-4209 > > > > =========================== > > > > _______________________________________________ > > > > users mailing list > > > > [email protected] > > > > https://gridengine.org/mailman/listinfo/users > > > > > > > > > > > > > > > -- > > > =========================== > > > Sangmin Park > > > Supercomputing Center > > > Ulsan National Institute of Science and Technology(UNIST) > > > Ulsan, 689-798, Korea > > > > > > phone : +82-52-217-4201 > > > mobile : +82-10-5094-0405 > > > fax : +82-52-217-4209 > > > =========================== > > > > > > > > > > -- > > =========================== > > Sangmin Park > > Supercomputing Center > > Ulsan National Institute of Science and Technology(UNIST) > > Ulsan, 689-798, Korea > > > > phone : +82-52-217-4201 > > mobile : +82-10-5094-0405 > > fax : +82-52-217-4209 > > =========================== > > > > > -- > =========================== > Sangmin Park > Supercomputing Center > Ulsan National Institute of Science and Technology(UNIST) > Ulsan, 689-798, Korea > > phone : +82-52-217-4201 > mobile : +82-10-5094-0405 > fax : +82-52-217-4209 > =========================== _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
