Am 28.10.2013 um 13:45 schrieb Sangmin Park:
> This is the RQS
>
> limit hosts {@parallelhosts} to slots=$num_proc
> limit queues !matlab.q hosts {@matlabhosts} to slots=$num_proc
> parallelhosts include matlabhosts.
>
> slots value in the matlab.q means the number of cores per node.
>
> All hosts is included in parallelhosts, node1 ~ node30.
> matlabhosts include node1 ~ node7.
> short.q, normal.q and long.q could be used in node1 ~ node7.
>
> I want to set up when jobs with short.q, normal.q and long.q are running, if
> matlab job is submitted,
> running job not using matlab.q in node1 ~ node7 is suspended and matlab job
> is run.
> This is what I want to set up.
>
> I don't understand why it can not be happened if I setup slots value 12.
It will suspend the job when all 12 slots are used on a particular host. You
may want to try with 1 instead. As s refinement, you could also look into
slotwise subordination.
-- Reuti
> --Sangmin
>
>
> On Mon, Oct 28, 2013 at 8:58 PM, Reuti <[email protected]> wrote:
> Am 28.10.2013 um 12:30 schrieb Sangmin Park:
>
> > I've edit the negative value in the priority section, short.q is 4,
> > normal.q is 6 and long.q is 8, respectively.
> > And I configured 72 cores for each queues.
>
> But you didn't answer the question: How do you limit the overall slot count?
> RQS oder definition in the exechost?
>
> > Below is matlab.q instance details.
> > qname matlab.q
> > hostlist @matlabhosts
> > seq_no 0
> > load_thresholds np_load_avg=1.75
> > suspend_thresholds NONE
> > nsuspend 1
> > suspend_interval 00:05:00
> > priority 2
> > min_cpu_interval 00:05:00
> > processors UNDEFINED
> > qtype BATCH INTERACTIVE
> > ckpt_list NONE
> > pe_list fill_up make matlab
> > rerun FALSE
> > slots 12
> > tmpdir /tmp
> > shell /bin/bash
> > prolog NONE
> > epilog NONE
> > shell_start_mode posix_compliant
> > starter_method NONE
> > suspend_method NONE
> > resume_method NONE
> > terminate_method NONE
> > notify 00:00:60
> > owner_list NONE
> > user_lists octausers onsiteusers
> > xuser_lists NONE
> > subordinate_list short.q=72, normal.q=72, long.q=72
>
> This will suspend these tree queues when 72 slots per queue instance in
> matlab.q is used. As you have only 12 defined above, this will never happen.
>
> What behavior would you like to set up?
>
> -- Reuti
>
>
> > complex_values NONE
> > projects NONE
> > xprojects NONE
> > calendar NONE
> > initial_state default
> > s_rt INFINITY
> > h_rt 168:00:00
> > s_cpu INFINITY
> > h_cpu INFINITY
> > s_fsize INFINITY
> > h_fsize INFINITY
> > s_data INFINITY
> > h_data INFINITY
> > s_stack INFINITY
> > h_stack INFINITY
> > s_core INFINITY
> > h_core INFINITY
> > s_rss INFINITY
> > h_rss INFINITY
> > s_vmem INFINITY
> > h_vmem INFINITY
> >
> > thanks,
> >
> > --Sangmin
> >
> >
> > On Mon, Oct 28, 2013 at 3:51 PM, Reuti <[email protected]> wrote:
> > Hi,
> >
> > Am 28.10.2013 um 06:40 schrieb Sangmin Park:
> >
> > > Thanks, adam
> > >
> > > I configured sge queue configuration following second link you said.
> > > But, it does not work.
> > >
> > > I make 4 queues, short.q, normal.q, long.q and matlab.q
> > > short.q, normal.q and long.q queue instances are running all computing
> > > nodes, node1 ~ node30.
> > > matlab.q instance is configured only for a few nodes, node1 ~ node7,
> > > called matlabhosts
> > >
> > > The priorities of each queue is below.
> > > [short.q]
> > > priority -5
> >
> > Don't use negative values here. This number is the "nice value" under which
> > the Linux kernel will run the process (i.e. the scheduler in the kernel,
> > for SGE it doesn't influence the scheduling). User processes should be in
> > the range 0..19 [20 on Solaris]. The negative ones are reserved for kernel
> > processes.
> >
> >
> > > subordinate_list NONE
> > > [normal.q]
> > > priority 0
> > > subordinate_list NONE
> > > [long.q]
> > > priority 5
> > > subordinate_list NONE
> > >
> > > and matlab.q is
> > > priority -10
> > > subordinate_list short.q normal.q long.q
> >
> > Same here. It's also worth to note, that these values are relative. I.e.
> > having the same number of user processes and cores, it doesn't matter which
> > values are used as nice values, as each process gets it's own core anyway.
> > Only when there are more processes than cores it will have an effect. But
> > as these are relative values, it's the same whether (cores+1) processes
> > have all 0 or 19 as nice value.
> >
> >
> > > I submited several jobs using normal.q to the matlabhosts
> > > and I submited a job using matlab.q that has subordinate_list
> > > I expected one of normal.q queue job is suspended and matlab.q queue job
> > > is running.
> > > But, matlab.q queue job waits in queue with status qw. not submitted.
> > >
> > > what's the matter with this?
> > > please help!!
> >
> > http://gridengine.org/pipermail/users/2013-October/006820.html
> >
> > How do you limit the overall slot count?
> >
> > -- Reuti
> >
> >
> > > Sangmin
> > >
> > >
> > >
> > >
> > > On Tue, Oct 15, 2013 at 3:50 PM, Adam Brenner <[email protected]> wrote:
> > > Sangmin,
> > >
> > > I believe the phrase / term you are looking for is Subordinate
> > > Queues[1][2]. This should handle what you are looking for.
> > >
> > > If not ... I am sure Reuti (or someone else) will correct me on this.
> > >
> > > Enjoy,
> > > -Adam
> > >
> > > [1]: http://docs.oracle.com/cd/E19957-01/820-0698/i998889/index.html
> > > [2]:
> > > http://grid-gurus.blogspot.com/2011/03/using-grid-engine-subordinate-queues.html
> > >
> > > --
> > > Adam Brenner
> > > Computer Science, Undergraduate Student
> > > Donald Bren School of Information and Computer Sciences
> > >
> > > Research Computing Support
> > > Office of Information Technology
> > > http://www.oit.uci.edu/rcs/
> > >
> > > University of California, Irvine
> > > www.ics.uci.edu/~aebrenne/
> > > [email protected]
> > >
> > >
> > > On Mon, Oct 14, 2013 at 11:18 PM, Sangmin Park <[email protected]>
> > > wrote:
> > > > Howdy,
> > > >
> > > > For specific purpose in my organization,
> > > > I want to configure something to SGE scheduler.
> > > >
> > > > Imazine.
> > > > a job is running, called A-job.
> > > > If B-job is submitted during A-job is running,
> > > > I want to hold A-job and run B-job first.
> > > > And after B-job is finished, restart A-job.
> > > >
> > > > What do I do for this?
> > > >
> > > > Sangmin
> > > >
> > > > --
> > > > ===========================
> > > > Sangmin Park
> > > > Supercomputing Center
> > > > Ulsan National Institute of Science and Technology(UNIST)
> > > > Ulsan, 689-798, Korea
> > > >
> > > > phone : +82-52-217-4201
> > > > mobile : +82-10-5094-0405
> > > > fax : +82-52-217-4209
> > > > ===========================
> > > >
> > > > _______________________________________________
> > > > users mailing list
> > > > [email protected]
> > > > https://gridengine.org/mailman/listinfo/users
> > > >
> > >
> > >
> > >
> > > --
> > > ===========================
> > > Sangmin Park
> > > Supercomputing Center
> > > Ulsan National Institute of Science and Technology(UNIST)
> > > Ulsan, 689-798, Korea
> > >
> > > phone : +82-52-217-4201
> > > mobile : +82-10-5094-0405
> > > fax : +82-52-217-4209
> > > ===========================
> > > _______________________________________________
> > > users mailing list
> > > [email protected]
> > > https://gridengine.org/mailman/listinfo/users
> >
> >
> >
> >
> > --
> > ===========================
> > Sangmin Park
> > Supercomputing Center
> > Ulsan National Institute of Science and Technology(UNIST)
> > Ulsan, 689-798, Korea
> >
> > phone : +82-52-217-4201
> > mobile : +82-10-5094-0405
> > fax : +82-52-217-4209
> > ===========================
>
>
>
>
> --
> ===========================
> Sangmin Park
> Supercomputing Center
> Ulsan National Institute of Science and Technology(UNIST)
> Ulsan, 689-798, Korea
>
> phone : +82-52-217-4201
> mobile : +82-10-5094-0405
> fax : +82-52-217-4209
> ===========================
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users