This is the RQS

   limit        hosts {@parallelhosts} to slots=$num_proc
   limit        queues !matlab.q hosts {@matlabhosts} to slots=$num_proc
parallelhosts include matlabhosts.

slots value in the matlab.q means the number of cores per node.

All hosts is included in parallelhosts, node1 ~ node30.
matlabhosts include node1 ~ node7.
short.q, normal.q and long.q could be used in node1 ~ node7.

I want to set up when jobs with short.q, normal.q and long.q are running,
if matlab job is submitted,
running job not using matlab.q in node1 ~ node7 is suspended and matlab job
is run.
This is what I want to set up.

I don't understand why it can not be happened if I setup slots value 12.

--Sangmin


On Mon, Oct 28, 2013 at 8:58 PM, Reuti <[email protected]> wrote:

> Am 28.10.2013 um 12:30 schrieb Sangmin Park:
>
> > I've edit the negative value in the priority section, short.q is 4,
> normal.q is 6 and long.q is 8, respectively.
> > And I configured 72 cores for each queues.
>
> But you didn't answer the question: How do you limit the overall slot
> count? RQS oder definition in the exechost?
>
> > Below is matlab.q instance details.
> > qname                 matlab.q
> > hostlist              @matlabhosts
> > seq_no                0
> > load_thresholds       np_load_avg=1.75
> > suspend_thresholds    NONE
> > nsuspend              1
> > suspend_interval      00:05:00
> > priority              2
> > min_cpu_interval      00:05:00
> > processors            UNDEFINED
> > qtype                 BATCH INTERACTIVE
> > ckpt_list             NONE
> > pe_list               fill_up make matlab
> > rerun                 FALSE
> > slots                 12
> > tmpdir                /tmp
> > shell                 /bin/bash
> > prolog                NONE
> > epilog                NONE
> > shell_start_mode      posix_compliant
> > starter_method        NONE
> > suspend_method        NONE
> > resume_method         NONE
> > terminate_method      NONE
> > notify                00:00:60
> > owner_list            NONE
> > user_lists            octausers onsiteusers
> > xuser_lists           NONE
> > subordinate_list      short.q=72, normal.q=72, long.q=72
>
> This will suspend these tree queues when 72 slots per queue instance in
> matlab.q is used. As you have only 12 defined above, this will never happen.
>
> What behavior would you like to set up?
>
> -- Reuti
>
>
> > complex_values        NONE
> > projects              NONE
> > xprojects             NONE
> > calendar              NONE
> > initial_state         default
> > s_rt                  INFINITY
> > h_rt                  168:00:00
> > s_cpu                 INFINITY
> > h_cpu                 INFINITY
> > s_fsize               INFINITY
> > h_fsize               INFINITY
> > s_data                INFINITY
> > h_data                INFINITY
> > s_stack               INFINITY
> > h_stack               INFINITY
> > s_core                INFINITY
> > h_core                INFINITY
> > s_rss                 INFINITY
> > h_rss                 INFINITY
> > s_vmem                INFINITY
> > h_vmem                INFINITY
> >
> > thanks,
> >
> > --Sangmin
> >
> >
> > On Mon, Oct 28, 2013 at 3:51 PM, Reuti <[email protected]>
> wrote:
> > Hi,
> >
> > Am 28.10.2013 um 06:40 schrieb Sangmin Park:
> >
> > > Thanks, adam
> > >
> > > I configured sge queue configuration following second link you said.
> > > But, it does not work.
> > >
> > > I make 4 queues, short.q, normal.q, long.q and matlab.q
> > > short.q, normal.q and long.q queue instances are running all computing
> nodes, node1 ~ node30.
> > > matlab.q instance is configured only for a few nodes, node1 ~ node7,
> called matlabhosts
> > >
> > > The priorities of each queue is below.
> > > [short.q]
> > > priority              -5
> >
> > Don't use negative values here. This number is the "nice value" under
> which the Linux kernel will run the process (i.e. the scheduler in the
> kernel, for SGE it doesn't influence the scheduling). User processes should
> be in the range 0..19 [20 on Solaris]. The negative ones are reserved for
> kernel processes.
> >
> >
> > > subordinate_list      NONE
> > > [normal.q]
> > > priority              0
> > > subordinate_list      NONE
> > > [long.q]
> > > priority              5
> > > subordinate_list      NONE
> > >
> > > and matlab.q is
> > > priority              -10
> > > subordinate_list      short.q normal.q long.q
> >
> > Same here. It's also worth to note, that these values are relative. I.e.
> having the same number of user processes and cores, it doesn't matter which
> values are used as nice values, as each process gets it's own core anyway.
> Only when there are more processes than cores it will have an effect. But
> as these are relative values, it's the same whether (cores+1) processes
> have all 0 or 19 as nice value.
> >
> >
> > > I submited several jobs using normal.q to the matlabhosts
> > > and I submited a job using matlab.q that has subordinate_list
> > > I expected one of normal.q queue job is suspended and matlab.q queue
> job is running.
> > > But, matlab.q queue job waits in queue with status qw. not submitted.
> > >
> > > what's the matter with this?
> > > please help!!
> >
> > http://gridengine.org/pipermail/users/2013-October/006820.html
> >
> > How do you limit the overall slot count?
> >
> > -- Reuti
> >
> >
> > > Sangmin
> > >
> > >
> > >
> > >
> > > On Tue, Oct 15, 2013 at 3:50 PM, Adam Brenner <[email protected]>
> wrote:
> > > Sangmin,
> > >
> > > I believe the phrase / term you are looking for is Subordinate
> > > Queues[1][2]. This should handle what you are looking for.
> > >
> > > If not ... I am sure Reuti (or someone else) will correct me on this.
> > >
> > > Enjoy,
> > > -Adam
> > >
> > > [1]: http://docs.oracle.com/cd/E19957-01/820-0698/i998889/index.html
> > > [2]:
> http://grid-gurus.blogspot.com/2011/03/using-grid-engine-subordinate-queues.html
> > >
> > > --
> > > Adam Brenner
> > > Computer Science, Undergraduate Student
> > > Donald Bren School of Information and Computer Sciences
> > >
> > > Research Computing Support
> > > Office of Information Technology
> > > http://www.oit.uci.edu/rcs/
> > >
> > > University of California, Irvine
> > > www.ics.uci.edu/~aebrenne/
> > > [email protected]
> > >
> > >
> > > On Mon, Oct 14, 2013 at 11:18 PM, Sangmin Park <[email protected]>
> wrote:
> > > > Howdy,
> > > >
> > > > For specific purpose in my organization,
> > > > I want to configure something to SGE scheduler.
> > > >
> > > > Imazine.
> > > > a job is running, called A-job.
> > > > If B-job is submitted during A-job is running,
> > > > I want to hold A-job and run B-job first.
> > > > And after B-job is finished, restart A-job.
> > > >
> > > > What do I do for this?
> > > >
> > > > Sangmin
> > > >
> > > > --
> > > > ===========================
> > > > Sangmin Park
> > > > Supercomputing Center
> > > > Ulsan National Institute of Science and Technology(UNIST)
> > > > Ulsan, 689-798, Korea
> > > >
> > > > phone : +82-52-217-4201
> > > > mobile : +82-10-5094-0405
> > > > fax : +82-52-217-4209
> > > > ===========================
> > > >
> > > > _______________________________________________
> > > > users mailing list
> > > > [email protected]
> > > > https://gridengine.org/mailman/listinfo/users
> > > >
> > >
> > >
> > >
> > > --
> > > ===========================
> > > Sangmin Park
> > > Supercomputing Center
> > > Ulsan National Institute of Science and Technology(UNIST)
> > > Ulsan, 689-798, Korea
> > >
> > > phone : +82-52-217-4201
> > > mobile : +82-10-5094-0405
> > > fax : +82-52-217-4209
> > > ===========================
> > > _______________________________________________
> > > users mailing list
> > > [email protected]
> > > https://gridengine.org/mailman/listinfo/users
> >
> >
> >
> >
> > --
> > ===========================
> > Sangmin Park
> > Supercomputing Center
> > Ulsan National Institute of Science and Technology(UNIST)
> > Ulsan, 689-798, Korea
> >
> > phone : +82-52-217-4201
> > mobile : +82-10-5094-0405
> > fax : +82-52-217-4209
> > ===========================
>
>


-- 
===========================
Sangmin Park
Supercomputing Center
Ulsan National Institute of Science and Technology(UNIST)
Ulsan, 689-798, Korea

phone : +82-52-217-4201
mobile : +82-10-5094-0405
fax : +82-52-217-4209
===========================
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to