On Fri, 2016-12-09 at 09:36 +0000, John_Tai wrote:
> 8 slots:
> 
> # qstat -f
> queuename                      qtype resv/used/tot. load_avg
> arch          states
> -------------------------------------------------------------------
> --------------
> all.q@ibm021                   BIP   0/0/8          0.02     lx-amd64
> -------------------------------------------------------------------
> --------------
> all.q@ibm037                   BIP   0/0/8          0.00     lx-amd64
> -------------------------------------------------------------------
> --------------
> all.q@ibm038                   BIP   0/0/8          0.00     lx-amd64
> -------------------------------------------------------------------
> --------------
> pc.q@ibm021                    BIP   0/0/1          0.02     lx-amd64
> -------------------------------------------------------------------
> --------------
> sim.q@ibm021                   BIP   0/0/1          0.02     lx-amd64
> 
> #####################################################################
> #######
>  - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS -
> PENDING JOBS
> #####################################################################
> #######
>      89 0.55500 xclock     johnt        qw    12/09/2016
> 15:14:25     2

Have you specified that all.q supports the cores parallel environment?
 I usually do that through the qmon GUI tool, but there's probably a
way to do so through the command line.

                                        Chris



> 
> 
> 
> -----Original Message-----
> From: Reuti [mailto:re...@staff.uni-marburg.de]
> Sent: Friday, December 09, 2016 3:46
> To: John_Tai
> Cc: users@gridengine.org
> Subject: Re: [gridengine users] CPU complex
> 
> Hi,
> 
> Am 09.12.2016 um 08:20 schrieb John_Tai:
> 
> > 
> > I've setup PE but I'm having problems submitting jobs.
> > 
> > - Here's the PE I created:
> > 
> > # qconf -sp cores
> > pe_name            cores
> > slots              999
> > user_lists         NONE
> > xuser_lists        NONE
> > start_proc_args    /bin/true
> > stop_proc_args     /bin/true
> > allocation_rule    $pe_slots
> > control_slaves     FALSE
> > job_is_first_task  TRUE
> > urgency_slots      min
> > accounting_summary FALSE
> > qsort_args         NONE
> > 
> > - I've then added this to all.q:
> > 
> > qconf -aattr queue pe_list cores all.q
> 
> How many "slots" were defined in there queue definition for all.q?
> 
> -- Reuti
> 
> 
> > 
> > - Now I submit a job:
> > 
> > # qsub -V -b y -cwd -now n -pe cores 2 -q all.q@ibm038 xclock Your
> > job
> > 89 ("xclock") has been submitted # qstat
> > job-ID  prior   name       user         state submit/start
> > at     queue                          slots ja-task-ID
> > -----------------------------------------------------------------
> > ------------------------------------------------
> >     89 0.00000 xclock     johnt        qw    12/09/2016
> > 15:14:25                                    2
> > # qalter -w p 89
> > Job 89 cannot run in PE "cores" because it only offers 0 slots
> > verification: no suitable queues
> > # qstat -f
> > queuename                      qtype resv/used/tot. load_avg
> > arch          states
> > -----------------------------------------------------------------
> > ----------------
> > all.q@ibm038                   BIP   0/0/8          0.00     lx-
> > amd64
> > 
> > ###################################################################
> > ###
> > ######
> > - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS -
> > PENDING
> > JOBS
> > ###################################################################
> > #########
> >     89 0.55500 xclock     johnt        qw    12/09/2016
> > 15:14:25     2
> > 
> > 
> > ----------------------------------------------------
> > 
> > It looks like all.q@ibm038 should have 8 free slots, so why is it
> > only offering 0?
> > 
> > Hope you can help me.
> > Thanks
> > John
> > 
> > 
> > 
> > 
> > 
> > 
> > -----Original Message-----
> > From: Reuti [mailto:re...@staff.uni-marburg.de]
> > Sent: Monday, December 05, 2016 6:32
> > To: John_Tai
> > Cc: users@gridengine.org
> > Subject: Re: [gridengine users] CPU complex
> > 
> > Hi,
> > 
> > > 
> > > Am 05.12.2016 um 09:36 schrieb John_Tai <john_...@smics.com>:
> > > 
> > > Thank you so much for your reply!
> > > 
> > > > 
> > > > > 
> > > > > Will you use the consumable virtual_free here instead mem?
> > > 
> > > Yes I meant to write virtual_free, not mem. Apologies.
> > > 
> > > > 
> > > > > 
> > > > > For parallel jobs you need to configure a (or some) so called
> > > > > PE (Parallel Environment).
> > > 
> > > My jobs are actually just one process which uses multiple cores,
> > > so for example in top one process "simv" is currently using 2 cpu
> > > cores (200%).
> > 
> > Yes, then it's a parallel job for SGE. Although the entries for
> > start_proc_args resp. stop_proc_args can be left untouched to the
> > default, a PE is the paradigm in SGE for a parallel job.
> > 
> > 
> > > 
> > > PID USER      PR  NI  VIRT  RES  SHR S %CPU
> > > %MEM    TIME+  COMMAND
> > > 3017 kelly     20   0 3353m 3.0g 165m R 200.0  0.6  15645:46 simv
> > > 
> > > So I'm not sure PE is suitable for my case, since it is not
> > > multiple parallel processes running at the same time. Am I
> > > correct?
> > > 
> > > If so, I am trying to find a way to get SGE to keep track of the
> > > number of cores used, but I believe it only keeps track of the
> > > total CPU usage in %. I guess I could use this and and the <total
> > > num cores> to get the <num of cores in use>, but how to integrate
> > > it in SGE?
> > 
> > You can specify a necessary number of cores for your job in the -pe
> > parameter, which can also be a range. The granted allocation by SGE
> > you can check in the job script $NHOSTS, $NSLOTS, $PE_HOSTFILE.
> > 
> > Having this setup, SGE will track the number of used cores per
> > machine. The available ones you define in the queue definition. In
> > case you have more than one queue per exechost, we need to setup in
> > addition an overall limit of cores which can be used at the same
> > time to avoid oversubscription.
> > 
> > -- Reuti
> > 
> > > 
> > > Thank you again for your help.
> > > 
> > > John
> > > 
> > > -----Original Message-----
> > > From: Reuti [mailto:re...@staff.uni-marburg.de]
> > > Sent: Monday, December 05, 2016 4:21
> > > To: John_Tai
> > > Cc: users@gridengine.org
> > > Subject: Re: [gridengine users] CPU complex
> > > 
> > > Hi,
> > > 
> > > Am 05.12.2016 um 08:00 schrieb John_Tai:
> > > 
> > > > 
> > > > Newbie here, hope to understand SGE usage.
> > > > 
> > > > I've successfully configured virtual_free as a complex for
> > > > telling SGE how much memory is needed when submitting a job, as
> > > > described here:
> > > > 
> > > > https://docs.oracle.com/cd/E19957-01/820-0698/6ncdvjclk/index.h
> > > > tml#i
> > > > 1000029
> > > > 
> > > > How do I do the same for telling SGE how many CPU cores a job
> > > > needs? For example:
> > > > 
> > > >               qsub -l mem=24G,cpu=4 myjob
> > > 
> > > Will you use the consumable virtual_free here instead mem?
> > > 
> > > 
> > > > 
> > > > Obviously I'd need for SGE to keep track of the actual CPU
> > > > utilization in the host, just as virtual_free is being tracked
> > > > independently of the SGE jobs.
> > > 
> > > For parallel jobs you need to configure a (or some) so called PE
> > > (Parallel Environment). Purpose of this is, to make preparations
> > > for the parallel jobs like rearranging the list of granted slots,
> > > prepare shared directories between the nodes,...
> > > 
> > > These PEs were of higher importance in former times, when
> > > parallel libraries were not programmed to integrate automatically
> > > in SGE for a tight integration. Your submissions could read:
> > > 
> > >   qsub -pe smp 4 myjob        # allocation_rule $peslots,
> > > control_slaves true
> > >   qsub -pe orte 16 myjob        # allovation_rule $round_robin,
> > > control_slaves tue
> > > 
> > > where smp resp. orte is the chosen parallel environment for
> > > OpenMP resp. Open MPI. Its settings are explained in `man
> > > sge_pe`, the "-pe" parameter to in the submission command in `man
> > > qsub`.
> > > 
> > > -- Reuti
> > > ________________________________
> > > 
> > > This email (including its attachments, if any) may be
> > > confidential and proprietary information of SMIC, and intended
> > > only for the use of the named recipient(s) above. Any
> > > unauthorized use or disclosure of this email is strictly
> > > prohibited. If you are not the intended recipient(s), please
> > > notify the sender immediately and delete this email from your
> > > computer.
> > > 
> > 
> > ________________________________
> > 
> > This email (including its attachments, if any) may be confidential
> > and proprietary information of SMIC, and intended only for the use
> > of the named recipient(s) above. Any unauthorized use or disclosure
> > of this email is strictly prohibited. If you are not the intended
> > recipient(s), please notify the sender immediately and delete this
> > email from your computer.
> > 
> 
> ________________________________
> 
> This email (including its attachments, if any) may be confidential
> and proprietary information of SMIC, and intended only for the use of
> the named recipient(s) above. Any unauthorized use or disclosure of
> this email is strictly prohibited. If you are not the intended
> recipient(s), please notify the sender immediately and delete this
> email from your computer.
> 
> _______________________________________________
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to