On Fri, 2016-12-09 at 09:36 +0000, John_Tai wrote: > 8 slots: > > # qstat -f > queuename qtype resv/used/tot. load_avg > arch states > ------------------------------------------------------------------- > -------------- > all.q@ibm021 BIP 0/0/8 0.02 lx-amd64 > ------------------------------------------------------------------- > -------------- > all.q@ibm037 BIP 0/0/8 0.00 lx-amd64 > ------------------------------------------------------------------- > -------------- > all.q@ibm038 BIP 0/0/8 0.00 lx-amd64 > ------------------------------------------------------------------- > -------------- > pc.q@ibm021 BIP 0/0/1 0.02 lx-amd64 > ------------------------------------------------------------------- > -------------- > sim.q@ibm021 BIP 0/0/1 0.02 lx-amd64 > > ##################################################################### > ####### > - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - > PENDING JOBS > ##################################################################### > ####### > 89 0.55500 xclock johnt qw 12/09/2016 > 15:14:25 2
Have you specified that all.q supports the cores parallel environment? I usually do that through the qmon GUI tool, but there's probably a way to do so through the command line. Chris > > > > -----Original Message----- > From: Reuti [mailto:re...@staff.uni-marburg.de] > Sent: Friday, December 09, 2016 3:46 > To: John_Tai > Cc: users@gridengine.org > Subject: Re: [gridengine users] CPU complex > > Hi, > > Am 09.12.2016 um 08:20 schrieb John_Tai: > > > > > I've setup PE but I'm having problems submitting jobs. > > > > - Here's the PE I created: > > > > # qconf -sp cores > > pe_name cores > > slots 999 > > user_lists NONE > > xuser_lists NONE > > start_proc_args /bin/true > > stop_proc_args /bin/true > > allocation_rule $pe_slots > > control_slaves FALSE > > job_is_first_task TRUE > > urgency_slots min > > accounting_summary FALSE > > qsort_args NONE > > > > - I've then added this to all.q: > > > > qconf -aattr queue pe_list cores all.q > > How many "slots" were defined in there queue definition for all.q? > > -- Reuti > > > > > > - Now I submit a job: > > > > # qsub -V -b y -cwd -now n -pe cores 2 -q all.q@ibm038 xclock Your > > job > > 89 ("xclock") has been submitted # qstat > > job-ID prior name user state submit/start > > at queue slots ja-task-ID > > ----------------------------------------------------------------- > > ------------------------------------------------ > > 89 0.00000 xclock johnt qw 12/09/2016 > > 15:14:25 2 > > # qalter -w p 89 > > Job 89 cannot run in PE "cores" because it only offers 0 slots > > verification: no suitable queues > > # qstat -f > > queuename qtype resv/used/tot. load_avg > > arch states > > ----------------------------------------------------------------- > > ---------------- > > all.q@ibm038 BIP 0/0/8 0.00 lx- > > amd64 > > > > ################################################################### > > ### > > ###### > > - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - > > PENDING > > JOBS > > ################################################################### > > ######### > > 89 0.55500 xclock johnt qw 12/09/2016 > > 15:14:25 2 > > > > > > ---------------------------------------------------- > > > > It looks like all.q@ibm038 should have 8 free slots, so why is it > > only offering 0? > > > > Hope you can help me. > > Thanks > > John > > > > > > > > > > > > > > -----Original Message----- > > From: Reuti [mailto:re...@staff.uni-marburg.de] > > Sent: Monday, December 05, 2016 6:32 > > To: John_Tai > > Cc: users@gridengine.org > > Subject: Re: [gridengine users] CPU complex > > > > Hi, > > > > > > > > Am 05.12.2016 um 09:36 schrieb John_Tai <john_...@smics.com>: > > > > > > Thank you so much for your reply! > > > > > > > > > > > > > > > > > Will you use the consumable virtual_free here instead mem? > > > > > > Yes I meant to write virtual_free, not mem. Apologies. > > > > > > > > > > > > > > > > > For parallel jobs you need to configure a (or some) so called > > > > > PE (Parallel Environment). > > > > > > My jobs are actually just one process which uses multiple cores, > > > so for example in top one process "simv" is currently using 2 cpu > > > cores (200%). > > > > Yes, then it's a parallel job for SGE. Although the entries for > > start_proc_args resp. stop_proc_args can be left untouched to the > > default, a PE is the paradigm in SGE for a parallel job. > > > > > > > > > > PID USER PR NI VIRT RES SHR S %CPU > > > %MEM TIME+ COMMAND > > > 3017 kelly 20 0 3353m 3.0g 165m R 200.0 0.6 15645:46 simv > > > > > > So I'm not sure PE is suitable for my case, since it is not > > > multiple parallel processes running at the same time. Am I > > > correct? > > > > > > If so, I am trying to find a way to get SGE to keep track of the > > > number of cores used, but I believe it only keeps track of the > > > total CPU usage in %. I guess I could use this and and the <total > > > num cores> to get the <num of cores in use>, but how to integrate > > > it in SGE? > > > > You can specify a necessary number of cores for your job in the -pe > > parameter, which can also be a range. The granted allocation by SGE > > you can check in the job script $NHOSTS, $NSLOTS, $PE_HOSTFILE. > > > > Having this setup, SGE will track the number of used cores per > > machine. The available ones you define in the queue definition. In > > case you have more than one queue per exechost, we need to setup in > > addition an overall limit of cores which can be used at the same > > time to avoid oversubscription. > > > > -- Reuti > > > > > > > > Thank you again for your help. > > > > > > John > > > > > > -----Original Message----- > > > From: Reuti [mailto:re...@staff.uni-marburg.de] > > > Sent: Monday, December 05, 2016 4:21 > > > To: John_Tai > > > Cc: users@gridengine.org > > > Subject: Re: [gridengine users] CPU complex > > > > > > Hi, > > > > > > Am 05.12.2016 um 08:00 schrieb John_Tai: > > > > > > > > > > > Newbie here, hope to understand SGE usage. > > > > > > > > I've successfully configured virtual_free as a complex for > > > > telling SGE how much memory is needed when submitting a job, as > > > > described here: > > > > > > > > https://docs.oracle.com/cd/E19957-01/820-0698/6ncdvjclk/index.h > > > > tml#i > > > > 1000029 > > > > > > > > How do I do the same for telling SGE how many CPU cores a job > > > > needs? For example: > > > > > > > > qsub -l mem=24G,cpu=4 myjob > > > > > > Will you use the consumable virtual_free here instead mem? > > > > > > > > > > > > > > Obviously I'd need for SGE to keep track of the actual CPU > > > > utilization in the host, just as virtual_free is being tracked > > > > independently of the SGE jobs. > > > > > > For parallel jobs you need to configure a (or some) so called PE > > > (Parallel Environment). Purpose of this is, to make preparations > > > for the parallel jobs like rearranging the list of granted slots, > > > prepare shared directories between the nodes,... > > > > > > These PEs were of higher importance in former times, when > > > parallel libraries were not programmed to integrate automatically > > > in SGE for a tight integration. Your submissions could read: > > > > > > qsub -pe smp 4 myjob # allocation_rule $peslots, > > > control_slaves true > > > qsub -pe orte 16 myjob # allovation_rule $round_robin, > > > control_slaves tue > > > > > > where smp resp. orte is the chosen parallel environment for > > > OpenMP resp. Open MPI. Its settings are explained in `man > > > sge_pe`, the "-pe" parameter to in the submission command in `man > > > qsub`. > > > > > > -- Reuti > > > ________________________________ > > > > > > This email (including its attachments, if any) may be > > > confidential and proprietary information of SMIC, and intended > > > only for the use of the named recipient(s) above. Any > > > unauthorized use or disclosure of this email is strictly > > > prohibited. If you are not the intended recipient(s), please > > > notify the sender immediately and delete this email from your > > > computer. > > > > > > > ________________________________ > > > > This email (including its attachments, if any) may be confidential > > and proprietary information of SMIC, and intended only for the use > > of the named recipient(s) above. Any unauthorized use or disclosure > > of this email is strictly prohibited. If you are not the intended > > recipient(s), please notify the sender immediately and delete this > > email from your computer. > > > > ________________________________ > > This email (including its attachments, if any) may be confidential > and proprietary information of SMIC, and intended only for the use of > the named recipient(s) above. Any unauthorized use or disclosure of > this email is strictly prohibited. If you are not the intended > recipient(s), please notify the sender immediately and delete this > email from your computer. > > _______________________________________________ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users