Hi, > Am 22.03.2017 um 04:24 schrieb John_Tai <john_...@smics.com>: > > I am now using sge6.1, however it doesn't have the option "JOB" for complex > consumable value. Is there another way to NOT multiply consumable memory > resource by number of pe slots?
Not that I'm aware of. It was a features introduced with SGE 6.2u2: https://arc.liv.ac.uk/trac/SGE/ticket/197 -- Reuti > > Thanks > John > > > > -----Original Message----- > From: Reuti [mailto:re...@staff.uni-marburg.de] > Sent: Wednesday, December 21, 2016 7:05 > To: Christopher Black > Cc: John_Tai; users@gridengine.org; Coleman, Marcus [JRDUS Non-J&J] > Subject: Re: [gridengine users] John's cores pe (Was: users Digest...) > > > Am 20.12.2016 um 23:42 schrieb Christopher Black: > >> We have found that the behavior that multiples consumable memory resource >> requests by number of pe slots can be confusing (and requires extra math in >> automation scripts), so we've have the complex consumable value set to "JOB" >> rather than "YES". When this is done (at least on SoGE), the memory >> requested is NOT multiplied by the number of slots. We also use h_vmem >> rather than virtual_free. > > Correct, it's not multiplied. But only the master exechost will get its > memory reduced in the bookeeping. The slave exechosts might still show a too > high value of the available memory I fear. > > -- Reuti > > >> Best, >> Chris >> >> On 12/20/16, 5:11 AM, "users-boun...@gridengine.org on behalf of Reuti" >> <users-boun...@gridengine.org on behalf of re...@staff.uni-marburg.de> wrote: >> >> >>> Am 20.12.2016 um 02:45 schrieb John_Tai <john_...@smics.com>: >>> >>> I spoke too soon. I can request PE and virtual_free separately, but I >>> cannot request both: >>> >>> >>> >>> # qsub -V -b y -cwd -now n -pe cores 7 -l mem=10G -q all.q@ibm037 >>> xclock >> >> Above you request "mem" (which is a snapshot of the actual usage and may >> vary over the runtime of other jobs [unless they request the total amount >> already at the beginning of the job and stay with it]). >> >>> Your job 180 ("xclock") has been submitted # qstat >>> job-ID prior name user state submit/start at queue >>> slots ja-task-ID >>> ----------------------------------------------------------------------------------------------------------------- >>> 180 0.55500 xclock johnt qw 12/20/2016 09:43:41 >>> 7 >>> # qstat -j 180 >>> ============================================================== >>> job_number: 180 >>> exec_file: job_scripts/180 >>> submission_time: Tue Dec 20 09:43:41 2016 >>> owner: johnt >>> uid: 162 >>> group: sa >>> gid: 4563 >>> sge_o_home: /home/johnt >>> sge_o_log_name: johnt >>> sge_o_path: >>> /home/sge/sge8.1.9-1.el5/bin:/home/sge/sge8.1.9-1.el5/bin/lx-amd64:/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin:/home/johnt/bin:. >>> sge_o_shell: /bin/tcsh >>> sge_o_workdir: /home/johnt/sge8 >>> sge_o_host: ibm005 >>> account: sge >>> cwd: /home/johnt/sge8 >>> hard resource_list: virtual_free=10G >> >> 10G times 7 = 70 GB >> >> The node has this amount of memory installed and it is defined this way in >> `qconf -me ibm037`? >> >> -- Reuti >> >> >>> mail_list: johnt@ibm005 >>> notify: FALSE >>> job_name: xclock >>> jobshare: 0 >>> hard_queue_list: all.q@ibm037 >>> env_list: TERM=xterm,DISPLAY=dsls11:3. [..] >>> script_file: xclock >>> parallel environment: cores range: 7 >>> binding: NONE >>> job_type: binary >>> scheduling info: cannot run in queue "sim.q" because it is not >>> contained in its hard queue list (-q) >>> cannot run in queue "pc.q" because it is not >>> contained in its hard queue list (-q) >>> cannot run in PE "cores" because it only >>> offers 0 slots >>> >>> >>> >>> >>> >>> -----Original Message----- >>> From: Reuti [mailto:re...@staff.uni-marburg.de] >>> Sent: Saturday, December 17, 2016 10:16 >>> To: Reuti >>> Cc: John_Tai; users@gridengine.org; Coleman, Marcus [JRDUS Non-J&J] >>> Subject: Re: [gridengine users] John's cores pe (Was: users >>> Digest...) >>> >>> >>> Am 17.12.2016 um 11:34 schrieb Reuti: >>> >>>> >>>> Am 17.12.2016 um 02:01 schrieb John_Tai: >>>> >>>>> It is working!! Thank you to all that replied to me and helped me figure >>>>> this out. >>>>> >>>>> I meant to set the default to 2G so that was my mistake. I changed it to: >>>>> >>>>> virtual_free mem MEMORY <= YES YES 2G >>>>> 0 >>>> >>>> That's strange. A plain "2" was for me always two bytes. A "h_vmem" of 2 >>>> bytes would crash the job instantly when it got scheduled, but for >>>> "virtual_free" (which is only a guidance for SGE how to distribute jobs) >>>> it shouldn't hinder the scheduling at all. >>>> >>>> `man sge_types` also lists: >>>> >>>> If no multiplier is present, the value is just counted in bytes. >>> >>> We have set "-w e" in /usr/sge/default/common/sge_request, and then I even >>> face an "Unable to run job: error: no suitable queues." This happens >>> whether the low 2 byte value is specified in the complex definition `qconf >>> -mc` or on the command line as "-l virutal_free=2". >>> >>> It turns out, that the minimum value which is being accepted is: 33. >>> >>> -- Reuti >>> >>> >>>> >>>>> And it's working now. Although I'm not sure why it affected the PE. >>>>> >>>>> Also I didn't set a global one, what is the purpose of the global one? >>>>> Should I set it? >>>> >>>> No, it was only one place I would have checked too. The global complexes >>>> therein can for example be used for a limit in the number of licenses of >>>> an application you have and which can be used floating in the cluster (one >>>> could prefer to put such a limit in an RQS though). >>>> >>>> If you would have set it up there, it would have been the "overall limit >>>> of memory which can be used in the complete cluster at the same time". >>>> >>>> -- Reuti >>>> >>>> >>>>> # qconf -se global >>>>> hostname global >>>>> load_scaling NONE >>>>> complex_values NONE >>>>> load_values NONE >>>>> processors 0 >>>>> user_lists NONE >>>>> xuser_lists NONE >>>>> projects NONE >>>>> xprojects NONE >>>>> usage_scaling NONE >>>>> report_variables NONE >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: Reuti [mailto:re...@staff.uni-marburg.de] >>>>> Sent: Friday, December 16, 2016 7:36 >>>>> To: John_Tai >>>>> Cc: Christopher Heiny; users@gridengine.org; Coleman, Marcus [JRDUS >>>>> Non-J&J] >>>>> Subject: Re: [gridengine users] John's cores pe (Was: users >>>>> Digest...) >>>>> >>>>> >>>>>> Am 16.12.2016 um 09:53 schrieb John_Tai <john_...@smics.com>: >>>>>> >>>>>> virtual_free mem MEMORY <= YES YES 2 >>>>>> 0 >>>>> >>>>> This would mean, that the default consumption is 2 bytes. I already >>>>> feared that a high values was programmed here. More suitable would be a >>>>> default of 1G or so. >>>>> >>>>> Is there any virtual_free complex defined on a global level: qconf >>>>> -se global >>>>> >>>>> -- Reuti >>>>> ________________________________ >>>>> >>>>> This email (including its attachments, if any) may be confidential and >>>>> proprietary information of SMIC, and intended only for the use of the >>>>> named recipient(s) above. Any unauthorized use or disclosure of this >>>>> email is strictly prohibited. If you are not the intended recipient(s), >>>>> please notify the sender immediately and delete this email from your >>>>> computer. >>>>> >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> users@gridengine.org >>>> https://gridengine.org/mailman/listinfo/users >>> >>> ________________________________ >>> >>> This email (including its attachments, if any) may be confidential and >>> proprietary information of SMIC, and intended only for the use of the named >>> recipient(s) above. Any unauthorized use or disclosure of this email is >>> strictly prohibited. If you are not the intended recipient(s), please >>> notify the sender immediately and delete this email from your computer. >>> >> >> >> _______________________________________________ >> users mailing list >> users@gridengine.org >> https://gridengine.org/mailman/listinfo/users >> >> >> >> This electronic message is intended for the use of the named recipient only, >> and may contain information that is confidential, privileged or protected >> from disclosure under applicable law. If you are not the intended recipient, >> or an employee or agent responsible for delivering this message to the >> intended recipient, you are hereby notified that any reading, disclosure, >> dissemination, distribution, copying or use of the contents of this message >> including any of its attachments is strictly prohibited. If you have >> received this message in error or are not the named recipient, please notify >> us immediately by contacting the sender at the electronic mail address noted >> above, and destroy all copies of this message. Please note, the recipient >> should check this email and any attachments for the presence of viruses. The >> organization accepts no liability for any damage caused by any virus >> transmitted by this email. >> > > ________________________________ > > This email (including its attachments, if any) may be confidential and > proprietary information of SMIC, and intended only for the use of the named > recipient(s) above. Any unauthorized use or disclosure of this email is > strictly prohibited. If you are not the intended recipient(s), please notify > the sender immediately and delete this email from your computer. >
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users