Re: [gridengine users] can't delete an exec host
On Wed, Sep 6, 2017 at 12:42 PM, Reuti <re...@staff.uni-marburg.de> wrote: > > > Am 06.09.2017 um 17:33 schrieb Michael Stauffer <mgsta...@gmail.com>: > > > > On Wed, Sep 6, 2017 at 11:16 AM, Feng Zhang <prod.f...@gmail.com> wrote: > > It seems SGE master did not get refreshed with new hostgroup. Maybe you > can try: > > > > 1. restart SGE master > > > > Is it safe to do this with jobs queued and running? I think it's not > reliable, i.e. jobs can get killed and de-queued? > > Just to mention, that it's safe to restart the qmaster or reboot even the > machine the qmaster is running on. Nothing will happen to the running jobs > on the exechosts. > OK good to know. I've done that before and seen them finish, although some googling suggested people have seen jobs get killed. Does a qmaster restart, however, empty the queue? I imagine a reboot would too, unless the queue is stored in a file? -M > > -- Reuti > > > > or > > > > 2. change basic.q, "hostlist" to any node, like "compute-1-0.local", > > wait till it gets refreshed; then change it back to "@basichosts". > > > > I've done this, but it's not refreshing (been about 10 minutes now). I'm > still getting the error when I try to delete exec host compute-2-4, and > qhost is still showing basic.q on the nodes in @basichosts. > > > > Interestingly, host compute-2-4 was removed from another queue > (qlogin.basic.q) that also uses @basichosts, so it's something about > basic.q that's stuck. > > > > Is there some way to refresh things other than restarting qmaster? > > > > -M > > > > > > > > > > > > On Wed, Sep 6, 2017 at 10:29 AM, Michael Stauffer <mgsta...@gmail.com> > wrote: > > > SoGE 8.1.8 > > > > > > Hi, > > > > > > I'm having trouble deleting an execution host. I've removed it from the > > > host group, but when I try to delete with qconf, it says it's still > part of > > > 'basic.q'. Here's the relevant output? Anyone have any suggestions? > > > > > > [root@chead ~]# qconf -de compute-2-4.local > > > Host object "compute-2-4.local" is still referenced in cluster queue > > > "basic.q". > > > > > > [root@chead ~]# qconf -sq basic.q > > > qname basic.q > > > hostlist @basichosts > > > seq_no0 > > > load_thresholds np_load_avg=1.74 > > > suspend_thresholdsNONE > > > nsuspend 1 > > > suspend_interval 00:05:00 > > > priority 0 > > > min_cpu_interval 00:05:00 > > > processorsUNDEFINED > > > qtype BATCH > > > ckpt_list NONE > > > pe_list make mpich mpi orte unihost serial > > > rerun FALSE > > > slots 8,[compute-1-2.local=3],[compute-1-0.local=7], \ > > > [compute-1-1.local=7],[compute-1-3.local=7], \ > > > [compute-1-5.local=8],[compute-1-6.local=8], \ > > > [compute-1-7.local=8],[compute-1-8.local=8], \ > > > [compute-1-9.local=8],[compute-1-10.local=8], \ > > > [compute-1-11.local=8],[compute-1-12.local=8], \ > > > [compute-1-13.local=8],[compute-1-14.local=8], \ > > > [compute-1-15.local=8] > > > tmpdir/tmp > > > shell /bin/bash > > > prologNONE > > > epilogNONE > > > shell_start_mode posix_compliant > > > starter_methodNONE > > > suspend_methodNONE > > > resume_method NONE > > > terminate_method NONE > > > notify00:00:60 > > > owner_listNONE > > > user_listsNONE > > > xuser_lists NONE > > > subordinate_list NONE > > > complex_valuesNONE > > > projects NONE > > > xprojects NONE > > > calendar NONE > > > initial_state default > > > s_rt INFINITY > > > h_rt INFINITY > > > s_cpu INFINITY > > > h_cpu INFINITY > > > s_fsize INFINITY > > > h_fsize INFINITY > > > s_dataINFINITY > > > h_data
Re: [gridengine users] can't delete an exec host
That was it, thanks! The node had failed so I didn't think there'd be anything running on there, but two jobs were stuck in the basic.q on that node. I've killed them and now can remove host compute-2-4. -M On Wed, Sep 6, 2017 at 11:41 AM, Feng Zhang <prod.f...@gmail.com> wrote: > Is there any running jobs on queue instance of compute-2-4@basic.q? > > On Wed, Sep 6, 2017 at 11:33 AM, Michael Stauffer <mgsta...@gmail.com> > wrote: > > On Wed, Sep 6, 2017 at 11:16 AM, Feng Zhang <prod.f...@gmail.com> wrote: > >> > >> It seems SGE master did not get refreshed with new hostgroup. Maybe you > >> can try: > >> > >> 1. restart SGE master > > > > > > Is it safe to do this with jobs queued and running? I think it's not > > reliable, i.e. jobs can get killed and de-queued? > > > >> > >> or > >> > >> 2. change basic.q, "hostlist" to any node, like "compute-1-0.local", > >> > >> wait till it gets refreshed; then change it back to "@basichosts". > > > > > > I've done this, but it's not refreshing (been about 10 minutes now). I'm > > still getting the error when I try to delete exec host compute-2-4, and > > qhost is still showing basic.q on the nodes in @basichosts. > > > > Interestingly, host compute-2-4 was removed from another queue > > (qlogin.basic.q) that also uses @basichosts, so it's something about > basic.q > > that's stuck. > > > > Is there some way to refresh things other than restarting qmaster? > > > > -M > > > > > >> > >> > >> > >> > >> On Wed, Sep 6, 2017 at 10:29 AM, Michael Stauffer <mgsta...@gmail.com> > >> wrote: > >> > SoGE 8.1.8 > >> > > >> > Hi, > >> > > >> > I'm having trouble deleting an execution host. I've removed it from > the > >> > host group, but when I try to delete with qconf, it says it's still > part > >> > of > >> > 'basic.q'. Here's the relevant output? Anyone have any suggestions? > >> > > >> > [root@chead ~]# qconf -de compute-2-4.local > >> > Host object "compute-2-4.local" is still referenced in cluster queue > >> > "basic.q". > >> > > >> > [root@chead ~]# qconf -sq basic.q > >> > qname basic.q > >> > hostlist @basichosts > >> > seq_no0 > >> > load_thresholds np_load_avg=1.74 > >> > suspend_thresholdsNONE > >> > nsuspend 1 > >> > suspend_interval 00:05:00 > >> > priority 0 > >> > min_cpu_interval 00:05:00 > >> > processorsUNDEFINED > >> > qtype BATCH > >> > ckpt_list NONE > >> > pe_list make mpich mpi orte unihost serial > >> > rerun FALSE > >> > slots 8,[compute-1-2.local=3],[compute-1-0.local=7], > \ > >> > [compute-1-1.local=7],[compute-1-3.local=7], \ > >> > [compute-1-5.local=8],[compute-1-6.local=8], \ > >> > [compute-1-7.local=8],[compute-1-8.local=8], \ > >> > [compute-1-9.local=8],[compute-1-10.local=8], \ > >> > [compute-1-11.local=8],[compute-1-12.local=8], > \ > >> > [compute-1-13.local=8],[compute-1-14.local=8], > \ > >> > [compute-1-15.local=8] > >> > tmpdir/tmp > >> > shell /bin/bash > >> > prologNONE > >> > epilogNONE > >> > shell_start_mode posix_compliant > >> > starter_methodNONE > >> > suspend_methodNONE > >> > resume_method NONE > >> > terminate_method NONE > >> > notify00:00:60 > >> > owner_listNONE > >> > user_listsNONE > >> > xuser_lists NONE > >> > subordinate_list NONE > >> > complex_valuesNONE > >> > projects NONE > >> > xprojects NONE > >> > calendar NONE > >> > initial_state default > >> > s_rt INFINITY > >> > h_rt INFINITY > >&g
Re: [gridengine users] can't delete an exec host
On Wed, Sep 6, 2017 at 11:16 AM, Feng Zhang <prod.f...@gmail.com> wrote: > It seems SGE master did not get refreshed with new hostgroup. Maybe you > can try: > > 1. restart SGE master > Is it safe to do this with jobs queued and running? I think it's not reliable, i.e. jobs can get killed and de-queued? > or > > 2. change basic.q, "hostlist" to any node, like "compute-1-0.local", wait till it gets refreshed; then change it back to "@basichosts". > I've done this, but it's not refreshing (been about 10 minutes now). I'm still getting the error when I try to delete exec host compute-2-4, and qhost is still showing basic.q on the nodes in @basichosts. Interestingly, host compute-2-4 was removed from another queue (qlogin.basic.q) that also uses @basichosts, so it's something about basic.q that's stuck. Is there some way to refresh things other than restarting qmaster? -M > > > > On Wed, Sep 6, 2017 at 10:29 AM, Michael Stauffer <mgsta...@gmail.com> > wrote: > > SoGE 8.1.8 > > > > Hi, > > > > I'm having trouble deleting an execution host. I've removed it from the > > host group, but when I try to delete with qconf, it says it's still part > of > > 'basic.q'. Here's the relevant output? Anyone have any suggestions? > > > > [root@chead ~]# qconf -de compute-2-4.local > > Host object "compute-2-4.local" is still referenced in cluster queue > > "basic.q". > > > > [root@chead ~]# qconf -sq basic.q > > qname basic.q > > hostlist @basichosts > > seq_no0 > > load_thresholds np_load_avg=1.74 > > suspend_thresholdsNONE > > nsuspend 1 > > suspend_interval 00:05:00 > > priority 0 > > min_cpu_interval 00:05:00 > > processorsUNDEFINED > > qtype BATCH > > ckpt_list NONE > > pe_list make mpich mpi orte unihost serial > > rerun FALSE > > slots 8,[compute-1-2.local=3],[compute-1-0.local=7], \ > > [compute-1-1.local=7],[compute-1-3.local=7], \ > > [compute-1-5.local=8],[compute-1-6.local=8], \ > > [compute-1-7.local=8],[compute-1-8.local=8], \ > > [compute-1-9.local=8],[compute-1-10.local=8], \ > > [compute-1-11.local=8],[compute-1-12.local=8], \ > > [compute-1-13.local=8],[compute-1-14.local=8], \ > > [compute-1-15.local=8] > > tmpdir/tmp > > shell /bin/bash > > prologNONE > > epilogNONE > > shell_start_mode posix_compliant > > starter_methodNONE > > suspend_methodNONE > > resume_method NONE > > terminate_method NONE > > notify00:00:60 > > owner_listNONE > > user_listsNONE > > xuser_lists NONE > > subordinate_list NONE > > complex_valuesNONE > > projects NONE > > xprojects NONE > > calendar NONE > > initial_state default > > s_rt INFINITY > > h_rt INFINITY > > s_cpu INFINITY > > h_cpu INFINITY > > s_fsize INFINITY > > h_fsize INFINITY > > s_dataINFINITY > > h_dataINFINITY > > s_stack INFINITY > > h_stack INFINITY > > s_coreINFINITY > > h_coreINFINITY > > s_rss INFINITY > > h_rss INFINITY > > s_vmem19G > > h_vmem19G > > > > [root@chead ~]# qconf -shgrp @basichosts > > group_name @basichosts > > hostlist compute-1-0.local compute-1-2.local compute-1-3.local \ > > compute-1-5.local compute-1-6.local compute-1-7.local \ > > compute-1-8.local compute-1-9.local compute-1-10.local \ > > compute-1-11.local compute-1-12.local compute-1-13.local \ > > compute-1-14.local compute-1-15.local compute-2-0.local \ > > compute-2-2.local compute-2-5.local compute-2-7.local \ > > compute-2-8.local compute-2-9.local compute-2-11.local \ > > compute-2-12.local compute-2-13.local compute-2-15.local \ > > compute-2-6.local > > > > Thanks > > > > -M > > > > ___ > > users mailing list > > users@gridengine.org > > https://gridengine.org/mailman/listinfo/users > > > > > > -- > Best, > > Feng > ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] can't delete an exec host
SoGE 8.1.8 Hi, I'm having trouble deleting an execution host. I've removed it from the host group, but when I try to delete with qconf, it says it's still part of 'basic.q'. Here's the relevant output? Anyone have any suggestions? [root@chead ~]# qconf -de compute-2-4.local Host object "compute-2-4.local" is still referenced in cluster queue "basic.q". [root@chead ~]# qconf -sq basic.q qname basic.q hostlist @basichosts seq_no0 load_thresholds np_load_avg=1.74 suspend_thresholdsNONE nsuspend 1 suspend_interval 00:05:00 priority 0 min_cpu_interval 00:05:00 processorsUNDEFINED qtype BATCH ckpt_list NONE pe_list make mpich mpi orte unihost serial rerun FALSE slots 8,[compute-1-2.local=3],[compute-1-0.local=7], \ [compute-1-1.local=7],[compute-1-3.local=7], \ [compute-1-5.local=8],[compute-1-6.local=8], \ [compute-1-7.local=8],[compute-1-8.local=8], \ [compute-1-9.local=8],[compute-1-10.local=8], \ [compute-1-11.local=8],[compute-1-12.local=8], \ [compute-1-13.local=8],[compute-1-14.local=8], \ [compute-1-15.local=8] tmpdir/tmp shell /bin/bash prologNONE epilogNONE shell_start_mode posix_compliant starter_methodNONE suspend_methodNONE resume_method NONE terminate_method NONE notify00:00:60 owner_listNONE user_listsNONE xuser_lists NONE subordinate_list NONE complex_valuesNONE projects NONE xprojects NONE calendar NONE initial_state default s_rt INFINITY h_rt INFINITY s_cpu INFINITY h_cpu INFINITY s_fsize INFINITY h_fsize INFINITY s_dataINFINITY h_dataINFINITY s_stack INFINITY h_stack INFINITY s_coreINFINITY h_coreINFINITY s_rss INFINITY h_rss INFINITY s_vmem19G h_vmem19G [root@chead ~]# qconf -shgrp @basichosts group_name @basichosts hostlist compute-1-0.local compute-1-2.local compute-1-3.local \ compute-1-5.local compute-1-6.local compute-1-7.local \ compute-1-8.local compute-1-9.local compute-1-10.local \ compute-1-11.local compute-1-12.local compute-1-13.local \ compute-1-14.local compute-1-15.local compute-2-0.local \ compute-2-2.local compute-2-5.local compute-2-7.local \ compute-2-8.local compute-2-9.local compute-2-11.local \ compute-2-12.local compute-2-13.local compute-2-15.local \ compute-2-6.local Thanks -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] default scheduler and global configs?
On Tue, Aug 22, 2017 at 3:26 AM, Reuti <re...@staff.uni-marburg.de> wrote: > Am 22.08.2017 um 00:38 schrieb Michael Stauffer: > > > > > > > On Thu, Aug 17, 2017 at 7:39 AM, Reuti <re...@staff.uni-marburg.de> > wrote: > > > > > My experience is, that sometimes RQS blocks the execution for unknown > reasons while jobs should start according to their setting. > > > > > > A mysterious bug? I hope not. :/ > > > > Unfortunately my experience is, that there is something odd with RQS. I > had the effect several times, especially if one uses a load sensor, that at > some point no more jobs were scheduled. Disabling the RQS worked instantly, > although they should have been started before this. > > > > I'm not sure how I'd run my cluster without RQS - it'd be a free-for-all > unless there's another way to limit user's resource consumption? > > Sure, but could you disable the RQS to test it? Setting the "enabled" flag > shows already whether there is an issue in your case. > If I disable the slot and memory RQS's, then the stuck jobs run. I've got a workaround, see other post I'll make in a minute to my other thread. > > Also, I don't believe I have any load sensor running except for the > default. Should I try disabling that? How do I do that? > > It's not directly the issue of a custom load sensor, but if you use any > value which is *not* computed as a consumable. E.g. a memory load. > I don't believe I use anything other than the consumables. Or rather, I don't recall setting anything up other than the consumables. Are there default settings (SoGE 8.1.8) that would do what you're talking about? -M > > > - -- Reuti > -BEGIN PGP SIGNATURE- > Comment: GPGTools - https://gpgtools.org > > iEYEARECAAYFAlmb3LEACgkQo/GbGkBRnRoZBQCgy9zviAlqwde8ATAcqJWFM3H4 > bjIAoJA4bE02J3beTXas6ECCrqtDarDK > =gE85 > -END PGP SIGNATURE- > ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] default scheduler and global configs?
On Thu, Aug 17, 2017 at 7:39 AM, Reutiwrote: > > > My experience is, that sometimes RQS blocks the execution for unknown > reasons while jobs should start according to their setting. > > > > A mysterious bug? I hope not. :/ > > Unfortunately my experience is, that there is something odd with RQS. I > had the effect several times, especially if one uses a load sensor, that at > some point no more jobs were scheduled. Disabling the RQS worked instantly, > although they should have been started before this. I'm not sure how I'd run my cluster without RQS - it'd be a free-for-all unless there's another way to limit user's resource consumption? Also, I don't believe I have any load sensor running except for the default. Should I try disabling that? How do I do that? -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] PE offers 0 slots?
On Thu, Aug 17, 2017 at 7:49 AM, Reuti <re...@staff.uni-marburg.de> wrote: > > > Am 13.08.2017 um 18:11 schrieb Michael Stauffer <mgsta...@gmail.com>: > > > > Thanks for the reply Reuti, see below > > > > On Fri, Aug 11, 2017 at 7:18 PM, Reuti <re...@staff.uni-marburg.de> > wrote: > > > > What I notice below: defining h_vmem/s_vmem on a queue level means per > job. Defining it on an exechost level means across all jobs. What is > different between: > > > > > > - > > > all.q@compute-0-13.local BP0/10/169.14 lx-amd64 > > > qf:h_vmem=40.000G > > > qf:s_vmem=40.000G > > > hc:slots=6 > > > > - > > > all.q@compute-0-14.local BP0/10/169.66 lx-amd64 > > > hc:h_vmem=28.890G > > > hc:s_vmem=30.990G > > > hc:slots=6 > > > > > > qf = queue fixed > > hc = host consumable > > > > What is the definition of h_vmem/s_vmem in `qconf -sc` and their default > consumptions? > > > > I thought this means that when it's showing qf, it's the per-job queue > limit, i.e. the queue has a h_vmem and s_vmem limits for the job of 40G > (which it does). And then hc is shown when the host resources are less than > the per-job queue limit. > > Yes, the lower limit should be shown. So it's defined on both sides: > exechost and queue? Yes, the queue has a 40GB per-job limit, and h_vmem and s_vmem are consumables on the exechosts -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] PE offers 0 slots?
I have a new insight which is very helpful. Thanks to Mark Bergman who mentioned that the 'PE offers 0 slots' error/warning can also mean memory limitations. If the stuck-job problem is happening to a user, I can get jobs to run if I make no memory request, or make a memory request (i.e., -l h_vmem=...) that's less than the default value for the complex. If I request more than 100M greater than the default, the job gets stuck with the "PE offers 0 slots" warning. Interesting! Any thoughts on this? Again, this is happening when there's plenty of resources on the nodes and plenty of room in the users quotas. I'll test more tomorrow, but this may mean I can at least get a workaround going by having a large default request and forcing users to make an explicit memory request. -M On Tue, Aug 15, 2017 at 6:40 PM, Michael Stauffer <mgsta...@gmail.com> wrote: > ## >> In regard of 'int_test' PE you created. If you set allocation rule to >> integer, it would mean that the job _must_ request amount of slots equal or >> multiple to this value. >> In your case, PE is defined to use '8' as allocation rule, so your job >> must request 8 or 16 or 24 ... slots. In case of you request 2, the job >> will never start, as the scheduler can't allocate 2 slots with allocation >> rule set to 8. >> >> From man sge_pe: >> "If the number of tasks specified with the "-pe" option (see >> qsub(1)) does not divide without remainder by thisthe job >> will not be scheduled. " >> >> So, the fact that the job in int_test never starts if it requests 2 cores >> - is totally fine from the scheduler point of view. >> > > OK, thanks very much, that explains it. I'll test accordingly. > > >> ## >> In regard of this issue in general: just wondering if you, or users on >> the cluster use '-R y' ( reservation ) option for theirs jobs? I have seen >> such a behavior, when someone submits a job with a reservation defined. The >> scheduler reserves slots on the cluster for this big job, and doesn't let >> new jobs come ( especially in case of runtime is not defined by h_rt ). In >> this case, there will be no messages in the scheduler log which is >> confusing some time. >> > > I don't think users are using '-R y', but I'm not sure. Do you know how I > can tell that? I think 'qstat -g c' shows that in the RES column? I don't > think I've ever seen non-zero there, but I'll pay attention. However the > stuck-job issue is happening right now to at least one user, and the RES > column is all zeros. > > -M > > >> >> Best regards, >> Mikhail Serkov >> >> On Fri, Aug 11, 2017 at 6:41 PM, Michael Stauffer <mgsta...@gmail.com> >> wrote: >> >>> Hi, >>> >>> >>> Below I've dumped relevant configurations. >>> >>> Today I created a new PE called "int_test" to test the "integer" >>> allocation rule. I set it to 16 (16 cores per node), and have also tried 8. >>> It's been added as a PE to the queues we use. When I try to run to this new >>> PE however, it *always* fails with the same "PE ...offers 0 slots" error, >>> even if I can run the same multi-slot job using "unihost" PE at the same >>> time. I'm not sure if this helps debug or not. >>> >>> Another thought - this behavior started happening some time ago more or >>> less when I tried implementing fairshare behavior. I never seemed to get >>> fairshare working right. We haven't been able to confirm, but for some >>> users it seems this "PE 0 slots" issue pops up only after they've been >>> running other jobs for a little while. So I'm wondering if I've screwed up >>> fairshare in some way that's causing this odd behavior. >>> >>> >>> > ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] PE offers 0 slots?
> > ## > In regard of 'int_test' PE you created. If you set allocation rule to > integer, it would mean that the job _must_ request amount of slots equal or > multiple to this value. > In your case, PE is defined to use '8' as allocation rule, so your job > must request 8 or 16 or 24 ... slots. In case of you request 2, the job > will never start, as the scheduler can't allocate 2 slots with allocation > rule set to 8. > > From man sge_pe: > "If the number of tasks specified with the "-pe" option (see qsub(1)) > does not divide without remainder by thisthe job will not be > scheduled. " > > So, the fact that the job in int_test never starts if it requests 2 cores > - is totally fine from the scheduler point of view. > OK, thanks very much, that explains it. I'll test accordingly. > ## > In regard of this issue in general: just wondering if you, or users on the > cluster use '-R y' ( reservation ) option for theirs jobs? I have seen such > a behavior, when someone submits a job with a reservation defined. The > scheduler reserves slots on the cluster for this big job, and doesn't let > new jobs come ( especially in case of runtime is not defined by h_rt ). In > this case, there will be no messages in the scheduler log which is > confusing some time. > I don't think users are using '-R y', but I'm not sure. Do you know how I can tell that? I think 'qstat -g c' shows that in the RES column? I don't think I've ever seen non-zero there, but I'll pay attention. However the stuck-job issue is happening right now to at least one user, and the RES column is all zeros. -M > > Best regards, > Mikhail Serkov > > On Fri, Aug 11, 2017 at 6:41 PM, Michael Stauffer <mgsta...@gmail.com> > wrote: > >> Hi, >> >> >> Below I've dumped relevant configurations. >> >> Today I created a new PE called "int_test" to test the "integer" >> allocation rule. I set it to 16 (16 cores per node), and have also tried 8. >> It's been added as a PE to the queues we use. When I try to run to this new >> PE however, it *always* fails with the same "PE ...offers 0 slots" error, >> even if I can run the same multi-slot job using "unihost" PE at the same >> time. I'm not sure if this helps debug or not. >> >> Another thought - this behavior started happening some time ago more or >> less when I tried implementing fairshare behavior. I never seemed to get >> fairshare working right. We haven't been able to confirm, but for some >> users it seems this "PE 0 slots" issue pops up only after they've been >> running other jobs for a little while. So I'm wondering if I've screwed up >> fairshare in some way that's causing this odd behavior. >> >> >> ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] default scheduler and global configs?
> > > Where might I find default configurations for the scheduler and global > config? I've looked in /opt/sge, and downloaded the source and looked in > there, but there's no files matching sched_* other than the one that holds > the current scheduler config. And I'm not sure what file holds the global > config. > > Do you have classic spooling enabled? With BDB it should only be in the > database. And these set by some constants in source/libs/sgeobj/sge_conf.c > AFAICS. > I think it's classic/text-file spooling. I have execd_spool_dir /opt/sge/default/spool in my global config. > > I'm looking to restore settings to before the time I made some changes > to try and implement fairshare policy, to see if this may be causing the > problems I've described in another thread with jobs getting stuck in the > queue. > > You can lookup the values there or install a new cell in the current SGE > installation (a name different from the one you use right now). By setting > $SGE_CELL you can then switch between the instances and copy & paste the > settings between open sessions. Different cells are completely unrelated > and share only the them SGE binaries. > I've found a backup and notes from before I tried switching to fairshare, and have restored those scheduler and global conf settings. So far isn't helping. > My experience is, that sometimes RQS blocks the execution for unknown > reasons while jobs should start according to their setting. > A mysterious bug? I hope not. :/ -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] PE offers 0 slots?
load_medium lm DOUBLE >=NO NO 0 0 load_short ls DOUBLE >=NO NO 0 0 m_core core INT <=YES NO 0 0 m_socketsocket INT <=YES NO 0 0 m_threadthread INT <=YES NO 0 0 m_topology topo RESTRING==YES NO NONE 0 m_topology_inuseutopo RESTRING==YES NO NONE 0 mem_freemf MEMORY <=YES NO 0 0 mem_total mt MEMORY <=YES NO 0 0 mem_usedmu MEMORY >=YES NO 0 0 min_cpu_intervalmciTIME<=NO NO 0:0:00 np_load_avg nlaDOUBLE >=NO NO 0 0 np_load_longnllDOUBLE >=NO NO 0 0 np_load_medium nlmDOUBLE >=NO NO 0 0 np_load_short nlsDOUBLE >=NO NO 0 0 num_procp INT ==YES NO 0 0 qname q RESTRING==YES NO NONE 0 rerun re BOOL==NO NO 0 0 s_core s_core MEMORY <=YES NO 0 0 s_cpu s_cpu TIME<=YES NO 0:0:00 s_data s_data MEMORY <=YES NO 0 0 s_fsize s_fsizeMEMORY <=YES NO 0 0 s_rss s_rss MEMORY <=YES NO 0 0 s_rts_rt TIME<=YES NO 0:0:00 s_stack s_stackMEMORY <=YES NO 0 0 s_vmem s_vmem MEMORY <=YES JOB 3000M0 seq_no seqINT ==NO NO 0 0 slots s INT <=YES YES1 1000 swap_free sf MEMORY <=YES NO 0 0 swap_rate sr MEMORY >=YES NO 0 0 swap_rsvd srsv MEMORY >=YES NO 0 0 swap_total st MEMORY <=YES NO 0 0 swap_used su MEMORY >=YES NO 0 0 tmpdir tmpRESTRING==NO NO NONE 0 virtual_freevf MEMORY <=YES NO 0 0 virtual_total vt MEMORY <=YES NO 0 0 virtual_usedvu MEMORY >=YES NO 0 0 Does this info help at all in diagnosing? Any other config info that could help understand this? -M On Sun, Aug 13, 2017 at 12:11 PM, Michael Stauffer <mgsta...@gmail.com> wrote: > Thanks for the reply Reuti, see below > > On Fri, Aug 11, 2017 at 7:18 PM, Reuti <re...@staff.uni-marburg.de> wrote: > >> >> What I notice below: defining h_vmem/s_vmem on a queue level means per >> job. Defining it on an exechost level means across all jobs. What is >> different between: >> >> > >> - >> > all.q@compute-0-13.local BP0/10/169.14 lx-amd64 >> > qf:h_vmem=40.000G >> > qf:s_vmem=40.000G >> > hc:slots=6 >> > >> - >> > all.q@compute-0-14.local BP0/10/169.66 lx-amd64 >> > hc:h_vmem=28.890G >> > hc:s_vmem=30.990G >> > hc:slots=6 >> >> >> qf = queue fixed >> hc = host consumable >> >> What is the definition of h_vmem/s_vmem in `qconf -sc` and their default >> consumptions? >> > > I thought this means that when it's showing qf, it's the per-job queue > limit, i.e. the queue has a h_vmem and s_vmem limits for the job of 40G > (which it does). And then hc is shown when the host resources are less than > the per-job queue limit. > > [root@chead ~]# qconf -sc | grep vmem > h_vmem h_vmem MEMORY <=YES JOB > 3100M0 > s_vmem s_vmem MEMORY <=YES JOB > 3000M0 > > > 'unihost' is the only PE I use. When users request multiple slots, they >> use 'unihost': >> &g
Re: [gridengine users] PE offers 0 slots?
al" > dropped because it is full > > queue instance "qlogin.q@compute-0-19.local" > dropped because it is full > > queue instance "qlogin.q@compute-gpu-0.local" > dropped because it is full > > queue instance "qlogin.q@compute-0-7.local" > dropped because it is full > > queue instance "all.q@compute-0-0.local" > dropped because it is full > > cannot run in PE "int_test" because it only > offers 0 slots > > > > [mgstauff@chead ~]$ qquota -u mgstauff > > resource quota rule limitfilter > > > > > > > [mgstauff@chead ~]$ qconf -srqs limit_user_slots > > { > >name limit_user_slots > >description Limit the users' batch slots > >enabled TRUE > >limitusers {pcook,mgstauff} queues {allalt.q} to slots=32 > >limitusers {*} queues {allalt.q} to slots=0 > >limitusers {*} queues {himem.q} to slots=6 > >limitusers {*} queues {all.q,himem.q} to slots=32 > >limitusers {*} queues {basic.q} to slots=40 > > } > > > > There are plenty of consumables available: > > > > [root@chead ~]# qstat -F h_vmem,s_vmem,slots -q all.q a > > queuename qtype resv/used/tot. load_avg arch > states > > > - > > all.q@compute-0-0.localBP0/4/4 5.24 lx-amd64 > > qf:h_vmem=40.000G > > qf:s_vmem=40.000G > > qc:slots=0 > > > - > > all.q@compute-0-1.localBP0/10/159.58 lx-amd64 > > qf:h_vmem=40.000G > > qf:s_vmem=40.000G > > qc:slots=5 > > > - > > all.q@compute-0-10.local BP0/9/16 9.80 lx-amd64 > > qf:h_vmem=40.000G > > qf:s_vmem=40.000G > > hc:slots=7 > > > - > > all.q@compute-0-11.local BP0/11/169.18 lx-amd64 > > qf:h_vmem=40.000G > > qf:s_vmem=40.000G > > hc:slots=5 > > > - > > all.q@compute-0-12.local BP0/11/169.72 lx-amd64 > > qf:h_vmem=40.000G > > qf:s_vmem=40.000G > > hc:slots=5 > > > - > > all.q@compute-0-13.local BP0/10/169.14 lx-amd64 > > qf:h_vmem=40.000G > > qf:s_vmem=40.000G > > hc:slots=6 > > > - > > all.q@compute-0-14.local BP0/10/169.66 lx-amd64 > > hc:h_vmem=28.890G > > hc:s_vmem=30.990G > > hc:slots=6 > > > - > > all.q@compute-0-15.local BP0/10/169.54 lx-amd64 > > qf:h_vmem=40.000G > > qf:s_vmem=40.000G > > hc:slots=6 > > > - > > all.q@compute-0-16.local BP0/10/1610.01lx-amd64 > > qf:h_vmem=40.000G > > qf:s_vmem=40.000G > > hc:slots=6 > > > - > > all.q@compute-0-17.local BP0/11/169.75 lx-amd64 > > hc:h_vmem=29.963G > > hc:s_vmem=32.960G > > hc:slots=5 > > > - > > all.q@compute-0-18.local BP0/11/1610.29lx-amd64 > > qf:h_vmem=40.000G > > qf:s_vmem=40.000G > > hc:slots=5 > > > - > > all.q@compute-0-19.local BP0/9/14 9.01 lx-amd64 > > qf:h_vmem=5.000G > > qf:s_vmem=5.000G > > qc:slots=5 &
Re: [gridengine users] PE offers 0 slots?
[mgstauff@chead ~]$ qconf -srqs limit_user_slots { name limit_user_slots description Limit the users' batch slots enabled TRUE limitusers {pcook,mgstauff} queues {allalt.q} to slots=32 limitusers {*} queues {allalt.q} to slots=0 limitusers {*} queues {himem.q} to slots=6 limitusers {*} queues {all.q,himem.q} to slots=32 limitusers {*} queues {basic.q} to slots=40 } There are plenty of consumables available: [root@chead ~]# qstat -F h_vmem,s_vmem,slots -q all.q a queuename qtype resv/used/tot. load_avg arch states - all.q@compute-0-0.localBP0/4/4 5.24 lx-amd64 qf:h_vmem=40.000G qf:s_vmem=40.000G qc:slots=0 - all.q@compute-0-1.localBP0/10/159.58 lx-amd64 qf:h_vmem=40.000G qf:s_vmem=40.000G qc:slots=5 - all.q@compute-0-10.local BP0/9/16 9.80 lx-amd64 qf:h_vmem=40.000G qf:s_vmem=40.000G hc:slots=7 - all.q@compute-0-11.local BP0/11/169.18 lx-amd64 qf:h_vmem=40.000G qf:s_vmem=40.000G hc:slots=5 - all.q@compute-0-12.local BP0/11/169.72 lx-amd64 qf:h_vmem=40.000G qf:s_vmem=40.000G hc:slots=5 - all.q@compute-0-13.local BP0/10/169.14 lx-amd64 qf:h_vmem=40.000G qf:s_vmem=40.000G hc:slots=6 - all.q@compute-0-14.local BP0/10/169.66 lx-amd64 hc:h_vmem=28.890G hc:s_vmem=30.990G hc:slots=6 - all.q@compute-0-15.local BP0/10/169.54 lx-amd64 qf:h_vmem=40.000G qf:s_vmem=40.000G hc:slots=6 - all.q@compute-0-16.local BP0/10/1610.01lx-amd64 qf:h_vmem=40.000G qf:s_vmem=40.000G hc:slots=6 - all.q@compute-0-17.local BP0/11/169.75 lx-amd64 hc:h_vmem=29.963G hc:s_vmem=32.960G hc:slots=5 - all.q@compute-0-18.local BP0/11/1610.29lx-amd64 qf:h_vmem=40.000G qf:s_vmem=40.000G hc:slots=5 - all.q@compute-0-19.local BP0/9/14 9.01 lx-amd64 qf:h_vmem=5.000G qf:s_vmem=5.000G qc:slots=5 - all.q@compute-0-2.localBP0/10/159.24 lx-amd64 qf:h_vmem=40.000G qf:s_vmem=40.000G qc:slots=5 - all.q@compute-0-20.local BP0/0/4 0.00 lx-amd64 qf:h_vmem=3.200G qf:s_vmem=3.200G qc:slots=4 - all.q@compute-0-3.localBP0/11/159.62 lx-amd64 qf:h_vmem=40.000G qf:s_vmem=40.000G qc:slots=4 - all.q@compute-0-4.localBP0/12/159.85 lx-amd64 qf:h_vmem=40.000G qf:s_vmem=40.000G qc:slots=3 - all.q@compute-0-5.localBP0/12/1510.18lx-amd64 hc:h_vmem=36.490G hc:s_vmem=39.390G qc:slots=3 - all.q@compute-0-6.localBP0/12/169.95 lx-amd64 qf:h_vmem=40.000G qf:s_vmem=40.000G hc:slots=4 - all.q@compute-0-7.localBP0/10/169.59 lx-amd64 hc:h_vmem=36.935G qf:s_vmem=40.000G hc:slots=5
[gridengine users] using VM's for compute nodes
SoGE 8.1.8 Rocks 6.2 - Centos 6.8 Hi, Does anyone have experience using VM's either as compute nodes or launching a VM as part of a qsub or qlogin job? I'm interested in running some nodes with CentOS 7, but for now at least we're stuck with Centos 6.8 because of Rocks. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] PE offers 0 slots?
On Mon, Feb 13, 2017 at 2:32 PM, Luis Huangwrote: > Check to make sure you haven’t got any rqs interfering. > I don't see any rqs as interfering. qquota for the users in question returns that the have quota available on queues to which their jobs are submitted. And qstat on the queue shows available resources. > I just the exact same problem and it turns out that RQS was limiting it. > > > > Also check your qconf –spl to make sure your PE has got enough slots. > The PE is assigned slots, and the cluster has 500 total. Thanks for the reply. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] PE offers 0 slots?
SoGE 8.1.8 Hi, I'm getting some queued jobs with scheduling info that includes this line at the end: cannot run in PE "unihost" because it only offers 0 slots 'unihost' is the only PE I use. When users request multiple slots, they use 'unihost': ... -binding linear:2 -pe unihost 2 ... What happens is that these jobs aren't running when it otherwise seems like they should be, or they sit waiting in the queue for a long time even when the user has plenty of quota available within the queue they've requested, and there are enough resources available on the queue's nodes (slots and vram are consumables). Any suggestions about how I might further understand this? Thanks ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] rotating accounting files
SoGE 8.1.8 Hi, What happens to the accounting file when it's rotated using /opt/sge/util/logchecker.sh? Can it still be used by qacct or is the info lost unless it's uncompressed and somehow put back into circulation? Thanks -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] fairshare ticketing setup
> > > > > > > I'm trying to setup a fair share policy. > > > > Do I need to setup a basic share tree like listed here: > > > > > > id=0 > > > name=Root > > > type=0 > > > shares=1 > > > childnodes=1 > > > id=1 > > > name=default > > > type=0 > > > shares=1000 > > Correct. All users which are collected under this "default" leaf will > automatically show up there (after you saved the tree and open it again and > click the "+"). > Thanks Reuti! It looks like it's working now. A couple questions though: 1) newly submitted qsub jobs are showing up with 'stckt' values in qstat -ext. However they're sometimes in the thousands, when the sharetree I set up like above has shares=1000. What might that mean? 2) qlogin jobs that were already running got share tickets assigned, as shown by qstat -ext. And they have LOTS of tickets, some up to 24k. Does that seem normal? > === > > BTW: I notice that you have no backfilling enabled. Is this by intention? > Yes, but more by ignorance than thoughtfulness. I only know the basic idea of backfilling, and figured it needs to have each job's expected execution duration in order to work. And I haven't set up timed queues and told user to submit with expected executation duration (although it's on the list), so I figured backfilling wouldn't make sense yet. Am I right? Thanks again. -M > > -- Reuti > > > > in the global config was enough. Thanks for any thoughts. > > > > > > # qconf -sconf > > > > > > enforce_user auto > > > auto_user_fshare 100 > > > > > > > > > # qconf -ssconf > > > algorithm default > > > schedule_interval 0:0:5 > > > maxujobs 200 > > > queue_sort_method load > > > job_load_adjustments np_load_avg=0.50 > > > load_adjustment_decay_time0:7:30 > > > load_formula np_load_avg > > > schedd_job_info true > > > flush_submit_sec 0 > > > flush_finish_sec 0 > > > paramsnone > > > reprioritize_interval 0:0:0 > > > halftime 168 > > > usage_weight_list cpu=1.00,mem=0.00,io= > 0.00 > > > compensation_factor 5.00 > > > weight_user 0.25 > > > weight_project0.25 > > > weight_department 0.25 > > > weight_job0.25 > > > weight_tickets_functional 1000 > > > weight_tickets_share 10 > > > share_override_ticketsTRUE > > > share_functional_shares TRUE > > > max_functional_jobs_to_schedule 2000 > > > report_pjob_tickets TRUE > > > max_pending_tasks_per_job 100 > > > halflife_decay_list none > > > policy_hierarchy OSF > > > weight_ticket 1.00 > > > weight_waiting_time 0.10 > > > weight_deadline 360.00 > > > weight_urgency0.10 > > > weight_priority 1.00 > > > max_reservation 0 > > > default_duration INFINITY > > > ___ > > > users mailing list > > > users@gridengine.org > > > https://gridengine.org/mailman/listinfo/users > > > > > > ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] fairshare ticketing setup
On Fri, Dec 16, 2016 at 11:50 AM, Reuti <re...@staff.uni-marburg.de> wrote: > > > Am 16.12.2016 um 16:15 schrieb Michael Stauffer <mgsta...@gmail.com>: > > > > SoGE 8.1.8 > > > > Hi, > > > > I'm trying to setup a fair share policy. > > Only functional fair share without honoring the past? 'stckt' are from the > share tree which honors the past usage. > No, I am hoping to honor the past. I don't see any tickets of any kind from qstat, unless I add override tickets. Can you point out what I may have done wrong in my config that I included in my orig post? -M > -- Reuti > > > > I've followed what I've found online from some different sources, and my > config is listed below. When I use qstat -ext however, I'm not seeing any > tickets issued under 'stckt'. I've tried adding some override tickets to > some jobs and they show up, and look to be influencing priority. > > > > Do I need to setup a basic share tree like listed here: > > > > id=0 > > name=Root > > type=0 > > shares=1 > > childnodes=1 > > id=1 > > name=default > > type=0 > > shares=1000 > > childnodes=NONE > > > > I'd thought that setting 'enforce_user auto' and 'auto_user_fshare 100' > in the global config was enough. Thanks for any thoughts. > > > > # qconf -sconf > > > > enforce_user auto > > auto_user_fshare 100 > > > > > > # qconf -ssconf > > algorithm default > > schedule_interval 0:0:5 > > maxujobs 200 > > queue_sort_method load > > job_load_adjustments np_load_avg=0.50 > > load_adjustment_decay_time0:7:30 > > load_formula np_load_avg > > schedd_job_info true > > flush_submit_sec 0 > > flush_finish_sec 0 > > paramsnone > > reprioritize_interval 0:0:0 > > halftime 168 > > usage_weight_list cpu=1.00,mem=0.00,io=0.00 > > compensation_factor 5.00 > > weight_user 0.25 > > weight_project0.25 > > weight_department 0.25 > > weight_job0.25 > > weight_tickets_functional 1000 > > weight_tickets_share 10 > > share_override_ticketsTRUE > > share_functional_shares TRUE > > max_functional_jobs_to_schedule 2000 > > report_pjob_tickets TRUE > > max_pending_tasks_per_job 100 > > halflife_decay_list none > > policy_hierarchy OSF > > weight_ticket 1.00 > > weight_waiting_time 0.10 > > weight_deadline 360.00 > > weight_urgency0.10 > > weight_priority 1.00 > > max_reservation 0 > > default_duration INFINITY > > ___ > > users mailing list > > users@gridengine.org > > https://gridengine.org/mailman/listinfo/users > > ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] fairshare ticketing setup
SoGE 8.1.8 Hi, I'm trying to setup a fair share policy. I've followed what I've found online from some different sources, and my config is listed below. When I use qstat -ext however, I'm not seeing any tickets issued under 'stckt'. I've tried adding some override tickets to some jobs and they show up, and look to be influencing priority. Do I need to setup a basic share tree like listed here: id=0 name=Root type=0 shares=1 childnodes=1 id=1 name=default type=0 shares=1000 childnodes=NONE I'd thought that setting 'enforce_user auto' and 'auto_user_fshare 100' in the global config was enough. Thanks for any thoughts. # qconf -sconf enforce_user auto auto_user_fshare 100 # qconf -ssconf algorithm default schedule_interval 0:0:5 maxujobs 200 queue_sort_method load job_load_adjustments np_load_avg=0.50 load_adjustment_decay_time0:7:30 load_formula np_load_avg schedd_job_info true flush_submit_sec 0 flush_finish_sec 0 paramsnone reprioritize_interval 0:0:0 halftime 168 usage_weight_list cpu=1.00,mem=0.00,io=0.00 compensation_factor 5.00 weight_user 0.25 weight_project0.25 weight_department 0.25 weight_job0.25 weight_tickets_functional 1000 weight_tickets_share 10 share_override_ticketsTRUE share_functional_shares TRUE max_functional_jobs_to_schedule 2000 report_pjob_tickets TRUE max_pending_tasks_per_job 100 halflife_decay_list none policy_hierarchy OSF weight_ticket 1.00 weight_waiting_time 0.10 weight_deadline 360.00 weight_urgency0.10 weight_priority 1.00 max_reservation 0 default_duration INFINITY ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] stdio permissions are ignoring ACL defaults
SoGE 8.1.8 Hi, We have some group data directories that are setup with group sticky bits so that new files are all owned by the linux group assigned to the data dir, and use ACL's to have new files group-readable and group-writeable by default, like so: # file: . # owner: pcook # group: detre_group # flags: -s- user::rwx group::rwx other::--- default:user::rwx default:group::rwx default:other::--- We've noticed that SGE stdio files that are written in these dirs are made without group-write permissions, i.e. they're 640 instead of 660: -rw-r- 1 mgstauff detre_group0 2016-10-21 11:07 my.stderr -rw-r- 1 mgstauff detre_group0 2016-10-21 11:07 my.stdout So it seems to be ignoring or otherwise overriding the ACL defaults. Does anyone have an idea why this might be? This is the same whether stido is set via -o and -e options like above, or just uses the default naming, fwiw. Thanks -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] jobs not running even though resource quotas not met
> > Maybe it would be good to tell the user not to submit into a queue at all > but request resources and SGE will select an appropriate queue for the job. > I have two main queues. all.q, which is a batch queue for newer compute nodes, and basic.q which is a batch queue for older, slower nodes that are used much less often. I also have a separate queue for qlogin sessions (if I remember right, I setup a separate qlogin queue a long time ago when I first set this system up so I could have a time-limit on sessions). Would it make sense to have a resource that differentiates between these queues that user's would request in order for SGE to choose the appropriate queue, or leave it as I have it currently, in which all.q is the default, and if a user wants to run on basic.q, they request it manually via qsub -q option. I'll probably be adding some queues soon that have different time limits, to better corral long-running jobs. I know there's a mechanism for doing this, but haven't looked into it yet. I imagine it' what you're suggesting here? > > and I hadn't looked carefully enough to notice that. So now I'm not sure > about the couple other times I've seen this in the past, it might have been > something like that. > > > > Skylar thanks for the qstat -w tip, I'll use that in the future. > > > > Reuti, if I were to adjust the setup not to use RQS, how would I limit > users' resource usage? > > It was only suggested as a test. I saw situations where a combinations of > consumables and limits in RQS blocks the scheduling completely and showing > something like "... offers only (-l none)." > > In case you have to limit the usage per user you have to use them for sure. > OK thanks, I thought you maybe were suggesting there's another way to limit resources by user. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] jobs not running even though resource quotas not met
Thanks Reuti, Skylar, Turns out it was a false alarm, sorry. The user hadn't told me they'd submitted to a different queue and I hadn't looked carefully enough to notice that. So now I'm not sure about the couple other times I've seen this in the past, it might have been something like that. Skylar thanks for the qstat -w tip, I'll use that in the future. Reuti, if I were to adjust the setup not to use RQS, how would I limit users' resource usage? -M On Wed, Oct 19, 2016 at 7:37 AM, Reuti <re...@staff.uni-marburg.de> wrote: > Hi, > > > Am 19.10.2016 um 03:26 schrieb Michael Stauffer <mgsta...@gmail.com>: > > > > SoGE 8.1.8 > > > > Hi, > > > > I'm using consumables h_vmem, s_vmem and slots and have rqs's to manage > these. I've noticed sometimes that a user's jobs will sit in the queue even > though their qquota output shows they haven't hit their limits, and "qstat > -F h_vmem,s_vmem,slots" shows one or more nodes with enough resources > available to run one or more of the queued-and-waiting jobs. > > > > Tonight I tried modifying the queue on some qw'ed jobs using qalter. The > default queue is all.q, and first when I did 'qalter -q all.q ', the > waiting job starting running right away. I tried on some more waiting jobs > but no effect. Then I did 'qalter -q all.q@ ' where > was a host that was reporting sufficient resources via qstat -F. The job > ran immediately. This worked for a few more jobs until resources were truly > insufficient. > > > > Does anyone have an idea what might be going on or how to continue > debugging? Thanks. > > I noticed such a behavior when RQS are in place. Can you adjust your setup > not to use RQS, or test it temporarily without them? > > -- Reuti > > > > > > -M > > ___ > > users mailing list > > users@gridengine.org > > https://gridengine.org/mailman/listinfo/users > > ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] jobs not running even though resource quotas not met
SoGE 8.1.8 Hi, I'm using consumables h_vmem, s_vmem and slots and have rqs's to manage these. I've noticed sometimes that a user's jobs will sit in the queue even though their qquota output shows they haven't hit their limits, and "qstat -F h_vmem,s_vmem,slots" shows one or more nodes with enough resources available to run one or more of the queued-and-waiting jobs. Tonight I tried modifying the queue on some qw'ed jobs using qalter. The default queue is all.q, and first when I did 'qalter -q all.q ', the waiting job starting running right away. I tried on some more waiting jobs but no effect. Then I did 'qalter -q all.q@ ' where was a host that was reporting sufficient resources via qstat -F. The job ran immediately. This worked for a few more jobs until resources were truly insufficient. Does anyone have an idea what might be going on or how to continue debugging? Thanks. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] allowing a single level of qsub recursion
Reuti, On Sat, Apr 16, 2016 at 6:08 AM, Reuti <re...@staff.uni-marburg.de> wrote: > Hi, > > Am 07.04.2016 um 23:15 schrieb Michael Stauffer: > > > SoGE 8.1.8 > > > > Hi, > > > > I'm trying to figure out how to allow users to submit qsub jobs from > qlogin sessions and from other qsub jobs, while preventing starting qlogin > sessions while within a qlogin session, and preventing poorly-written > scripts from running out of control recursively and submitting a dangerous > number of jobs. That is, I'd like to allow a initiating single level of > qsub calls from exec hosts. > > > > I can think of some hacks to do this, but am wondering if there's an > official way? > > > > For qlogins, to prevent calling qlogin from qlogin session: I could > check the job spool 'environment' file during login, and if QRSH_PORT is > not the head node, I can logout of the new session. Or even more easily, I > could write a wrapper for qlogin that checks that it's being called from > head node. > > For `qlogin` these settings are used: > > $ qconf -sconf > ... > qlogin_command builtin > qlogin_daemonbuiltin > > As these are not only global settings, they can be changed for each > particular machine. For each exechost it could be changed/created to read: > > $ qconf -sconf nodeXY > qlogin_command /bin/false > Thanks for the reply. I'm finally getting back to this question. Although this would work, I think I'll simply check the QRSH_PORT var in the qlogin environment to see if it's originated from head node or not. Then I can print a clean error message. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] allowing a single level of qsub recursion
SoGE 8.1.8 Hi, I'm trying to figure out how to allow users to submit qsub jobs from qlogin sessions and from other qsub jobs, while preventing starting qlogin sessions while within a qlogin session, and preventing poorly-written scripts from running out of control recursively and submitting a dangerous number of jobs. That is, I'd like to allow a initiating single level of qsub calls from exec hosts. I can think of some hacks to do this, but am wondering if there's an official way? For qlogins, to prevent calling qlogin from qlogin session: I could check the job spool 'environment' file during login, and if QRSH_PORT is not the head node, I can logout of the new session. Or even more easily, I could write a wrapper for qlogin that checks that it's being called from head node. For qsub, I already have a wrapper that I use to verify certain settings (created before I knew about JSV). In this, I could check if I'm on an exec host, and if so, create a state file in /tmp with the returned job id from qsub. Then when qsub is called again, if a state file exists with the current job id, I know I was initiated from an exec host and should deny the new qsub request. Any thought? Thanks. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] rebooting nodes nicely - what happened?
On Tue, Mar 1, 2016 at 6:19 PM, Reuti <re...@staff.uni-marburg.de> wrote: > Hi, > >> > Am 01.03.2016 um 23:44 schrieb Michael Stauffer: > >> > > SoGE 8.1.8 > >> > > >> > I need to reboot my compute nodes after the glibc patch, and wanted to > do so nicely, i.e. wait for each node's jobs to finish before rebooting. > I've done this before and it worked, but now my setup is a little more > complicated and I changed my reinstall script. > >> > > >> > I have a queue for qsub jobs and one for qlogin. Each is assigned a > different number of cores per node so that some nodes always have at least > a couple cores available for qlogin sessions, and some nodes are used only > for qsub jobs. > >> > > >> > However my reinstall script (taken from the sge examples, listed below) > does its thing by submitting a job that requests all the cores on a node, > so it only runs when other jobs have completed. So I created a new queue > called reboot.q that is allotted all cores on all nodes. My understanding > was that the queues would cooperatively manage resources, so if a node was > using, for example, 8 cores for jobs on my qsub queue, then my reboot job > that's requesting 16 cores would wait until those jobs finish. > >> > Did you limit the overall slot count across all queues by a consumable > complex on an exechost level ("complex_values slots=8") and/or with an RQS? > Otherwise each queue can use all defined slots counts in each particular > queue definition (and overload the nodes essentially). > No, at least not knowingly. I should do this also for regular usage to avoid overloading. How do I actually do this? That is, I don't know from what you say how to actually do this. My queues look like this for 'slots' (e.g. for the qsub queue:) slots 1,[compute-0-0.local=0],[compute-0-1.local=15], \ [compute-0-2.local=15],[compute-0-3.local=15], \ [compute-0-4.local=16],[compute-0-5.local=16], \ [compute-0-6.local=16],[compute-0-7.local=16], \ [compute-0-9.local=16],[compute-0-10.local=16], \ [compute-0-11.local=16],[compute-0-12.local=16], \ [compute-0-13.local=16],[compute-0-14.local=16], \ [compute-0-15.local=16],[compute-0-16.local=16], \ [compute-0-17.local=16],[compute-0-18.local=16], \ [compute-0-8.local=16],[compute-0-19.local=16], \ [compute-0-20.local=16] complex_valuesNONE Do I do something similar for the complex_values parameter? > But when I ran my script, all nodes rebooted for reinstall immediately. I > guess I don't understand things correctly? Can someone set me straight? How > do I do a node reboot only after jobs have finished under these > circumstances? > >> > What about attaching the "exclusive" complex (needs to be defined manually > in `qconf -mc`) to each exechost and request this when submitting the > reboot job? Even one slot would be enough then to get exclusive access to > each node. > This sounds great. Can you give me details on how to do this? What are values needed for the complex configuration params? Something like this? name shortcut typerelop requestable consumable default urgency exclusive exBOOL == YES NO 0 0 How is it attached to each exechost? Thanks very much. -M > -- Reuti > >> > > > script: > >> > > >> > ME=`hostname` > >> > > >> > EXECHOSTS=`qconf -sel` > >> > > >> > for TARGETHOST in $EXECHOSTS; do > >> > > >> > if [ "$ME" == "$TARGETHOST" ]; then > >> > > >> > echo "Skipping $ME. This is the submission host" > >> > > >> > else > >> > > >> > numprocs=`qconf -se $TARGETHOST | \ > >> > > >> > awk '/^processors/ {print $2}'` > >> > > >> > /opt/rocks/bin/rocks set host boot $TARGETHOST > action=install > >> > > >> > qsub -p 1024 -pe unihost $numprocs -binding > linear:${numprocs} -q reboot.q@$TARGETHOST \ > >> > > >> > /root/admin/scripts/sge-reboot.qsub > >> > > >> > echo "Set $TARGETHOST for Reinstallation" > >> > > >> > fi > >> > > >> > done > >> > > >> > > >> > Thanks > >> > > >> > -M > >> > ___ > >> > users mailing list > >> > users@gridengine.org > >> > https://gridengine.org/mailman/listinfo/users > > ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] rebooting nodes nicely - what happened?
SoGE 8.1.8 Hi, I need to reboot my compute nodes after the glibc patch, and wanted to do so nicely, i.e. wait for each node's jobs to finish before rebooting. I've done this before and it worked, but now my setup is a little more complicated and I changed my reinstall script. I have a queue for qsub jobs and one for qlogin. Each is assigned a different number of cores per node so that some nodes always have at least a couple cores available for qlogin sessions, and some nodes are used only for qsub jobs. However my reinstall script (taken from the sge examples, listed below) does its thing by submitting a job that requests all the cores on a node, so it only runs when other jobs have completed. So I created a new queue called reboot.q that is allotted all cores on all nodes. My understanding was that the queues would cooperatively manage resources, so if a node was using, for example, 8 cores for jobs on my qsub queue, then my reboot job that's requesting 16 cores would wait until those jobs finish. But when I ran my script, all nodes rebooted for reinstall immediately. I guess I don't understand things correctly? Can someone set me straight? How do I do a node reboot only after jobs have finished under these circumstances? script: ME=`hostname` EXECHOSTS=`qconf -sel` for TARGETHOST in $EXECHOSTS; do if [ "$ME" == "$TARGETHOST" ]; then echo "Skipping $ME. This is the submission host" else numprocs=`qconf -se $TARGETHOST | \ awk '/^processors/ {print $2}'` /opt/rocks/bin/rocks set host boot $TARGETHOST action=install qsub -p 1024 -pe unihost $numprocs -binding linear:${numprocs} -q reboot.q@$TARGETHOST \ /root/admin/scripts/sge-reboot.qsub echo "Set $TARGETHOST for Reinstallation" fi done Thanks -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Is qacct 'maxvmem' reporting reliable?
On Mon, Oct 26, 2015 at 5:43 PM, Jesse Beckerwrote: > On Mon, Oct 26, 2015 at 02:22:19PM -0700, Skylar Thompson wrote: > >> Hi Mark, >> >> IIRC maxvmem is a sampled value, so if the job experiences rapid >> fluctuations in vmem usage, it might not be entirely accurate. >> > > As you said, vmem is a polled value, and will change over time. > However, maxvmem should be a strictly non-decreasing value. > Thanks to both of you. This is good to know, but disappointing. If we get motivated enough I'll see if I can find in the source where the polling interval is coded and try decreasing it. If it is that straight forward, I have no idea. > > There were some versions of SGE (et al) where the vmem reporting was > invalid above 4G of RAM becuase a 32bit int was used for tracking this > value. I believe that all *current* versions of SGE derivatives (open > source and commercial) have this fixed. The low reporting I'm seeing is with value much less than 4G, so it's probably not this issue. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] Is qacct 'maxvmem' reporting reliable?
SoGE 8.1.8 Hi, I've been directing users to run qacct to look at completed jobs and check the maxvmem field in order to set the memory limit requests (h_vmem & s_vmem) when running similar jobs in the future. But a couple users have seen maxvmem reported as a much lower value reported than is needed for the job to run. 'man accounting' mentions the ACCT_RESERVED_USAGE paramter set in qconf. 'qconf -sconf' doesn't show this parameter at all, so can I assume it's false, which means qacct's maxvmem should actually reflect the actual max vmem usage? What might I be doing wrong or misunderstanding? Thanks. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] how to migrate OGS
Hi agian, I've got my cluster almost back up after upgrading to Rocks 6.2. Regarding installing SoGE 8.1.8 - I'm wondering how to do this. Is it as simple as yum-installing the rpm(s?) on the head and compute nodes, then using the load_sge_config.sh script on the head node? I haven't found any documentation on this. Thanks for any suggestions. -M On Sat, Aug 1, 2015 at 2:49 PM, Michael Stauffer mgsta...@gmail.com wrote: On Sat, Aug 1, 2015 at 1:47 PM, Reuti re...@staff.uni-marburg.de wrote: Am 31.07.2015 um 20:55 schrieb Michael Stauffer: I haven't, I'm doing the upgrades tomorrow. I've looked at the scripts you mention and it looks different from util/upgrade_modules/save_sge_config.sh and util/upgrade_modules/load_sge_config.sh which have also been recommended to me. Seems the don't call each other. Any idea why there are two such paired scripts and which I should use? These are different types of backup/restore: The save/load_sge_config.sh you can use multiple times, although no update is intended. Just to save/backup and restore a configuration - this will create text files with the information gained by issuing SGE commands. With inst_sge backup/restore the classic or BDB files are saved as they are. With the update procedure all daemons on the exechosts will be updated to a new version (although often an un-tar of the binaries is sufficient). inst_sge will also load a former saved configuration, which was done with save_sge_config.sh by hand (depending on the command line options). === In your case I assumed you already have an empty SGE configuration, so that no update on all machines of the binaries need to be performed. Yes. I'll be installing SoGE. I'll try the save/load_sge_config.sh method first. Fingers crossed that the config load is the same between SoGE and OGS. Thanks! -M -- Reuti Thanks. -M On Fri, Jul 31, 2015 at 3:33 AM, Rémy Dernat remy...@gmail.com wrote: Have you tried ./inst_sge -rst on your fresh install after a backup of the previous Sge ( ./inst_sge -bup ) ? Le 29 juil. 2015 22:29, Michael Stauffer mgsta...@gmail.com a écrit : On Wed, Jul 29, 2015 at 4:01 PM, Reuti re...@staff.uni-marburg.de wrote: Hi, Am 29.07.2015 um 21:22 schrieb Michael Stauffer: I'm upgrading a rocks 6.1 system to 6.2 this weekend. I'll be doing a clean install of rocks. Both rocks 6.1 6.2 have OGS/GE 2011.11p1 (at least as far as I can tell, 6.2 has the same version). Would it be possible to move to a newer version? AFAICS OGS didn't get much updates for some time now and I don't know about its current state. Thanks for the reply Reuit. Yes OGS seems very stagnant - the current version is from july 2012. In terms of a newer version, I figure you mean using Son of GE? I probably chose OGS when I setup this cluster a couple years ago because it came with an OGS roll. I imagine my setup should migrate easily to Son of GE? Univa GE Core looks old too, the master on github is 3 years old. I imagine other forks will work with Rocks but I've never tried and don't know about installation complications. I'll have to ask on the Rocks list. Does anyone here have experience using Son of GE on Rocks? Could someone help me with the steps to do the migration of my SGE setup? I have this so far: On the old rocks 6.1 install: 1) run /opt/gridengine/util/upgrade_modules/save_sge_config.sh and save to external location On the new rocks 6.2 install: 1) copy over the output from save_sge_config.sh 2) Now, do I run $SGE_ROOT/inst_sge -upd OR /opt/gridengine/util/upgrade_modules/load_sge_config.sh After you installed the new version (and maybe removed the usual all.q, defined host/hostgroups,...) this can be used to load the old configuration into the new empty SGE installation. But I remember that it may be necessary to run it more than once, in case there are some mutual references (like in the list of subordinated queues). The script can fail anyway, in case that there are new entries in the definition of objects in case you upgrade to a much newer version (well, these could be added by hand to the text files if its known what is missing). Yes I saw too that load_sge_config may need to be run twice. I'm less optimistic about this working if I switch to Son of GE - but if it fails I can recreate by hand if I have to. -M -- Reuti OR /opt/gridengine/util/upgrade_modules/inst_upgrade.sh Are there any options for these commands that I haven't been able to figure out? Or do I so something else altogether? Thanks! -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo
Re: [gridengine users] how to migrate OGS
I haven't, I'm doing the upgrades tomorrow. I've looked at the scripts you mention and it looks different from util/upgrade_modules/save_sge_config.sh and util/upgrade_modules/load_sge_config.sh which have also been recommended to me. Seems the don't call each other. Any idea why there are two such paired scripts and which I should use? Thanks. -M On Fri, Jul 31, 2015 at 3:33 AM, Rémy Dernat remy...@gmail.com wrote: Have you tried ./inst_sge -rst on your fresh install after a backup of the previous Sge ( ./inst_sge -bup ) ? Le 29 juil. 2015 22:29, Michael Stauffer mgsta...@gmail.com a écrit : On Wed, Jul 29, 2015 at 4:01 PM, Reuti re...@staff.uni-marburg.de wrote: Hi, Am 29.07.2015 um 21:22 schrieb Michael Stauffer: I'm upgrading a rocks 6.1 system to 6.2 this weekend. I'll be doing a clean install of rocks. Both rocks 6.1 6.2 have OGS/GE 2011.11p1 (at least as far as I can tell, 6.2 has the same version). Would it be possible to move to a newer version? AFAICS OGS didn't get much updates for some time now and I don't know about its current state. Thanks for the reply Reuit. Yes OGS seems very stagnant - the current version is from july 2012. In terms of a newer version, I figure you mean using Son of GE? I probably chose OGS when I setup this cluster a couple years ago because it came with an OGS roll. I imagine my setup should migrate easily to Son of GE? Univa GE Core looks old too, the master on github is 3 years old. I imagine other forks will work with Rocks but I've never tried and don't know about installation complications. I'll have to ask on the Rocks list. Does anyone here have experience using Son of GE on Rocks? Could someone help me with the steps to do the migration of my SGE setup? I have this so far: On the old rocks 6.1 install: 1) run /opt/gridengine/util/upgrade_modules/save_sge_config.sh and save to external location On the new rocks 6.2 install: 1) copy over the output from save_sge_config.sh 2) Now, do I run $SGE_ROOT/inst_sge -upd OR /opt/gridengine/util/upgrade_modules/load_sge_config.sh After you installed the new version (and maybe removed the usual all.q, defined host/hostgroups,...) this can be used to load the old configuration into the new empty SGE installation. But I remember that it may be necessary to run it more than once, in case there are some mutual references (like in the list of subordinated queues). The script can fail anyway, in case that there are new entries in the definition of objects in case you upgrade to a much newer version (well, these could be added by hand to the text files if its known what is missing). Yes I saw too that load_sge_config may need to be run twice. I'm less optimistic about this working if I switch to Son of GE - but if it fails I can recreate by hand if I have to. -M -- Reuti OR /opt/gridengine/util/upgrade_modules/inst_upgrade.sh Are there any options for these commands that I haven't been able to figure out? Or do I so something else altogether? Thanks! -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] how to migrate OGS
Hi, I'm upgrading a rocks 6.1 system to 6.2 this weekend. I'll be doing a clean install of rocks. Both rocks 6.1 6.2 have OGS/GE 2011.11p1 (at least as far as I can tell, 6.2 has the same version). Could someone help me with the steps to do the migration of my SGE setup? I have this so far: On the old rocks 6.1 install: 1) run /opt/gridengine/util/upgrade_modules/save_sge_config.sh and save to external location On the new rocks 6.2 install: 1) copy over the output from save_sge_config.sh 2) Now, do I run $SGE_ROOT/inst_sge -upd OR /opt/gridengine/util/upgrade_modules/load_sge_config.sh OR /opt/gridengine/util/upgrade_modules/inst_upgrade.sh Are there any options for these commands that I haven't been able to figure out? Or do I so something else altogether? Thanks! -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] squeezing in other jobs
On Thu, May 14, 2015 at 4:31 PM, Reuti re...@staff.uni-marburg.de wrote: Am 14.05.2015 um 20:22 schrieb Michael Stauffer: OGS, via Rocks 6.1 Hi, Sometimes users have jobs running that fill they're quota, and want to suspend one of them in order to run a different immediate job, and then continue the suspended job. I've experimented with suspending running jobs and submitting a new job, but the new job doesn't run. Should it? Is there a different way to do this? How should SGE know which job to suspend to lower the quota temporarily? There is no look-ahead feature in SGE to make such decisions. Once resource are granted, they are used up also by suspended jobs. I was experimenting with suspending the jobs manually, so SGE wouldn't have to decide. But in any case, seems my goal isn't possible using suspension, thanks, so I'll stick with placing holds on queued jobs, and look into checkpointing. -M -- Reuti Related to this, I've seen that if the user's quota is full and also has a number of queued jobs, the queued jobs can have a hold placed on them, and a newly submitted job will then run once one of the running jobs has completed. This works for some cases, but if the running jobs are long, it's less attractive. Thanks -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Spooling framework failed
Hi, So my / filesystem filled up, seems that was the issue. I had to put my hostname back in act_qmaster, and then qmaster started up again. Thanks, Google! -M On Tue, May 5, 2015 at 5:08 PM, Michael Stauffer mgsta...@gmail.com wrote: OGS/GE 2011.11p1 (Rocks 6.1) Hi, Just a bit ago I started getting errors like this for each host when changing rqs' error writing object qlogin.long.q/compute-0-20.local to spooling database Queue instance state of qlogin.long.q@compute-0-20.local not modified: Spooling framework failed Around the same time I was downloading a new rocks roll, create a roll and rebuilding the distro. It seems unlikely this would have interfered with OGS, but I don't know of anything else unusual that was going on at the same time. I couldn't find anything via google about 'spooling framework failed', save for the sge header files that define this error message. I tried restarting qmaster, but things have gotten worse: [root@chead ~]# /etc/init.d/sgemaster.chead softstop shutting down Grid Engine qmaster [root@chead ~]# /etc/init.d/sgemaster.chead start starting sge_qmaster sge_qmaster start problem sge_qmaster didn't start! [root@chead ~]# /etc/init.d/sgemaster.chead start sge_qmaster didn't start! This is not a qmaster host! Check your /opt/gridengine/default/common/act_qmaster file! Egads! Please help, I'm out of my league now. Thanks. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] load grpah
Message: 1 Date: Fri, 17 Apr 2015 10:12:10 +0200 From: Jacques Foucry jacques.fou...@novasparks.com To: users@gridengine.org Users users@gridengine.org Subject: [gridengine users] load grpah Message-ID: 5530c05a.6060...@novasparks.com Content-Type: text/plain; charset=utf-8 Hello folks, Is there a way to have (may be on qmaster) some load graph of the grid ? Thanks for you help. Jacques I've got these in a list, haven't tried them yet though: UBMoD: Collecting Statistical Data of Grid Engine Jobs The open source UBMoD is a tool for retrieving old job data and doing some statistics on it. Their description from http://ubmod.sourceforge.net/: UBMoD (UB Metrics on Demand) is an open source tool for collecting and mining statistical data from cluster resource managers (such as TORQUE, OpenPBS, and SGE) commonly found in high-performance computing environments. It has been developed by the Center for Computational Research at the University at Buffalo, SUNY and presents resource utilization including CPU cycles consumed, total jobs, average wait time, etc. for individual users, research groups, departments, and decanal units. The web-based user interface provides a dashboard for displaying resource consumption along with fine-grained control over the time period and resources displayed SunGrid Graphical Accounting Engine The description from http://rdlab.lsi.upc.edu/index.php/serveis/s-gae.html s-gae is a web application designed to display accounting information generated by Oracle Grid Engine (formerly SunGrid Engine) or its free forks such as Open Grid Scheduler, Son of Grid Engine, etc. as well as non free forks such as Univa Grid Engine. This gathered data is stored in a database in order to display eye-candy charts grouped by user, queue or full cluster. Moreover, you can use several filter options to customize the results. Qmem: Grid Engine Memory Usage Statistics From: https://github.com/txemaheredia/qmem Qmem is a script designed to describe the memory usage of a SGE cluster. If your cluster has memory restrictions, the usage of qstat solely is not enough to monitor its state properly. Qmem attempts to solve that. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] project quota details
Ed, thanks for the reply and offer. Here's an example of what I want to do (for slot quotas. I figure setup for h_vmem and s_vmem rqs's will be similar): project name# slots aggregate # slots user users ---- --- lab1 100 40 ted,ann,bob,jim lab2 80 15 ann,cin,fred,jen 50 lan (lan's a power user in lab2) For now each project will be assigned exclusively to one of two qsub queues, but that part should be straight forward if you just show me how to handle the above for one queue. And each will go to the qlogin queue with different slot limits, but that should be straightforward too if I know how to do it for qsub queue. The FE is 2 years old, has 16 2.2 GHz Xeon cores, and 64GB ram. Thanks! -M On Mon, Apr 13, 2015 at 9:05 PM, Ed Lauzier elauzi...@perlstar.com wrote: Hi Michael, Send some basic examples of what you want to do and I'll fire off a basic RQS config that will get you going. There is a lot to it, esp for project-level fairshare settings Also, remember that if you want decent response, you need a scheduler with at least 2 cpus. Best to have 4 cpus so that decisions can be made faster during the scheduling cycle and worker threads can do their thing Also consider looking using the perl JSV for runtime limits enforcements. It may be best to get Univa to assist you for a day even over the phone if you can justify the expense. It is well worth it to get the new Univa Grid Engine. -Ed -Original Message- *From:* Reuti [mailto:re...@staff.uni-marburg.de] *Sent:* Monday, April 13, 2015 06:22 PM *To:* 'Michael Stauffer' *Cc:* 'Gridengine Users Group' *Subject:* Re: [gridengine users] project quota details Hi, Am 13.04.2015 um 23:24 schrieb Michael Stauffer: OGS/GE 2011.11p1 (Rocks 6.1) Hi, I'm looking to setup project-based quota management. I'm at a university and different labs will be signing up for different quotas for the users in their labs. I understand that I can: - add users to one or more projects - assign quotas (slots and memory in my case) to projects that will limit the total concurrent resource usage by project - have users choose a particular project when they submit a job (needed for users who do work for multiple labs) I'm wondering if I can also set a per-user quota within a project quota that will limit how much of a resource any individual from the project can use at once. That is, I'd like that limit to be lower than the project's limit on all project users, so that no one user in a project can use all the project's resources at once. Could different per-user quotas be assigned for different users within a project? e.g. a power user in a project might generally need more slots than other users. Yes. You need to phrase these individual limits in a second RQS. I.e. one RQS will limit the overall consumption per project, the second one will limit the combinations of projects and users to varying limits. -- Reuti Any suggestions on strategies for this kind of resource management would be a great help. Thanks -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] project quota details
OGS/GE 2011.11p1 (Rocks 6.1) Hi, I'm looking to setup project-based quota management. I'm at a university and different labs will be signing up for different quotas for the users in their labs. I understand that I can: - add users to one or more projects - assign quotas (slots and memory in my case) to projects that will limit the total concurrent resource usage by project - have users choose a particular project when they submit a job (needed for users who do work for multiple labs) I'm wondering if I can also set a per-user quota within a project quota that will limit how much of a resource any individual from the project can use at once. That is, I'd like that limit to be lower than the project's limit on all project users, so that no one user in a project can use all the project's resources at once. Could different per-user quotas be assigned for different users within a project? e.g. a power user in a project might generally need more slots than other users. Any suggestions on strategies for this kind of resource management would be a great help. Thanks -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] no memory usage for qlogin jobs
Yes we use ssh, and I don't know what 'tight integration' means. Could you clue me in or point me somewhere? Thanks. -M On Fri, Apr 10, 2015 at 4:00 PM, Brendan Moloney molo...@ohsu.edu wrote: No, that is not normal. I am guessing you use SSH for qlogin and don't have tight integration setup? -- *From:* users-boun...@gridengine.org [users-boun...@gridengine.org] on behalf of Michael Stauffer [mgsta...@gmail.com] *Sent:* Friday, April 10, 2015 12:31 PM *To:* Gridengine Users Group *Subject:* [gridengine users] no memory usage for qlogin jobs Version OGS/GE 2011.11p1 (Rocks 6.1) Hi, When I run qacct, I don't get any memory info for qlogin jobs, just 0.000. Is this normal? qsub jobs return memory usage information. Thanks for any help. [mgstauff@chead ~]$ qacct -d 1 -o mgstauff -j qnameqlogin.q hostname compute-0-9.local groupmgstauff ownermgstauff project NONE department defaultdepartment jobname QLOGIN jobnumber836367 taskid undefined account sge priority 0 qsub_timeFri Apr 10 15:19:00 2015 start_time Fri Apr 10 15:19:00 2015 end_time Fri Apr 10 15:25:06 2015 granted_pe NONE slots1 failed 0 exit_status 255 ru_wallclock 366 ru_utime 0.014 ru_stime 0.004 ru_maxrss3744 ru_ixrss 0 ru_ismrss0 ru_idrss 0 ru_isrss 0 ru_minflt1926 ru_majflt0 ru_nswap 0 ru_inblock 0 ru_oublock 32 ru_msgsnd0 ru_msgrcv0 ru_nsignals 0 ru_nvcsw 67 ru_nivcsw38 cpu 0.018 mem 0.000 io 0.000 iow 0.000 maxvmem 0.000 -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] no memory usage for qlogin jobs
Thanks! -M On Fri, Apr 10, 2015 at 4:32 PM, Brendan Moloney molo...@ohsu.edu wrote: Reuti posted this link yesterday: https://arc.liv.ac.uk/SGE/htmlman/htmlman5/remote_startup.html section SSH TIGHT INTEGRATION Brendan Moloney Research Associate Advanced Imaging Research Center Oregon Health Science University -- *From:* Michael Stauffer [mgsta...@gmail.com] *Sent:* Friday, April 10, 2015 1:12 PM *To:* Brendan Moloney *Cc:* Gridengine Users Group *Subject:* Re: [gridengine users] no memory usage for qlogin jobs Yes we use ssh, and I don't know what 'tight integration' means. Could you clue me in or point me somewhere? Thanks. -M On Fri, Apr 10, 2015 at 4:00 PM, Brendan Moloney molo...@ohsu.edu wrote: No, that is not normal. I am guessing you use SSH for qlogin and don't have tight integration setup? -- *From:* users-boun...@gridengine.org [users-boun...@gridengine.org] on behalf of Michael Stauffer [mgsta...@gmail.com] *Sent:* Friday, April 10, 2015 12:31 PM *To:* Gridengine Users Group *Subject:* [gridengine users] no memory usage for qlogin jobs Version OGS/GE 2011.11p1 (Rocks 6.1) Hi, When I run qacct, I don't get any memory info for qlogin jobs, just 0.000. Is this normal? qsub jobs return memory usage information. Thanks for any help. [mgstauff@chead ~]$ qacct -d 1 -o mgstauff -j qnameqlogin.q hostname compute-0-9.local groupmgstauff ownermgstauff project NONE department defaultdepartment jobname QLOGIN jobnumber836367 taskid undefined account sge priority 0 qsub_timeFri Apr 10 15:19:00 2015 start_time Fri Apr 10 15:19:00 2015 end_time Fri Apr 10 15:25:06 2015 granted_pe NONE slots1 failed 0 exit_status 255 ru_wallclock 366 ru_utime 0.014 ru_stime 0.004 ru_maxrss3744 ru_ixrss 0 ru_ismrss0 ru_idrss 0 ru_isrss 0 ru_minflt1926 ru_majflt0 ru_nswap 0 ru_inblock 0 ru_oublock 32 ru_msgsnd0 ru_msgrcv0 ru_nsignals 0 ru_nvcsw 67 ru_nivcsw38 cpu 0.018 mem 0.000 io 0.000 iow 0.000 maxvmem 0.000 -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] method for further job process info?
Hi, Is there a way to easily query if a job is idle or otherwise stuck even though a queue state says it's running? I've seen some old jobs that are listed as running in the queue, but upon investigation on their compute node there is no cpu activity associated with the processes, there are no error messages in output files. I can devise a script to do this, but if there's already something for this I'd just use that. Thanks. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] lots of jobs for one user
OGS/GE 2011.11p1 Hi again, I've got a user who's got 240+ running jobs (single slot) in the default queue (and 1400 queued and waiting), when the usual slot quota is about 50. I say 'usual' because I'm running a simple script that modifies everyone's slot quota depending on the overall cluster usage. When lots of slots are available, the quota goes up to a max of 100. I checked the logs from the script (it runs every minute) and over the time period that these 240+ jobs were submitted, the max slot quota never went above 97. My script examines the current cluster state, then dumps out a new rqs file, which then gets loaded via 'qconf -Mrqs'. The script gets called every minute. The queue scedule interval is one second: schedule_interval 0:0:1 Anyone have an idea how this might have happened? If the user submits a lot of jobs in the split-second when 'qconf -Mrqs' is updating, could the scheduler get confused and start more jobs than it should? Any suggestions on how to dig around to see what happened? Thanks. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] method for further job process info?
On Wed, Feb 11, 2015 at 2:02 PM, Reuti re...@staff.uni-marburg.de wrote: Hi, Am 11.02.2015 um 19:28 schrieb Michael Stauffer mgsta...@gmail.com: Hi, Is there a way to easily query if a job is idle or otherwise stuck even though a queue state says it's running? I've seen some old jobs that are listed as running in the queue, but upon investigation on their compute node there is no cpu activity associated with the processes, there are no error messages in output files. The used CPU time you can check by looking at the usage line in the `qstat -j job_id` output. Any logic to have a safe indication whether a job is stuck in an infinity loop or still computing won't be easy to be implemented and will most likely depend on each particular application, whether there are any output or scratch files which can be checked too. But even then the same output may repeatedly being written thereto. We have even jobs which compute (apparently) fine, but only by manual investigation one can say that the computed values converge to a wrong state or are oscillating between states and won't stop ever. -- Reuti Thanks Reuti. I can see how this would be difficult. I may use the 'usage' line from qstat. I could check every N hours, writing the usage output for each running job to a file, then check the current usage stats against the previous run's file and look for lines that haven't changed at all. To be safe I'd just then email the user to suggset they take a look. This won't catch instances of jobs that are stuck in loops of course, but at least it'll catch completely hung jobs. How often are a job's stats updated? Looks like every 40 seconds? -M I can devise a script to do this, but if there's already something for this I'd just use that. Thanks. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] lots of jobs for one user
On Wed, Feb 11, 2015 at 2:30 PM, Reuti re...@staff.uni-marburg.de wrote: Hi, Am 11.02.2015 um 20:03 schrieb Michael Stauffer mgsta...@gmail.com: OGS/GE 2011.11p1 Hi again, I've got a user who's got 240+ running jobs (single slot) in the default queue (and 1400 queued and waiting), when the usual slot quota is about 50. I say 'usual' because I'm running a simple script that modifies everyone's slot quota depending on the overall cluster usage. When lots of slots are available, the quota goes up to a max of 100. I checked the logs from the script (it runs every minute) and over the time period that these 240+ jobs were submitted, the max slot quota never went above 97. My script examines the current cluster state, then dumps out a new rqs file, which then gets loaded via 'qconf -Mrqs'. The script gets called every minute. The queue scedule interval is one second: schedule_interval 0:0:1 Are the jobs so short that such a short interval is necessary? It will put some load on the scheduler. No they're not so short. I had this just to give the user the fastest response possible. I don't notice any overhead on my system, usually there's at most a few hundred jobs in the queue and we have an overpowered head node. But I'll change it to 2 sec for good measure. Anyone have an idea how this might have happened? If the user submits a lot of jobs in the split-second when 'qconf -Mrqs' is updating, could the scheduler get confused and start more jobs than it should? Any suggestions on how to dig around to see what happened? Thanks. I can't say for sure, but instead of creating an altered file of the output, it's also possible to change individual lines like: $ qconf -mattr resource_quota limit slots=4 general/3 $ qconf -mattr resource_quota limit slots=4 general/short # here the limit got a name $ qconf -mattr resource_quota enabled TRUE general for an RQS called general. OK seems like a great idea. By 'can't say for sure' do you mean you don't know for sure if this will avoid the problem? Seems very likely. A safety net could be setup in addition in the scheduler configuration with maxujobs. Yes, good idea. I had that set once but removed it for some reason, can't remember. Also I figure I could disable all queues before I make the changes, then reenable. -M -- Reuti ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] how to implement 'floating' slot resource quotas?
OGS/GE 2011.11p1 (Rocks 6.1) Hi, I'm looking to implement some 'floating' resources quotas - not sure what they're properly called. I'd like individual users to have per-user slot quotas, in our case 32, and then to have access to a per-group slot quota that will augment their individual quota. The per-group quota should apply to a defined set of users, and should be shared by the group. That is, if user A is using 32 slots of their individual quota, and in addition16 slots of a 24-slot group quota, then other users of the group should only have 8 group slots available to them once they've filled their individual 32 slots. Can someone tell me the proper terminology for this kind of setup and how to proceed? I know I can create project groups, but I don't know how to have the group's quotas be available in addition to any individual quotas. Thanks. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] How to clear possibly-corrupted queue state?
Thanks Reuti. The group has restored operations, using a snapshot to restore previous spools. -M On Wed, Sep 3, 2014 at 6:52 AM, Reuti re...@staff.uni-marburg.de wrote: Hi, Am 02.09.2014 um 22:30 schrieb Michael Stauffer: I'm trying to help users of another cluster whose admin is on vacation - a bit of Murphy's Law at work here, it seems. Their queue keeps failing, and after restarting qmaster it fails again after about a minute. The suspicion is some bad job files, judging from these log entries: = Also, the last few lines in the qmaster logfile = $SGE_ROOT/$SGE_CELL/spool/qmaster/messages = = 09/02/2014 14:15:02| main|cbica-cluster|C|job file = jobs/00/0005/2729 = has zero size = 09/02/2014 14:15:02| main|cbica-cluster|C|job file = jobs/00/0005/2726 = has zero size = 09/02/2014 14:15:02| main|cbica-cluster|C|job file = jobs/00/0005/2727 = has zero size = 09/02/2014 14:15:02| main|cbica-cluster|C|job file = jobs/00/0005/2728 = has zero size = 09/02/2014 14:15:02| main|cbica-cluster|C|job file = jobs/00/0003/2326 = has zero size = 09/02/2014 14:15:02| main|cbica-cluster|E|wrong cull version, read = 0x, but expected actual version 0x1002 = 09/02/2014 14:15:02| main|cbica-cluster|E|error in init_packbuffer: = wrong cull version The qmaster and commands are working, it's just the exechost which keep failing? You could stop the execd thereon, and remove the complete spool directory for the node. The starting execd will recreate the directory structure for the particular node. If it's the structure of the qmaster instead: do you use classic spooling then? -- Reuti How can we clear any state files and get a fresh start? Thanks. In the meantime I'll look more online for answers. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] How to clear possibly-corrupted queue state?
Hi, I'm trying to help users of another cluster whose admin is on vacation - a bit of Murphy's Law at work here, it seems. Their queue keeps failing, and after restarting qmaster it fails again after about a minute. The suspicion is some bad job files, judging from these log entries: = Also, the last few lines in the qmaster logfile = $SGE_ROOT/$SGE_CELL/spool/qmaster/messages = = 09/02/2014 14:15:02| main|cbica-cluster|C|job file = jobs/00/0005/2729 = has zero size = 09/02/2014 14:15:02| main|cbica-cluster|C|job file = jobs/00/0005/2726 = has zero size = 09/02/2014 14:15:02| main|cbica-cluster|C|job file = jobs/00/0005/2727 = has zero size = 09/02/2014 14:15:02| main|cbica-cluster|C|job file = jobs/00/0005/2728 = has zero size = 09/02/2014 14:15:02| main|cbica-cluster|C|job file = jobs/00/0003/2326 = has zero size = 09/02/2014 14:15:02| main|cbica-cluster|E|wrong cull version, read = 0x, but expected actual version 0x1002 = 09/02/2014 14:15:02| main|cbica-cluster|E|error in init_packbuffer: = wrong cull version How can we clear any state files and get a fresh start? Thanks. In the meantime I'll look more online for answers. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Notify when h_vmem is hit?
Thanks Reuit, see below Am 25.08.2014 um 22:27 schrieb Michael Stauffer mgsta...@gmail.com: Version OGS/GE 2011.11p1 (Rocks 6.1) Hi, I'm using h_vmem and s_vmem to limit memory usage for qsub and qlogin jobs. A user's got some analyses running on nearly identical data sets that are hitting memory limits and being killed, which is fine, but the messages are inconsistent. Some instances report an exception from the app in question saying that memory can't be allocated. This app (an in-house tool) sends exceptions to stdout. Other instances just dump core and there's no message about memory problems in either stdout or stderr logs. h_vmem is 6000M and s_vmem is 5900M. It might be that the instances are right up against the s_vmem limit when the failing memory allocation occurs, and in some cases the requested amount triggers only the soft limit, and in other it triggers both. So perhpas the instances where it triggers the hard limit are the ones without the exception messages? Unfortunately the stderr and stdout log filenames don't contain job ids. But you can include the job id in the filename of the generated stdout/-err file, or dump a `ps -e f` to stdout in the jobscript. The shepherd will also contain the job id as argument. Yes sorry, I wasn't clear. I just meant that the output files I had to work with from the user did not have the job id's included. In further tests, I can include the job id. Do you catch the sigxcpu in the job script? No. Is this relevant for h_vmem and s_vmem limits? When the loglevel in SGE is set to log_info, it will also record the passed limits in the messages file of the execd on the node. This is another place to look at then. Great. qconf shows the level is currently log_warning, yet I still see messages about catching s_vmem and h_vmem, which is very helpful. I've run some more tests with both a modified analysis script and a simple bash script that eats memory. Each script has some commands to print to stdout that run after the command that runs out of memory, so I can monitor if the script keeps running after the mem limit is reached. I ran 100 iterations of each of these, one run with h_vmem set higher than s_vmem, and the other run vice versa. In both cases, I get about 90% of the iterations with a 'clean' exit, in which I see an exception message from the offending command that memory could not be allocated, and the script finishes running after the offending command. In the remaining cases, the output shows neither a memory exception message nor that the script finishes running. Does this seem normal? -M However, in my first tests anyway, a qsub script that runs out of memory shows an exception message, even when s_vmem is higher than h_vmem. So I'm not sure about this line of reasoning. We're trying to figure it out and will run more tests, but I thought I'd check here first to see if anyone's had this kind of experience. Thanks. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Notify when h_vmem is hit?
Version OGS/GE 2011.11p1 (Rocks 6.1) Hi, I'm using h_vmem and s_vmem to limit memory usage for qsub and qlogin jobs. A user's got some analyses running on nearly identical data sets that are hitting memory limits and being killed, which is fine, but the messages are inconsistent. Some instances report an exception from the app in question saying that memory can't be allocated. This app (an in-house tool) sends exceptions to stdout. Other instances just dump core and there's no message about memory problems in either stdout or stderr logs. h_vmem is 6000M and s_vmem is 5900M. It might be that the instances are right up against the s_vmem limit when the failing memory allocation occurs, and in some cases the requested amount triggers only the soft limit, and in other it triggers both. So perhpas the instances where it triggers the hard limit are the ones without the exception messages? Unfortunately the stderr and stdout log filenames don't contain job ids. But you can include the job id in the filename of the generated stdout/-err file, or dump a `ps -e f` to stdout in the jobscript. The shepherd will also contain the job id as argument. Yes sorry, I wasn't clear. I just meant that the output files I had to work with from the user did not have the job id's included. In further tests, I can include the job id. Do you catch the sigxcpu in the job script? No. Is this relevant for h_vmem and s_vmem limits? Passing the s_vmem limit will send a signal to the job. If you are not acting on it it will either abort the job (as default action for sigxcpu), or if it's ignored passing later on the h_vmem. This action for the limits is described in `man queue_conf`. What behavior did you expect when s_vmem is passed? Seems I hadn't thought it through enough. I'd figured that the signal would be caught by the application that's run via the job script, and if the application handles it, it will quit gracefully. But what you're saying is that the signal goes to the job script process? I guess that makes sense. I'm experimenting with 'trap' to catch sigxcpu, but it's not working to trap it. I've tried having the script reach both vmem and cput-time ulimits. I've tried in a simple script running both on FE and via qsub. In the same script if I trap SIGINT, it works to trap it and execute my trap command. Anyone had this issue before? And if I can get sigxcpu to work in a trap, is there a way to add a trap command to every script that gets submitted via qsub, or to run each qsub command in a wrapper script that includes a trap? When the loglevel in SGE is set to log_info, it will also record the passed limits in the messages file of the execd on the node. This is another place to look at then. Great. qconf shows the level is currently log_warning, yet I still see messages about catching s_vmem and h_vmem, which is very helpful. However, no that I test this more, sometimes I do NOT see messages on a node about a job that was just terminated due to memory limits. Would this be because of the situation you describe below, when the kernel acts before SGE to kill the job? I've run some more tests with both a modified analysis script and a simple bash script that eats memory. Each script has some commands to print to stdout that run after the command that runs out of memory, so I can monitor if the script keeps running after the mem limit is reached. I ran 100 iterations of each of these, one run with h_vmem set higher than s_vmem, and the other run vice versa. In both cases, I get about 90% of the iterations with a 'clean' exit, in which I see an exception message from the offending command that memory could not be allocated, and the script finishes running after the offending command. In the remaining cases, the output shows neither a memory exception message nor that the script finishes running. Does this seem normal? AFAIK yes. The overall h_vmem consumption is observed by SGE and it will act when it's passed. But h_vmem will also set a kernel limit. Whichever of these two will notice it first, will take action. The difference is that SGE will accumulate all processes belonging to a job, while the kernel limit is per process. OK, so you're saying when SGE acts on h_vmem, I'm getting a clean exit, but when the kernel catches it first, I'm getting the 'no message' exit? Thanks. -M -- Reuti -M However, in my first tests anyway, a qsub script that runs out of memory shows an exception message, even when s_vmem is higher than h_vmem. So I'm not sure about this line of reasoning. We're trying to figure it out and will run more tests, but I thought I'd check here first to see if anyone's had this kind of experience. Thanks. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] Notify when h_vmem is hit?
Version OGS/GE 2011.11p1 (Rocks 6.1) Hi, I'm using h_vmem and s_vmem to limit memory usage for qsub and qlogin jobs. A user's got some analyses running on nearly identical data sets that are hitting memory limits and being killed, which is fine, but the messages are inconsistent. Some instances report an exception from the app in question saying that memory can't be allocated. This app (an in-house tool) sends exceptions to stdout. Other instances just dump core and there's no message about memory problems in either stdout or stderr logs. h_vmem is 6000M and s_vmem is 5900M. It might be that the instances are right up against the s_vmem limit when the failing memory allocation occurs, and in some cases the requested amount triggers only the soft limit, and in other it triggers both. So perhpas the instances where it triggers the hard limit are the ones without the exception messages? Unfortunately the stderr and stdout log filenames don't contain job ids. However, in my first tests anyway, a qsub script that runs out of memory shows an exception message, even when s_vmem is higher than h_vmem. So I'm not sure about this line of reasoning. We're trying to figure it out and will run more tests, but I thought I'd check here first to see if anyone's had this kind of experience. Thanks. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] users Digest, Vol 42, Issue 7
On Fri, Jun 6, 2014 at 8:30 PM, Michael Stauffer mgsta...@gmail.com wrote: OGS/GE 2011.11p1 Hi, Is there an alternative to h_vmem that checks resident memory rather than virtual? I'd like a consumable that kills a job when it oversteps its requested memory quota, so h_vmem is great. But I'm having trouble with Matlab (and maybe other apps, haven't checked fully yet) which at startup, allocates a somewhat-variable large amount of VIRT memory (~4G) but little RES memory (~300M) (according to 'top'). Matlab's java environment is allocating most of the virtual memory, but they can't tell me a way to limit this. In any case this large VIRT value makes it impossible to set a smaller default h_vmem value because Matlab won't launch. Thanks for any thoughts. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users I've got a workaround/solution from Mathworks about the issue above. Turn on Redhat 6 introduced a new memory tool called arenas. These are behind (at least most of) the large allocations of VIRT memory in Java-based programs. They were added to reduce 'false sharing' with multi-threaded apps, not that I understand the details. The env var MALLOC_ARENA_MAX can be set to a low number to limit these arenas and thus the VIRT allocations. It seems the recommended value is 4 to limit arenas, but not completely lose out on the advantages. When I set to 4, matlab (R2013a) now alloc's 1350MB of VIRT instead of ~4GB. Some discussion online: Hadoop says set to 4: https://issues.apache.org/jira/browse/HADOOP-7154 IBM: https://www.ibm.com/developerworks/community/blogs/kevgrig/entry/linux_glibc_2_10_rhel_6_malloc_may_show_excessive_virtual_memory_usage?lang=en -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Enforce users to use specific amount of memory/slot
Message: 4 Date: Mon, 30 Jun 2014 11:53:12 +0200 From: Txema Heredia txema.llis...@gmail.com To: Derrick Lin klin...@gmail.com, SGE Mailing List users@gridengine.org Subject: Re: [gridengine users] Enforce users to use specific amount of memory/slot Message-ID: 53b13388.5060...@gmail.com Content-Type: text/plain; charset=iso-8859-1; Format=flowed Hi Derrick, You could either set h_vmem as a consumable (consumable=yes) attribute and set a default value of 8GB for it. This way, whenever a job doesn't request any amount of h_vmem, it will automatically request 8GB per slot. This will affect all types of jobs. You could also define a JSV script that checks the username, and forces a -l h_vmem=8G for his/her jobs ( jsv_sub_add_param('l_hard','h_vmem','8G') ). This will affect all jobs for that user, but could turn into a pain to manage. Or, you could set a different policy and allow all users to request the amount of memory they really need, trying to fit best the node. What is the point of forcing the user to reserve 63 additional cores when they only need 1 core and 500GB of memory? You could fit in that node one job like this, and, say, two 30-core-6GB-memory jobs. Txema El 30/06/14 08:55, Derrick Lin escribi?: Hi guys, A typical node on our cluster has 64 cores and 512GB memory. So it's about 8GB/core. Occasionally, we have some jobs that utilizes only 1 core but 400-500GB of memory, that annoys lots of users. So I am seeking a way that can force jobs to run strictly below 8GB/core ration or it should be killed. For example, the above job should ask for 64 cores in order to use 500GB of memory (we have user quota for slots). I have been trying to play around h_vmem, set it to consumable and configure RQS { namemax_user_vmem enabled true description Each user can utilize more than 8GB/slot limit users {bad_user} to h_vmem=8g } but it seems to be setting a total vmem bad_user can use per job. I would love to set it on users instead of queue or hosts because we have applications that utilize the same set of nodes and app should be unlimited. Thanks Derrick I've been dealing with this too. I'm using h_vmem to kill processes that go above the limit, and s_vmem set slightly lower by default to give well-behaved processes a chance first to exit gracefully. The issue is that these use virtual memory, which is (always, more or less) great than resident memory, i.e. the actual ram usage. And with java apps like Matlab, the amount of virtual memory reserved/used is HUGE compared to resident, by 10x give or take. So it makes it really impracticle actually. However so far I've just set the default h_vmem and s_vmem values high enough to accomadate jvm apps, and increased the per-host consumable appropriately. We don't get fine-grained memory control, but it definitely controls out-of-control users/procs that otherwise might gobble up enough ram to slow dow the entire node. We may switch to UVE just for this reason, to get memory limits based on resident memory, if it seems worth it enough in the end. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Enforce users to use specific amount of memory/slot
If I remember right, h_vmem amount applies to the job and is not scaled by # of slots like some other resources. Just did a simple test with an 8-slot job (pe_serial) and it only used one 'unit' of h_vmem, i.e. the default amount assigned as consumable. 40GB VIRT vs 100MB RES is a huge difference! I thought I had it bad with matlab using 4GB VIRT for 100MB RES. -M On Mon, Jun 30, 2014 at 4:47 PM, Feng Zhang prod.f...@gmail.com wrote: Guys, Just curious, how does the h_vmem work on processes of MPI jobs(or OPENMP, multi-threading)? I have some parallel jobs, the top command shows VET of 40GB, while the RES is only 100MB. On Mon, Jun 30, 2014 at 3:01 PM, Michael Stauffer mgsta...@gmail.com wrote: Message: 4 Date: Mon, 30 Jun 2014 11:53:12 +0200 From: Txema Heredia txema.llis...@gmail.com To: Derrick Lin klin...@gmail.com, SGE Mailing List users@gridengine.org Subject: Re: [gridengine users] Enforce users to use specific amount of memory/slot Message-ID: 53b13388.5060...@gmail.com Content-Type: text/plain; charset=iso-8859-1; Format=flowed Hi Derrick, You could either set h_vmem as a consumable (consumable=yes) attribute and set a default value of 8GB for it. This way, whenever a job doesn't request any amount of h_vmem, it will automatically request 8GB per slot. This will affect all types of jobs. You could also define a JSV script that checks the username, and forces a -l h_vmem=8G for his/her jobs ( jsv_sub_add_param('l_hard','h_vmem','8G') ). This will affect all jobs for that user, but could turn into a pain to manage. Or, you could set a different policy and allow all users to request the amount of memory they really need, trying to fit best the node. What is the point of forcing the user to reserve 63 additional cores when they only need 1 core and 500GB of memory? You could fit in that node one job like this, and, say, two 30-core-6GB-memory jobs. Txema El 30/06/14 08:55, Derrick Lin escribi?: Hi guys, A typical node on our cluster has 64 cores and 512GB memory. So it's about 8GB/core. Occasionally, we have some jobs that utilizes only 1 core but 400-500GB of memory, that annoys lots of users. So I am seeking a way that can force jobs to run strictly below 8GB/core ration or it should be killed. For example, the above job should ask for 64 cores in order to use 500GB of memory (we have user quota for slots). I have been trying to play around h_vmem, set it to consumable and configure RQS { namemax_user_vmem enabled true description Each user can utilize more than 8GB/slot limit users {bad_user} to h_vmem=8g } but it seems to be setting a total vmem bad_user can use per job. I would love to set it on users instead of queue or hosts because we have applications that utilize the same set of nodes and app should be unlimited. Thanks Derrick I've been dealing with this too. I'm using h_vmem to kill processes that go above the limit, and s_vmem set slightly lower by default to give well-behaved processes a chance first to exit gracefully. The issue is that these use virtual memory, which is (always, more or less) great than resident memory, i.e. the actual ram usage. And with java apps like Matlab, the amount of virtual memory reserved/used is HUGE compared to resident, by 10x give or take. So it makes it really impracticle actually. However so far I've just set the default h_vmem and s_vmem values high enough to accomadate jvm apps, and increased the per-host consumable appropriately. We don't get fine-grained memory control, but it definitely controls out-of-control users/procs that otherwise might gobble up enough ram to slow dow the entire node. We may switch to UVE just for this reason, to get memory limits based on resident memory, if it seems worth it enough in the end. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] Alternative to h_vmem?
OGS/GE 2011.11p1 Hi, Is there an alternative to h_vmem that checks resident memory rather than virtual? I'd like a consumable that kills a job when it oversteps its requested memory quota, so h_vmem is great. But I'm having trouble with Matlab (and maybe other apps, haven't checked fully yet) which at startup, allocates a somewhat-variable large amount of VIRT memory (~4G) but little RES memory (~300M) (according to 'top'). Matlab's java environment is allocating most of the virtual memory, but they can't tell me a way to limit this. In any case this large VIRT value makes it impossible to set a smaller default h_vmem value because Matlab won't launch. Thanks for any thoughts. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] method to more fully utilize idle cores?
Version OGS/GE 2011.11p1 (Rocks 6.1) Hi, Is there a mechanism for dynamically changing rqs' when the clusters resources are below some overall usage threshold, in order to allow more jobs to run? I've got 300 cores, and sometimes only a few users are running anything with lots of queued jobs. With a quota of 32 cores each, there are a lot of unused cores. If the scheduler could recognize this and enough queued jobs to fill maybe 75% of the cores, that'd be great. If not a direct mechanism to do this, would a subordinate queue be best? One that suspends running jobs when the main queue fills up? However, these seems less ideal, since the suspended jobs might stay suspended for a long time if the main queue fills up with a lot of jobs all of a sudden. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] Sun Admin Guide 6.2 - found
Hi, I've found a 6.2 admin guide: http://beowulf.rutgers.edu/info-user/pdf/ge62u5-admin.pdf I know at least one other person out there was looking, can't find the original post about that. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] jobs going to wrong queue type
Version OGS/GE 2011.11p1 (Rocks 6.1) Hi, I've got a situation where a user's qsub jobs are being run in interactive-only queues, as well a the expected batch-only queue. Here's his command, and below that are some specs of my setup. Anyone have any idea how this might happen or what I'm missing? I saw this happen yesterday for a job by another user, but otherwse qsub jobs seems to be properly going to only all.q. Thanks. -M qsub -v ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=${slots},ANTSPATH=${ANTSPATH} -pe unihost $slots -S /bin/bash -cwd -o ${outputTP_Dir}/${subj}_${tp}.stdout -e ${outputTP_Dir}/${subj}_${tp}.stderr ${ANTSPATH}antsCorticalThickness.sh -d 3 -a $t1 -e ${templateDir}/template.nii.gz -m ${templateDir}/masks/ templateBrainMaskProbability.nii.gz -f ${templateDir}/masks/ templateBrainExtractionRegistrationMask.nii.gz -p ${templateDir}/priors/prior%02d.nii.gz -t ${templateDir}/templateBrain.nii.gz -o ${outputTP_Dir}/${subj}_${tp}_ [root@chead ~]# qstat -u pcook 35282 0.60500 antsCortic pcookr 05/20/2014 09:56:42 all.q@compute-0-4.local2 35283 0.60500 antsCortic pcookr 05/20/2014 09:57:27 all.q@compute-0-4.local2 35284 0.60500 antsCortic pcookr 05/20/2014 10:19:36 qlogin.q@compute-0-1.local 2 35285 0.60500 antsCortic pcookr 05/20/2014 10:19:50 all.q@compute-0-5.local2 35286 0.60500 antsCortic pcookr 05/20/2014 10:43:01 qlogin.lon...@compute-0-5.loca 2 35287 0.60500 antsCortic pcookr 05/20/2014 10:51:37 all.q@compute-0-5.local2 35960 0.60500 antsCortic pcookr 05/20/2014 11:28:11 qlogin.q@compute-0-5.local 2 35961 0.60500 antsCortic pcookr 05/20/2014 11:28:12 qlogin.q@compute-0-15.local2 35962 0.60500 antsCortic pcookr 05/20/2014 11:28:12 all.q@compute-0-15.local 2 35963 0.60500 antsCortic pcookr 05/20/2014 12:31:56 all.q@compute-0-6.local2 35964 0.60500 antsCortic pcookr 05/20/2014 12:34:17 all.q@compute-0-6.local2 35965 0.60500 antsCortic pcookr 05/20/2014 12:53:47 all.q@compute-0-11.local 2 35966 0.60500 antsCortic pcookr 05/20/2014 12:54:45 all.q@compute-0-11.local 2 35967 0.60500 antsCortic pcookr 05/20/2014 13:04:26 qlogin.lon...@compute-0-11.loc 2 35968 0.60500 antsCortic pcookr 05/20/2014 13:05:13 all.q@compute-0-11.local 2 35969 0.60500 antsCortic pcookr 05/20/2014 13:10:23 all.q@compute-0-11.local 2 35970 0.60500 antsCortic pcookr 05/20/2014 13:19:27 all.q@compute-0-6.local2 35971 0.60500 antsCortic pcookr 05/20/2014 13:20:57 qlogin.lon...@compute-0-11.loc 2 35972 0.60500 antsCortic pcookr 05/20/2014 13:28:03 all.q@compute-0-11.local 2 35973 0.60500 antsCortic pcookr 05/20/2014 13:32:47 all.q@compute-0-8.local2 35974 0.60500 antsCortic pcookr 05/20/2014 13:35:31 all.q@compute-0-11.local 2 35975 0.60500 antsCortic pcookr 05/20/2014 14:21:19 all.q@compute-0-11.local 2 [root@chead ~]# qconf -sql all.q qlogin.long.q qlogin.q [root@chead ~]# qconf -sq all.q | grep -e type -e pe_list qtype BATCH pe_list make mpich mpi orte unihost serial [root@chead ~]# qconf -sq qlogin.q | grep -e type -e pe_list qtype INTERACTIVE pe_list make unihost serial [root@chead ~]# qconf -sq qlogin.long.q | grep -e type -e pe_list qtype INTERACTIVE pe_list make unihost serial [root@chead ~]# qconf -srqs { name limit_user_interactive_slots description Limit the users' interactive slots based enabled TRUE limitusers {*} queues {qlogin.q,qlogin.long.q} to slots=6 } { name limit_user_slots description Limit the users' batch slots enabled TRUE limitusers {*} queues {all.q} to slots=32 } [root@chead ~]# qconf -sp unihost pe_nameunihost slots user_lists NONE xuser_listsNONE start_proc_args/bin/true stop_proc_args /bin/true allocation_rule$pe_slots control_slaves FALSE job_is_first_task TRUE urgency_slots min accounting_summary FALSE ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] jobs going to wrong queue type
On Tue, May 20, 2014 at 4:20 PM, Reuti re...@staff.uni-marburg.de wrote: Hi, Am 20.05.2014 um 21:42 schrieb Michael Stauffer: Version OGS/GE 2011.11p1 (Rocks 6.1) I've got a situation where a user's qsub jobs are being run in interactive-only queues, an attached PE makes a queue also a BATCH queue for jobs requesting this PE. There is no parallel-interactive-only setting. Maybe a JSV could route a parallel batch job to dedicated queues only. -- Reuti Thanks. So if I have a separate PE that's used for qlogin's only and assigned that one only to the qlogin queues and the original PE only to the batch queue, it should be ok? Seems easier than using JSV's (haven't tried one yet). -M as well a the expected batch-only queue. Here's his command, and below that are some specs of my setup. Anyone have any idea how this might happen or what I'm missing? I saw this happen yesterday for a job by another user, but otherwse qsub jobs seems to be properly going to only all.q. Thanks. -M qsub -v ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=${slots},ANTSPATH=${ANTSPATH} -pe unihost $slots -S /bin/bash -cwd -o ${outputTP_Dir}/${subj}_${tp}.stdout -e ${outputTP_Dir}/${subj}_${tp}.stderr ${ANTSPATH}antsCorticalThickness.sh -d 3 -a $t1 -e ${templateDir}/template.nii.gz -m ${templateDir}/masks/ templateBrainMaskProbability.nii.gz -f ${templateDir}/masks/ templateBrainExtractionRegistrationMask.nii.gz -p ${templateDir}/priors/prior%02d.nii.gz -t ${templateDir}/templateBrain.nii.gz -o ${outputTP_Dir}/${subj}_${tp}_ [root@chead ~]# qstat -u pcook 35282 0.60500 antsCortic pcookr 05/20/2014 09:56:42 all.q@compute-0-4.local2 35283 0.60500 antsCortic pcookr 05/20/2014 09:57:27 all.q@compute-0-4.local2 35284 0.60500 antsCortic pcookr 05/20/2014 10:19:36 qlogin.q@compute-0-1.local 2 35285 0.60500 antsCortic pcookr 05/20/2014 10:19:50 all.q@compute-0-5.local2 35286 0.60500 antsCortic pcookr 05/20/2014 10:43:01 qlogin.lon...@compute-0-5.loca 2 35287 0.60500 antsCortic pcookr 05/20/2014 10:51:37 all.q@compute-0-5.local2 35960 0.60500 antsCortic pcookr 05/20/2014 11:28:11 qlogin.q@compute-0-5.local 2 35961 0.60500 antsCortic pcookr 05/20/2014 11:28:12 qlogin.q@compute-0-15.local2 35962 0.60500 antsCortic pcookr 05/20/2014 11:28:12 all.q@compute-0-15.local 2 35963 0.60500 antsCortic pcookr 05/20/2014 12:31:56 all.q@compute-0-6.local2 35964 0.60500 antsCortic pcookr 05/20/2014 12:34:17 all.q@compute-0-6.local2 35965 0.60500 antsCortic pcookr 05/20/2014 12:53:47 all.q@compute-0-11.local 2 35966 0.60500 antsCortic pcookr 05/20/2014 12:54:45 all.q@compute-0-11.local 2 35967 0.60500 antsCortic pcookr 05/20/2014 13:04:26 qlogin.lon...@compute-0-11.loc 2 35968 0.60500 antsCortic pcookr 05/20/2014 13:05:13 all.q@compute-0-11.local 2 35969 0.60500 antsCortic pcookr 05/20/2014 13:10:23 all.q@compute-0-11.local 2 35970 0.60500 antsCortic pcookr 05/20/2014 13:19:27 all.q@compute-0-6.local2 35971 0.60500 antsCortic pcookr 05/20/2014 13:20:57 qlogin.lon...@compute-0-11.loc 2 35972 0.60500 antsCortic pcookr 05/20/2014 13:28:03 all.q@compute-0-11.local 2 35973 0.60500 antsCortic pcookr 05/20/2014 13:32:47 all.q@compute-0-8.local2 35974 0.60500 antsCortic pcookr 05/20/2014 13:35:31 all.q@compute-0-11.local 2 35975 0.60500 antsCortic pcookr 05/20/2014 14:21:19 all.q@compute-0-11.local 2 [root@chead ~]# qconf -sql all.q qlogin.long.q qlogin.q [root@chead ~]# qconf -sq all.q | grep -e type -e pe_list qtype BATCH pe_list make mpich mpi orte unihost serial [root@chead ~]# qconf -sq qlogin.q | grep -e type -e pe_list qtype INTERACTIVE pe_list make unihost serial [root@chead ~]# qconf -sq qlogin.long.q | grep -e type -e pe_list qtype INTERACTIVE pe_list make unihost serial [root@chead ~]# qconf -srqs { name limit_user_interactive_slots description Limit the users' interactive slots based enabled TRUE limitusers {*} queues {qlogin.q,qlogin.long.q} to slots=6 } { name limit_user_slots description Limit the users' batch slots enabled TRUE limitusers {*} queues {all.q} to slots=32 } [root@chead ~]# qconf -sp unihost pe_nameunihost slots user_lists NONE xuser_listsNONE start_proc_args/bin/true stop_proc_args /bin/true allocation_rule
Re: [gridengine users] jobs going to wrong queue type
On Tue, May 20, 2014 at 4:29 PM, Reuti re...@staff.uni-marburg.de wrote: Am 20.05.2014 um 22:25 schrieb Michael Stauffer: On Tue, May 20, 2014 at 4:20 PM, Reuti re...@staff.uni-marburg.de wrote: Hi, Am 20.05.2014 um 21:42 schrieb Michael Stauffer: Version OGS/GE 2011.11p1 (Rocks 6.1) I've got a situation where a user's qsub jobs are being run in interactive-only queues, an attached PE makes a queue also a BATCH queue for jobs requesting this PE. There is no parallel-interactive-only setting. Maybe a JSV could route a parallel batch job to dedicated queues only. -- Reuti Thanks. So if I have a separate PE that's used for qlogin's only and assigned that one only to the qlogin queues and the original PE only to the batch queue, it should be ok? Yes, but could be abused by users - or even interchanged by accident. -- Reuti Yes, I see. I have qsub and qlogin aliased for all users with additional aliases for launching with pe's. But still, could be trouble. I'll take a look at JSV's, thanks. -M Seems easier than using JSV's (haven't tried one yet). -M as well a the expected batch-only queue. Here's his command, and below that are some specs of my setup. Anyone have any idea how this might happen or what I'm missing? I saw this happen yesterday for a job by another user, but otherwse qsub jobs seems to be properly going to only all.q. Thanks. -M qsub -v ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=${slots},ANTSPATH=${ANTSPATH} -pe unihost $slots -S /bin/bash -cwd -o ${outputTP_Dir}/${subj}_${tp}.stdout -e ${outputTP_Dir}/${subj}_${tp}.stderr ${ANTSPATH}antsCorticalThickness.sh -d 3 -a $t1 -e ${templateDir}/template.nii.gz -m ${templateDir}/masks/ templateBrainMaskProbability.nii.gz -f ${templateDir}/masks/ templateBrainExtractionRegistrationMask.nii.gz -p ${templateDir}/priors/prior%02d.nii.gz -t ${templateDir}/templateBrain.nii.gz -o ${outputTP_Dir}/${subj}_${tp}_ [root@chead ~]# qstat -u pcook 35282 0.60500 antsCortic pcookr 05/20/2014 09:56:42 all.q@compute-0-4.local2 35283 0.60500 antsCortic pcookr 05/20/2014 09:57:27 all.q@compute-0-4.local2 35284 0.60500 antsCortic pcookr 05/20/2014 10:19:36 qlogin.q@compute-0-1.local 2 35285 0.60500 antsCortic pcookr 05/20/2014 10:19:50 all.q@compute-0-5.local2 35286 0.60500 antsCortic pcookr 05/20/2014 10:43:01 qlogin.lon...@compute-0-5.loca 2 35287 0.60500 antsCortic pcookr 05/20/2014 10:51:37 all.q@compute-0-5.local2 35960 0.60500 antsCortic pcookr 05/20/2014 11:28:11 qlogin.q@compute-0-5.local 2 35961 0.60500 antsCortic pcookr 05/20/2014 11:28:12 qlogin.q@compute-0-15.local2 35962 0.60500 antsCortic pcookr 05/20/2014 11:28:12 all.q@compute-0-15.local 2 35963 0.60500 antsCortic pcookr 05/20/2014 12:31:56 all.q@compute-0-6.local2 35964 0.60500 antsCortic pcookr 05/20/2014 12:34:17 all.q@compute-0-6.local2 35965 0.60500 antsCortic pcookr 05/20/2014 12:53:47 all.q@compute-0-11.local 2 35966 0.60500 antsCortic pcookr 05/20/2014 12:54:45 all.q@compute-0-11.local 2 35967 0.60500 antsCortic pcookr 05/20/2014 13:04:26 qlogin.lon...@compute-0-11.loc 2 35968 0.60500 antsCortic pcookr 05/20/2014 13:05:13 all.q@compute-0-11.local 2 35969 0.60500 antsCortic pcookr 05/20/2014 13:10:23 all.q@compute-0-11.local 2 35970 0.60500 antsCortic pcookr 05/20/2014 13:19:27 all.q@compute-0-6.local2 35971 0.60500 antsCortic pcookr 05/20/2014 13:20:57 qlogin.lon...@compute-0-11.loc 2 35972 0.60500 antsCortic pcookr 05/20/2014 13:28:03 all.q@compute-0-11.local 2 35973 0.60500 antsCortic pcookr 05/20/2014 13:32:47 all.q@compute-0-8.local2 35974 0.60500 antsCortic pcookr 05/20/2014 13:35:31 all.q@compute-0-11.local 2 35975 0.60500 antsCortic pcookr 05/20/2014 14:21:19 all.q@compute-0-11.local 2 [root@chead ~]# qconf -sql all.q qlogin.long.q qlogin.q [root@chead ~]# qconf -sq all.q | grep -e type -e pe_list qtype BATCH pe_list make mpich mpi orte unihost serial [root@chead ~]# qconf -sq qlogin.q | grep -e type -e pe_list qtype INTERACTIVE pe_list make unihost serial [root@chead ~]# qconf -sq qlogin.long.q | grep -e type -e pe_list qtype INTERACTIVE pe_list make unihost serial [root@chead ~]# qconf -srqs { name limit_user_interactive_slots description Limit the users' interactive slots based enabled TRUE limitusers {*} queues {qlogin.q,qlogin.long.q
Re: [gridengine users] SunN1Grid Engine 6.2 Administration Guide
Date: Mon, 12 May 2014 10:21:10 -0400 From: patrick aestheticmaca...@gmail.com To: users@gridengine.org Subject: [gridengine users] SunN1Grid Engine 6.2 Administration Guide Message-ID: cakss31hwptn3yd+hxdzfkv2g1_tzwlcmwfp-qk0tyeaqa6m...@mail.gmail.com Content-Type: text/plain; charset=utf-8 Anyone have the SunN1Grid Engine 6.2 Administration Guide? I was able to find the SunN1Grid Engine 6.1 Administration Guide but had no luck finding 6.2. Thanks I've been looking for this myself. Here's the Beginner's guide for 6.2, but you've probably found it already: http://mjrutherford.org/files/2009-Spring-COMP-4704-Sun_GRID.pdf -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] users Digest, Vol 41, Issue 15
#Date: Mon, 12 May 2014 10:21:10 -0400 From: patrick aestheticmaca...@gmail.com To: users@gridengine.org Subject: [gridengine users] SunN1Grid Engine 6.2 Administration Guide Message-ID: cakss31hwptn3yd+hxdzfkv2g1_tzwlcmwfp-qk0tyeaqa6m...@mail.gmail.com Content-Type: text/plain; charset=utf-8 Anyone have the SunN1Grid Engine 6.2 Administration Guide? I was able to find the SunN1Grid Engine 6.1 Administration Guide but had no luck finding 6.2. Thanks I've been looking for this myself. Here's the Beginner's guide for 6.2, but you've probably found it already: http://mjrutherford.org/files/2009-Spring-COMP-4704-Sun_GRID.pdf -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] users Digest, Vol 41, Issue 4
I'm trying to understand the output from qstat after setting up a resource limits for queue memory requests. I've got h_vmem and s_vmem set as consumables, and the default to 3.9G per job, e.g.: [root@compute-0-18 ~]# qconf -sc | grep h_vmem h_vmem h_vmem MEMORY =YES JOB 3900M0 Previously, using 'qstat -F h_vmem', I was seeing the amount of this resource remaining after whatever running jobs had claimed either the default or requested amount But now after setting up the following queue limits, the all.q output shows only the queue per-job limit, i.e. 'qf'. Is that intentional? qhost still shows the remaining consumable resource amounts. Just curious about the rationale, really. As the limit is defined on a job and a host level the tighter one will be displayed - the prefix will either be qf: or hc. Meaning: you can submit a new job which requests up the the displayed value - either the limit of the queue or the remaining free memory on the host. You could in addition define a limit under complex_values for h_vmem and the output will change to qc: if it's the actual constraint. -- Reuti OK, thanks for the explanation, that makes sense. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] understanding qstat output for consumables (Reuti)
I'm trying to understand the output from qstat after setting up a resource limits for queue memory requests. I've got h_vmem and s_vmem set as consumables, and the default to 3.9G per job, e.g.: [root@compute-0-18 ~]# qconf -sc | grep h_vmem h_vmem h_vmem MEMORY =YES JOB 3900M0 Previously, using 'qstat -F h_vmem', I was seeing the amount of this resource remaining after whatever running jobs had claimed either the default or requested amount. But now after setting up the following queue limits, the all.q output shows only the queue per-job limit, i.e. 'qf'. Is that intentional? qhost still shows the remaining consumable resource amounts. Just curious about the rationale, really. As the limit is defined on a job and a host level the tighter one will be displayed - the prefix will either be qf: or hc. Meaning: you can submit a new job which requests up the the displayed value - either the limit of the queue or the remaining free memory on the host. You could in addition define a limit under complex_values for h_vmem and the output will change to qc: if it's the actual constraint. -- Reuti OK, thanks for the explanation, that makes sense. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] qquota shows unexpected usage
Am 05.05.2014 um 20:01 schrieb Michael Stauffer: OGS/GE 2011.11p1 (Rocks 6.1) Hi, qquota is showing resource usage even when a user has no jobs running. Also, the increments and values are funny. Here's the output. The same thing is shown if run as root or the user: [root@chead users]# qquota -u mgstauff resource quota rule limitfilter limit_user_h_vmem/1 h_vmem=-104.400M/125 users mgstauff queues all.q limit_user_h_vmem/2 h_vmem=-1000.000M/23 users mgstauff queues qlogin.q limit_user_s_vmem/1 s_vmem=4.195G/125.77 users mgstauff queues all.q limit_user_s_vmem/2 s_vmem=-1000.000M/23 users mgstauff queues qlogin.q What does the definition of the RQS looks like? -- Reuti I've got this: { name limit_user_h_vmem description Limit total h_vmem memory usage to num slots * 3.93G enabled TRUE limitusers {*} queues {all.q} to h_vmem=125.77G limitusers {*} queues {qlogin.q,qlogin.long.q} to h_vmem=23.59G } { name limit_user_s_vmem description Limit total s_vmem memory usage to num slots * 3.93G enabled TRUE limitusers {*} queues {all.q} to s_vmem=125.77G limitusers {*} queues {qlogin.q,qlogin.long.q} to s_vmem=23.59G } BTW, I think there's no way to make these limits dynamic? Dynamic limits are only for hosts? I'd like to limit to N * max-num-slots-for-user-on-queue -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] qquota shows unexpected usage
Am 05.05.2014 um 20:01 schrieb Michael Stauffer: OGS/GE 2011.11p1 (Rocks 6.1) Hi, qquota is showing resource usage even when a user has no jobs running. Also, the increments and values are funny. Here's the output. The same thing is shown if run as root or the user: [root@chead users]# qquota -u mgstauff resource quota rule limitfilter limit_user_h_vmem/1 h_vmem=-104.400M/125 users mgstauff queues all.q limit_user_h_vmem/2 h_vmem=-1000.000M/23 users mgstauff queues qlogin.q limit_user_s_vmem/1 s_vmem=4.195G/125.77 users mgstauff queues all.q limit_user_s_vmem/2 s_vmem=-1000.000M/23 users mgstauff queues qlogin.q What does the definition of the RQS looks like? -- Reuti I've got this: { name limit_user_h_vmem description Limit total h_vmem memory usage to num slots * 3.93G enabled TRUE limitusers {*} queues {all.q} to h_vmem=125.77G limitusers {*} queues {qlogin.q,qlogin.long.q} to h_vmem=23.59G } { name limit_user_s_vmem description Limit total s_vmem memory usage to num slots * 3.93G enabled TRUE limitusers {*} queues {all.q} to s_vmem=125.77G limitusers {*} queues {qlogin.q,qlogin.long.q} to s_vmem=23.59G } BTW, I think there's no way to make these limits dynamic? Dynamic limits are only for hosts? I'd like to limit to N * max-num-slots-for-user-on-queue -M I changed some values in the RQS' and now things are working normally again. I guess the re-config must have reset something. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] Configurations during kickstart
Hi, I'm trying to get some resource configurations in place during kickstart. I have the following in my kickstart file replace-partition.xml. The file is run during kickstart: I can see output to text files when I add debugging info. This code runs correctly if I run it in a shell once the node is up. The issue seems to be that qhost and qconf aren't outputting anything when they run. Is that to be expected? Here's what I have added: post snipped the default stuff for this post... # Here's the code as I'd like it to work: # This code gets reached. I can output these env vars and the # values are correct. export SGEBIN=$SGE_ROOT/bin/$SGE_ARCH export NODE=$(/bin/hostname -s) export MEMFREE=`$SGEBIN/qhost -F mem_total -h $NODE|tail -n 1|cut -d: -f3 | cut -d= -f2` $SGEBIN/qconf -mattr exechost complex_values h_vmem=$MEMFREE $NODE 2gt;amp;1 gt; /root/qconf_complex_setup.log $SGEBIN/qconf -mattr exechost complex_values s_vmem=$MEMFREE $NODE 2gt;amp;1 gt;gt; /root/qconf_complex_setup.log /post Thanks! -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Configurations during kickstart
Random ideas: 1. try disabling the log redirects to see if anything ends up in the standard kickstart log? OK I'll try this. Have to wait for a host to free up to try a reinstall again. 2. SGE is unusually sensitive to hostname and DNS resolution. Is your kickstart environment giving the node the same IP address during provisioning as it has when running? Does your kickstart environment have reverse DNS lookup working so that a lookup on the IP returns the proper hostname? I'll dump tests in the kickstart file and check. Don't know how to check the last bit - you mean a lookup on the IP by the execute host as it's booting? 3. qconf requires communication with the qmaster, it looks like you are defining ENV vars that point only to the bin directory rather than setting up the full SGE environment during the kickstart. Consider sourcing the SGE init scripts or at least setting SGE_ROOT and SGE_CELL values so that the SGE binaries can navigate to $SGE_ROOT/$SGE_CELL/act_qmaster so that it knows what host to be communicating with I source /etc/profile.d/sge-binaries.sh at the begin of my code. Should I need something else than that? In any case I'm dumping relevent env vars in the kickstart now to check them. Thanks -M Regards, Chris Michael Stauffer wrote: Hi, I'm trying to get some resource configurations in place during kickstart. I have the following in my kickstart file replace-partition.xml. The file is run during kickstart: I can see output to text files when I add debugging info. This code runs correctly if I run it in a shell once the node is up. The issue seems to be that qhost and qconf aren't outputting anything when they run. Is that to be expected? Here's what I have added: post snipped the default stuff for this post... # Here's the code as I'd like it to work: # This code gets reached. I can output these env vars and the # values are correct. export SGEBIN=$SGE_ROOT/bin/$SGE_ARCH export NODE=$(/bin/hostname -s) export MEMFREE=`$SGEBIN/qhost -F mem_total -h $NODE|tail -n 1|cut -d: -f3 | cut -d= -f2` $SGEBIN/qconf -mattr exechost complex_values h_vmem=$MEMFREE $NODE 2gt;amp;1 gt; /root/qconf_complex_setup.log $SGEBIN/qconf -mattr exechost complex_values s_vmem=$MEMFREE $NODE 2gt;amp;1 gt;gt; /root/qconf_complex_setup.log /post Thanks! -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Configurations during kickstart
On Thu, May 1, 2014 at 2:26 PM, Reuti re...@staff.uni-marburg.de wrote: Hi, Am 01.05.2014 um 19:58 schrieb Michael Stauffer: I'm trying to get some resource configurations in place during kickstart. I have the following in my kickstart file replace-partition.xml. The file is run during kickstart: I can see output to text files when I add debugging info. This code runs correctly if I run it in a shell once the node is up. The issue seems to be that qhost and qconf aren't outputting anything when they run. Is that to be expected? Here's what I have added: post snipped the default stuff for this post... # Here's the code as I'd like it to work: # This code gets reached. I can output these env vars and the # values are correct. export SGEBIN=$SGE_ROOT/bin/$SGE_ARCH export NODE=$(/bin/hostname -s) export MEMFREE=`$SGEBIN/qhost -F mem_total -h $NODE|tail -n 1|cut -d: -f3 | cut -d= -f2` $SGEBIN/qconf -mattr exechost complex_values h_vmem=$MEMFREE $NODE 2gt;amp;1 gt; /root/qconf_complex_setup.log $SGEBIN/qconf -mattr exechost complex_values s_vmem=$MEMFREE $NODE 2gt;amp;1 gt;gt; Might be intended, but this syntax will put the error to the default output and only the default output in the logfile. In case you want to capture both it needs to be written as: qconf ... /root/qconf_complex_setup.log 21 -- Reuti Thanks, that was a mistake. -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] Configurations during kickstart
On Thu, May 1, 2014 at 2:27 PM, Jesse Becker becke...@mail.nih.gov wrote: On Thu, May 01, 2014 at 01:58:04PM -0400, Michael Stauffer wrote: I'm trying to get some resource configurations in place during kickstart. I have the following in my kickstart file replace-partition.xml. The file is run during kickstart: I can see output to text files when I add debugging info. I've recently been doing something similar with our system provisioning, although not directly in kickstart (we aren't using Rocks either, but I don't think that's the problem). This code runs correctly if I run it in a shell once the node is up. The issue seems to be that qhost and qconf aren't outputting anything when they run. Is that to be expected? Here's what I have added: I think the reason is one of timing. Working backwards, you want to do this: 4. configure exechost settings with information reported by qhost 3. for qhost to report info, sge_execd must be running on the node 2. for sge_execd to start, the node must be added via 'qconf -ae' 1. something needs to watch for new nodes, and trigger 'qconf -ae' I forget exactly when Rocks automagically adds nodes to SGE (the qconf -ae' bit, but I bet it hasn't happened yet. Thus, sge_execd can't start, so qhost can't report host info, so qconf -mattr fails. A few possible solutions: 1 .You might be able to somehow force this part of the %post script to run after the master adds the new node. Maybe part of the firstboot service? 2. Create a service that watches for new nodes, and configures them accordingly. 3. Have a cronjob that periodically configures *all* hosts (even old nodes, to catch HW changes). (we've opted for something between options 2 and 3--we look at all nodes, all the time, but only update new ones). Thanks. I've implemented option 3 for the time being. New hosts are rarely added or rebooted here so a periodic cron job will probably be just fine. -M post snipped the default stuff for this post... # Here's the code as I'd like it to work: # This code gets reached. I can output these env vars and the # values are correct. export SGEBIN=$SGE_ROOT/bin/$SGE_ARCH export NODE=$(/bin/hostname -s) export MEMFREE=`$SGEBIN/qhost -F mem_total -h $NODE|tail -n 1|cut -d: -f3 | cut -d= -f2` $SGEBIN/qconf -mattr exechost complex_values h_vmem=$MEMFREE $NODE 2gt;amp;1 gt; /root/qconf_complex_setup.log $SGEBIN/qconf -mattr exechost complex_values s_vmem=$MEMFREE $NODE 2gt;amp;1 gt;gt; /root/qconf_complex_setup.log /post Thanks! -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users -- Jesse Becker (Contractor) ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] understanding qstat output for consumables
OGS/GE 2011.11p1 (Rocks 6.1) Hi, I'm trying to understand the output from qstat after setting up a resource limits for queue memory requests. I've got h_vmem and s_vmem set as consumables, and the default to 3.9G per job, e.g.: [root@compute-0-18 ~]# qconf -sc | grep h_vmem h_vmem h_vmem MEMORY =YES JOB 3900M0 Previously, using 'qstat -F h_vmem', I was seeing the amount of this resource remaining after whatever running jobs had claimed either the default or requested amount. But now after setting up the following queue limits, the all.q output shows only the queue per-job limit, i.e. 'qf'. Is that intentional? qhost still shows the remaining consumable resource amounts. Just curious about the rationale, really. [root@compute-0-18 ~]# qconf -sq all.q snip h_rss INFINITY s_vmem7.6G h_vmem7.8G [root@compute-0-18 ~]# qstat -F h_vmem,s_vmem all.q@compute-0-0.localBP0/3/16 2.72 linux-x64 qf:h_vmem=7.800G qf:s_vmem=7.600G snip [root@compute-0-18 ~]# qhost -F h_vmem -h compute-0-0 HOSTNAMEARCH NCPU LOAD MEMTOT MEMUSE SWAPTO SWAPUS --- global - - - - - - - compute-0-0 linux-x64 16 2.75 63.0G2.2G 31.2G 0.0 Host Resource(s): hc:h_vmem=51.573G -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
[gridengine users] limit jobs to N core?
Hi, I'm setting up a new cluster. I'd like to limit jobs to a set number of cores on each host. Jobs default to 1 slot currently, and I'm using pe_serial to keep parallel jobs restricted to single hosts because we use shared-memory parallel apps here. But on an older cluster here, jobs will spill out onto more cores when the app's number of threads isn't limited by the user. I found a post that mentions the '-binding' option for qsub: Here, we are also binding each job to a single core. -binding linear:1 Does this really do that? I can't quite tell from the qsub documentation. If someone uses -pe my_pe 4 -binding linear:4 to request for cores, will their job placement be limited if no host as 4 consecutive cores to allocate? Thanks -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] shutdown and preserver queues?
Thanks Reuti, this is great. I'll stop qmaster and let the running jobs finish, then do my thing. -M On Wed, Feb 5, 2014 at 7:58 PM, Reuti re...@staff.uni-marburg.de wrote: Hi, Am 06.02.2014 um 01:27 schrieb Michael Stauffer: I need to shutdown my FE for maintenance, and am hoping to be able to preserve queued GE jobs for the convenience of my users. Any suggestions on how to do that? Nothing to worry about - just do it. You can stop restart the qmaster (or even reboot the complete qmaster machine) - running jobs will continue to run*, and queued ones will stay in the queue. *) unless you stop the execd with its script [but there is softstop implemented in the script to allow jobs to continue even when the execd needs to be restarted for any reason] The consensus here is that shutting down GE will kill queued jobs too by default, Shutting down the qmaster won't kill anything, shutting down the execd will be handled like outlined above. In your case you might want to disable the queues (to drain them), and then restart the qmaster after the maintenance and enable the queues again. -- Reuti although none of us have tried it. Sorry if this is a silly question, am just getting started as admin on this system. Thanks -M ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] submit qsub within qlogin?
Thanks a bunch, Hugh and Reuti. We'll try it out different ways - I'm actually asking this question for someone else. -M On Thu, Jan 23, 2014 at 2:07 PM, Reuti re...@staff.uni-marburg.de wrote: Hi, Am 23.01.2014 um 19:15 schrieb Michael Stauffer: Is it possible to allow qsub submissions while in a qlogin session? The goal is to allow users to run scripts that cycle through sending out a number of jobs, collecting and processing the results, then iterating. We'd like to avoid these master scripts having to be run on the FE. Thanks. Yes, you can submit jobs inside `qlogin` or also in a batch script itself. Prerequisite is, that the nodes are also submit hosts in SGE's configuration. Depending on the workload of the master script: it might be advisable to run them all in special queue (selected by a BOOL complex) and allow an almost unlimited number of them, as they produce no load on the machine itself. I can't judge whether this is suitable for you scripts. In case they use `qlogin` only to start the master script, they could also submit the script itself to the mentioned queue. -- Reuti ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users