If I remember right, h_vmem amount applies to the job and is not scaled by # of slots like some other resources. Just did a simple test with an 8-slot job (pe_serial) and it only used one 'unit' of h_vmem, i.e. the default amount assigned as consumable.
40GB VIRT vs 100MB RES is a huge difference! I thought I had it bad with matlab using 4GB VIRT for 100MB RES. -M On Mon, Jun 30, 2014 at 4:47 PM, Feng Zhang <prod.f...@gmail.com> wrote: > Guys, > > Just curious, how does the h_vmem work on processes of MPI jobs(or > OPENMP, multi-threading)? I have some parallel jobs, the top command > shows "VET" of 40GB, while the "RES" is only 100MB. > > On Mon, Jun 30, 2014 at 3:01 PM, Michael Stauffer <mgsta...@gmail.com> > wrote: > >> Message: 4 > >> Date: Mon, 30 Jun 2014 11:53:12 +0200 > >> From: Txema Heredia <txema.llis...@gmail.com> > >> To: Derrick Lin <klin...@gmail.com>, SGE Mailing List > >> <users@gridengine.org> > >> Subject: Re: [gridengine users] Enforce users to use specific amount > >> of memory/slot > >> Message-ID: <53b13388.5060...@gmail.com> > >> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed" > >> > >> > >> Hi Derrick, > >> > >> You could either set h_vmem as a consumable (consumable=yes) attribute > >> and set a default value of 8GB for it. This way, whenever a job doesn't > >> request any amount of h_vmem, it will automatically request 8GB per > >> slot. This will affect all types of jobs. > >> > >> You could also define a JSV script that checks the username, and forces > >> a -l h_vmem=8G for his/her jobs ( > >> jsv_sub_add_param('l_hard','h_vmem','8G') ). This will affect all jobs > >> for that user, but could turn into a pain to manage. > >> > >> Or, you could set a different policy and allow all users to request the > >> amount of memory they really need, trying to fit best the node. What is > >> the point of forcing the user to reserve 63 additional cores when they > >> only need 1 core and 500GB of memory? You could fit in that node one job > >> like this, and, say, two 30-core-6GB-memory jobs. > >> > >> Txema > >> > >> > >> > >> El 30/06/14 08:55, Derrick Lin escribi?: > >> > >> > Hi guys, > >> > > >> > A typical node on our cluster has 64 cores and 512GB memory. So it's > >> > about 8GB/core. Occasionally, we have some jobs that utilizes only 1 > >> > core but 400-500GB of memory, that annoys lots of users. So I am > >> > seeking a way that can force jobs to run strictly below 8GB/core > >> > ration or it should be killed. > >> > > >> > For example, the above job should ask for 64 cores in order to use > >> > 500GB of memory (we have user quota for slots). > >> > > >> > I have been trying to play around h_vmem, set it to consumable and > >> > configure RQS > >> > > >> > { > >> > name max_user_vmem > >> > enabled true > >> > description "Each user can utilize more than 8GB/slot" > >> > limit users {bad_user} to h_vmem=8g > >> > } > >> > > >> > but it seems to be setting a total vmem bad_user can use per job. > >> > > >> > I would love to set it on users instead of queue or hosts because we > >> > have applications that utilize the same set of nodes and app should be > >> > unlimited. > >> > > >> > Thanks > >> > Derrick > > > > > > I've been dealing with this too. I'm using h_vmem to kill processes that > go > > above the limit, and s_vmem set slightly lower by default to give > > well-behaved processes a chance first to exit gracefully. > > > > The issue is that these use virtual memory, which is (always, more or > less) > > great than resident memory, i.e. the actual ram usage. And with java apps > > like Matlab, the amount of virtual memory reserved/used is HUGE compared > to > > resident, by 10x give or take. So it makes it really impracticle > actually. > > However so far I've just set the default h_vmem and s_vmem values high > > enough to accomadate jvm apps, and increased the per-host consumable > > appropriately. We don't get fine-grained memory control, but it > definitely > > controls out-of-control users/procs that otherwise might gobble up enough > > ram to slow dow the entire node. > > > > We may switch to UVE just for this reason, to get memory limits based on > > resident memory, if it seems worth it enough in the end. > > > > -M > > > > _______________________________________________ > > users mailing list > > users@gridengine.org > > https://gridengine.org/mailman/listinfo/users > > >
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users