Re: [gridengine users] orphan tmp directories

Reuti Wed, 10 Apr 2013 16:40:44 -0700

Am 10.04.2013 um 23:39 schrieb Adam Brenner:

> >> On Wed, Apr 10, 2013 at 2:21 AM, Arnau Bria <[email protected]> wrote:
> >> Extra question:
> >>
> >> how do other admins control the disk space used per job under $TMPDIR ?
> 
> > On Wed, Apr 10, 2013 at 3:33 AM, Reuti <[email protected]> wrote:
> > On the exechost? I don't do it at all on a per job basis. In case your 
> > users fight 
> > for the disk space you can implement a consumable for the disk space in 
> > combination with a load sensor:
> 
> > http://gridengine.org/pipermail/users/2012-February/002914.html
> >(there are some other points in the thread too, like mounting a limited loop 
> >device on $TMPDIR)
> 
> 
> We use the same setup on our cluster. A load sensor to monitor $TMPDIR and 
> update the consumable resource. Of course, this gets more complicated as
> 
> 1) Writing to $TMPDIR does not require a consumable resource...so how do you 
> keep the consumable resource value for the exechost up to date with GE? (load 
> sensor)


Not at all. The requested size can only be an estimation anyway and not 
limited. If users are fair, it will work. If the value of the load sensor falls 
below the consumable bookeeping, the tighter contraint will be used and maybe a 
new job not start as some former users made a wrong estimation of their 
required disk space and there is not enough left on a node.


> 2) If the user requests x amount of $TMPDIR as a consumable resource, when 
> you use a load sensor, you need to take account of this as to not provide an 
> incorrect value to GE!

No. SGE will use both values ("bookkeeping consumable" and "load sensor") at 
the same time automatically. The lower value will be used in the end.

-- Reuti


>    a) However, getting the requested consumable resource for each job is 
> "taxing" on the SGE master as described here: 
> http://gridengine.org/pipermail/users/2013-April/005873.html
>    b) If you read William's response here: 
> http://gridengine.org/pipermail/users/2013-April/005874.html he offers 
> another solution, however, again, depending on your setup, may not work.
> 
> 3) So in short, no easy way. Need to determine if the trade off is worth it. 
> 
> >> On Wed, Apr 10, 2013 at 2:21 AM, Arnau Bria <[email protected]> wrote:
> >> any way for limiting the amount of space per job (SGE or OS level)?
> 
> Some-what. Using epilog/prolog scripts you could, "mount" $TMPDIR and set a 
> specific size, that the user requests. So...in your epilog script you create 
> a unique mount point and give it a specific size. So something like
>     mount -t ext4 -o size=${SIZE}G,mode=755,uid=${user},gid=users ext4
> then in your epilog you simply force an unmount (make sure you lsof and kill 
> -9 any open files to it....)
> 
> However, we again run into the issue of grabbing the ${SIZE} variable from 
> the consumable resource as mentioned in #2. 
> 
> --
> Adam Brenner
> Computer Science, Undergraduate Student
> Donald Bren School of Information and Computer Sciences
> 
> Research Computing Support
> Office of Information Technology
> http://www.oit.uci.edu/rcs/
> 
> University of California, Irvine
> www.ics.uci.edu/~aebrenne/
> [email protected]
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] orphan tmp directories

Reply via email to