> PS: Just for academic interest, how do I get the SGE-GID in the prolog script?
$SGE_JOB_SPOOL_DIR/addgrpid > PPS: Sorry, I missclicked and sent the previous mail. > > > > > El 01/03/12 17:58, Txema Heredia Genestar escribió: >> >> >> El 29/02/12 19:47, Rayson Ho escribió: >>> On Wed, Feb 29, 2012 at 1:15 PM, Reuti<[email protected]> wrote: >>>> Aha, I found this: >>>> >>>> http://arc.liv.ac.uk/pipermail/gridengine-users/2006-November/012125.html >>>> >>>> as the group is already there as Rayson mentions, creating the quota is >>>> the easiest. >>> >>> I was thinking about suggesting that when I first read Txema's >>> email... but turns out that it's what I suggested 5.5 years ago. (I >>> guess most of the questions here are similar. One day we can hire IBM >>> Watson and feed it the list archive, the manpages, and the admin guide >>> and it will answer all questions on this list for us!) >>> >>> The prolog& epilog should be just a few lines of shell script - >>> configure a 1-node test cluster and you should be able to implement& >>> test it in less than a few hours. Another minor improvement: if the >>> node crashes, then the startup process needs to cleanup the job >>> $TMPDIR directories when it comes back up. >>> >>> BTW, if you use a modern filesystem, then it takes almost no time to >>> format a disk. Oracle's BtrFS takes a few seconds to format a disk, >>> and can easily apply quota, and even snapshots (which is useful when >>> checkpointing a job - the data is consistent with the job progress): >>> >>> "I Can't Believe This is Butter! A tour of btrfs. - Avi Miller" >>> >>> https://www.youtube.com/watch?v=hxWuaozpe2I >>> >>> (Note: I exchanged a few emails with Avi so he is my "e-friend", but I >>> am suggesting his presentation not because I know him but only because >>> it is a great talk) >>> >>> Rayson >>> >>> >>> >>>> -- Reuti >>>> >>>> >>>>> And then terminates the job? >>>>> >>>>> >>>>>> But that would be much more complicated and could add some unwanted >>>>>> complexity to the whole system. >>>>> Do you users stay in $TMPDIR? Then it would be easier I think to have a >>>>> `du -s *.all.q` and check whether any is above the request. >>>>> >>>>> NB: There is a suspend_threshold for queues, but unfortunately not for >>>>> each individual job on its own. >>>>> >>>>> === >>>>> >>>>> Another approach, if the jobs stay in one node: >>>>> >>>>> - in the job prolog create a file with the requested space >>>>> - format and mount it on $TMPDIR as loop device >>>>> - in the epilog it can be removed again >>>>> >>>>> Well, creating and formatting will take some time, but they can never >>>>> pass the requested space and it's guaranteed to be available. >>>>> >>>>> -- Reuti >>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> [email protected] >>>>>> https://gridengine.org/mailman/listinfo/users >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> [email protected] >>>>> https://gridengine.org/mailman/listinfo/users >>>> >>>> _______________________________________________ >>>> users mailing list >>>> [email protected] >>>> https://gridengine.org/mailman/listinfo/users >> > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
