On Fri, May 18, 2012 at 10:27 AM, Rayson Ho <[email protected]>wrote:
> There is a checkpoint & cleanup script in the source tree, and your > Grid Engine installation should also have it. > > > https://gridscheduler.svn.sourceforge.net/svnroot/gridscheduler/trunk/source/dist/util/bdb_checkpoint.sh > > The BDB transaction logs are cleaned up by this script. > Thanks for pointing this out to me The documentation says that it should be used every minute if the configuration uses a BDB server. I don't use a BDB server, but the storage method I use is BDB (not flat files). If I should use this checkppoint script, how often should I run it, and should I shut down the qmaster to run it? > > Rayson > > > > On Fri, May 18, 2012 at 1:17 PM, Simon Matthews > <[email protected]> wrote: > > After SGE was killed by the OOM killed, the file (a berkely db file) in > my > > cluster was 1.4GB. I did a db_dump and db_load, on this file, resulting > in a > > much smaller file. > > > > However, this then raised the question -- how is this file maintained? > > Presumably, it holds the information on jobs in all states (queued, > running > > and finished). How do the finished jobs get removed from this file? > > Obviously, I don't want the file to grow without limit. > > > > We are now putting about 50k jobs into our small cluster every day (many > > finish running in a fraction of a second). > > > > Simon > > > > _______________________________________________ > > users mailing list > > [email protected] > > https://gridengine.org/mailman/listinfo/users > > >
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
