On Fri, May 18, 2012 at 10:27 AM, Rayson Ho <[email protected]>wrote:

> There is a checkpoint & cleanup script in the source tree, and your
> Grid Engine installation should also have it.
>
>
> https://gridscheduler.svn.sourceforge.net/svnroot/gridscheduler/trunk/source/dist/util/bdb_checkpoint.sh
>
> The BDB transaction logs are cleaned up by this script.
>


Thanks for pointing this out to me

The documentation says that it should be used every minute if the
configuration uses a BDB server. I don't use a BDB server, but the storage
method I use is BDB (not flat files). If I should use this checkppoint
script, how often should I run it, and should I shut down the qmaster to
run it?


>
> Rayson
>
>
>
> On Fri, May 18, 2012 at 1:17 PM, Simon Matthews
> <[email protected]> wrote:
> > After SGE was killed by the OOM killed, the  file (a berkely db file) in
> my
> > cluster was 1.4GB. I did a db_dump and db_load, on this file, resulting
> in a
> > much smaller file.
> >
> > However, this then raised the question -- how is this file maintained?
> > Presumably, it holds the information on jobs in all states (queued,
> running
> > and finished). How do the finished jobs get removed from this file?
> > Obviously, I don't want the file to grow without limit.
> >
> > We are now putting about 50k jobs into our small cluster every day (many
> > finish running in a fraction of a second).
> >
> > Simon
> >
> > _______________________________________________
> > users mailing list
> > [email protected]
> > https://gridengine.org/mailman/listinfo/users
> >
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to