Re: Can anyone give me some suggestions?

Øystein Grøvlen Thu, 18 Aug 2005 05:14:54 -0700

>>>>> "RR" == Raymond Raymond <[EMAIL PROTECTED]> writes:


    RR> Hi, everyone, I  am a graduate student and trying  to do some research
    RR> with Derby.

    RR> I am  interested in the  "Autonomic checkpointing timing and  log file
    RR> size" issue on the

    RR> to-do  list.  I  would like  to  know  is  there  anyone else  who  is
    RR> interested in that? or

    RR> anyone who can give me some suggestions or direction about that issue?

Currently, you can configure the checkpoint interval and log file size
of Derby by setting the properties:

derby.storage.logSwitchInterval  (default 1 MB)
derby.storage.checkpointInterval (default 10 MB)

(None of these seems to be documented in the manuals, and the JavaDoc
for LogFactory.recover() gives wrong (out-dated?) defaults).

This means that by default all log files will be 1 MB, and a checkpoint
is made for every tenth log file.

In order to know when it is useful to change the defaults, one has to
consider the purpose of a checkpoint:

  1) Reduced recovery times.  Only log create after the penultimate
     checkpoint needs to be redone at recovery.  This also means that
     older log files may be garbage-collected (as long as they do not
     contain log records for transactions that are still not
     terminated.)

     To get short recovery times, one should keep the checkpoint
     interval low.  The trade-off is that frequent checkpoints will
     increase I/O since you will have less updates to the same page
     between two checkpoints.  Hence, you will get more I/O per
     database operation.

  2) Flush dirty pages to disk.  A checkpoint is a much more efficient
     way to clean dirty pages in the db cache than to do it on demand
     on a single page when one need to replace it with another.
     Hence, one should make sure to do checkpoints often enough to
     avoid that the whole cache is dirty.

Based on 2), one could initiate a new checkpoint when many pages in
the cache are dirty (e.g., 50% of the pages) and postpone a checkpoint
if few pages are dirty.  The difficult part would be to determine how
long checkpoint intervals is acceptable with respect to impact on
recovery times. 

I guess one could argue that for recovery times, it is the clock time
that matters.  Hence, one could automatically increase the value of
derby.storage.checkpointInterval on more performant computers since it
will be able to process more log per time unit.

When would want to change the log switch interval?  I think few would
care, but since the log files per default are preallocated, space will
be wasted if operations that perform a log switch (e.g., backup) is
performed when the current log file is nearly empty.  On the other
hand, a small log file size will result many concurrent log files if
the checkpoint interval is very large.

Hope this helps a little,

-- 
Øystein

Re: Can anyone give me some suggestions?

Reply via email to