Re: Some other questions on Checkpointing.

Mike Matrigali Wed, 24 Aug 2005 12:54:32 -0700


Raymond Raymond wrote:

From: Mike Matrigali <[EMAIL PROTECTED]>
Reply-To: "Derby Development" <[email protected]>
To: Derby Development <[email protected]>
Subject: Re: Can anyone give me some suggestions?
Date: Tue, 23 Aug 2005 10:03:35 -0700

Yes, the key is that in normal operation the call to these routines
is always done by a background process which is a different thread
than normal client threads.  So database client threads procede
while the checkpoint procedes.  The buffer cache is not locked for
the duration of the checkpoint, only individual pages for the time
it takes to copy the page from the buffer cache into the OS cache.
Thanks lots, I got what you said. But I still have some questions aboutthat.
1. If a buffer page, we call it B1 here, is updated before a checkpointis takenand the log of the update is generated as L1. During the followingcheckpoint,depending on the WAL policy, log buffer will be first forced to disk, soL1 willbe flushed to disk. Then the checkpoint process will try to write thecache bufferto disk. My question is, since the checkpoint thread and the databaseclient threadsare asynchronous, it is possible another update on B1 happens before B1is latched andwritten out, later B1 will be written out but the log of the secondupdate is not
written out yet. How does derby ensure the WAL policy in this case?

Derby maintains in memory the last log record that changed the page.Before writing the page it always asks the log to flush up to that page,

usually this request is a no-op as the log has already flushed up to
that point.  See CachedPage.java!writePage()
  // force WAL - and check to see if database is corrupt or is frozen.
// last log Instant may be null if the page is being forced
// to disk on a createPage (which violates the WAL protocol actually).
// See FileContainer.newPage
LogInstant flushLogTo = getLastLogInstant();
dataFactory.flush(flushLogTo);

2. During a checkpoint, now derby will search all the cache buffer fordirty pagesand write them out. Why don't we keep a dirty page list? Just IDs thatcan identifythe corresponding dirty page are stored in the list, so it will not takea lot ofspace. The first time a buffer page is updated, its ID will be appendedto the list,and after the dirty page has been written out, it will be released fromthe list.During a checkpoint, we just search from the head of the list to the endand write
the corresponding dirty pages out.

The current cache has no lists, lists tend to be a point of contentionin between users of the cache. The current cache design was mostlypicked to be correct and to be simple as possible. Walking an arrayonce for the checkpoint does not seem like much overhead, overmaintaining dirty page list. I guess if the usual case of a checkpointthere were no dirty pages that would be an issue, but at least thecurrent checkpoints are driven by amount of log being written which in

most cases will indicate lots of dirty pages.

 The tradeoff here is that performance of
a checkpoint is not that important, especially making it go faster in
anyway slows down the "real" work of the system.  If I did any work on

checkpoints I would actually try to figure out how to make it slower,ie. spread the I/O out over non-active times in the server - so that itwould have little affect on user queries. It is not as simple aspriority as what is mostly affected is I/O load of the system ratherthan cpu.

Raymond

_________________________________________________________________
Powerful Parental Controls Let your child discover the best the Internethas to offer.http://join.msn.com/?pgmarket=en-ca&page=byoa/prem&xAPID=1994&DI=1034&SU=http://hotmail.com/enca&HL=Market_MSNIS_TaglinesStart enjoying all the benefits of MSN® Premium right now and get thefirst two months FREE*.

Re: Some other questions on Checkpointing.

Reply via email to