At the end of last year, I discussed something about how to automatically
decide the checkpoint interval with Oystein and Mike. Now, I am trying to
implement it. As what we discussed before, I write an outline of what I am
going to do.
1. Let the users to set a certain acceptable recovery time
that Derby should try to satisfy.(We will give an appropriate
default value).
2. During initilization of Derby, we run some measurement that
determines the performance of the system and maps the
recovery time into some X megabytes of log.
3. establish a dirty page list in wich dirty pages are sorted in ascending
order of the time when they were firt updated. When one dirty page
is flushed out to disk, it will be released from the link.(this step
needs
further discussion,whether we need to establish such a list)
4. A checkpoint is made and controled in combined consideration of
-the acceptable log length which we get in step 2
-the current IO performance
5. We do increamental checkpoint.That means:
From the beginning of the dirty page list established in step 3,(the
earliest updated dirty page), to the end of the list (the latest
updated
dirty page), we do checkpoint. If data reads or a log writes (if log
in
default location) start to have longer response times then a
appropriate
value,we pause the checkpoint process and update the log control file
to let
derby know where we are.When the data reads or log writes time return
to
acceptable value, we continue to do checkpoint.
This is just an outline. I would like to discuss details about them with
everyone
later.If anyone has any suggestion, please let me know.
Now, I am going to design the 2nd step first to map the recovery time into
some
X megabytes of log. A simple approach is that we can design a test log file.
In the
log file, we can let derby create a temporary database and do a bunch of
test to get
necessary disk IO information, and then delete the temporary database. When
derby
boots up, we let it to do recovery from the test log file.Anyone has some
other
suggestions on it?
I am wondering do I need to establish some relationship between the data
reads time
and the data writes time. I mean, under a certain average data reads time,
approximately
how long would the average data writes time be.Since what we get from step2
is jusn
under a certain system condition,when the system condition changes(becomes
busier),
the value should change too. If I can establish such a relationship,then I
can make
acurate adjustment on the checkpoint process.
Raymond
_________________________________________________________________
Scan and help eliminate destructive viruses from your inbound and outbound
e-mail and attachments.
http://join.msn.com/?pgmarket=en-ca&page=byoa/prem&xAPID=1994&DI=1034&SU=http://hotmail.com/enca&HL=Market_MSNIS_Taglines
Start enjoying all the benefits of MSNĀ® Premium right now and get the
first two months FREE*.