Re: Discussion of how to map the recovery time into Xmb of log --Checkpoint issue

Mike Matrigali Fri, 03 Feb 2006 09:47:58 -0800

You are seeing the same results as Oystein reported as part of
DERBY-799.  If no one submits a patch for DERBY-799 I will probably
submit something simple before the next candidate release is
cut.  Something on the order of sleeping in between I/O's to
stop checkpoint from flooding I/O.


Kristian Waagan wrote:

Mike Matrigali wrote:
Long answer above, some comments inline below.

I think runtime performance would be optimal in this case, runtime
performance is in no way "helped" by having checkpoints - only either
not affected or hindered.  As has been noted checkpoints can cause
drastic downward spikes in some disk bound applications, hopefully we
will some changes into 10.2 to smooth those spikes down.  But the
reality is the more checkpoints on a system that is disk i/o bound the
more the app is going to slow down, if you are not disk i/o bound then
the checkpoints may have little affect.
Thank you for the explanations Mike. I run a TPC-B like load againstDerby and plotted some performance metrics for two differentconfigurations; one where the default checkpointing interval was used,and one where it was set to maximum. I ran for 1 hour, and for thesecond case, I don't think a checkpoint was started (the test took along time to exit when the database was shut down, as almost 100 MB oflog had to be handled). Please have a look at the attached figures, andsee if they are as you expected.

Yes TPC-B on a fast/multi-processor machine is probably the worst case
scenario for the current default checkpoint settings.  The default aims
more at typical xact throughput apps rather than extreme highen as you
are measuring.   The hope was that anyone trying to run such extreme

app would be able to set the checkpoint interval to a reasonablesetting. Discussion on the list and the above bug continues about ways

to avoid the spikes.

The amount of log probably was not much of a factor for shutdown speed.
Assuming you were doing a clean shutdown, I expect the time was just
waiting for all the pages in the cache to go to disk and be sync'd.

What bothers me in particular, are the spikes for the run with defaultcheckpointing interval. As you can see, the throughput drops to (nearly)zero for 10 second periods, which is pretty bad. The checkpoint shouldnot interfere with user activity in such a way. I have talked to somepeople about this, and we suspect there might be some kind ofOS/filesystem issue that we're running into. This might be caused by theway the checkpoint writes pages to disk - write all dirty pages to disk,then sync at the very end. Depending on the underlyingOS/filesystem/caches, the effects may vary. My runs were done on Solariswith the UFS filesystem. I also attatched a second graph where I useddirectio (option 'forcedirectio' when mounting).

I believe there was some info posted onto the list about previousanalysis about what was causing the 10 second drops. I seem to remember

it was even somewhat jvm specific.

Unfortunately I do not have logs for disk io activity for these runs.The data and logs were stored on different physical disks (used'logDevice'). The database was approx 17 GB, the page cache 0.5 GB.Embedded Derby, 16 clients/connections.

with any performance analysis the following kinds of info is interesting:
o sysinfo ouput (gets jvm, version, and some other stuff)

o description of machine (mostly I look for #processors and speed/typeof processors)o description of disks (scsi vs ide, is write cache enabled?, aremultiple disks raided to look like one disk to derby?)


Any comments on the graphs attached?



--
Kristian

There are only 2 reasons for checkpoints:
1) decrease recovery time after a system crash.
2) make it possible to delete log file information (if you don't have
  rollforward recovery backups).  Without a checkpoint derby must
  keep all log files, thus space needed in the log directory will
  always grow.

The background writer thread should handle this, it should not consider
this an extreme case.  If there were no background writer and no
checkpoints then the following would happen:

1) the page cache grows to whatever maximum size it has gotten to
2) requests for a new page then use clock to determine what page to
  throw out.
3) if the page picked to throw out is dirty, then it is first written
 to the OS with no sync requested.  It is up to the OS whether this
 is handled async or not.  Most modern OS's will make this be an
 async operation unless the OS cache is full and then it will turn
 into a wait for some i/o (maybe some other i/o to free OS resource).
 The downside is that a user select at this point may end up waiting
 on a synchronous write of some page.
4) if the page to throw out is not dirty, then it can just be thrown
  out without any possible I/O wait.
5) In both cases 3 and 4 the user thread of course has to wait on the
  I/O to read the page into the cache.  Depending on the OS cache this
  may or may not be a "real" I/O.

The job of the background writer is to make case 3 less likely, that's
it.  Note if you try to keep the whole cache clean then you may flood
the I/O system unnecessarily if the app tends to write the same page
over and over again, then it is better to leave it dirty in cache until
needed.  The clock tends to do this by throwing out less used pages
vs. more used pages.

Kristian Waagan wrote:

Hi Mike,

A question totally on the side of this discussion: Do you, or anyone
else, have any opinion about how the "runtime performance" of Derby
would be affected by not having checkpoints at all, say for a large
database (around 20 GB) and 0.5 GB of page cache in a disk-bound
application load?

Is the Derby background-writer (and Clock.java) written/designed to
handle such "extreme cases" without major performance degradation?
Any information on the goal/function of the background-writer?
What mechanisms would kick in when the page-cache is full and Derby
needs slots for new pages?

mechansism described above, it it particular to whether page to throw
out is dirty vs. clean.  There isn't really dependency on full.  In
a busy "normal" system the cache is always full and I don't think we
do anything special about weights of dirty vs. clean.  more work could
be done in this area as has been discussed.

I do know this is not a smart way to handle things, I'm just curious
what people think about this! And I am not seeking answers about long
recovery times and log disk space usage ;)

Hey in my benchmark days with other db products, it was standard
procedure to configure test system to either have no checkpoints or if
required ONE checkpoint during the run.  Derby is no different for this.

I almost always try to separate the checkpoint affect from the
performance throughput I am trying to measure (unless optimizing the
checkpoint is what I am trying to measure).  My guess is that default
checkpoint interval is making WAY too many checkpoints for your
throughput by default.


--
Kristian

[snip - map recovery time into Xmb of log stuff]



------------------------------------------------------------------------


------------------------------------------------------------------------

Re: Discussion of how to map the recovery time into Xmb of log --Checkpoint issue

Reply via email to