On Thu, 2006-12-21 at 18:46 +0900, ITAGAKI Takahiro wrote:
> "Takayuki Tsunakawa" <[EMAIL PROTECTED]> wrote:
> > > If you use Linux, it has very unpleased behavior in fsync(); It locks all
> > > metadata of the file being fsync-ed. We have to wait for the completion of
> > > fsync when we do read(), write(), and even lseek().
> > 
> > Oh, really, what an evil fsync is!  Yes, I sometimes saw a backend
> > waiting for lseek() to complete when it committed.  But why does the
> > backend which is syncing WAL/pg_control have to wait for syncing the
> > data file?  They are, not to mention, different files, and WAL and
> > data files are stored on separate disks.
> Backends call lseek() in planning, so they have to wait fsync() to
> the table that they will access. Even if all of data in the file is in
> the cache, lseek() conflict with fsync(). You can see a lot of backends
> are waiting in planning phase in checkpoints, not executing phase.

It isn't clear to me why you are doing planning during a test at all.

If you are doing replanning during test execution then the real
performance problem will be the planning, not the fact that the fsync
stops planning from happening.

Prepared queries are only replanned manually, so the chances of
replanning during a checkpoint are fairly low. So although it sounds
worrying, I'm not sure that we'll want to alter the use of lseek during
planning - though there may be other arguments also.

I have also seen cases where the WAL drive, even when separated, appears
to spike upwards during a checkpoint. My best current theory, so far
untested, is that the WAL and data drives are using the same CFQ
scheduler and that the scheduler actively slows down WAL requests when
it need not. Mounting the drives as separate block drives with separate
schedulers, CFQ for data and Deadline for WAL should help.

  Simon Riggs             
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Reply via email to