Re: [HACKERS] Controlling Load Distributed Checkpoints

Heikki Linnakangas Thu, 07 Jun 2007 11:03:33 -0700

Tom Lane wrote:

Heikki Linnakangas <[EMAIL PROTECTED]> writes:
Tom Lane wrote:
I don't think it's a historical artifact at all: it's a valid reflection
of the fact that we don't know enough about disk layout to do low-level
I/O scheduling.  Issuing more fsyncs than necessary will do little
except guarantee a less-than-optimal scheduling of the writes.
I'm not proposing to issue any more fsyncs. I'm proposing to change theordering so that instead of first writing all dirty buffers and thenfsyncing all files, we'd write all buffers belonging to a file, fsyncthat file only, then write all buffers belonging to next file, fsync,and so forth.
But that means that the I/O to different files cannot be overlapped by
the kernel, even if it would be more efficient to do so.

True. On the other hand, if we issue writes in essentially random order,we might fill the kernel buffers with random blocks and the kernel needsto flush them to disk as almost random I/O. If we did the writes ingroups, the kernel has better chance at coalescing them.

I tend to agree that if the goal is to finish the checkpoint as quicklyas possible, the current approach is better. In the context of loaddistributed checkpoints, however, it's unlikely the kernel can do anysignificant overlapping since we're trickling the writes anyway.


Do we need both strategies?

I'm starting to feel we should give up on smoothing the fsyncs anddistribute the writes only, for 8.3. As we get more experience with thatand it's shortcomings, we can enhance our checkpoints further in 8.4.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Re: [HACKERS] Controlling Load Distributed Checkpoints

Reply via email to