Tom Lane wrote:
To my mind the problem with fsync is not that it gives us too little
control but that it gives too much: we have to specify a particular
order of writing out files.  What we'd really like is a version of
sync(2) that tells us when it's done but doesn't constrain the I/O
scheduler's choices at all.  Unfortunately there's no such API ...

The problem I see with fsync is that it causes an immediate I/O storm as the OS tries to flush everything out as quickly as possible. But we're not in a hurry. What we'd need is a lazy fsync, that would tell the operating system "let me know when all these dirty buffers are written to disk, but I'm not in a hurry, take your time". It wouldn't change the scheduling of the writes, just inform the caller when they're done.

If we wanted more precise control of the flushing, we could use sync_file_range on Linux, but that's not portable. Nevertheless, I think it would be OK to have an ifdef and use it on platforms that support it, if it gave a benefit.

As a side note, with full_page_writes on, a checkpoint wouldn't actually need to fsync those pages that have been written to WAL after the checkpoint started. Doesn't make much difference in most cases, but we could take that into account if we start taking more control of the flushing.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org

Reply via email to