Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

Bruce Momjian Sat, 07 Apr 2018 19:34:53 -0700

On Sun, Apr  8, 2018 at 02:16:07PM +1200, Thomas Munro wrote:
> So, what can we actually do about this new Linux behaviour?
> 
> Idea 1:
> 
> * whenever you open a file, either tell the checkpointer so it can
> open it too (and wait for it to tell you that it has done so, because
> it's not safe to write() until then), or send it a copy of the file
> descriptor via IPC (since duplicated file descriptors share the same
> f_wb_err)
> 
> * if the checkpointer can't take any more file descriptors (how would
> that limit even work in the IPC case?), then it somehow needs to tell
> you that so that you know that you're responsible for fsyncing that
> file yourself, both on close (due to fd cache recycling) and also when
> the checkpointer tells you to
> 
> Maybe it could be made to work, but sheesh, that seems horrible.  Is
> there some simpler idea along these lines that could make sure that
> fsync() is only ever called on file descriptors that were opened
> before all unflushed writes, or file descriptors cloned from such file
> descriptors?
> 
> Idea 2:
> 
> Give up, complain that this implementation is defective and
> unworkable, both on POSIX-compliance grounds and on POLA grounds, and
> campaign to get it fixed more fundamentally (actual details left to
> the experts, no point in speculating here, but we've seen a few
> approaches that work on other operating systems including keeping
> buffers dirty and marking the whole filesystem broken/read-only).
> 
> Idea 3:
> 
> Give up on buffered IO and develop an O_SYNC | O_DIRECT based system ASAP.


Idea 4 would be for people to assume their database is corrupt if their
server logs report any I/O error on the file systems Postgres uses.

-- 
  Bruce Momjian  <[email protected]>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

Reply via email to