Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

Peter Geoghegan Mon, 09 Apr 2018 12:26:46 -0700

On Mon, Apr 9, 2018 at 12:13 PM, Andres Freund <[email protected]> wrote:
> Let's lower the pitchforks a bit here.  Obviously a grand rewrite is
> absurd, as is some of the proposed ways this is all supposed to
> work. But I think the case we're discussing is much closer to a near
> irresolvable corner case than anything else.


+1

> We're talking about the storage layer returning an irresolvable
> error. You're hosed even if we report it properly.  Yes, it'd be nice if
> we could report it reliably.  But that doesn't change the fact that what
> we're doing is ensuring that data is safely fsynced unless storage
> fails, in which case it's not safely fsynced anyway.

Right. We seem to be implicitly assuming that there is a big
difference between a problem in the storage layer that we could in
principle detect, but don't, and any other problem in the storage
layer. I've read articles claiming that technologies like SMART are
not really reliable in a practical sense [1], so it seems to me that
there is reason to doubt that this gap is all that big.

That said, I suspect that the problems with running out of disk space
are serious practical problems. I have personally scoffed at stories
involving Postgres databases corruption that gets attributed to
running out of disk space. Looks like I was dead wrong.

[1] https://danluu.com/file-consistency/ -- "Filesystem correctness"
-- 
Peter Geoghegan

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

Reply via email to