On 2014-01-22 10:14:27 -0500, Robert Haas wrote:
> On Wed, Jan 22, 2014 at 9:48 AM, Andres Freund <and...@2ndquadrant.com> wrote:
> > On 2014-01-18 08:35:47 -0500, Robert Haas wrote:
> >> > I am not sure I understand that point. We can either update the
> >> > in-memory bit before performing the on-disk operations or
> >> > afterwards. Either way, there's a way to be inconsistent if the disk
> >> > operation fails somewhere inbetween (it might fail but still have
> >> > deleted the file/directory!). The normal way to handle that in other
> >> > places is PANICing when we don't know so we recover from the on-disk
> >> > state.
> >> > I really don't see the problem here? Code doesn't get more robust by
> >> > doing s/PANIC/ERROR/, rather the contrary. It takes extra smarts to only
> >> > ERROR, often that's not warranted.
> >>
> >> People get cranky when the database PANICs because of a filesystem
> >> failure.  We should avoid that, especially when it's trivial to do so.
> >>  The update to shared memory should be done second and should be set
> >> up to be no-fail.
> >
> > I don't see how that would help. If we fail during unlink/rmdir, we
> > don't really know at which point we failed.
> This doesn't make sense to me.  unlink/rmdir are atomic operations.

Yes, individual operations should be, but you cannot be sure whether a
rename()/unlink() will survive a crash until the directory is
fsync()ed. So, what is one going to do if the unlink suceeded, but the
fsync didn't?

Deletion currently works like:
    if (rename(path, tmppath) != 0)
                 errmsg("could not rename \"%s\" to \"%s\": %m",
                        path, tmppath)));

    /* make sure no partial state is visible after a crash */
    fsync_fname(tmppath, false);
    fsync_fname("pg_replslot", true);

    if (!rmtree(tmppath, true))
                 errmsg("could not remove directory \"%s\": %m",

If we fail between the rename() and the fsync_fname() we don't really
know which state we are in. We'd also have to add code to handle
incomplete slot directories, which currently only exists for startup, to
other places.


Andres Freund

 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to