On 07/12/2018 02:25 AM, David Pacheco wrote:
On Tue, Jul 10, 2018 at 1:34 PM, Alvaro Herrera <alvhe...@2ndquadrant.com <mailto:alvhe...@2ndquadrant.com>> wrote:

    On 2018-Jul-10, Jerry Jelinek wrote:

    > 2) Disabling WAL recycling reduces reliability, even on COW filesystems.

    I think the problem here is that WAL recycling in normal filesystems
    helps protect the case where filesystem gets full.  If you remove it,
    that protection goes out the window.  You can claim that people needs to
    make sure to have available disk space, but this does become a problem
    in practice.  I think the thing to do is verify what happens with
    recycling off when the disk gets full; is it possible to recover
    afterwards?  Is there any corrupt data?  What happens if the disk gets
    full just as the new WAL file is being created -- is there a Postgres
    PANIC or something?  As I understand, with recycling on it is easy (?)
    to recover, there is no PANIC crash, and no data corruption results.

If the result of hitting ENOSPC when creating or writing to a WAL file was that the database could become corrupted, then wouldn't that risk already be present (a) on any system, for the whole period from database init until the maximum number of WAL files was created, and (b) all the time on any copy-on-write filesystem?

I don't follow Alvaro's reasoning, TBH. There's a couple of things that confuse me ...

I don't quite see how reusing WAL segments actually protects against full filesystem? On "traditional" filesystems I would not expect any difference between "unlink+create" and reusing an existing file. On CoW filesystems (like ZFS or btrfs) the space management works very differently and reusing an existing file is unlikely to save anything.

But even if it reduces the likelihood of ENOSPC, it does not eliminate it entirely. max_wal_size is not a hard limit, and the disk may be filled by something else (when WAL is not on a separate device, when there is think provisioning, etc.). So it's not a protection against data corruption we could rely on. (And as was discussed in the recent fsync thread, ENOSPC is a likely source of past data corruption issues on NFS and possibly other filesystems.)

I might be missing something, of course.

AFAICS the original reason for reusing WAL segments was the belief that overwriting an existing file is faster than writing a new file. That might have been true in the past, but the question is if it's still true on current filesystems. The results posted here suggest it's not true on ZFS, at least.


Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply via email to