On 07/12/2018 02:25 AM, David Pacheco wrote:
On Tue, Jul 10, 2018 at 1:34 PM, Alvaro Herrera
<alvhe...@2ndquadrant.com <mailto:alvhe...@2ndquadrant.com>> wrote:
On 2018-Jul-10, Jerry Jelinek wrote:
> 2) Disabling WAL recycling reduces reliability, even on COW filesystems.
I think the problem here is that WAL recycling in normal filesystems
helps protect the case where filesystem gets full. If you remove it,
that protection goes out the window. You can claim that people needs to
make sure to have available disk space, but this does become a problem
in practice. I think the thing to do is verify what happens with
recycling off when the disk gets full; is it possible to recover
afterwards? Is there any corrupt data? What happens if the disk gets
full just as the new WAL file is being created -- is there a Postgres
PANIC or something? As I understand, with recycling on it is easy (?)
to recover, there is no PANIC crash, and no data corruption results.
If the result of hitting ENOSPC when creating or writing to a WAL file
was that the database could become corrupted, then wouldn't that risk
already be present (a) on any system, for the whole period from database
init until the maximum number of WAL files was created, and (b) all the
time on any copy-on-write filesystem?
I don't follow Alvaro's reasoning, TBH. There's a couple of things that
confuse me ...
I don't quite see how reusing WAL segments actually protects against
full filesystem? On "traditional" filesystems I would not expect any
difference between "unlink+create" and reusing an existing file. On CoW
filesystems (like ZFS or btrfs) the space management works very
differently and reusing an existing file is unlikely to save anything.
But even if it reduces the likelihood of ENOSPC, it does not eliminate
it entirely. max_wal_size is not a hard limit, and the disk may be
filled by something else (when WAL is not on a separate device, when
there is think provisioning, etc.). So it's not a protection against
data corruption we could rely on. (And as was discussed in the recent
fsync thread, ENOSPC is a likely source of past data corruption issues
on NFS and possibly other filesystems.)
I might be missing something, of course.
AFAICS the original reason for reusing WAL segments was the belief that
overwriting an existing file is faster than writing a new file. That
might have been true in the past, but the question is if it's still true
on current filesystems. The results posted here suggest it's not true on
ZFS, at least.
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services