On 3/26/20 11:37 AM, Robert Haas wrote:
On Wed, Mar 25, 2020 at 4:54 PM Stephen Frost <sfr...@snowman.net> wrot >
This is where I feel like I'm trying to make decisions in a vacuum. If
we had a few more people weighing in on the thread on this point, I'd
be happy to go with whatever the consensus was. If most people think
having both --no-manifest (suppressing the manifest completely) and
--manifest-checksums=none (suppressing only the checksums) is useless
and confusing, then sure, let's rip the latter one out. If most people
like the flexibility, let's keep it: it's already implemented and
tested. But I hate to base the decision on what one or two people
think.

I'm not sure I see a lot of value to being able to build manifest with no checksums, especially if overhead for the default checksum algorithm is negligible.

However, I'd still prefer that the default be something more robust and allow users to tune it down rather than the other way around. But I've made that pretty clear up-thread and I consider that argument lost at this point.

As for folks who are that close to the edge on their backup timing that
they can't have it slow down- chances are pretty darn good that they're
not far from ending up needing to find a better solution than
pg_basebackup anyway.  Or they don't need to generate a manifest (or, I
suppose, they could have one but not have checksums..).

40-50% is a lot more than "if you were on the edge."

For the record I think this is a very misleading number. Sure, if you are doing your backup to a local SSD on a powerful development laptop it makes sense.

But backups are generally placed on slower storage, remotely, with compression. Even without compression the first two are going to bring this percentage down by a lot.

When you get to page-level incremental backups, which is where this all started, I'd still recommend using a stronger checksum algorithm to verify that the file was reconstructed correctly on restore. That much I believe we have agreed on.

Even pg_basebackup (in both fetch and stream modes...) checks that we at
least got all the WAL that's needed for the backup from the server
before considering the backup to be valid and telling the user that
there was a successful backup.  With what you're proposing here, we
could have someone do a pg_basebackup, get back an ERROR saying the
backup wasn't valid, and then run pg_validatebackup and be told that the
backup is valid.  I don't get how that's sensible.

I'm sorry that you can't see how that's sensible, but it doesn't mean
that it isn't sensible. It is totally unrealistic to expect that any
backup verification tool can verify that you won't get an error when
trying to use the backup. That would require that everything that the
validation tool try to do everything that PostgreSQL will try to do
when the backup is used, including running recovery and updating the
data files. Anything less than that creates a real possibility that
the backup will verify good but fail when used. This tool has a much
narrower purpose, which is to try to verify that we (still) have the
files the server sent as part of the backup and that, to the best of
our ability to detect such things, they have not been modified. As you
know, or should know, the WAL files are not sent as part of the
backup, and so are not verified. Other things that would also be
useful to check are also not verified. It would be fantastic to have
more verification tools in the future, but it is difficult to see why
anyone would bother trying if an attempt to get the first one
committed gets blocked because it does not yet do everything. Very few
patches try to do everything, and those that do usually get blocked
because, by trying to do too much, they get some of it badly wrong.

I agree with Stephen that this should be done, but I agree with you that it can wait for a future commit. However, I do think:

1) It should be called out rather plainly in the documentation.
2) If there are files in pg_wal then pg_validatebackup should inform the user that those files have not been validated.

I know you and Stephen have agreed on a number of doc changes, would it be possible to get a new patch with those included? I finally have time to do a review of this tomorrow. I saw some mistakes in the docs in the current patch but I know those patches are not current.

Regards,
--
-David
da...@pgmasters.net


Reply via email to