On Thu, Jan 26, 2017 at 9:32 AM, Stephen Frost <sfr...@snowman.net> wrote: > * Robert Haas (robertmh...@gmail.com) wrote: >> On Wed, Jan 25, 2017 at 7:19 PM, Michael Paquier >> <michael.paqu...@gmail.com> wrote: >> > On Thu, Jan 26, 2017 at 9:14 AM, Peter Geoghegan <p...@heroku.com> wrote: >> >> On Wed, Jan 25, 2017 at 3:30 PM, Stephen Frost <sfr...@snowman.net> wrote: >> >>> As it is, there are backup solutions which *do* check the checksum when >> >>> backing up PG. This is no longer, thankfully, some hypothetical thing, >> >>> but something which really exists and will hopefully keep users from >> >>> losing data. >> >> >> >> Wouldn't that have issues with torn pages? >> > >> > Why? What do you foresee here? I would think such backup solutions are >> > careful enough to ensure correctly the durability of pages so as they >> > are not partially written. >> >> Well, you'd have to keep a read(fd, buf, 8192) performed by the backup >> tool from overlapping with a write(fd, buf, 8192) performed by the >> backend. > > As Michael mentioned, that'd depend on if things are atomic from a > user's perspective at certain sizes (perhaps 4k, which wouldn't be too > surprising, but may also be system-dependent), in which case verifying > that the page is in the WAL would be sufficient.
That would be enough. It should also be rare enough that there would not be that many pages to track when looking at records from the backup start position to minimum recovery point. It could be also simpler, though more time-consuming, to just let a backup recover up to the minimum recovery point (recovery_target = 'immediate'), and then run the checksum sanity checks. There are other checks usually needed on a backup anyway like being sure that index pages are in good shape even with a correct checksum, etc. But here I am really high-jacking the thread, so I'll stop.. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers