On Sat, Mar 3, 2018 at 5:06 PM, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:
> On 03/03/2018 01:38 PM, Robert Haas wrote: > > On Sat, Mar 3, 2018 at 7:32 AM, Robert Haas <robertmh...@gmail.com> > wrote: > >> On Fri, Mar 2, 2018 at 6:26 PM, Tomas Vondra > >> <tomas.von...@2ndquadrant.com> wrote: > >>> Hmmm, OK. So we need to have a valid checksum on a page, disable > >>> checksums, set some hint bits on the page (which won't be > >>> WAL-logged), enable checksums again and still get a valid > >>> checksum even with the new hint bits? That's possible, albeit > >>> unlikely. > >> > >> No, the problem is if - as is much more likely - the checksum is > >> not still valid. > > > > Hmm, on second thought ... maybe I didn't think this through > > carefully enough. If the checksum matches on the master by chance, > > and the page is the same on the standby, then we're fine, right? It's > > a weird accident, but nothing is actually broken. The failure > > scenario is where the standby has a version of the page with a bad > > checksum, but the master has a good checksum. So for example: > > checksums disabled, master modifies the page (which is replicated), > > master sets some hint bits (coincidentally making the checksum > > match), now we try to turn checksums on and don't re-replicate the > > page because the checksum already looks correct. > > > > Yeah. Doesn't that pretty much mean we can't skip any pages that have > correct checksum, because we can't rely on standby having the same page > data? That is, this block in ProcessSingleRelationFork: > > /* > * If checksum was not set or was invalid, mark the buffer as dirty > * and force a full page write. If the checksum was already valid, we > * can leave it since we know that any other process writing the > * buffer will update the checksum. > */ > if (checksum != pagehdr->pd_checksum) > { > START_CRIT_SECTION(); > MarkBufferDirty(buf); > log_newpage_buffer(buf, false); > END_CRIT_SECTION(); > } > > That would mean this optimization - only doing the write when the > checksum does not match - is broken. > Yes. I think that was the conclusion of this, as posted in https://www.postgresql.org/message-id/CABUevExDZu__5KweT8fr3Ox45YcuvTDEEu%3DaDpGBT8Sk0RQE_g%40mail.gmail.com :) If that's the case, it probably makes restarts/resume more expensive, > because this optimization was why after restart the already processed > data was only read (and the checksums verified) but not written. > > Yes, it definitely does. It's not a dealbreaker, but it's certainly a bit painful not to be able to resume as cheap. -- Magnus Hagander Me: https://www.hagander.net/ <http://www.hagander.net/> Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>