Re: Online checksums patch - once again

Heikki Linnakangas Tue, 09 Feb 2021 00:55:08 -0800

(I may have said this before, but) My overall high-level impression ofthis patch is that it's really cmmplex for a feature that you use maybeonce in the lifetime of a cluster. I'm happy to review but I'm notplanning to commit this myself. I don't object if some other committerpicks this up (Magnus?).


Now to the latest patch version:


On 03/02/2021 18:15, Daniel Gustafsson wrote:

The previous v35 had a tiny conflict in pg_class.h, the attached v36 (which is
a squash of the 2 commits in v35) fixes that.  No other changes are introduced
in this version.

        /*
         * Check to see if my copy of RedoRecPtr is out of date. If so, may have
         * to go back and have the caller recompute everything. This can only
         * happen just after a checkpoint, so it's better to be slow in this 
case
         * and fast otherwise.
         *
         * Also check to see if fullPageWrites or forcePageWrites was just 
turned
         * on, or if we are in the process of enabling checksums in the cluster;
         * if we weren't already doing full-page writes then go back and 
recompute.
         *
         * If we aren't doing full-page writes then RedoRecPtr doesn't actually
         * affect the contents of the XLOG record, so we'll update our local 
copy
         * but not force a recomputation.  (If doPageWrites was just turned off,
         * we could recompute the record without full pages, but we choose not 
to
         * bother.)
         */
        if (RedoRecPtr != Insert->RedoRecPtr)
        {
                Assert(RedoRecPtr < Insert->RedoRecPtr);
                RedoRecPtr = Insert->RedoRecPtr;
        }
        doPageWrites = (Insert->fullPageWrites || Insert->forcePageWrites || 
DataChecksumsOnInProgress());

Why does this use DataChecksumsOnInProgress() instead ofDataChecksumsNeedWrite()? If checksums are enabled, you always needfull-page writes, don't you? If not, then why is it needed in theinprogress-on state?

We also set doPageWrites in InitXLOGAccess(). That should match thecondition above (although it doesn't matter for correctness).

/*
 * DataChecksumsNeedVerify
 *              Returns whether data checksums must be verified or not
 *
 * Data checksums are only verified if they are fully enabled in the cluster.
 * During the "inprogress-on" and "inprogress-off" states they are only
 * updated, not verified (see datachecksumsworker.c for a longer discussion).
 *
 * This function is intended for callsites which have read data and are about
 * to perform checksum validation based on the result of this. To avoid the
 * the risk of the checksum state changing between reading and performing the
 * validation (or not), interrupts must be held off. This implies that calling
 * this function must be performed as close to the validation call as possible
 * to keep the critical section short. This is in order to protect against
 * time of check/time of use situations around data checksum validation.
 */
bool
DataChecksumsNeedVerify(void)
{
        Assert(InterruptHoldoffCount > 0);
        return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
}


What exactly is the intended call pattern here? Something like this?

smgrread() a data page
HOLD_INTERRUPTS();
if (DataChecksumsNeedVerify())
{
  if (pg_checksum_page((char *) page, blkno) != expected)
    elog(ERROR, "bad checksum");
}
RESUME_INTERRUPTS();

That seems to be what the code currently does. What good does holdinginterrupts do here? If checksums were not fully enabled at thesmgrread() call, the page might have incorrect checksums, and if thestate transitions from inprogress-on to on between the smggread() calland the DataChecksumsNeedVerify() call, you'll get an error. I think youneed to hold the interrupts *before* the smgrread() call.

/*
 * Set checksum for a page in private memory.
 *
 * This must only be used when we know that no other process can be modifying
 * the page buffer.
 */
void
PageSetChecksumInplace(Page page, BlockNumber blkno)
{
        HOLD_INTERRUPTS();
        /* If we don't need a checksum, just return */
        if (PageIsNew(page) || !DataChecksumsNeedWrite())
        {
                RESUME_INTERRUPTS();
                return;
        }

        ((PageHeader) page)->pd_checksum = pg_checksum_page((char *) page, 
blkno);
        RESUME_INTERRUPTS();
}

The checksums state might change just after this call, before the callerhas actually performed the smgrwrite() or smgrextend() call. The callerneeds to hold interrupts across this function and thesmgrwrite/smgrextend() call. It is a bad idea to HOLD_INTERRUPTS() here,because that just masks bugs where the caller isn't holding theinterrupts. Same in PageSetChecksumCopy().


- Heikki

Re: Online checksums patch - once again

Reply via email to