Hi > This doesn't test the consequences of the restart being skipped, nor > does it review on a code level the correctness. I check not only one stuff during review. I look code too: bgworker checksumhelper.c registered with: > bgw.bgw_start_time = BgWorkerStart_RecoveryFinished; And then process the whole cluster (even if we run checksumhelper before, but exit before its completed). Or BgWorkerStart_RecoveryFinished does not guarantee start only after recovery finished? Before start any real work (and after recovery end) checksumhelper checked current cluster status again:
> + * If a standby was restarted when in pending state, a background worker > + * was registered to start. If it's later promoted after the master has > + * completed enabling checksums, we need to terminate immediately and > not > + * do anything. If the cluster is still in pending state when promoted, > + * the background worker should start to complete the job. > What if your replicas are delayed (e.g. recovery_min_apply_delay)? > What if you later need to do PITR? if we start after replay pg_enable_data_checksums and before it completed - we plan start bgworker on recovery finish. if we replay checksumhelper finish - we _can_ start checksumhelper again and this is handled during checksumhelper start. Behavior seems correct for me. I miss something very wrong? regards, Sergei