On Fri, Mar 09, 2018 at 01:42:04PM +0530, Dilip Kumar wrote:
> While setting the full_page_write with SIGHUP I hit an assert in checkpoint
> process. And, that is because inside a CRITICAL section we are calling
> RecoveryInProgress which intern allocates memory.  So I have moved
> RecoveryInProgress call out of the CRITICAL section and the problem got
> solved.

Indeed, I can see how this is possible.

If you apply the following you can also have way more fun, but that's
overdoing it:
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7918,6 +7918,8 @@ CheckRecoveryConsistency(void)
bool
RecoveryInProgress(void)
{
+   Assert(CritSectionCount == 0);

Anyway, it seems to me that you are not taking care of all possible race
conditions here.  RecoveryInProgress() could as well be called in
XLogFlush(), and that's a code path taken during redo.

Instead of doing what you are suggesting, why not moving
InitXLogInsert() out of InitXLOGAccess() and change InitPostgres() so as
the allocations for WAL inserts is done unconditionally?  This has
the cost of also making this allocation even for backends which are
started during recovery, still we are talking about allocating a couple
of bytes in exchange of addressing completely all race conditions in
this area.  InitXLogInsert() does not depend on any post-recovery data
like ThisTimeLineId, so a split is possible.

Thoughts?
--
Michael

Attachment: signature.asc
Description: PGP signature

Reply via email to