Re: remove spurious CREATE INDEX CONCURRENTLY wait
On 2021-11-11 09:38:07 -0300, Alvaro Herrera wrote: > > I just noticed that this commit (dcfff74fb16) didn't revert the change of > > lock > > level in ReplicationSlotRelease(). Was that intentional? > > Hmm, no, that seems to have been a mistake. I'll restore it. Thanks
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On 2021-Nov-10, Andres Freund wrote: > > Reverts 27838981be9d (some comments are kept). Per discussion, it does > > not seem safe to relax the lock level used for this; in order for it to > > be safe, there would have to be memory barriers between the point we set > > the flag and the point we set the trasaction Xid, which perhaps would > > not be so bad; but there would also have to be barriers at the readers' > > side, which from a performance perspective might be bad. > > > > Now maybe this analysis is wrong and it *is* safe for some reason, but > > proof of that is not trivial. > > I just noticed that this commit (dcfff74fb16) didn't revert the change of lock > level in ReplicationSlotRelease(). Was that intentional? Hmm, no, that seems to have been a mistake. I'll restore it. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "El miedo atento y previsor es la madre de la seguridad" (E. Burke)
Re: remove spurious CREATE INDEX CONCURRENTLY wait
Hi, On 2020-11-25 17:03:58 -0300, Alvaro Herrera wrote: > On 2020-Nov-23, Andres Freund wrote: > > > On 2020-11-23 12:30:05 -0300, Alvaro Herrera wrote: > > > > In other words, my conclusion is that there definitely is a bug here and > > > I am going to restore the use of exclusive lock for setting the > > > statusFlags. > > > > Cool. > > Here's a patch. > > Note it also moves the computation of vacuum's Xmin (per > GetTransactionSnapshot) to *after* the bit has been set in statusFlags. > From b813c67a4abe2127b8bd13db7e920f958db15d59 Mon Sep 17 00:00:00 2001 > From: Alvaro Herrera > Date: Tue, 24 Nov 2020 18:10:42 -0300 > Subject: [PATCH] Restore lock level to update statusFlags > > Reverts 27838981be9d (some comments are kept). Per discussion, it does > not seem safe to relax the lock level used for this; in order for it to > be safe, there would have to be memory barriers between the point we set > the flag and the point we set the trasaction Xid, which perhaps would > not be so bad; but there would also have to be barriers at the readers' > side, which from a performance perspective might be bad. > > Now maybe this analysis is wrong and it *is* safe for some reason, but > proof of that is not trivial. I just noticed that this commit (dcfff74fb16) didn't revert the change of lock level in ReplicationSlotRelease(). Was that intentional? Greetings, Andres Freund
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On Mon, Nov 30, 2020 at 04:15:27PM -0300, Alvaro Herrera wrote: > In the interest of showing progress, I'm going to mark this CF item as > committed, and I'll submit the remaining pieces in a new thread. Thanks for splitting! -- Michael signature.asc Description: PGP signature
Re: remove spurious CREATE INDEX CONCURRENTLY wait
In the interest of showing progress, I'm going to mark this CF item as committed, and I'll submit the remaining pieces in a new thread. Thanks!
Re: remove spurious CREATE INDEX CONCURRENTLY wait
Actually, I noticed two things. The first of them, addressed in this new version of the patch, is that REINDEX CONCURRENTLY is doing a lot of repetitive work by reopening each index and table in the build/validate loops, so that they can report progress. This is easy to remedy by adding a couple more members to the new struct (which I also renamed to ReindexIndexInfo), for tableId and amId. The code seems a bit simpler this way. The other thing is that ReindexRelationConcurrenty seems to always be called with the relations already locked by RangeVarGetRelidExtended. So claiming to acquire locks on the relations over and over is pointless. (I only noticed this because there was an obvious deadlock hazard in one of the loops, that locked index before table.) I think we should reduce all those to NoLock. My patch does not do that. diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index ca24620fd0..a52cb16b5e 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -114,6 +114,17 @@ typedef struct ReindexErrorInfo char relkind; } ReindexErrorInfo; +/* + * Index to process in ReindexRelationConcurrently + */ +typedef struct ReindexIndexInfo +{ + Oid indexId; + Oid tableId; + Oid amId; + bool safe; /* for set_indexsafe_procflags */ +} ReindexIndexInfo; + /* * CheckIndexCompatible * Determine whether an existing index definition is compatible with a @@ -385,7 +396,7 @@ CompareOpclassOptions(Datum *opts1, Datum *opts2, int natts) * lazy VACUUMs, because they won't be fazed by missing index entries * either. (Manual ANALYZEs, however, can't be excluded because they * might be within transactions that are going to do arbitrary operations - * later.) Processes running CREATE INDEX CONCURRENTLY + * later.) Processes running CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY * on indexes that are neither expressional nor partial are also safe to * ignore, since we know that those processes won't examine any data * outside the table they're indexing. @@ -1564,9 +1575,11 @@ DefineIndex(Oid relationId, CommitTransactionCommand(); StartTransactionCommand(); - /* Tell concurrent index builds to ignore us, if index qualifies */ - if (safe_index) - set_indexsafe_procflags(); + /* + * This transaction doesn't need to set the PROC_IN_SAFE_IC flag, because + * it only takes a snapshot to do some catalog manipulations, after the + * wait is over. + */ /* We should now definitely not be advertising any xmin. */ Assert(MyProc->xmin == InvalidTransactionId); @@ -3132,10 +3145,16 @@ ReindexRelationConcurrently(Oid relationOid, int options) get_rel_name(cellOid; else { + ReindexIndexInfo *idx; + /* Save the list of relation OIDs in private context */ oldcontext = MemoryContextSwitchTo(private_context); - indexIds = lappend_oid(indexIds, cellOid); + idx = palloc(sizeof(ReindexIndexInfo)); + idx->indexId = cellOid; + /* other fields set later */ + + indexIds = lappend(indexIds, idx); MemoryContextSwitchTo(oldcontext); } @@ -3172,13 +3191,18 @@ ReindexRelationConcurrently(Oid relationOid, int options) get_rel_name(cellOid; else { + ReindexIndexInfo *idx; + /* * Save the list of relation OIDs in private * context */ oldcontext = MemoryContextSwitchTo(private_context); - indexIds = lappend_oid(indexIds, cellOid); + idx = palloc(sizeof(ReindexIndexInfo)); + idx->indexId = cellOid; + indexIds = lappend(indexIds, idx); + /* other fields set later */ MemoryContextSwitchTo(oldcontext); } @@ -3197,6 +3221,7 @@ ReindexRelationConcurrently(Oid relationOid, int options) Oid heapId = IndexGetRelation(relationOid, (options & REINDEXOPT_MISSING_OK) != 0); Relation heapRelation; +ReindexIndexInfo *idx; /* if relation is missing, leave */ if (!OidIsValid(heapId)) @@ -3247,7 +3272,10 @@ ReindexRelationConcurrently(Oid relationOid, int options) * Save the list of relation OIDs in private context. Note * that invalid indexes are allowed here. */ -indexIds = lappend_oid(indexIds, relationOid); +idx = palloc(sizeof(ReindexIndexInfo)); +idx->indexId = relationOid; +indexIds = lappend(indexIds, idx); +/* other fields set later */ MemoryContextSwitchTo(oldcontext); break; @@ -3306,31 +3334,40 @@ ReindexRelationConcurrently(Oid relationOid, int options) foreach(lc, indexIds) { char *concurrentName; - Oid indexId = lfirst_oid(lc); + ReindexIndexInfo *idx = lfirst(lc); + ReindexIndexInfo *newidx; Oid newIndexId; Relation indexRel; Relation heapRel; Relation newIndexRel; LockRelId *lockrelid; - indexRel = index_open(indexId, ShareUpdateExclusiveLock); + /* XXX pointless lock acquisition */ + indexRel =
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On 2020-Nov-26, Alvaro Herrera wrote: > So let's discuss the next step in this series: what to do about REINDEX > CONCURRENTLY. > [...] ... as in the attached. diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index ca24620fd0..9c1c0aad3e 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -385,7 +385,7 @@ CompareOpclassOptions(Datum *opts1, Datum *opts2, int natts) * lazy VACUUMs, because they won't be fazed by missing index entries * either. (Manual ANALYZEs, however, can't be excluded because they * might be within transactions that are going to do arbitrary operations - * later.) Processes running CREATE INDEX CONCURRENTLY + * later.) Processes running CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY * on indexes that are neither expressional nor partial are also safe to * ignore, since we know that those processes won't examine any data * outside the table they're indexing. @@ -1564,9 +1564,11 @@ DefineIndex(Oid relationId, CommitTransactionCommand(); StartTransactionCommand(); - /* Tell concurrent index builds to ignore us, if index qualifies */ - if (safe_index) - set_indexsafe_procflags(); + /* + * This transaction doesn't need to set the PROC_IN_SAFE_IC flag, because + * it only takes a snapshot to do some catalog manipulations, after the + * wait is over. + */ /* We should now definitely not be advertising any xmin. */ Assert(MyProc->xmin == InvalidTransactionId); @@ -3020,6 +3022,13 @@ ReindexMultipleInternal(List *relids, int options) * concerns, so there's no need for the more complicated concurrent build, * anyway, and a non-concurrent reindex is more efficient. */ + +typedef struct reindex_idx +{ + Oid indexId; + bool safe; +} reindex_idx; + static bool ReindexRelationConcurrently(Oid relationOid, int options) { @@ -3132,10 +3141,15 @@ ReindexRelationConcurrently(Oid relationOid, int options) get_rel_name(cellOid; else { + reindex_idx *idx; + /* Save the list of relation OIDs in private context */ oldcontext = MemoryContextSwitchTo(private_context); - indexIds = lappend_oid(indexIds, cellOid); + idx = palloc(sizeof(reindex_idx)); + idx->indexId = cellOid; + + indexIds = lappend(indexIds, idx); MemoryContextSwitchTo(oldcontext); } @@ -3172,13 +3186,17 @@ ReindexRelationConcurrently(Oid relationOid, int options) get_rel_name(cellOid; else { + reindex_idx *idx; + /* * Save the list of relation OIDs in private * context */ oldcontext = MemoryContextSwitchTo(private_context); - indexIds = lappend_oid(indexIds, cellOid); + idx = palloc(sizeof(reindex_idx)); + idx->indexId = cellOid; + indexIds = lappend(indexIds, idx); MemoryContextSwitchTo(oldcontext); } @@ -3197,6 +3215,7 @@ ReindexRelationConcurrently(Oid relationOid, int options) Oid heapId = IndexGetRelation(relationOid, (options & REINDEXOPT_MISSING_OK) != 0); Relation heapRelation; +reindex_idx *idx; /* if relation is missing, leave */ if (!OidIsValid(heapId)) @@ -3247,7 +3266,9 @@ ReindexRelationConcurrently(Oid relationOid, int options) * Save the list of relation OIDs in private context. Note * that invalid indexes are allowed here. */ -indexIds = lappend_oid(indexIds, relationOid); +idx = palloc(sizeof(reindex_idx)); +idx->indexId = relationOid; +indexIds = lappend(indexIds, idx); MemoryContextSwitchTo(oldcontext); break; @@ -3306,8 +3327,10 @@ ReindexRelationConcurrently(Oid relationOid, int options) foreach(lc, indexIds) { char *concurrentName; - Oid indexId = lfirst_oid(lc); + reindex_idx *index = lfirst(lc); + Oid indexId = index->indexId; Oid newIndexId; + reindex_idx *newidx; Relation indexRel; Relation heapRel; Relation newIndexRel; @@ -3347,12 +3370,20 @@ ReindexRelationConcurrently(Oid relationOid, int options) */ newIndexRel = index_open(newIndexId, ShareUpdateExclusiveLock); + /* determine safety of this index for set_indexsafe_procflags */ + index->safe = (newIndexRel->rd_indexprs == NIL && + newIndexRel->rd_indpred == NIL); + /* * Save the list of OIDs and locks in private context */ oldcontext = MemoryContextSwitchTo(private_context); - newIndexIds = lappend_oid(newIndexIds, newIndexId); + newidx = palloc(sizeof(reindex_idx)); + newidx->indexId = newIndexId; + newidx->safe = index->safe; + + newIndexIds = lappend(newIndexIds, newidx); /* * Save lockrelid to protect each relation from drop then close @@ -3416,6 +3447,11 @@ ReindexRelationConcurrently(Oid relationOid, int options) CommitTransactionCommand(); StartTransactionCommand(); + /* + * Because we don't take a snapshot in this transaction, there's no + * need to set the
Re: remove spurious CREATE INDEX CONCURRENTLY wait
So let's discuss the next step in this series: what to do about REINDEX CONCURRENTLY. I started with Dmitry's patch (an updated version of which I already posted in [1]). However, Dmitry missed (and I hadn't noticed) that some of the per-index loops are starting transactions also, and that we need to set the flag in those. And what's more, in a couple of the function-scope transactions we do set the flag pointlessly: the transactions there do not acquire a snapshot, so there's no reason to set the flag at all, because WaitForOlderSnapshots ignores sessions whose Xmin is 0. There are also transactions that wait first, without setting a snapshot, and then do some catalog manipulations. I think it's prett much useless to set the flag for those, because they're going to be very short anyway. (There's also one case of this in CREATE INDEX CONCURRENTLY.) But there's a more interesting point also. In Dmitry's patch, we determine safety for *all* indexes being processed as a set, and then apply the flag only if they're all deemed safe. But we can optimize this, and set the flag for each index' transaction individually, and only skip it for those specific indexes that are unsafe. So I propose to change the data structure used in ReindexRelationConcurrently from the current list of OIDs to a list of (oid,boolean) pairs, to be able to track setting the flag individually. There's one more useful observation: in the function-scope transactions (outside the per-index loops), we don't touch the contents of any indexes; we just wait or do some catalog manipulation. So we can set the flag *regardless of the safety of any indexes*. We only need to care about the safety of the indexes in the phases where we build the indexes and when we validate them. [1] https://postgr.es/m/20201118175804.GA23027@alvherre.pgsql
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On 2020-Nov-25, Alvaro Herrera wrote: > On 2020-Nov-23, Andres Freund wrote: > > > On 2020-11-23 12:30:05 -0300, Alvaro Herrera wrote: > > > > In other words, my conclusion is that there definitely is a bug here and > > > I am going to restore the use of exclusive lock for setting the > > > statusFlags. > > > > Cool. > > Here's a patch. Pushed, thanks.
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On 2020-Nov-23, Andres Freund wrote: > On 2020-11-23 12:30:05 -0300, Alvaro Herrera wrote: > > In other words, my conclusion is that there definitely is a bug here and > > I am going to restore the use of exclusive lock for setting the > > statusFlags. > > Cool. Here's a patch. Note it also moves the computation of vacuum's Xmin (per GetTransactionSnapshot) to *after* the bit has been set in statusFlags. >From b813c67a4abe2127b8bd13db7e920f958db15d59 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Tue, 24 Nov 2020 18:10:42 -0300 Subject: [PATCH] Restore lock level to update statusFlags Reverts 27838981be9d (some comments are kept). Per discussion, it does not seem safe to relax the lock level used for this; in order for it to be safe, there would have to be memory barriers between the point we set the flag and the point we set the trasaction Xid, which perhaps would not be so bad; but there would also have to be barriers at the readers' side, which from a performance perspective might be bad. Now maybe this analysis is wrong and it *is* safe for some reason, but proof of that is not trivial. Discussion: https://postgr.es/m/20201118190928.vnztes7c2sldu...@alap3.anarazel.de --- src/backend/commands/vacuum.c | 20 +++- src/backend/replication/logical/logical.c | 2 +- src/backend/storage/ipc/procarray.c | 4 +--- src/include/storage/proc.h| 6 +++--- 4 files changed, 16 insertions(+), 16 deletions(-) diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index 395e75f768..f1112111de 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -1712,13 +1712,6 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params) /* Begin a transaction for vacuuming this relation */ StartTransactionCommand(); - /* - * Need to acquire a snapshot to prevent pg_subtrans from being truncated, - * cutoff xids in local memory wrapping around, and to have updated xmin - * horizons. - */ - PushActiveSnapshot(GetTransactionSnapshot()); - if (!(params->options & VACOPT_FULL)) { /* @@ -1739,9 +1732,11 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params) * Note: these flags remain set until CommitTransaction or * AbortTransaction. We don't want to clear them until we reset * MyProc->xid/xmin, otherwise GetOldestNonRemovableTransactionId() - * might appear to go backwards, which is probably Not Good. + * might appear to go backwards, which is probably Not Good. (We also + * set PROC_IN_VACUUM *before* taking our own snapshot, so that our + * xmin doesn't become visible ahead of setting the flag.) */ - LWLockAcquire(ProcArrayLock, LW_SHARED); + LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); MyProc->statusFlags |= PROC_IN_VACUUM; if (params->is_wraparound) MyProc->statusFlags |= PROC_VACUUM_FOR_WRAPAROUND; @@ -1749,6 +1744,13 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params) LWLockRelease(ProcArrayLock); } + /* + * Need to acquire a snapshot to prevent pg_subtrans from being truncated, + * cutoff xids in local memory wrapping around, and to have updated xmin + * horizons. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + /* * Check for user-requested abort. Note we want this to be inside a * transaction, so xact.c doesn't issue useless WARNING. diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c index 4324e32656..f1f4df7d70 100644 --- a/src/backend/replication/logical/logical.c +++ b/src/backend/replication/logical/logical.c @@ -181,7 +181,7 @@ StartupDecodingContext(List *output_plugin_options, */ if (!IsTransactionOrTransactionBlock()) { - LWLockAcquire(ProcArrayLock, LW_SHARED); + LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); MyProc->statusFlags |= PROC_IN_LOGICAL_DECODING; ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags; LWLockRelease(ProcArrayLock); diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index 94edb24b22..c7848c0b69 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -662,10 +662,8 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid) /* avoid unnecessarily dirtying shared cachelines */ if (proc->statusFlags & PROC_VACUUM_STATE_MASK) { - /* Only safe to change my own flags with just share lock */ - Assert(proc == MyProc); Assert(!LWLockHeldByMe(ProcArrayLock)); - LWLockAcquire(ProcArrayLock, LW_SHARED); + LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); Assert(proc->statusFlags == ProcGlobal->statusFlags[proc->pgxactoff]); proc->statusFlags &= ~PROC_VACUUM_STATE_MASK; ProcGlobal->statusFlags[proc->pgxactoff] = proc->statusFlags; diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h index 00bb244340..22046e4e36 100644 --- a/src/include/storage/proc.h +++ b/src/include/storage/proc.h @@
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On 2020-Nov-23, Tom Lane wrote: > Alvaro Herrera writes: > > GetCurrentVirtualXIDs, ComputeXidHorizons, GetSnapshotData: > > > In these cases, what we need is that the code computes some xmin (or > > equivalent computation) based on a set of transactions that exclude > > those marked with the flags. The behavior we want is that if some > > transaction is marked as vacuum, we ignore the Xid/Xmin *if there is > > one*. In other words, if there's no Xid/Xmin, then the flag is not > > important. So if we can ensure that the flag is set first, and the > > Xid/xmin is installed later, that's sufficient correctness and we don't > > need to hold exclusive lock. But if we can't ensure that, then we must > > use exclusive lock, because otherwise we risk another process seeing our > > Xid first and not our flag, which would be bad. > > I don't buy this either. You get the same result if someone looks just > before you take the ProcArrayLock to set the flag. So if there's a > problem, it's inherent in the way that the flags are defined or used; > the strength of lock used in this stanza won't affect it. The problem is that the writes could be reordered in a way that makes the Xid appear set to an onlooker before PROC_IN_VACUUM appears set. Vacuum always sets the bit first, and *then* the xid. If the reader always reads it like that then it's not a problem. But in order to guarantee that, we would have to have a read barrier for each pass through the loop. With the LW_EXCLUSIVE lock, we block the readers so that the bit is known set by the time they examine it. As I understand, getting the lock is itself a barrier, so there's no danger that we'll set the bit and they won't see it. ... at least, that how I *imagine* the argument to be. In practice, vacuum_rel() calls GetSnapshotData() before installing the PROC_IN_VACUUM bit, and therefore there *is* a risk that reader 1 will get MyProc->xmin included in their snapshot (because bit not yet set), and reader 2 won't. If my understanding is correct, then we should move the PushActiveSnapshot(GetTransactionSnapshot()) call to after we have the PROC_IN_VACUUM bit set.
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On 2020-Nov-23, Andres Freund wrote: > On 2020-11-23 12:30:05 -0300, Alvaro Herrera wrote: > > PROC_IN_LOGICAL_DECODING: > > Oddly enough, I think the reset of PROC_IN_LOGICAL_DECODING in > > ReplicationSlotRelease might be the most problematic one of the lot. > > That's because a proc's xmin that had been ignored all along by > > ComputeXidHorizons, will now be included in the computation. Adding > > asserts that proc->xmin and proc->xid are InvalidXid by the time we > > reset the flag, I got hits in pg_basebackup, test_decoding and > > subscription tests. I think it's OK for ComputeXidHorizons (since it > > just means that a vacuum that reads a later will remove less rows.) But > > in GetSnapshotData it is just not correct to have the Xmin go backwards. > > I don't think there's a problem. PROC_IN_LOGICAL_DECODING can only be > set when outside a transaction block, i.e. walsender. In which case > there shouldn't be an xid/xmin, I think? Or did you gate your assert on > PROC_IN_LOGICAL_DECODING being set? Ah, you're right about this one -- I missed the significance of setting the flag only "when outside of a transaction block" at the time we call StartupDecodingContext.
Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
On Mon, Nov 23, 2020 at 06:13:17PM -0500, Tom Lane wrote: > Alvaro Herrera writes: >> Please feel free to go ahead, including the change to ProcSleep. > > Will do. Thank you both for 450c823 and 789b938. -- Michael signature.asc Description: PGP signature
Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
On 2020-Nov-23, Tom Lane wrote: > I never cared that much for "is_log_level_output" either. Thinking > about renaming it to "should_output_to_log()", and then the new function > would be "should_output_to_client()". Ah, that sounds nicely symmetric and grammatical. Thanks!
Re: remove spurious CREATE INDEX CONCURRENTLY wait
Hi, On 2020-11-23 12:30:05 -0300, Alvaro Herrera wrote: > ProcSleep: no bug here. > The flags are checked to see if we should kill() the process (used when > autovac blocks some other process). The lock here appears to be > inconsequential, since we release it before we do kill(); so strictly > speaking, there is still a race condition where the autovac process > could reset the flag after we read it and before we get a chance to > signal it. The lock level autovac uses to set the flag is not relevant > either. Yea. Even before the recent changes building the lock message under the lock didn't make sense. Now that the flags are mirrored in pgproc, we probably should just make this use READ_ONCE(autovac->statusFlags), without any other use of ProcArrayLock. As you say the race condition is between the flag test, the kill, and the signal being processed, which wasn't previously protected either. > PROC_IN_LOGICAL_DECODING: > Oddly enough, I think the reset of PROC_IN_LOGICAL_DECODING in > ReplicationSlotRelease might be the most problematic one of the lot. > That's because a proc's xmin that had been ignored all along by > ComputeXidHorizons, will now be included in the computation. Adding > asserts that proc->xmin and proc->xid are InvalidXid by the time we > reset the flag, I got hits in pg_basebackup, test_decoding and > subscription tests. I think it's OK for ComputeXidHorizons (since it > just means that a vacuum that reads a later will remove less rows.) But > in GetSnapshotData it is just not correct to have the Xmin go backwards. I don't think there's a problem. PROC_IN_LOGICAL_DECODING can only be set when outside a transaction block, i.e. walsender. In which case there shouldn't be an xid/xmin, I think? Or did you gate your assert on PROC_IN_LOGICAL_DECODING being set? > GetCurrentVirtualXIDs, ComputeXidHorizons, GetSnapshotData: > > In these cases, what we need is that the code computes some xmin (or > equivalent computation) based on a set of transactions that exclude > those marked with the flags. The behavior we want is that if some > transaction is marked as vacuum, we ignore the Xid/Xmin *if there is > one*. In other words, if there's no Xid/Xmin, then the flag is not > important. So if we can ensure that the flag is set first, and the > Xid/xmin is installed later, that's sufficient correctness and we don't > need to hold exclusive lock. That'd require at least memory barriers in GetSnapshotData()'s loop, which I'd really like to avoid. Otherwise the order in which memory gets written in one process doesn't guarantee the order of visibility in another process... > In other words, my conclusion is that there definitely is a bug here and > I am going to restore the use of exclusive lock for setting the > statusFlags. Cool. > GetSnapshotData has an additional difficulty -- we do the > UINT32_ACCESS_ONCE(ProcGlobal->xid[i]) read *before* testing the bit. > So it's not valid to set the bit without locking out GSD, regardless of > any barriers we had; if we want this to be safe, we'd have to change > this so that the flag is read first, and we read the xid only > afterwards, with a read barrier. Which we don't want, because that'd mean slowing down the *extremely* common case of the majority of backends neither having an xid assigned nor doing logical decoding / vacuum. We'd be reading two cachelines instead of one. Greetings, Andres Freund
Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
Alvaro Herrera writes: > On 2020-Nov-23, Tom Lane wrote: >> I'm not too fussed about whether we invent is_log_level_output_client, >> although that name doesn't seem well-chosen compared to >> is_log_level_output. > Just replacing "log" for "client" in that seemed strictly worse, and I > didn't (don't) have any other ideas. I never cared that much for "is_log_level_output" either. Thinking about renaming it to "should_output_to_log()", and then the new function would be "should_output_to_client()". >> Shall I press forward with this, or do you want to? > Please feel free to go ahead, including the change to ProcSleep. Will do. regards, tom lane
Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
On 2020-Nov-23, Tom Lane wrote: > Ah, I see I didn't cover the case in ProcSleep that you were originally on > about ... I'd just looked for existing references to log_min_messages > and client_min_messages. Yeah, it seemed bad form to add that when you had just argued against it :-) > I think it's important to have the explicit check for elevel >= ERROR. > I'm not too fussed about whether we invent is_log_level_output_client, > although that name doesn't seem well-chosen compared to > is_log_level_output. Just replacing "log" for "client" in that seemed strictly worse, and I didn't (don't) have any other ideas. > Shall I press forward with this, or do you want to? Please feel free to go ahead, including the change to ProcSleep.
Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
Alvaro Herrera writes: > Your version has the advantage that errstart() doesn't get a new > function call. I'm +1 for going with that ... we could avoid the > duplicate code with some additional contortions but this changes so > rarely that it's probably not worth the trouble. I was considering adding that factorization, but marking the function inline to avoid adding overhead. Most of elog.c predates our use of inline, so it wasn't considered when this code was written. regards, tom lane
Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
Alvaro Herrera writes: > On 2020-Nov-23, Tom Lane wrote: >> Here's a draft patch. > Here's another of my own. Outside of elog.c it seems identical. Ah, I see I didn't cover the case in ProcSleep that you were originally on about ... I'd just looked for existing references to log_min_messages and client_min_messages. I think it's important to have the explicit check for elevel >= ERROR. I'm not too fussed about whether we invent is_log_level_output_client, although that name doesn't seem well-chosen compared to is_log_level_output. Shall I press forward with this, or do you want to? regards, tom lane
Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
On 2020-Nov-23, Alvaro Herrera wrote: > On 2020-Nov-23, Tom Lane wrote: > > > Here's a draft patch. > > Here's another of my own. Outside of elog.c it seems identical. Your version has the advantage that errstart() doesn't get a new function call. I'm +1 for going with that ... we could avoid the duplicate code with some additional contortions but this changes so rarely that it's probably not worth the trouble.
Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
On 2020-Nov-23, Tom Lane wrote: > Here's a draft patch. Here's another of my own. Outside of elog.c it seems identical. diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index 03c553e7ea..a4ab9090f9 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -5344,7 +5344,7 @@ static void ShowTransactionState(const char *str) { /* skip work if message will definitely not be printed */ - if (log_min_messages <= DEBUG5 || client_min_messages <= DEBUG5) + if (message_level_is_interesting(DEBUG5)) ShowTransactionStateRec(str, CurrentTransactionState); } diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c index 7e915bcadf..32a3099c1f 100644 --- a/src/backend/access/transam/xlogutils.c +++ b/src/backend/access/transam/xlogutils.c @@ -105,7 +105,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno, * tracing of the cause (note the elog context mechanism will tell us * something about the XLOG record that generated the reference). */ - if (log_min_messages <= DEBUG1 || client_min_messages <= DEBUG1) + if (message_level_is_interesting(DEBUG1)) report_invalid_page(DEBUG1, node, forkno, blkno, present); if (invalid_page_tab == NULL) @@ -159,7 +159,7 @@ forget_invalid_pages(RelFileNode node, ForkNumber forkno, BlockNumber minblkno) hentry->key.forkno == forkno && hentry->key.blkno >= minblkno) { - if (log_min_messages <= DEBUG2 || client_min_messages <= DEBUG2) + if (message_level_is_interesting(DEBUG2)) { char *path = relpathperm(hentry->key.node, forkno); @@ -192,7 +192,7 @@ forget_invalid_pages_db(Oid dbid) { if (hentry->key.node.dbNode == dbid) { - if (log_min_messages <= DEBUG2 || client_min_messages <= DEBUG2) + if (message_level_is_interesting(DEBUG2)) { char *path = relpathperm(hentry->key.node, hentry->key.forkno); diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c index b0d037600e..0ad647ed9c 100644 --- a/src/backend/catalog/dependency.c +++ b/src/backend/catalog/dependency.c @@ -1146,15 +1146,8 @@ reportDependentObjects(const ObjectAddresses *targetObjects, * If no error is to be thrown, and the msglevel is too low to be shown to * either client or server log, there's no need to do any of the rest of * the work. - * - * Note: this code doesn't know all there is to be known about elog - * levels, but it works for NOTICE and DEBUG2, which are the only values - * msglevel can currently have. We also assume we are running in a normal - * operating environment. */ - if (behavior == DROP_CASCADE && - msglevel < client_min_messages && - (msglevel < log_min_messages || log_min_messages == LOG)) + if (behavior == DROP_CASCADE && !message_level_is_interesting(msglevel)) return; /* diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c index babee386c4..87c3ea450e 100644 --- a/src/backend/replication/walreceiver.c +++ b/src/backend/replication/walreceiver.c @@ -1215,7 +1215,7 @@ ProcessWalSndrMessage(XLogRecPtr walEnd, TimestampTz sendTime) walrcv->lastMsgReceiptTime = lastMsgReceiptTime; SpinLockRelease(>mutex); - if (log_min_messages <= DEBUG2) + if (message_level_is_interesting(DEBUG2)) { char *sendtime; char *receipttime; diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 5d1b1a16be..2eb19ad293 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -1900,7 +1900,7 @@ ProcessStandbyReplyMessage(void) replyTime = pq_getmsgint64(_message); replyRequested = pq_getmsgbyte(_message); - if (log_min_messages <= DEBUG2) + if (message_level_is_interesting(DEBUG2)) { char *replyTimeStr; @@ -2082,7 +2082,7 @@ ProcessStandbyHSFeedbackMessage(void) feedbackCatalogXmin = pq_getmsgint(_message, 4); feedbackCatalogEpoch = pq_getmsgint(_message, 4); - if (log_min_messages <= DEBUG2) + if (message_level_is_interesting(DEBUG2)) { char *replyTimeStr; diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index 40eac704dc..eb2309ed76 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -1341,22 +1341,25 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable) StringInfoData logbuf; /* errdetail for server log */ /* report the case, if configured to do so */ -initStringInfo(); -initStringInfo(); -DescribeLockTag(, _copy); -appendStringInfo(, - _("Process %d waits for %s on %s."), - MyProcPid, - GetLockmodeName(lockmethod_copy, lockmode), - locktagbuf.data); +if (message_level_is_interesting(DEBUG1)) +{ + initStringInfo(); + initStringInfo(); + DescribeLockTag(, _copy); + appendStringInfo(, + _("Process %d waits for %s on %s."), + MyProcPid, +
Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
Here's a draft patch. regards, tom lane diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index 03c553e7ea..9cd0b7c11b 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -5344,7 +5344,7 @@ static void ShowTransactionState(const char *str) { /* skip work if message will definitely not be printed */ - if (log_min_messages <= DEBUG5 || client_min_messages <= DEBUG5) + if (message_level_is_interesting(DEBUG5)) ShowTransactionStateRec(str, CurrentTransactionState); } @@ -5371,7 +5371,6 @@ ShowTransactionStateRec(const char *str, TransactionState s) if (s->parent) ShowTransactionStateRec(str, s->parent); - /* use ereport to suppress computation if msg will not be printed */ ereport(DEBUG5, (errmsg_internal("%s(%d) name: %s; blockState: %s; state: %s, xid/subid/cid: %u/%u/%u%s%s", str, s->nestingLevel, diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c index 7e915bcadf..32a3099c1f 100644 --- a/src/backend/access/transam/xlogutils.c +++ b/src/backend/access/transam/xlogutils.c @@ -105,7 +105,7 @@ log_invalid_page(RelFileNode node, ForkNumber forkno, BlockNumber blkno, * tracing of the cause (note the elog context mechanism will tell us * something about the XLOG record that generated the reference). */ - if (log_min_messages <= DEBUG1 || client_min_messages <= DEBUG1) + if (message_level_is_interesting(DEBUG1)) report_invalid_page(DEBUG1, node, forkno, blkno, present); if (invalid_page_tab == NULL) @@ -159,7 +159,7 @@ forget_invalid_pages(RelFileNode node, ForkNumber forkno, BlockNumber minblkno) hentry->key.forkno == forkno && hentry->key.blkno >= minblkno) { - if (log_min_messages <= DEBUG2 || client_min_messages <= DEBUG2) + if (message_level_is_interesting(DEBUG2)) { char *path = relpathperm(hentry->key.node, forkno); @@ -192,7 +192,7 @@ forget_invalid_pages_db(Oid dbid) { if (hentry->key.node.dbNode == dbid) { - if (log_min_messages <= DEBUG2 || client_min_messages <= DEBUG2) + if (message_level_is_interesting(DEBUG2)) { char *path = relpathperm(hentry->key.node, hentry->key.forkno); diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c index b0d037600e..245c2f4fc8 100644 --- a/src/backend/catalog/dependency.c +++ b/src/backend/catalog/dependency.c @@ -1146,15 +1146,9 @@ reportDependentObjects(const ObjectAddresses *targetObjects, * If no error is to be thrown, and the msglevel is too low to be shown to * either client or server log, there's no need to do any of the rest of * the work. - * - * Note: this code doesn't know all there is to be known about elog - * levels, but it works for NOTICE and DEBUG2, which are the only values - * msglevel can currently have. We also assume we are running in a normal - * operating environment. */ if (behavior == DROP_CASCADE && - msglevel < client_min_messages && - (msglevel < log_min_messages || log_min_messages == LOG)) + !message_level_is_interesting(msglevel)) return; /* diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c index babee386c4..87c3ea450e 100644 --- a/src/backend/replication/walreceiver.c +++ b/src/backend/replication/walreceiver.c @@ -1215,7 +1215,7 @@ ProcessWalSndrMessage(XLogRecPtr walEnd, TimestampTz sendTime) walrcv->lastMsgReceiptTime = lastMsgReceiptTime; SpinLockRelease(>mutex); - if (log_min_messages <= DEBUG2) + if (message_level_is_interesting(DEBUG2)) { char *sendtime; char *receipttime; diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 5d1b1a16be..2eb19ad293 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -1900,7 +1900,7 @@ ProcessStandbyReplyMessage(void) replyTime = pq_getmsgint64(_message); replyRequested = pq_getmsgbyte(_message); - if (log_min_messages <= DEBUG2) + if (message_level_is_interesting(DEBUG2)) { char *replyTimeStr; @@ -2082,7 +2082,7 @@ ProcessStandbyHSFeedbackMessage(void) feedbackCatalogXmin = pq_getmsgint(_message, 4); feedbackCatalogEpoch = pq_getmsgint(_message, 4); - if (log_min_messages <= DEBUG2) + if (message_level_is_interesting(DEBUG2)) { char *replyTimeStr; diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c index 1ba47c194b..04c2245338 100644 --- a/src/backend/utils/error/elog.c +++ b/src/backend/utils/error/elog.c @@ -600,6 +600,42 @@ errfinish(const char *filename, int lineno, const char *funcname) CHECK_FOR_INTERRUPTS(); } +/* + * message_level_is_interesting --- would ereport/elog do anything? + * + * Returns true if ereport/elog with this elevel will not be a no-op. + * This is useful to short-circuit any expensive preparatory work that + * might be needed for a logging message. There is no point in + *
Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
On 2020-Nov-23, Tom Lane wrote: > Alvaro Herrera writes: > > Well, we already do this in a number of places. But I can get behind > > this: > > >> Maybe it'd be a good idea to have elog.c expose a new function > >> along the lines of "bool message_level_is_interesting(int elevel)" > >> to support this and similar future optimizations in a less fragile way. > > I'll see about a patch for that. Looking at that now ...
Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
Alvaro Herrera writes: > Well, we already do this in a number of places. But I can get behind > this: >> Maybe it'd be a good idea to have elog.c expose a new function >> along the lines of "bool message_level_is_interesting(int elevel)" >> to support this and similar future optimizations in a less fragile way. I'll see about a patch for that. regards, tom lane
Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
On 2020-Nov-23, Tom Lane wrote: > Alvaro Herrera writes: > > On 2020-Nov-19, Michael Paquier wrote: > >> By the way, it strikes me that you could just do nothing as long as > >> (log_min_messages > DEBUG1), so you could encapsulate most of the > >> logic that plays with the lock tag using that. > > > Good idea, done. > > I'm less sure that that's a good idea. It embeds knowledge here that > should not exist outside elog.c; moreover, I'm not entirely sure that > it's even correct, given the nonlinear ranking of log_min_messages. Well, we already do this in a number of places. But I can get behind this: > Maybe it'd be a good idea to have elog.c expose a new function > along the lines of "bool message_level_is_interesting(int elevel)" > to support this and similar future optimizations in a less fragile way.
Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
Alvaro Herrera writes: > On 2020-Nov-19, Michael Paquier wrote: >> By the way, it strikes me that you could just do nothing as long as >> (log_min_messages > DEBUG1), so you could encapsulate most of the >> logic that plays with the lock tag using that. > Good idea, done. I'm less sure that that's a good idea. It embeds knowledge here that should not exist outside elog.c; moreover, I'm not entirely sure that it's even correct, given the nonlinear ranking of log_min_messages. Maybe it'd be a good idea to have elog.c expose a new function along the lines of "bool message_level_is_interesting(int elevel)" to support this and similar future optimizations in a less fragile way. regards, tom lane
Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
On 2020-Nov-19, Michael Paquier wrote: > On Thu, Nov 19, 2020 at 12:13:44PM +0900, Michael Paquier wrote: > > That still looks useful for debugging, so DEBUG1 sounds fine to me. > > By the way, it strikes me that you could just do nothing as long as > (log_min_messages > DEBUG1), so you could encapsulate most of the > logic that plays with the lock tag using that. Good idea, done. I also noticed that if we're going to accept a race (which BTW already exists) we may as well simplify the code about it. I think the attached is the final form of this. >From cab95d8813cd79adb7b2a1b5501714c2dd87bec2 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Tue, 17 Nov 2020 21:09:22 -0300 Subject: [PATCH] Don't hold ProcArrayLock longer than needed in rare cases While cancelling an autovacuum worker, we hold ProcArrayLock while formatting a debugging log string. We can make this shorter by saving the data we need to produce the message and doing the formatting outside the locked region. This isn't terribly critical, as it only occurs pretty rarely: when a backend runs deadlock detection and it happens to be blocked by a autovacuum running autovacuum. Still, there's no need to cause a hiccup in ProcArrayLock processing, which can be very high-traffic in some cases. While at it, rework code so that we only print the string when it is really going to be used, as suggested by Michael Paquier. Discussion: https://postgr.es/m/20201118214127.GA3179@alvherre.pgsql Reviewed-by: Michael Paquier --- src/backend/storage/lmgr/proc.c | 63 - 1 file changed, 38 insertions(+), 25 deletions(-) diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index d1738c65f5..dbfdd7b3c7 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -1311,40 +1311,58 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable) { PGPROC *autovac = GetBlockingAutoVacuumPgproc(); uint8 statusFlags; + uint8 lockmethod_copy; + LOCKTAG locktag_copy; - LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); + /* + * Grab info we need, then release lock immediately. Note this + * coding means that there is a tiny chance that the process + * terminates its current transaction and starts a different one + * before we have a change to send the signal; the worst possible + * consequence is that a for-wraparound vacuum is cancelled. But + * that could happen in any case unless we were to do kill() with + * the lock held, which is much more undesirable. + */ + LWLockAcquire(ProcArrayLock, LW_SHARED); + statusFlags = ProcGlobal->statusFlags[autovac->pgxactoff]; + lockmethod_copy = lock->tag.locktag_lockmethodid; + locktag_copy = lock->tag; + LWLockRelease(ProcArrayLock); /* * Only do it if the worker is not working to protect against Xid * wraparound. */ - statusFlags = ProcGlobal->statusFlags[autovac->pgxactoff]; if ((statusFlags & PROC_IS_AUTOVACUUM) && !(statusFlags & PROC_VACUUM_FOR_WRAPAROUND)) { int pid = autovac->pid; -StringInfoData locktagbuf; -StringInfoData logbuf; /* errdetail for server log */ -initStringInfo(); -initStringInfo(); -DescribeLockTag(, >tag); -appendStringInfo(, - _("Process %d waits for %s on %s."), - MyProcPid, - GetLockmodeName(lock->tag.locktag_lockmethodid, - lockmode), - locktagbuf.data); +/* report the case, if configured to do so */ +if (log_min_messages <= DEBUG1 || client_min_messages <= DEBUG1) +{ + StringInfoData locktagbuf; + StringInfoData logbuf; /* errdetail for server log */ -/* release lock as quickly as possible */ -LWLockRelease(ProcArrayLock); + initStringInfo(); + initStringInfo(); + DescribeLockTag(, _copy); + appendStringInfo(, + _("Process %d waits for %s on %s."), + MyProcPid, + GetLockmodeName(lockmethod_copy, lockmode), + locktagbuf.data); + + ereport(DEBUG1, + (errmsg("sending cancel to blocking autovacuum PID %d", + pid), + errdetail_log("%s", logbuf.data))); + + pfree(logbuf.data); + pfree(locktagbuf.data); +} /* send the autovacuum worker Back to Old Kent Road */ -ereport(DEBUG1, - (errmsg("sending cancel to blocking autovacuum PID %d", -pid), - errdetail_log("%s", logbuf.data))); - if (kill(pid, SIGINT) < 0) { /* @@ -1362,12 +1380,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable) (errmsg("could not send signal to process %d: %m", pid))); } - -pfree(logbuf.data); -pfree(locktagbuf.data); } - else -LWLockRelease(ProcArrayLock); /* prevent signal from being sent again more than once */ allow_autovacuum_cancel = false; -- 2.20.1
Re: remove spurious CREATE INDEX CONCURRENTLY wait
Alvaro Herrera writes: > PROC_IN_LOGICAL_DECODING: > Oddly enough, I think the reset of PROC_IN_LOGICAL_DECODING in > ReplicationSlotRelease might be the most problematic one of the lot. > That's because a proc's xmin that had been ignored all along by > ComputeXidHorizons, will now be included in the computation. Adding > asserts that proc->xmin and proc->xid are InvalidXid by the time we > reset the flag, I got hits in pg_basebackup, test_decoding and > subscription tests. I think it's OK for ComputeXidHorizons (since it > just means that a vacuum that reads a later will remove less rows.) But > in GetSnapshotData it is just not correct to have the Xmin go backwards. > Therefore it seems to me that this code has a bug independently of the > lock level used. That is only a bug if the flags are *cleared* in a way that's not atomic with clearing the transaction's xid/xmin, no? I agree that once set, the flag had better stay set till transaction end, but that's not what's at stake here. > GetCurrentVirtualXIDs, ComputeXidHorizons, GetSnapshotData: > In these cases, what we need is that the code computes some xmin (or > equivalent computation) based on a set of transactions that exclude > those marked with the flags. The behavior we want is that if some > transaction is marked as vacuum, we ignore the Xid/Xmin *if there is > one*. In other words, if there's no Xid/Xmin, then the flag is not > important. So if we can ensure that the flag is set first, and the > Xid/xmin is installed later, that's sufficient correctness and we don't > need to hold exclusive lock. But if we can't ensure that, then we must > use exclusive lock, because otherwise we risk another process seeing our > Xid first and not our flag, which would be bad. I don't buy this either. You get the same result if someone looks just before you take the ProcArrayLock to set the flag. So if there's a problem, it's inherent in the way that the flags are defined or used; the strength of lock used in this stanza won't affect it. regards, tom lane
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On 2020-Nov-18, Andres Freund wrote: > In 13 this is: > LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); > MyPgXact->vacuumFlags |= PROC_IN_VACUUM; > if (params->is_wraparound) > MyPgXact->vacuumFlags |= PROC_VACUUM_FOR_WRAPAROUND; > LWLockRelease(ProcArrayLock); > > Lowering this to a shared lock doesn't seem right, at least without a > detailed comment explaining why it's safe. Because GetSnapshotData() etc > look at all procs with just an LW_SHARED ProcArrayLock, changing > vacuumFlags without a lock means that two concurrent horizon > computations could come to a different result. > > I'm not saying it's definitely wrong to relax things here, but I'm not > sure we've evaluated it sufficiently. True. Let's evaluate it. I changed three spots where statusFlags bits are written: a) StartupDecodingContext: sets PROC_IN_LOGICAL_DECODING b) ReplicationSlotRelease: removes PROC_IN_LOGICAL_DECODING c) vacuum_rel: sets PROC_IN_VACUUM and potentially PROC_VACUUM_FOR_WRAPAROUND Who reads these flags? PROC_IN_LOGICAL_DECODING is read by: * ComputeXidHorizons * GetSnapshotData PROC_IN_VACUUM is read by: * GetCurrentVirtualXIDs * ComputeXidHorizons * GetSnapshotData * ProcArrayInstallImportedXmin PROC_VACUUM_FOR_WRAPAROUND is read by: * ProcSleep ProcSleep: no bug here. The flags are checked to see if we should kill() the process (used when autovac blocks some other process). The lock here appears to be inconsequential, since we release it before we do kill(); so strictly speaking, there is still a race condition where the autovac process could reset the flag after we read it and before we get a chance to signal it. The lock level autovac uses to set the flag is not relevant either. ProcArrayInstallImportedXmin: This one is just searching for a matching backend; not affected by the flags. PROC_IN_LOGICAL_DECODING: Oddly enough, I think the reset of PROC_IN_LOGICAL_DECODING in ReplicationSlotRelease might be the most problematic one of the lot. That's because a proc's xmin that had been ignored all along by ComputeXidHorizons, will now be included in the computation. Adding asserts that proc->xmin and proc->xid are InvalidXid by the time we reset the flag, I got hits in pg_basebackup, test_decoding and subscription tests. I think it's OK for ComputeXidHorizons (since it just means that a vacuum that reads a later will remove less rows.) But in GetSnapshotData it is just not correct to have the Xmin go backwards. Therefore it seems to me that this code has a bug independently of the lock level used. GetCurrentVirtualXIDs, ComputeXidHorizons, GetSnapshotData: In these cases, what we need is that the code computes some xmin (or equivalent computation) based on a set of transactions that exclude those marked with the flags. The behavior we want is that if some transaction is marked as vacuum, we ignore the Xid/Xmin *if there is one*. In other words, if there's no Xid/Xmin, then the flag is not important. So if we can ensure that the flag is set first, and the Xid/xmin is installed later, that's sufficient correctness and we don't need to hold exclusive lock. But if we can't ensure that, then we must use exclusive lock, because otherwise we risk another process seeing our Xid first and not our flag, which would be bad. In other words, my conclusion is that there definitely is a bug here and I am going to restore the use of exclusive lock for setting the statusFlags. GetSnapshotData has an additional difficulty -- we do the UINT32_ACCESS_ONCE(ProcGlobal->xid[i]) read *before* testing the bit. So it's not valid to set the bit without locking out GSD, regardless of any barriers we had; if we want this to be safe, we'd have to change this so that the flag is read first, and we read the xid only afterwards, with a read barrier. I *think* we could relax the lock if we had a write barrier in between: set the flag first, issue a write barrier, set the Xid. (I have to admit I'm confused about what needs to happen in the read case: read the bit first, potentially skip the PGPROC entry; but can we read the Xid ahead of reading the flag, and if we do read the xid afterwards, do we need a read barrier in between?) Given this uncertainty, I'm not proposing to relax the lock from exclusive to shared. I didn't get around to reading ComputeXidHorizons, but it seems like it'd have the same problem as GSD.
Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
On Thu, Nov 19, 2020 at 12:13:44PM +0900, Michael Paquier wrote: > That still looks useful for debugging, so DEBUG1 sounds fine to me. By the way, it strikes me that you could just do nothing as long as (log_min_messages > DEBUG1), so you could encapsulate most of the logic that plays with the lock tag using that. -- Michael signature.asc Description: PGP signature
Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
On Wed, Nov 18, 2020 at 02:48:40PM -0800, Andres Freund wrote: > On 2020-11-18 18:41:27 -0300, Alvaro Herrera wrote: >> We could make this more concurrent by copying lock->tag to a local >> variable, releasing the lock, then doing all the string formatting and >> printing. See attached quickly.patch. > > Sounds like a plan. +1. >> Now, when this code was written (d7318d43d, 2012), this was a LOG >> message; it was demoted to DEBUG1 later (d8f15c95bec, 2015). I think it >> would be fair to ... remove the message? Or go back to Simon's original >> formulation from commit acac68b2bca, which had this message as DEBUG2 >> without any string formatting. > > I don't really have an opinion on this. That still looks useful for debugging, so DEBUG1 sounds fine to me. -- Michael signature.asc Description: PGP signature
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On Wed, Nov 18, 2020 at 11:09:28AM -0800, Andres Freund wrote: > Uh, wait a second. The acquisition of this lock hasn't been affected by > the snapshot scalability changes, and therefore are unrelated to > ->pgxactoff changing or not. > > In 13 this is: > LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); > MyPgXact->vacuumFlags |= PROC_IN_VACUUM; > if (params->is_wraparound) > MyPgXact->vacuumFlags |= PROC_VACUUM_FOR_WRAPAROUND; > LWLockRelease(ProcArrayLock); > > Lowering this to a shared lock doesn't seem right, at least without a > detailed comment explaining why it's safe. Because GetSnapshotData() etc > look at all procs with just an LW_SHARED ProcArrayLock, changing > vacuumFlags without a lock means that two concurrent horizon > computations could come to a different result. > > I'm not saying it's definitely wrong to relax things here, but I'm not > sure we've evaluated it sufficiently. Yeah. While I do like the new assertion that 27838981 has added in ProcArrayEndTransactionInternal(), this commit feels a bit rushed to me. Echoing with my comment from upthread, I am not sure that we still document enough why a shared lock would be completely fine in the case of statusFlags. We have no hints that this could be fine before 2783898, and 2783898 does not make that look better. FWIW, I think that just using LW_EXCLUSIVE for the CIC change would have been fine enough first, after evaluating the performance impact. We could evaluate if it is possible to lower the ProcArrayLock acquisition in those code paths on a separate thread. -- Michael signature.asc Description: PGP signature
Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
Hi, On 2020-11-18 18:41:27 -0300, Alvaro Herrera wrote: > The amount of stuff that this is doing with ProcArrayLock held > exclusively seems a bit excessive; it sounds like we could copy the > values we need first, release the lock, and *then* do all that memory > allocation and string printing -- it's a lock of code for such a > contended lock. Yea, that's a good point. > Anytime a process sees itself as blocked by autovacuum > and wants to signal it, there's a ProcArrayLock hiccup (I didn't > actually measure it, but it's at least five function calls). I'm a bit doubtful it's that important - it's limited in frequency by deadlock_timeout. But worth improving anyway. > We could make this more concurrent by copying lock->tag to a local > variable, releasing the lock, then doing all the string formatting and > printing. See attached quickly.patch. Sounds like a plan. > Now, when this code was written (d7318d43d, 2012), this was a LOG > message; it was demoted to DEBUG1 later (d8f15c95bec, 2015). I think it > would be fair to ... remove the message? Or go back to Simon's original > formulation from commit acac68b2bca, which had this message as DEBUG2 > without any string formatting. I don't really have an opinion on this. Greetings, Andres Freund
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On 2020-Nov-18, Michael Paquier wrote: > On Tue, Nov 17, 2020 at 02:14:53PM -0300, Alvaro Herrera wrote: > > diff --git a/src/backend/replication/logical/logical.c > > b/src/backend/replication/logical/logical.c > > index f1f4df7d70..4324e32656 100644 > > --- a/src/backend/replication/logical/logical.c > > +++ b/src/backend/replication/logical/logical.c > > @@ -181,7 +181,7 @@ StartupDecodingContext(List *output_plugin_options, > > */ > > if (!IsTransactionOrTransactionBlock()) > > { > > - LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); > > + LWLockAcquire(ProcArrayLock, LW_SHARED); > > MyProc->statusFlags |= PROC_IN_LOGICAL_DECODING; > > ProcGlobal->statusFlags[MyProc->pgxactoff] = > > MyProc->statusFlags; > > LWLockRelease(ProcArrayLock); > > We don't really document that it is safe to update statusFlags while > holding a shared lock on ProcArrayLock, right? Could this be > clarified at least in proc.h? Pushed that part with a comment addition. This stuff is completely undocumented ... > > + /* Determine whether we can call set_safe_index_flag */ > > + safe_index = indexInfo->ii_Expressions == NIL && > > + indexInfo->ii_Predicate == NIL; > > This should tell why we check after expressions and predicates, in > short giving an explanation about the snapshot issues with build and > validation when executing those expressions and/or predicates. Fair. It seems good to add a comment to the new function, and reference that in each place where we set the flag. > > + * Set the PROC_IN_SAFE_IC flag in my PGPROC entry. > Would it be better to document that this function had better be called > after beginning a transaction? I am wondering if we should not > enforce some conditions here, say this one or similar: > Assert(GetTopTransactionIdIfAny() == InvalidTransactionId); I went with checking MyProc->xid and MyProc->xmin, which is the same in practice but notionally closer to what we're doing. > > +static inline void > > +set_safe_index_flag(void) > > This routine name is rather close to index_set_state_flags(), could it > be possible to finish with a better name? I went with set_indexsafe_procflags(). Ugly ... >From a1c9fb94a71f0bf9ec492e1715809315e38084b3 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Tue, 17 Nov 2020 11:41:19 -0300 Subject: [PATCH v7 1/2] CREATE INDEX CONCURRENTLY: don't wait for its kin MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In the various waiting phases of CREATE INDEX CONCURRENTLY, we wait for other processes to release their snapshots; in general terms this is necessary for correctness. However, processes doing CIC for other tables cannot possibly affect CIC done on "this" table, so we don't need to wait for *those*. This commit adds a flag in MyProc->statusFlags to indicate that the current process is doing CIC, so that other processes doing CIC can ignore it when waiting. This flag can potentially also be used by processes doing REINDEX CONCURRENTLY also to avoid waiting, and by VACUUM to ignore the process in CIC for the purposes of computing an Xmin. That's left for future commits. Author: Álvaro Herrera Author: Dimitry Dolgov <9erthali...@gmail.com> Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/20200810233815.GA18970@alvherre.pgsql --- src/backend/commands/indexcmds.c | 63 ++-- src/include/storage/proc.h | 5 ++- 2 files changed, 64 insertions(+), 4 deletions(-) diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index 02c7a0c7e1..9efb6b5420 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -93,6 +93,7 @@ static void ReindexPartitions(Oid relid, int options, bool isTopLevel); static void ReindexMultipleInternal(List *relids, int options); static bool ReindexRelationConcurrently(Oid relationOid, int options); static void update_relispartition(Oid relationId, bool newval); +static inline void set_indexsafe_procflags(void); /* * callback argument type for RangeVarCallbackForReindexIndex() @@ -384,7 +385,10 @@ CompareOpclassOptions(Datum *opts1, Datum *opts2, int natts) * lazy VACUUMs, because they won't be fazed by missing index entries * either. (Manual ANALYZEs, however, can't be excluded because they * might be within transactions that are going to do arbitrary operations - * later.) + * later.) Processes running CREATE INDEX CONCURRENTLY + * on indexes that are neither expressional nor partial are also safe to + * ignore, since we know that those processes won't examine any data + * outside the table they're indexing. * * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not * check for that. @@ -405,7 +409,8 @@ WaitForOlderSnapshots(TransactionId limitXmin, bool progress) VirtualTransactionId *old_snapshots; old_snapshots = GetCurrentVirtualXIDs(limitXmin, true,
"as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
> On 2020-11-17 12:55:01 -0300, Alvaro Herrera wrote: > > ... ah, but I realize now that this means that we can use shared lock > > here, not exclusive, which is already an enormous improvement. That's > > because ->pgxactoff can only be changed with exclusive lock held; so as > > long as we hold shared, the array item cannot move. > > Uh, wait a second. The acquisition of this lock hasn't been affected by > the snapshot scalability changes, and therefore are unrelated to > ->pgxactoff changing or not. I'm writing a response trying to thoroughly analyze this, but I also wanted to report that ProcSleep is being a bit generous with what it calls "as quickly as possible" here: LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); /* * Only do it if the worker is not working to protect against Xid * wraparound. */ statusFlags = ProcGlobal->statusFlags[autovac->pgxactoff]; if ((statusFlags & PROC_IS_AUTOVACUUM) && !(statusFlags & PROC_VACUUM_FOR_WRAPAROUND)) { intpid = autovac->pid; StringInfoData locktagbuf; StringInfoData logbuf;/* errdetail for server log */ initStringInfo(); initStringInfo(); DescribeLockTag(, >tag); appendStringInfo(, _("Process %d waits for %s on %s."), MyProcPid, GetLockmodeName(lock->tag.locktag_lockmethodid, lockmode), locktagbuf.data); /* release lock as quickly as possible */ LWLockRelease(ProcArrayLock); The amount of stuff that this is doing with ProcArrayLock held exclusively seems a bit excessive; it sounds like we could copy the values we need first, release the lock, and *then* do all that memory allocation and string printing -- it's a lock of code for such a contended lock. Anytime a process sees itself as blocked by autovacuum and wants to signal it, there's a ProcArrayLock hiccup (I didn't actually measure it, but it's at least five function calls). We could make this more concurrent by copying lock->tag to a local variable, releasing the lock, then doing all the string formatting and printing. See attached quickly.patch. Now, when this code was written (d7318d43d, 2012), this was a LOG message; it was demoted to DEBUG1 later (d8f15c95bec, 2015). I think it would be fair to ... remove the message? Or go back to Simon's original formulation from commit acac68b2bca, which had this message as DEBUG2 without any string formatting. Thoughts? diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index d1738c65f5..caefff41e2 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -1325,20 +1325,22 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable) int pid = autovac->pid; StringInfoData locktagbuf; StringInfoData logbuf; /* errdetail for server log */ +LOCKTAG locktag_copy; + +/* release lock as quickly as possible */ +locktag_copy = lock->tag; +LWLockRelease(ProcArrayLock); initStringInfo(); initStringInfo(); -DescribeLockTag(, >tag); +DescribeLockTag(, _copy); appendStringInfo(, _("Process %d waits for %s on %s."), MyProcPid, - GetLockmodeName(lock->tag.locktag_lockmethodid, + GetLockmodeName(locktag_copy.locktag_lockmethodid, lockmode), locktagbuf.data); -/* release lock as quickly as possible */ -LWLockRelease(ProcArrayLock); - /* send the autovacuum worker Back to Old Kent Road */ ereport(DEBUG1, (errmsg("sending cancel to blocking autovacuum PID %d",
Re: remove spurious CREATE INDEX CONCURRENTLY wait
Hi, On 2020-11-17 12:55:01 -0300, Alvaro Herrera wrote: > ... ah, but I realize now that this means that we can use shared lock > here, not exclusive, which is already an enormous improvement. That's > because ->pgxactoff can only be changed with exclusive lock held; so as > long as we hold shared, the array item cannot move. Uh, wait a second. The acquisition of this lock hasn't been affected by the snapshot scalability changes, and therefore are unrelated to ->pgxactoff changing or not. In 13 this is: LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); MyPgXact->vacuumFlags |= PROC_IN_VACUUM; if (params->is_wraparound) MyPgXact->vacuumFlags |= PROC_VACUUM_FOR_WRAPAROUND; LWLockRelease(ProcArrayLock); Lowering this to a shared lock doesn't seem right, at least without a detailed comment explaining why it's safe. Because GetSnapshotData() etc look at all procs with just an LW_SHARED ProcArrayLock, changing vacuumFlags without a lock means that two concurrent horizon computations could come to a different result. I'm not saying it's definitely wrong to relax things here, but I'm not sure we've evaluated it sufficiently. Greetings, Andres Freund
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On Tue, Nov 17, 2020 at 02:14:53PM -0300, Alvaro Herrera wrote: > diff --git a/src/backend/replication/logical/logical.c > b/src/backend/replication/logical/logical.c > index f1f4df7d70..4324e32656 100644 > --- a/src/backend/replication/logical/logical.c > +++ b/src/backend/replication/logical/logical.c > @@ -181,7 +181,7 @@ StartupDecodingContext(List *output_plugin_options, >*/ > if (!IsTransactionOrTransactionBlock()) > { > - LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); > + LWLockAcquire(ProcArrayLock, LW_SHARED); > MyProc->statusFlags |= PROC_IN_LOGICAL_DECODING; > ProcGlobal->statusFlags[MyProc->pgxactoff] = > MyProc->statusFlags; > LWLockRelease(ProcArrayLock); We don't really document that it is safe to update statusFlags while holding a shared lock on ProcArrayLock, right? Could this be clarified at least in proc.h? > + /* Determine whether we can call set_safe_index_flag */ > + safe_index = indexInfo->ii_Expressions == NIL && > + indexInfo->ii_Predicate == NIL; This should tell why we check after expressions and predicates, in short giving an explanation about the snapshot issues with build and validation when executing those expressions and/or predicates. > + * Set the PROC_IN_SAFE_IC flag in my PGPROC entry. > + * > + * When doing concurrent index builds, we can set this flag > + * to tell other processes concurrently running CREATE > + * INDEX CONCURRENTLY to ignore us when > + * doing their waits for concurrent snapshots. On one hand it > + * avoids pointlessly waiting for a process that's not interesting > + * anyway, but more importantly it avoids deadlocks in some cases. > + * > + * This can only be done for indexes that don't execute any expressions. > + * Caller is responsible for only calling this routine when that > + * assumption holds true. > + * > + * (The flag is reset automatically at transaction end, so it must be > + * set for each transaction.) Would it be better to document that this function had better be called after beginning a transaction? I am wondering if we should not enforce some conditions here, say this one or similar: Assert(GetTopTransactionIdIfAny() == InvalidTransactionId); > + */ > +static inline void > +set_safe_index_flag(void) This routine name is rather close to index_set_state_flags(), could it be possible to finish with a better name? -- Michael signature.asc Description: PGP signature
Re: remove spurious CREATE INDEX CONCURRENTLY wait
So I made the change to set the statusFlags with only LW_SHARED -- both in existing uses (see 0002) and in the new CIC code (0003). I don't think 0002 is going to have a tremendous performance impact, because it changes operations that are very infrequent. But even so, it would be weird to leave code around that uses exclusive lock when we're going to add new code that uses shared lock for the same thing. I still left the REINDEX CONCURRENTLY support in split out in 0004; I intend to get the first three patches pushed now, and look into 0004 again later. >From 5e2af4e082df8f839bf257d933d70bb9645f6cfe Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Tue, 17 Nov 2020 11:37:36 -0300 Subject: [PATCH v6 1/4] indexcmds.c: reorder function prototypes ... out of an overabundance of neatnikism, perhaps. --- src/backend/commands/indexcmds.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index 35696f9f75..02c7a0c7e1 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -68,6 +68,7 @@ /* non-export function prototypes */ +static bool CompareOpclassOptions(Datum *opts1, Datum *opts2, int natts); static void CheckPredicate(Expr *predicate); static void ComputeIndexAttrs(IndexInfo *indexInfo, Oid *typeOidP, @@ -87,13 +88,11 @@ static char *ChooseIndexNameAddition(List *colnames); static List *ChooseIndexColumnNames(List *indexElems); static void RangeVarCallbackForReindexIndex(const RangeVar *relation, Oid relId, Oid oldRelId, void *arg); -static bool ReindexRelationConcurrently(Oid relationOid, int options); - +static void reindex_error_callback(void *args); static void ReindexPartitions(Oid relid, int options, bool isTopLevel); static void ReindexMultipleInternal(List *relids, int options); -static void reindex_error_callback(void *args); +static bool ReindexRelationConcurrently(Oid relationOid, int options); static void update_relispartition(Oid relationId, bool newval); -static bool CompareOpclassOptions(Datum *opts1, Datum *opts2, int natts); /* * callback argument type for RangeVarCallbackForReindexIndex() -- 2.20.1 >From 3e2778d553448c67e956bd385c4a94039933d3c6 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Tue, 17 Nov 2020 13:48:16 -0300 Subject: [PATCH v6 2/4] relax lock level for setting statusFlags --- src/backend/commands/vacuum.c | 2 +- src/backend/replication/logical/logical.c | 2 +- src/backend/replication/slot.c| 2 +- src/backend/storage/ipc/procarray.c | 2 ++ 4 files changed, 5 insertions(+), 3 deletions(-) diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index d89ba3031f..395e75f768 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -1741,7 +1741,7 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params) * MyProc->xid/xmin, otherwise GetOldestNonRemovableTransactionId() * might appear to go backwards, which is probably Not Good. */ - LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); + LWLockAcquire(ProcArrayLock, LW_SHARED); MyProc->statusFlags |= PROC_IN_VACUUM; if (params->is_wraparound) MyProc->statusFlags |= PROC_VACUUM_FOR_WRAPAROUND; diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c index f1f4df7d70..4324e32656 100644 --- a/src/backend/replication/logical/logical.c +++ b/src/backend/replication/logical/logical.c @@ -181,7 +181,7 @@ StartupDecodingContext(List *output_plugin_options, */ if (!IsTransactionOrTransactionBlock()) { - LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); + LWLockAcquire(ProcArrayLock, LW_SHARED); MyProc->statusFlags |= PROC_IN_LOGICAL_DECODING; ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags; LWLockRelease(ProcArrayLock); diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index 5ed955ba0b..9c7cf13d4d 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -527,7 +527,7 @@ ReplicationSlotRelease(void) MyReplicationSlot = NULL; /* might not have been set when we've been a plain slot */ - LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); + LWLockAcquire(ProcArrayLock, LW_SHARED); MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING; ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags; LWLockRelease(ProcArrayLock); diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index e2cb6e990d..ebe91907d6 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -662,6 +662,8 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid) /* avoid unnecessarily dirtying shared cachelines */ if (proc->statusFlags & PROC_VACUUM_STATE_MASK) { + /* Only safe to change my own flags with only share lock */ + Assert(proc == MyProc);
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On 2020-Nov-16, Alvaro Herrera wrote: > On 2020-Nov-09, Tom Lane wrote: > > > Michael Paquier writes: > > >> +LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); > > >> +MyProc->vacuumFlags |= PROC_IN_SAFE_IC; > > >> +ProcGlobal->vacuumFlags[MyProc->pgxactoff] = > > >> MyProc->vacuumFlags; > > >> +LWLockRelease(ProcArrayLock); > > > > > I can't help noticing that you are repeating the same code pattern > > > eight times. I think that this should be in its own routine, and that > > > we had better document that this should be called just after starting > > > a transaction, with an assertion enforcing that. > > > > Do we really need exclusive lock on the ProcArray to make this flag > > change? That seems pretty bad from a concurrency standpoint. > > BTW I now know that the reason for taking ProcArrayLock is not the > vacuumFlags itself, but rather MyProc->pgxactoff, which can move. ... ah, but I realize now that this means that we can use shared lock here, not exclusive, which is already an enormous improvement. That's because ->pgxactoff can only be changed with exclusive lock held; so as long as we hold shared, the array item cannot move.
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On Mon, Nov 16, 2020 at 09:23:41PM -0300, Alvaro Herrera wrote: > I've been wondering if it would be sane/safe to do the WaitForFoo stuff > outside of any transaction. This needs to remain within a transaction as CIC holds a session lock on the parent table, which would not be cleaned up without a transaction context. -- Michael signature.asc Description: PGP signature
Re: remove spurious CREATE INDEX CONCURRENTLY wait
I am really unsure about the REINDEX CONCURRENTLY part of this, for two reasons: 1. It is not as good when reindexing multiple indexes, because we can only apply the flag if *all* indexes are "safe". Any unsafe index means we step down from it for the whole thing. This is probably not worth worrying much about, but still. 2. In some of the waiting transactions, we actually do more things than what we do in CREATE INDEX CONCURRENTLY transactions --- some catalog updates, but we also do the whole index validation phase. Is that OK? It's not as clear to me that it is safe to set the flag in all those places. I moved the comments to the new function and made it inline. I also changed the way we determine how the function is safe; there's no reason to build an IndexInfo if we can simply look at rel->rd_indexprs and rel->indpred. I've been wondering if it would be sane/safe to do the WaitForFoo stuff outside of any transaction. I'll have a look again tomorrow. >From 1e5560d4a3f3e3b4cb83803f90985701fb4dee8c Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Tue, 4 Aug 2020 22:04:57 -0400 Subject: [PATCH v5] Avoid spurious CREATE INDEX CONCURRENTLY waits --- src/backend/commands/indexcmds.c | 85 +--- src/include/storage/proc.h | 6 ++- 2 files changed, 83 insertions(+), 8 deletions(-) diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index 35696f9f75..44ea84c54d 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -68,6 +68,7 @@ /* non-export function prototypes */ +static bool CompareOpclassOptions(Datum *opts1, Datum *opts2, int natts); static void CheckPredicate(Expr *predicate); static void ComputeIndexAttrs(IndexInfo *indexInfo, Oid *typeOidP, @@ -87,13 +88,12 @@ static char *ChooseIndexNameAddition(List *colnames); static List *ChooseIndexColumnNames(List *indexElems); static void RangeVarCallbackForReindexIndex(const RangeVar *relation, Oid relId, Oid oldRelId, void *arg); -static bool ReindexRelationConcurrently(Oid relationOid, int options); - +static void reindex_error_callback(void *args); static void ReindexPartitions(Oid relid, int options, bool isTopLevel); static void ReindexMultipleInternal(List *relids, int options); -static void reindex_error_callback(void *args); +static bool ReindexRelationConcurrently(Oid relationOid, int options); static void update_relispartition(Oid relationId, bool newval); -static bool CompareOpclassOptions(Datum *opts1, Datum *opts2, int natts); +static inline void set_safe_index_flag(void); /* * callback argument type for RangeVarCallbackForReindexIndex() @@ -385,7 +385,10 @@ CompareOpclassOptions(Datum *opts1, Datum *opts2, int natts) * lazy VACUUMs, because they won't be fazed by missing index entries * either. (Manual ANALYZEs, however, can't be excluded because they * might be within transactions that are going to do arbitrary operations - * later.) + * later.) Processes running CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY + * on indexes that are neither expressional nor partial are also safe to + * ignore, since we know that those processes won't examine any data + * outside the table they're indexing. * * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not * check for that. @@ -406,7 +409,8 @@ WaitForOlderSnapshots(TransactionId limitXmin, bool progress) VirtualTransactionId *old_snapshots; old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false, - PROC_IS_AUTOVACUUM | PROC_IN_VACUUM, + PROC_IS_AUTOVACUUM | PROC_IN_VACUUM + | PROC_IN_SAFE_IC, _old_snapshots); if (progress) pgstat_progress_update_param(PROGRESS_WAITFOR_TOTAL, n_old_snapshots); @@ -426,7 +430,8 @@ WaitForOlderSnapshots(TransactionId limitXmin, bool progress) newer_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false, - PROC_IS_AUTOVACUUM | PROC_IN_VACUUM, + PROC_IS_AUTOVACUUM | PROC_IN_VACUUM + | PROC_IN_SAFE_IC, _newer_snapshots); for (j = i; j < n_old_snapshots; j++) { @@ -519,6 +524,7 @@ DefineIndex(Oid relationId, bool amcanorder; amoptions_function amoptions; bool partitioned; + bool safe_index; Datum reloptions; int16 *coloptions; IndexInfo *indexInfo; @@ -1045,6 +1051,10 @@ DefineIndex(Oid relationId, } } + /* Determine whether we can call set_safe_index_flag */ + safe_index = indexInfo->ii_Expressions == NIL && + indexInfo->ii_Predicate == NIL; + /* * Report index creation if appropriate (delay this till after most of the * error checks) @@ -1431,6 +1441,10 @@ DefineIndex(Oid relationId, CommitTransactionCommand(); StartTransactionCommand(); + /* Tell concurrent index builds to ignore us, if index qualifies */ + if (safe_index) + set_safe_index_flag(); + /* * The index is now visible, so we can report the OID.
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On 2020-Nov-09, Tom Lane wrote: > Michael Paquier writes: > >> + LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); > >> + MyProc->vacuumFlags |= PROC_IN_SAFE_IC; > >> + ProcGlobal->vacuumFlags[MyProc->pgxactoff] = > >> MyProc->vacuumFlags; > >> + LWLockRelease(ProcArrayLock); > > > I can't help noticing that you are repeating the same code pattern > > eight times. I think that this should be in its own routine, and that > > we had better document that this should be called just after starting > > a transaction, with an assertion enforcing that. > > Do we really need exclusive lock on the ProcArray to make this flag > change? That seems pretty bad from a concurrency standpoint. BTW I now know that the reason for taking ProcArrayLock is not the vacuumFlags itself, but rather MyProc->pgxactoff, which can move. On the other hand, if we stopped mirroring the flags in ProcGlobal, it would mean we would have to access all procs' PGPROC entries in GetSnapshotData, which is undesirable for performance reason (per commit 5788e258bb26).
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On 2020-Nov-16, Dmitry Dolgov wrote: > The same with reindex without locks: > > nsecs : count distribution >512 -> 1023 : 0|| > 1024 -> 2047 : 111345 || > 2048 -> 4095 : 6997627 || > 4096 -> 8191 : 18575|| > 8192 -> 16383 : 586 || > 16384 -> 32767 : 312 || > 32768 -> 65535 : 18 || > > The same with reindex with locks: > > nsecs : count distribution >512 -> 1023 : 0|| > 1024 -> 2047 : 59438|| > 2048 -> 4095 : 6901187 || > 4096 -> 8191 : 18584|| > 8192 -> 16383 : 581 || > 16384 -> 32767 : 280 || > 32768 -> 65535 : 84 || > > Looks like with reindex without locks is indeed faster (there are mode > samples in lower time section), but not particularly significant to the > whole distribution, especially taking into account extremity of the > test. I didn't analyze these numbers super carefully, but yeah it doesn't look significant. I'm looking at these patches now, with intention to push.
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On Tue, 10 Nov 2020 at 03:02, Tom Lane wrote: > > Alvaro Herrera writes: > > Yeah ... it would be much better if we can make it use atomics instead. > > I was thinking more like "do we need any locking at all". > > Assuming that a proc's vacuumFlags can be set by only the process itself, > there's no write conflicts to worry about. On the read side, there's a > hazard that onlookers will not see the PROC_IN_SAFE_IC flag set; but > that's not any different from what the outcome would be if they looked > just before this stanza executes. And even if they don't see it, at worst > we lose the optimization being proposed. > > There is a question of whether it's important that both copies of the flag > appear to update atomically ... but that just begs the question "why in > heaven's name are there two copies?" Agreed to all of the above, but I think the issue is miniscule because ProcArrayLock is acquired in LW_EXCLUSIVE mode at transaction commit. So it doesn't seem like there is much need to optimize this particular aspect of the patch. -- Simon Riggshttp://www.EnterpriseDB.com/
Re: remove spurious CREATE INDEX CONCURRENTLY wait
> On Fri, Nov 13, 2020 at 09:25:40AM +0900, Michael Paquier wrote: > On Thu, Nov 12, 2020 at 04:36:32PM +0100, Dmitry Dolgov wrote: > > Interesting enough, similar discussion happened about vaccumFlags before > > with the same conclusion that theoretically it's fine to update without > > holding the lock, but this assumption could change one day and it's > > better to avoid such risks. Having said that I believe it makes sense to > > continue with locking. Are there any other opinions? I'll try to > > benchmark it in the meantime. > > Thanks for planning some benchmarking for this specific patch. I have > to admit that the possibility of switching vacuumFlags to use atomics > is very appealing in the long term, with or without considering this > patch, even if we had better be sure that this patch has no actual > effect on concurrency first if atomics are not used in worst-case > scenarios. I've tried first to test scenarios where GetSnapshotData produces significant lock contention and "reindex concurrently" implementation with locks interferes with it. The idea I had is to create a test function that constantly calls GetSnapshotData (perf indeed shows significant portion of time spent on contended lock), and clash it with a stream of "reindex concurrently" of an empty relation (which still reaches safe_index check). I guess it could be considered as an artificial extreme case. Measuring GetSnapshotData (or rather the surrounding wrapper, to distinguish calls from the test function from everything else) latency without reindex, with reindex and locks, with reindex without locks should produce different "modes" and comparing them we can make some conclusions. Latency histograms without reindex (nanoseconds): nsecs : count distribution 512 -> 1023 : 0|| 1024 -> 2047 : 10001209 || 2048 -> 4095 : 76936|| 4096 -> 8191 : 1468 || 8192 -> 16383 : 98 || 16384 -> 32767 : 39 || 32768 -> 65535 : 6|| The same with reindex without locks: nsecs : count distribution 512 -> 1023 : 0|| 1024 -> 2047 : 111345 || 2048 -> 4095 : 6997627 || 4096 -> 8191 : 18575|| 8192 -> 16383 : 586 || 16384 -> 32767 : 312 || 32768 -> 65535 : 18 || The same with reindex with locks: nsecs : count distribution 512 -> 1023 : 0|| 1024 -> 2047 : 59438|| 2048 -> 4095 : 6901187 || 4096 -> 8191 : 18584|| 8192 -> 16383 : 581 || 16384 -> 32767 : 280 || 32768 -> 65535 : 84 || Looks like with reindex without locks is indeed faster (there are mode samples in lower time section), but not particularly significant to the whole distribution, especially taking into account extremity of the test. I'll take a look at benchmarking of switching vacuumFlags to use atomics, but as it's probably a bit off topic I'm going to attach another version of the patch with locks and suggested changes. To which I have one question: > Michael Paquier writes: > I think that this should be in its own routine, and that we had better > document that this should be called just after starting a transaction, > with an assertion enforcing that. I'm not sure which exactly assertion condition do you mean? >From 07c4705a22e6cdd5717df46a974ce00a69fc901f Mon Sep 17 00:00:00 2001 From: Dmitrii Dolgov <9erthali...@gmail.com> Date: Wed, 11 Nov 2020 15:19:48 +0100 Subject: [PATCH v4 1/2] Rename vaccumFlags to statusFlags With more flags associated to a PGPROC entry that are not related to vacuum (currently existing or planned), the name "statusFlags" describes its purpose better. --- src/backend/access/transam/twophase.c | 2 +- src/backend/commands/vacuum.c | 6 +-- src/backend/postmaster/autovacuum.c | 6 +-- src/backend/replication/logical/logical.c | 4 +- src/backend/replication/slot.c| 4 +-
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On Thu, Nov 12, 2020 at 04:36:32PM +0100, Dmitry Dolgov wrote: > Interesting enough, similar discussion happened about vaccumFlags before > with the same conclusion that theoretically it's fine to update without > holding the lock, but this assumption could change one day and it's > better to avoid such risks. Having said that I believe it makes sense to > continue with locking. Are there any other opinions? I'll try to > benchmark it in the meantime. Thanks for planning some benchmarking for this specific patch. I have to admit that the possibility of switching vacuumFlags to use atomics is very appealing in the long term, with or without considering this patch, even if we had better be sure that this patch has no actual effect on concurrency first if atomics are not used in worst-case scenarios. -- Michael signature.asc Description: PGP signature
Re: remove spurious CREATE INDEX CONCURRENTLY wait
> On Mon, Nov 09, 2020 at 10:02:27PM -0500, Tom Lane wrote: > > Alvaro Herrera writes: > > Yeah ... it would be much better if we can make it use atomics instead. > > I was thinking more like "do we need any locking at all". > > Assuming that a proc's vacuumFlags can be set by only the process itself, > there's no write conflicts to worry about. On the read side, there's a > hazard that onlookers will not see the PROC_IN_SAFE_IC flag set; but > that's not any different from what the outcome would be if they looked > just before this stanza executes. And even if they don't see it, at worst > we lose the optimization being proposed. > > There is a question of whether it's important that both copies of the flag > appear to update atomically ... but that just begs the question "why in > heaven's name are there two copies?" Sounds right, but after reading the thread about GetSnapshotData scalability more thoroughly there seem to be an assumption that those copies have to be updated at the same time under the same lock, and claims that in some cases justification for correctness around not taking ProcArrayLock is too complicated, at least for now. Interesting enough, similar discussion happened about vaccumFlags before with the same conclusion that theoretically it's fine to update without holding the lock, but this assumption could change one day and it's better to avoid such risks. Having said that I believe it makes sense to continue with locking. Are there any other opinions? I'll try to benchmark it in the meantime.
Re: remove spurious CREATE INDEX CONCURRENTLY wait
Alvaro Herrera writes: > Yeah ... it would be much better if we can make it use atomics instead. I was thinking more like "do we need any locking at all". Assuming that a proc's vacuumFlags can be set by only the process itself, there's no write conflicts to worry about. On the read side, there's a hazard that onlookers will not see the PROC_IN_SAFE_IC flag set; but that's not any different from what the outcome would be if they looked just before this stanza executes. And even if they don't see it, at worst we lose the optimization being proposed. There is a question of whether it's important that both copies of the flag appear to update atomically ... but that just begs the question "why in heaven's name are there two copies?" regards, tom lane
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On Mon, Nov 09, 2020 at 11:31:15PM -0300, Alvaro Herrera wrote: > Yeah ... it would be much better if we can make it use atomics instead. > Currently it's an uint8, and in PGPROC itself it's probably not a big > deal to enlarge that, but I fear that quadrupling the size of the > mirroring array in PROC_HDR might be bad for performance. However, > maybe if we use atomics to access it, then we don't need to mirror it > anymore? That would need some benchmarking of GetSnapshotData. Hmm. If you worry about the performance impact here, it is possible to do a small performance test without this patch. vacuum_rel() sets the flag for a non-full VACUUM, so with one backend running a manual VACUUM in loop on an empty relation could apply some pressure on any benchmark, even a simple pgbench. -- Michael signature.asc Description: PGP signature
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On 2020-Nov-09, Tom Lane wrote: > Michael Paquier writes: > > On Mon, Nov 09, 2020 at 08:32:13PM -0500, Tom Lane wrote: > >> Do we really need exclusive lock on the ProcArray to make this flag > >> change? That seems pretty bad from a concurrency standpoint. > > > Any place where we update vacuumFlags acquires an exclusive LWLock on > > ProcArrayLock. That's held for a very short time, so IMO it won't > > matter much in practice, particularly if you compare that with the > > potential gains related to the existing wait phases. > > Not sure I believe that it doesn't matter much in practice. If there's > a steady stream of shared ProcArrayLock acquisitions (for snapshot > acquisition) then somebody wanting exclusive lock will create a big > hiccup, whether they hold it for a short time or not. Yeah ... it would be much better if we can make it use atomics instead. Currently it's an uint8, and in PGPROC itself it's probably not a big deal to enlarge that, but I fear that quadrupling the size of the mirroring array in PROC_HDR might be bad for performance. However, maybe if we use atomics to access it, then we don't need to mirror it anymore? That would need some benchmarking of GetSnapshotData.
Re: remove spurious CREATE INDEX CONCURRENTLY wait
Michael Paquier writes: > On Mon, Nov 09, 2020 at 08:32:13PM -0500, Tom Lane wrote: >> Do we really need exclusive lock on the ProcArray to make this flag >> change? That seems pretty bad from a concurrency standpoint. > Any place where we update vacuumFlags acquires an exclusive LWLock on > ProcArrayLock. That's held for a very short time, so IMO it won't > matter much in practice, particularly if you compare that with the > potential gains related to the existing wait phases. Not sure I believe that it doesn't matter much in practice. If there's a steady stream of shared ProcArrayLock acquisitions (for snapshot acquisition) then somebody wanting exclusive lock will create a big hiccup, whether they hold it for a short time or not. regards, tom lane
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On Mon, Nov 09, 2020 at 08:32:13PM -0500, Tom Lane wrote: > Do we really need exclusive lock on the ProcArray to make this flag > change? That seems pretty bad from a concurrency standpoint. Any place where we update vacuumFlags acquires an exclusive LWLock on ProcArrayLock. That's held for a very short time, so IMO it won't matter much in practice, particularly if you compare that with the potential gains related to the existing wait phases. -- Michael signature.asc Description: PGP signature
Re: remove spurious CREATE INDEX CONCURRENTLY wait
Michael Paquier writes: >> +LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); >> +MyProc->vacuumFlags |= PROC_IN_SAFE_IC; >> +ProcGlobal->vacuumFlags[MyProc->pgxactoff] = >> MyProc->vacuumFlags; >> +LWLockRelease(ProcArrayLock); > I can't help noticing that you are repeating the same code pattern > eight times. I think that this should be in its own routine, and that > we had better document that this should be called just after starting > a transaction, with an assertion enforcing that. Do we really need exclusive lock on the ProcArray to make this flag change? That seems pretty bad from a concurrency standpoint. regards, tom lane
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On Mon, Nov 09, 2020 at 04:47:43PM +0100, Dmitry Dolgov wrote: > > On Tue, Nov 03, 2020 at 07:14:47PM +0100, Dmitry Dolgov wrote: > > > On Thu, Aug 20, 2020 at 03:11:19PM +0900, Michael Paquier wrote: > > > On Wed, Aug 19, 2020 at 02:16:46PM -0400, Alvaro Herrera wrote: > > > > I did not set the flag in REINDEX CONCURRENTLY, but as I understand it > > > > can be done too, since in essence it's the same thing as a CIC from a > > > > snapshot management point of view. > > > > > > Yes, I see no problems for REINDEX CONCURRENTLY as well as long as > > > there are no predicates and expressions involved. The transactions > > > that should be patched are all started in ReindexRelationConcurrently. > > > The transaction of index_concurrently_swap() cannot set up that > > > though. Only thing to be careful is to make sure that safe_flag is > > > correct depending on the list of indexes worked on. > > > > Hi, > > > > After looking through the thread and reading the patch it seems good, > > and there are only few minor questions: > > > > * Doing the same for REINDEX CONCURRENTLY, which does make sense. In > > fact it's already mentioned in the commentaries as done, which a bit > > confusing. > > Just to give it a shot, would the attached change be enough? Could it be possible to rename vacuumFlags to statusFlags first? I did not see any objection to do this suggestion. > + LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); > + MyProc->vacuumFlags |= PROC_IN_SAFE_IC; > + ProcGlobal->vacuumFlags[MyProc->pgxactoff] = > MyProc->vacuumFlags; > + LWLockRelease(ProcArrayLock); I can't help noticing that you are repeating the same code pattern eight times. I think that this should be in its own routine, and that we had better document that this should be called just after starting a transaction, with an assertion enforcing that. -- Michael signature.asc Description: PGP signature
Re: remove spurious CREATE INDEX CONCURRENTLY wait
> On Tue, Nov 03, 2020 at 07:14:47PM +0100, Dmitry Dolgov wrote: > > On Thu, Aug 20, 2020 at 03:11:19PM +0900, Michael Paquier wrote: > > On Wed, Aug 19, 2020 at 02:16:46PM -0400, Alvaro Herrera wrote: > > > I did not set the flag in REINDEX CONCURRENTLY, but as I understand it > > > can be done too, since in essence it's the same thing as a CIC from a > > > snapshot management point of view. > > > > Yes, I see no problems for REINDEX CONCURRENTLY as well as long as > > there are no predicates and expressions involved. The transactions > > that should be patched are all started in ReindexRelationConcurrently. > > The transaction of index_concurrently_swap() cannot set up that > > though. Only thing to be careful is to make sure that safe_flag is > > correct depending on the list of indexes worked on. > > Hi, > > After looking through the thread and reading the patch it seems good, > and there are only few minor questions: > > * Doing the same for REINDEX CONCURRENTLY, which does make sense. In > fact it's already mentioned in the commentaries as done, which a bit > confusing. Just to give it a shot, would the attached change be enough? >From d30d8acf91679985970334069ab7f1f8f7fc3ec5 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Tue, 4 Aug 2020 22:04:57 -0400 Subject: [PATCH v3] Avoid spurious CREATE INDEX CONCURRENTLY waits --- src/backend/commands/indexcmds.c | 122 ++- src/include/storage/proc.h | 6 +- 2 files changed, 124 insertions(+), 4 deletions(-) diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index 75552c64ed..5019397d50 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -385,7 +385,10 @@ CompareOpclassOptions(Datum *opts1, Datum *opts2, int natts) * lazy VACUUMs, because they won't be fazed by missing index entries * either. (Manual ANALYZEs, however, can't be excluded because they * might be within transactions that are going to do arbitrary operations - * later.) + * later.) Processes running CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY + * on indexes that are neither expressional nor partial are also safe to + * ignore, since we know that those processes won't examine any data + * outside the table they're indexing. * * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not * check for that. @@ -406,7 +409,8 @@ WaitForOlderSnapshots(TransactionId limitXmin, bool progress) VirtualTransactionId *old_snapshots; old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false, - PROC_IS_AUTOVACUUM | PROC_IN_VACUUM, + PROC_IS_AUTOVACUUM | PROC_IN_VACUUM + | PROC_IN_SAFE_IC, _old_snapshots); if (progress) pgstat_progress_update_param(PROGRESS_WAITFOR_TOTAL, n_old_snapshots); @@ -426,7 +430,8 @@ WaitForOlderSnapshots(TransactionId limitXmin, bool progress) newer_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false, - PROC_IS_AUTOVACUUM | PROC_IN_VACUUM, + PROC_IS_AUTOVACUUM | PROC_IN_VACUUM + | PROC_IN_SAFE_IC, _newer_snapshots); for (j = i; j < n_old_snapshots; j++) { @@ -519,6 +524,7 @@ DefineIndex(Oid relationId, bool amcanorder; amoptions_function amoptions; bool partitioned; + bool safe_index; Datum reloptions; int16 *coloptions; IndexInfo *indexInfo; @@ -1045,6 +1051,17 @@ DefineIndex(Oid relationId, } } + /* + * When doing concurrent index builds, we can set a PGPROC flag to tell + * concurrent VACUUM, CREATE INDEX CONCURRENTLY and REINDEX CONCURRENTLY + * to ignore us when waiting for concurrent snapshots. That can only be + * done for indexes that don't execute any expressions. Determine that. + * (The flag is reset automatically at transaction end, so it must be + * set for each transaction.) + */ + safe_index = indexInfo->ii_Expressions == NIL && + indexInfo->ii_Predicate == NIL; + /* * Report index creation if appropriate (delay this till after most of the * error checks) @@ -1431,6 +1448,15 @@ DefineIndex(Oid relationId, CommitTransactionCommand(); StartTransactionCommand(); + /* Tell concurrent index builds to ignore us, if index qualifies */ + if (safe_index) + { + LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); + MyProc->vacuumFlags |= PROC_IN_SAFE_IC; + ProcGlobal->vacuumFlags[MyProc->pgxactoff] = MyProc->vacuumFlags; + LWLockRelease(ProcArrayLock); + } + /* * The index is now visible, so we can report the OID. */ @@ -1490,6 +1516,15 @@ DefineIndex(Oid relationId, CommitTransactionCommand(); StartTransactionCommand(); + /* Tell concurrent index builds to ignore us, if index qualifies */ + if (safe_index) + { + LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); + MyProc->vacuumFlags |= PROC_IN_SAFE_IC; + ProcGlobal->vacuumFlags[MyProc->pgxactoff] = MyProc->vacuumFlags; + LWLockRelease(ProcArrayLock); + } + /* * Phase 3 of
Re: remove spurious CREATE INDEX CONCURRENTLY wait
> On Thu, Aug 20, 2020 at 03:11:19PM +0900, Michael Paquier wrote: > On Wed, Aug 19, 2020 at 02:16:46PM -0400, Alvaro Herrera wrote: > > I did not set the flag in REINDEX CONCURRENTLY, but as I understand it > > can be done too, since in essence it's the same thing as a CIC from a > > snapshot management point of view. > > Yes, I see no problems for REINDEX CONCURRENTLY as well as long as > there are no predicates and expressions involved. The transactions > that should be patched are all started in ReindexRelationConcurrently. > The transaction of index_concurrently_swap() cannot set up that > though. Only thing to be careful is to make sure that safe_flag is > correct depending on the list of indexes worked on. Hi, After looking through the thread and reading the patch it seems good, and there are only few minor questions: * Doing the same for REINDEX CONCURRENTLY, which does make sense. In fact it's already mentioned in the commentaries as done, which a bit confusing. * Naming, to be more precise what suggested Michael: > Could we consider renaming vacuumFlags? With more flags associated to > a PGPROC entry that are not related to vacuum, the current naming > makes things confusing. Something like statusFlags could fit better > in the picture? which sounds reasonable, and similar one about flag name PROC_IN_SAFE_CIC - if it covers both CREATE INDEX/REINDEX CONCURRENTLY maybe just PROC_IN_SAFE_IC? Any plans about those questions? I can imagine that are the only missing parts.
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On Wed, Aug 19, 2020 at 02:16:46PM -0400, Alvaro Herrera wrote: > I did not set the flag in REINDEX CONCURRENTLY, but as I understand it > can be done too, since in essence it's the same thing as a CIC from a > snapshot management point of view. Yes, I see no problems for REINDEX CONCURRENTLY as well as long as there are no predicates and expressions involved. The transactions that should be patched are all started in ReindexRelationConcurrently. The transaction of index_concurrently_swap() cannot set up that though. Only thing to be careful is to make sure that safe_flag is correct depending on the list of indexes worked on. > Also, per [1], ISTM this flag could be used to tell lazy VACUUM to > ignore the Xmin of this process too, which the previous formulation > (where all CICs were so marked) could not. This patch doesn't do that > yet, but it seems the natural next step to take. > > [1] https://postgr.es/m/20191101203310.GA12239@alvherre.pgsql Could we consider renaming vacuumFlags? With more flags associated to a PGPROC entry that are not related to vacuum, the current naming makes things confusing. Something like statusFlags could fit better in the picture? -- Michael signature.asc Description: PGP signature
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On 2020-Aug-11, James Coleman wrote: > In [3] I'd brought up that a function index can do arbitrary > operations (including, terribly, a query of another table), and Andres > (in [4]) noted that: > > > Well, even if we consider this an actual problem, we could still use > > PROC_IN_CIC for non-expression non-partial indexes (index operator > > themselves better ensure this isn't a problem, or they're ridiculously > > broken already - they can get called during vacuum). > > But went on to describe how this is a problem with all expression > indexes (even if those expressions don't do dangerous things) because > of syscache lookups, but that ideally for expression indexes we'd have > a way to use a different (or more frequently taken) snapshot for the > purpose of computing those expressions. That's a significantly more > involved patch though. So the easy first patch here is to add the flag as PROC_IN_SAFE_CIC, which is set only for CIC when it's used for indexes that are neither on expressions nor partial. Then ignore those in WaitForOlderSnapshots. The attached patch does it that way. (Also updated per recent conflicts). I did not set the flag in REINDEX CONCURRENTLY, but as I understand it can be done too, since in essence it's the same thing as a CIC from a snapshot management point of view. Also, per [1], ISTM this flag could be used to tell lazy VACUUM to ignore the Xmin of this process too, which the previous formulation (where all CICs were so marked) could not. This patch doesn't do that yet, but it seems the natural next step to take. [1] https://postgr.es/m/20191101203310.GA12239@alvherre.pgsql -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >From 2a5088dfa35cbc800a87dc2154b6ebfa22837a66 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Tue, 4 Aug 2020 22:04:57 -0400 Subject: [PATCH v2] Avoid spurious CREATE INDEX CONCURRENTLY waits --- src/backend/commands/indexcmds.c | 50 ++-- src/include/storage/proc.h | 6 +++- 2 files changed, 52 insertions(+), 4 deletions(-) diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index 254dbcdce5..459f6fa5db 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -372,7 +372,10 @@ CompareOpclassOptions(Datum *opts1, Datum *opts2, int natts) * lazy VACUUMs, because they won't be fazed by missing index entries * either. (Manual ANALYZEs, however, can't be excluded because they * might be within transactions that are going to do arbitrary operations - * later.) + * later.) Processes running CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY + * on indexes that are neither expressional nor partial are also safe to + * ignore, since we know that those processes won't examine any data + * outside the table they're indexing. * * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not * check for that. @@ -393,7 +396,8 @@ WaitForOlderSnapshots(TransactionId limitXmin, bool progress) VirtualTransactionId *old_snapshots; old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false, - PROC_IS_AUTOVACUUM | PROC_IN_VACUUM, + PROC_IS_AUTOVACUUM | PROC_IN_VACUUM + | PROC_IN_SAFE_CIC, _old_snapshots); if (progress) pgstat_progress_update_param(PROGRESS_WAITFOR_TOTAL, n_old_snapshots); @@ -413,7 +417,8 @@ WaitForOlderSnapshots(TransactionId limitXmin, bool progress) newer_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false, - PROC_IS_AUTOVACUUM | PROC_IN_VACUUM, + PROC_IS_AUTOVACUUM | PROC_IN_VACUUM + | PROC_IN_SAFE_CIC, _newer_snapshots); for (j = i; j < n_old_snapshots; j++) { @@ -506,6 +511,7 @@ DefineIndex(Oid relationId, bool amcanorder; amoptions_function amoptions; bool partitioned; + bool safe_index; Datum reloptions; int16 *coloptions; IndexInfo *indexInfo; @@ -1033,6 +1039,17 @@ DefineIndex(Oid relationId, } } + /* + * When doing concurrent index builds, we can set a PGPROC flag to tell + * concurrent VACUUM, CREATE INDEX CONCURRENTLY and REINDEX CONCURRENTLY + * to ignore us when waiting for concurrent snapshots. That can only be + * done for indexes that don't execute any expressions. Determine that. + * (The flag is reset automatically at transaction end, so it must be + * set for each transaction.) + */ + safe_index = indexInfo->ii_Expressions == NIL && + indexInfo->ii_Predicate == NIL; + /* * Report index creation if appropriate (delay this till after most of the * error checks) @@ -1419,6 +1436,15 @@ DefineIndex(Oid relationId, CommitTransactionCommand(); StartTransactionCommand(); + /* Tell concurrent index builds to ignore us, if index qualifies */ + if (safe_index) + { + LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); + MyProc->vacuumFlags |=
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On Tue, Aug 11, 2020 at 11:14 AM Tom Lane wrote: > > James Coleman writes: > > Why is a CIC in active index-building something we need to wait for? > > Wouldn't it fall under a similar kind of logic to the other snapshot > > types we can explicitly ignore? CIC can't be run in a manual > > transaction, so the snapshot it holds won't be used to perform > > arbitrary operations (i.e., the reason why a manual ANALYZE can't be > > ignored). > > Expression indexes that call user-defined functions seem like a > pretty serious risk factor for that argument. Those are exactly > the same expressions that ANALYZE will evaluate, as a result of > which we judge it unsafe to ignore. Why would CIC be different? The comments for WaitForOlderSnapshots() don't discuss that problem at all; as far as ANALYZE goes they only say: * Manual ANALYZEs, however, can't be excluded because they * might be within transactions that are going to do arbitrary operations * later. But nonetheless over in the thread Álvaro linked to we'd discussed these issues already. Andres in [1] and [2] believed that: > For the snapshot waits we can add a procarray flag > (alongside PROCARRAY_VACUUM_FLAG) indicating that the backend is > doing. Which WaitForOlderSnapshots() can then use to ignore those CICs, > which is safe, because those transactions definitely don't insert into > relations targeted by CIC. The change to WaitForOlderSnapshots() would > just be to pass the new flag to GetCurrentVirtualXIDs, I think. and > What I was thinking of was a new flag, with a distinct value from > PROC_IN_VACUUM. It'd currently just be specified in the > GetCurrentVirtualXIDs() calls in WaitForOlderSnapshots(). That'd avoid > needing to wait for other CICs on different relations. Since CIC is not > permitted on system tables, and CIC doesn't do DML on normal tables, it > seems fairly obviously correct to exclude other CICs. In [3] I'd brought up that a function index can do arbitrary operations (including, terribly, a query of another table), and Andres (in [4]) noted that: > Well, even if we consider this an actual problem, we could still use > PROC_IN_CIC for non-expression non-partial indexes (index operator > themselves better ensure this isn't a problem, or they're ridiculously > broken already - they can get called during vacuum). But went on to describe how this is a problem with all expression indexes (even if those expressions don't do dangerous things) because of syscache lookups, but that ideally for expression indexes we'd have a way to use a different (or more frequently taken) snapshot for the purpose of computing those expressions. That's a significantly more involved patch though. So from what I understand, everything that I'd claimed in my previous message still holds true for non-expression/non-partial indexes. Is there something else I'm missing? James 1: https://www.postgresql.org/message-id/20200325191935.jjhdg2zy5k7ath5v%40alap3.anarazel.de 2: https://www.postgresql.org/message-id/20200325195841.gq4hpl25t6pxv3gl%40alap3.anarazel.de 3: https://www.postgresql.org/message-id/CAAaqYe_fveT_tvKonVt1imujOURUUMrydGeaBGahqLLy9D-REw%40mail.gmail.com 4: https://www.postgresql.org/message-id/20200416221207.wmnzbxvjqqpazeob%40alap3.anarazel.de
Re: remove spurious CREATE INDEX CONCURRENTLY wait
James Coleman writes: > Why is a CIC in active index-building something we need to wait for? > Wouldn't it fall under a similar kind of logic to the other snapshot > types we can explicitly ignore? CIC can't be run in a manual > transaction, so the snapshot it holds won't be used to perform > arbitrary operations (i.e., the reason why a manual ANALYZE can't be > ignored). Expression indexes that call user-defined functions seem like a pretty serious risk factor for that argument. Those are exactly the same expressions that ANALYZE will evaluate, as a result of which we judge it unsafe to ignore. Why would CIC be different? regards, tom lane
Re: remove spurious CREATE INDEX CONCURRENTLY wait
On Mon, Aug 10, 2020 at 8:37 PM Tom Lane wrote: > > Alvaro Herrera writes: > > To recap: currently, any CREATE INDEX CONCURRENTLY will wait for all > > other CICs running concurrently to finish, because they can't be > > distinguished amidst other old snapshots. We can change things by > > having CIC set a special flag in PGPROC (like PROC_IN_VACUUM) indicating > > that it's doing CIC; other CICs will see that flag and will know that > > they don't need to wait for those processes. With this, CIC on small > > tables don't have to wait for CIC on large tables to complete. > > Hm. +1 for improving this, if we can, but ... > > It seems clearly unsafe to ignore a CIC that is in active index-building; > a snapshot held for that purpose is just as real as any other. It *might* > be all right to ignore a CIC that is just waiting, but you haven't made > any argument in the patch comments as to why that's safe either. > (Moreover, at the points where we're just waiting, I don't think we have > a snapshot, so another CIC's WaitForOlderSnapshots shouldn't wait for us > anyway.) Why is a CIC in active index-building something we need to wait for? Wouldn't it fall under a similar kind of logic to the other snapshot types we can explicitly ignore? CIC can't be run in a manual transaction, so the snapshot it holds won't be used to perform arbitrary operations (i.e., the reason why a manual ANALYZE can't be ignored). > Actually, it doesn't look like you've touched the comments at all. > WaitForOlderSnapshots' header comment has a long explanation of why > it's safe to ignore certain processes. That certainly needs to be > updated by any patch that's going to change the rules. Agreed that the comment needs to be updated to discuss the (im)possibility of arbitrary operations within a snapshot held by CIC. James
Re: remove spurious CREATE INDEX CONCURRENTLY wait
Alvaro Herrera writes: > To recap: currently, any CREATE INDEX CONCURRENTLY will wait for all > other CICs running concurrently to finish, because they can't be > distinguished amidst other old snapshots. We can change things by > having CIC set a special flag in PGPROC (like PROC_IN_VACUUM) indicating > that it's doing CIC; other CICs will see that flag and will know that > they don't need to wait for those processes. With this, CIC on small > tables don't have to wait for CIC on large tables to complete. Hm. +1 for improving this, if we can, but ... It seems clearly unsafe to ignore a CIC that is in active index-building; a snapshot held for that purpose is just as real as any other. It *might* be all right to ignore a CIC that is just waiting, but you haven't made any argument in the patch comments as to why that's safe either. (Moreover, at the points where we're just waiting, I don't think we have a snapshot, so another CIC's WaitForOlderSnapshots shouldn't wait for us anyway.) Actually, it doesn't look like you've touched the comments at all. WaitForOlderSnapshots' header comment has a long explanation of why it's safe to ignore certain processes. That certainly needs to be updated by any patch that's going to change the rules. BTW, what about REINDEX CONCURRENTLY? regards, tom lane
Re: remove spurious CREATE INDEX CONCURRENTLY wait
+ James Coleman -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
remove spurious CREATE INDEX CONCURRENTLY wait
I previously[1] posted a patch to have multiple CREATE INDEX CONCURRENTLY not wait for the slowest of them. This is an update of that, with minor conflicts fixed and a fresh thread. To recap: currently, any CREATE INDEX CONCURRENTLY will wait for all other CICs running concurrently to finish, because they can't be distinguished amidst other old snapshots. We can change things by having CIC set a special flag in PGPROC (like PROC_IN_VACUUM) indicating that it's doing CIC; other CICs will see that flag and will know that they don't need to wait for those processes. With this, CIC on small tables don't have to wait for CIC on large tables to complete. [1] https://postgr.es/m/20200805021109.GA9079@alvherre.pgsql -- Álvaro Herrerahttp://www.linkedin.com/in/alvherre "Escucha y olvidarás; ve y recordarás; haz y entenderás" (Confucio) >From 2596c3033aacceb021463f58b50e2c4eed8a5ab2 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Tue, 4 Aug 2020 22:04:57 -0400 Subject: [PATCH] Flag CREATE INDEX CONCURRENTLY to avoid spurious waiting --- src/backend/commands/indexcmds.c | 13 +++-- src/include/storage/proc.h | 4 +++- src/include/storage/procarray.h | 4 3 files changed, 18 insertions(+), 3 deletions(-) diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index a3cb3cd47f..241c8a337e 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -393,7 +393,7 @@ WaitForOlderSnapshots(TransactionId limitXmin, bool progress) VirtualTransactionId *old_snapshots; old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false, - PROC_IS_AUTOVACUUM | PROC_IN_VACUUM, + PROC_IS_AUTOVACUUM | PROC_IN_VACUUM | PROC_IN_CIC, _old_snapshots); if (progress) pgstat_progress_update_param(PROGRESS_WAITFOR_TOTAL, n_old_snapshots); @@ -413,7 +413,7 @@ WaitForOlderSnapshots(TransactionId limitXmin, bool progress) newer_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false, - PROC_IS_AUTOVACUUM | PROC_IN_VACUUM, + PROC_IS_AUTOVACUUM | PROC_IN_VACUUM | PROC_IN_CIC, _newer_snapshots); for (j = i; j < n_old_snapshots; j++) { @@ -1420,6 +1420,9 @@ DefineIndex(Oid relationId, CommitTransactionCommand(); StartTransactionCommand(); + /* Set flag for other concurrent index builds to ignore us */ + MyPgXact->vacuumFlags |= PROC_IN_CIC; + /* * The index is now visible, so we can report the OID. */ @@ -1482,6 +1485,9 @@ DefineIndex(Oid relationId, CommitTransactionCommand(); StartTransactionCommand(); + /* Set flag for other concurrent index builds to ignore us */ + MyPgXact->vacuumFlags |= PROC_IN_CIC; + /* * Phase 3 of concurrent index build * @@ -1541,6 +1547,9 @@ DefineIndex(Oid relationId, CommitTransactionCommand(); StartTransactionCommand(); + /* Set flag for other concurrent index builds to ignore us */ + MyPgXact->vacuumFlags |= PROC_IN_CIC; + /* We should now definitely not be advertising any xmin. */ Assert(MyPgXact->xmin == InvalidTransactionId); diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h index 5ceb2494ba..b030dcde6c 100644 --- a/src/include/storage/proc.h +++ b/src/include/storage/proc.h @@ -52,6 +52,8 @@ struct XidCache */ #define PROC_IS_AUTOVACUUM 0x01 /* is it an autovac worker? */ #define PROC_IN_VACUUM 0x02 /* currently running lazy vacuum */ +#define PROC_IN_CIC 0x04 /* currently running CREATE INDEX + CONCURRENTLY */ #define PROC_VACUUM_FOR_WRAPAROUND 0x08 /* set by autovac only */ #define PROC_IN_LOGICAL_DECODING 0x10 /* currently doing logical * decoding outside xact */ @@ -59,7 +61,7 @@ struct XidCache /* flags reset at EOXact */ #define PROC_VACUUM_STATE_MASK \ - (PROC_IN_VACUUM | PROC_VACUUM_FOR_WRAPAROUND) + (PROC_IN_VACUUM | PROC_IN_CIC | PROC_VACUUM_FOR_WRAPAROUND) /* * We allow a small number of "weak" relation locks (AccessShareLock, diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h index 01040d76e1..c6edeb36e0 100644 --- a/src/include/storage/procarray.h +++ b/src/include/storage/procarray.h @@ -29,17 +29,21 @@ */ #define PROCARRAY_VACUUM_FLAG 0x02 /* currently running lazy * vacuum */ +#define PROCARRAY_CIC_FLAG0x04 /* currently running CREATE INDEX + * CONCURRENTLY */ #define PROCARRAY_LOGICAL_DECODING_FLAG 0x10 /* currently doing logical * decoding outside xact */ #define PROCARRAY_SLOTS_XMIN 0x20 /* replication slot xmin, * catalog_xmin */ + /* * Only flags in PROCARRAY_PROC_FLAGS_MASK are considered when matching * PGXACT->vacuumFlags. Other flags are used for different purposes and * have no corresponding PROC flag equivalent. */ #define PROCARRAY_PROC_FLAGS_MASK (PROCARRAY_VACUUM_FLAG | \ + PROCARRAY_CIC_FLAG | \