On Sat, Jan 3, 2015 at 12:42 PM, Peter Geoghegan <[email protected]> wrote:
> There are probably additional factors that make the situation more
> complicated than it is for the ON CONFLICT patch, but it's clear to me
> that the mutual dependency that can be involved with two ordinary
> exclusion constraint inserters is reason enough for those to deadlock.
I looked at the code in more detail, and realized that there were old
bugs in the exclusion constraint related modifications. I attach a
delta patch that fixes them. This is a combined patch that is all that
is needed to apply on top of v1.8.vallock2.tar.gz [1] to have all
available bugfixes.
This unbreaks my previous exclusion constraint test case, as far as it
goes. I am still concerned about the risk of lock starvation/livelocks
here. But I must admit, having stressed the patch with this latest
fix, that it's possible that the real risks have been overestimated.
Maybe this is in the same category as the way L&Y's algorithm could
theoretically starve a session with a pagesplit that needs to move
right indefinitely. It's a theoretical risk that turns out to not
matter in practice, for reasons that lack a real theoretical
justification beyond the fact that sustained very bad luck - a
sustained "perfect storm" - is required for starvation to occur. OTOH,
maybe my OS scheduler doesn't have the characteristic that provoke a
the suspected bug. I really don't know, but I suppose that one or the
other concurrent inserters will tend to win the race often enough - a
draw may be a very rare thing.
[1]
http://www.postgresql.org/message-id/cam3swzrg_htrol-6_wfe6_d_ucuyc28jfapsfh_tra76gkk...@mail.gmail.com
--
Peter Geoghegan
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index ea8c57b..f8a4017 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -1133,8 +1133,11 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
* If the index has an associated exclusion constraint, check that.
* This is simpler than the process for uniqueness checks since we
* always insert first and then check. If the constraint is deferred,
- * we check now anyway, but don't throw error on violation; instead
- * we'll queue a recheck event.
+ * we check now anyway, but don't throw error on violation or wait for
+ * a conclusive outcome from a concurrent insertion; instead we'll
+ * queue a recheck event. Similarly, noDupErr callers (speculative
+ * inserters) will recheck later, and wait for a conclusive outcome
+ * then.
*
* An index for an exclusion constraint can't also be UNIQUE (not an
* essential property, we just don't allow it in the grammar), so no
@@ -1142,15 +1145,15 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
*/
if (indexInfo->ii_ExclusionOps != NULL)
{
- bool errorOK = (!indexRelation->rd_index->indimmediate &&
- !noDupErr);
+ bool violationOK = (!indexRelation->rd_index->indimmediate ||
+ noDupErr);
satisfiesConstraint =
check_exclusion_or_unique_constraint(heapRelation,
indexRelation, indexInfo,
tupleid, values, isnull,
- estate, false, errorOK,
- false, NULL);
+ estate, false,
+ violationOK, false, NULL);
}
if ((checkUnique == UNIQUE_CHECK_PARTIAL ||
@@ -1158,7 +1161,7 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
!satisfiesConstraint)
{
/*
- * The tuple potentially violates the uniqueness or exclusion
+ * The tuple potentially violates the unique index or exclusion
* constraint, so make a note of the index so that we can re-check
* it later.
*/
@@ -1322,7 +1325,7 @@ ExecCheckIndexConstraints(TupleTableSlot *slot,
* is convenient for deferred exclusion checks; we need not bother queuing
* a deferred event if there is definitely no conflict at insertion time.
*
- * When errorOK is false, we'll throw error on violation, so a false result
+ * When violationOK is false, we'll throw error on violation, so a false result
* is impossible.
*
* Note: The indexam is normally responsible for checking unique constraints,
@@ -1335,7 +1338,7 @@ check_exclusion_or_unique_constraint(Relation heap, Relation index,
IndexInfo *indexInfo, ItemPointer tupleid,
Datum *values, bool *isnull,
EState *estate, bool newIndex,
- bool errorOK, bool wait,
+ bool violationOK, bool wait,
ItemPointer conflictTid)
{
Oid *constr_procs;
@@ -1462,7 +1465,7 @@ retry:
* we're not supposed to raise error, just return the fact of the
* potential conflict without waiting to see if it's real.
*/
- if (errorOK && !wait)
+ if (violationOK && !wait)
{
conflict = true;
if (conflictTid)
@@ -1487,7 +1490,7 @@ retry:
if (DirtySnapshot.speculativeToken)
SpeculativeInsertionWait(DirtySnapshot.xmin,
DirtySnapshot.speculativeToken);
- else if (errorOK)
+ else if (violationOK)
XactLockTableWait(xwait, heap, &tup->t_self,
XLTW_RecheckExclusionConstr);
else
@@ -1499,7 +1502,7 @@ retry:
/*
* We have a definite conflict. Return it to caller, or report it.
*/
- if (errorOK)
+ if (violationOK)
{
conflict = true;
if (conflictTid)
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 4e5be1a..9bb13df 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -313,8 +313,8 @@ ExecInsert(TupleTableSlot *slot,
* deleting the already-inserted tuple and retrying, but that's fairly
* expensive, so we try to avoid it.
*/
- conflict = false;
vlock:
+ conflict = false;
ItemPointerSetInvalid(&conflictTid);
/*
@@ -354,7 +354,8 @@ vlock:
* which is appropriate only for non-promise tuples) to wait for us
* to decide if we're going to go ahead with the insertion.
*/
- SpeculativeInsertionLockAcquire(GetCurrentTransactionId());
+ if (spec != SPEC_NONE)
+ SpeculativeInsertionLockAcquire(GetCurrentTransactionId());
/*
* insert the tuple
@@ -363,7 +364,8 @@ vlock:
* the t_self field.
*/
newId = heap_insert(resultRelationDesc, tuple,
- estate->es_output_cid, HEAP_INSERT_SPECULATIVE,
+ estate->es_output_cid,
+ spec != SPEC_NONE? HEAP_INSERT_SPECULATIVE:0,
NULL);
/*
@@ -404,8 +406,11 @@ vlock:
estate->es_output_cid, NULL, false, &hufd, true);
}
- SpeculativeInsertionLockRelease(GetCurrentTransactionId());
- ClearSpeculativeInsertionState();
+ if (spec != SPEC_NONE)
+ {
+ SpeculativeInsertionLockRelease(GetCurrentTransactionId());
+ ClearSpeculativeInsertionState();
+ }
}
if (conflict)
@@ -1014,6 +1019,17 @@ ExecLockUpdateTuple(EState *estate,
LockTupleExclusive, false, /* wait */
false, &buffer, &hufd);
+ /* Was tuple concurrently super-deleted? */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple.t_data),
+ InvalidTransactionId))
+ {
+ test = HeapTupleUpdated;
+ /* XXX Should be caught within tqual.c */
+ elog(WARNING, "concurrent super delete of tuple (%u,%u)",
+ ItemPointerGetBlockNumber(&tuple.t_data->t_ctid),
+ ItemPointerGetOffsetNumber(&tuple.t_data->t_ctid));
+ }
+
if (test == HeapTupleMayBeUpdated)
copyTuple = heap_copytuple(&tuple);
@@ -1145,6 +1161,15 @@ ExecLockUpdateTuple(EState *estate,
*returning = ExecUpdate(&tuple.t_data->t_ctid, NULL, slot,
planSlot, &onConflict->mt_epqstate,
onConflict->ps.state, canSetTag);
+ else
+ /*
+ * XXX Debug instrumentation. This is useful for stress test
+ * queries with "WHERE target.index = excluded.index", in order
+ * to assert that the UPDATE affects the right row.
+ */
+ elog(WARNING, "did not update tuple (%u,%u)",
+ ItemPointerGetBlockNumber(&tuple.t_data->t_ctid),
+ ItemPointerGetOffsetNumber(&tuple.t_data->t_ctid));
ReleaseBuffer(buffer);
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 9216779..94099f1 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -728,6 +728,16 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
snapshot->xmin = snapshot->xmax = InvalidTransactionId;
snapshot->speculativeToken = 0;
+ /*
+ * Never return "super-deleted" tuples
+ *
+ * XXX: Comment this code out and you'll get conflicts within
+ * ExecLockUpdateTuple(), which result in an infinite loop.
+ */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+ InvalidTransactionId))
+ return false;
+
if (!HeapTupleHeaderXminCommitted(tuple))
{
if (HeapTupleHeaderXminInvalid(tuple))
@@ -943,6 +953,13 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
Assert(ItemPointerIsValid(&htup->t_self));
Assert(htup->t_tableOid != InvalidOid);
+ /*
+ * Never return "super-deleted" tuples
+ */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+ InvalidTransactionId))
+ return false;
+
if (!HeapTupleHeaderXminCommitted(tuple))
{
if (HeapTupleHeaderXminInvalid(tuple))
@@ -1147,6 +1164,13 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
Assert(htup->t_tableOid != InvalidOid);
/*
+ * Immediately VACUUM "super-deleted" tuples
+ */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+ InvalidTransactionId))
+ return HEAPTUPLE_DEAD;
+
+ /*
* Has inserting transaction committed?
*
* If the inserting transaction aborted, then the tuple was never visible
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers