On Sun, Jun 29, 2014 at 9:12 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Meh.  Even "SELECT 1" is going to be doing *far* more pallocs than that to
> get through raw parsing, parse analysis, planning, and execution startup.
> If you can find a few hundred pallocs we can avoid in trivial queries,
> it would get interesting; but I'll be astonished if saving 4 is measurable.

I got nerd-sniped by this problem today, probably after staring at the
profiling data very similar to what led Andres to ask the question in
the first place.  The gains are indeed not measurable on a
macrobenchmark, but I had to write the patch to figure that out, so
here it is.

Although the gain isn't a measurable percentage of total runtime, it
does appear to be a significant percentage of the palloc traffic on
trivial queries. I did 30-second SELECT-only pgbench runs at scale
factor 10, using prepared queries.  According to perf, on unpatched
master, StartTransactionCommand accounts for 11.46% of the calls to
AllocSetAlloc.  (I imagine this is by runtime, not by call count, but
it probably doesn't matter much either way.)  But with this patch,
StartTransactionCommand's share drops to 4.43%.  Most of that is
coming from AtStart_Inval(), which wouldn't be hard to fix either.

I'm inclined to think that tamping down palloc traffic is worthwhile
even if we can't directly measure the savings on a macrobenchmark.
AllocSetAlloc is often at or near the top of profiling results, but
there's rarely a single caller responsible for enough of those calls
for to make it worth optimizing. But that means that if we refuse to
take patches that save just a few pallocs, we're basically giving up
on ever improving anything in this area, and that doesn't seem like a
good idea either; it's only going to get slowly worse over time as we
add more features.

BTW, if anyone's curious what the biggest source of AllocSetAlloc
calls is on this workload, the answer is ExecInitIndexScan(), which
looks to be indirectly responsible for almost half the total (or just
over half, with the patch applied).  That apparently calls a lot of
different things that allocate small amounts of memory, and it
accounts for nearly all of the AllocSetAlloc traffic that originates
from PortalStart().  I haven't poked around in there enough to know
whether there's anything worth optimizing, but given the degree to
which this patch shifts the profile, I bet the number that it takes to
make a material savings is more like a few dozen than a few hundred.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index f4c0ffa..1db0666 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -90,6 +90,7 @@ static void AfterTriggerSaveEvent(EState *estate, ResultRelInfo *relinfo,
 					  int event, bool row_trigger,
 					  HeapTuple oldtup, HeapTuple newtup,
 					  List *recheckIndexes, Bitmapset *modifiedCols);
+static void AfterTriggerEnlargeQueryState(void);
 
 
 /*
@@ -3203,9 +3204,7 @@ typedef struct AfterTriggersData
 	int			maxtransdepth;	/* allocated len of above arrays */
 } AfterTriggersData;
 
-typedef AfterTriggersData *AfterTriggers;
-
-static AfterTriggers afterTriggers;
+static AfterTriggersData afterTriggers;
 
 static void AfterTriggerExecute(AfterTriggerEvent event,
 					Relation rel, TriggerDesc *trigdesc,
@@ -3228,7 +3227,7 @@ GetCurrentFDWTuplestore()
 {
 	Tuplestorestate *ret;
 
-	ret = afterTriggers->fdw_tuplestores[afterTriggers->query_depth];
+	ret = afterTriggers.fdw_tuplestores[afterTriggers.query_depth];
 	if (ret == NULL)
 	{
 		MemoryContext oldcxt;
@@ -3255,7 +3254,7 @@ GetCurrentFDWTuplestore()
 		CurrentResourceOwner = saveResourceOwner;
 		MemoryContextSwitchTo(oldcxt);
 
-		afterTriggers->fdw_tuplestores[afterTriggers->query_depth] = ret;
+		afterTriggers.fdw_tuplestores[afterTriggers.query_depth] = ret;
 	}
 
 	return ret;
@@ -3271,7 +3270,7 @@ static bool
 afterTriggerCheckState(AfterTriggerShared evtshared)
 {
 	Oid			tgoid = evtshared->ats_tgoid;
-	SetConstraintState state = afterTriggers->state;
+	SetConstraintState state = afterTriggers.state;
 	int			i;
 
 	/*
@@ -3282,19 +3281,22 @@ afterTriggerCheckState(AfterTriggerShared evtshared)
 		return false;
 
 	/*
-	 * Check if SET CONSTRAINTS has been executed for this specific trigger.
+	 * If constraint state exists, SET CONSTRAINTS might have been executed
+	 * either for this trigger or for all triggers.
 	 */
-	for (i = 0; i < state->numstates; i++)
+	if (state != NULL)
 	{
-		if (state->trigstates[i].sct_tgoid == tgoid)
-			return state->trigstates[i].sct_tgisdeferred;
-	}
+		/* Check for SET CONSTRAINTS for this specific trigger. */
+		for (i = 0; i < state->numstates; i++)
+		{
+			if (state->trigstates[i].sct_tgoid == tgoid)
+				return state->trigstates[i].sct_tgisdeferred;
+		}
 
-	/*
-	 * Check if SET CONSTRAINTS ALL has been executed; if so use that.
-	 */
-	if (state->all_isset)
-		return state->all_isdeferred;
+		/* Check for SET CONSTRAINTS ALL. */
+		if (state->all_isset)
+			return state->all_isdeferred;
+	}
 
 	/*
 	 * Otherwise return the default state for the trigger.
@@ -3331,8 +3333,8 @@ afterTriggerAddEvent(AfterTriggerEventList *events,
 		Size		chunksize;
 
 		/* Create event context if we didn't already */
-		if (afterTriggers->event_cxt == NULL)
-			afterTriggers->event_cxt =
+		if (afterTriggers.event_cxt == NULL)
+			afterTriggers.event_cxt =
 				AllocSetContextCreate(TopTransactionContext,
 									  "AfterTriggerEvents",
 									  ALLOCSET_DEFAULT_MINSIZE,
@@ -3375,7 +3377,7 @@ afterTriggerAddEvent(AfterTriggerEventList *events,
 				chunksize /= 2; /* too many shared records */
 			chunksize = Min(chunksize, MAX_CHUNK_SIZE);
 		}
-		chunk = MemoryContextAlloc(afterTriggers->event_cxt, chunksize);
+		chunk = MemoryContextAlloc(afterTriggers.event_cxt, chunksize);
 		chunk->next = NULL;
 		chunk->freeptr = CHUNK_DATA_START(chunk);
 		chunk->endptr = chunk->endfree = (char *) chunk + chunksize;
@@ -3706,7 +3708,7 @@ afterTriggerMarkEvents(AfterTriggerEventList *events,
 				/*
 				 * Mark it as to be fired in this firing cycle.
 				 */
-				evtshared->ats_firing_id = afterTriggers->firing_counter;
+				evtshared->ats_firing_id = afterTriggers.firing_counter;
 				event->ate_flags |= AFTER_TRIGGER_IN_PROGRESS;
 				found = true;
 			}
@@ -3898,40 +3900,28 @@ afterTriggerInvokeEvents(AfterTriggerEventList *events,
 void
 AfterTriggerBeginXact(void)
 {
-	Assert(afterTriggers == NULL);
+	/*
+	 * Initialize after-trigger state structure to empty
+	 */
+	afterTriggers.firing_counter = (CommandId) 1;		/* mustn't be 0 */
+	afterTriggers.query_depth = -1;
 
 	/*
-	 * Build empty after-trigger state structure
+	 * Verify that there is no leftover state remaining.  If these assertions
+	 * trip, it means that AfterTriggerEndXact wasn't called or didn't clean
+	 * up properly.
 	 */
-	afterTriggers = (AfterTriggers)
-		MemoryContextAlloc(TopTransactionContext,
-						   sizeof(AfterTriggersData));
-
-	afterTriggers->firing_counter = (CommandId) 1;		/* mustn't be 0 */
-	afterTriggers->state = SetConstraintStateCreate(8);
-	afterTriggers->events.head = NULL;
-	afterTriggers->events.tail = NULL;
-	afterTriggers->events.tailfree = NULL;
-	afterTriggers->query_depth = -1;
-
-	/* We initialize the arrays to a reasonable size */
-	afterTriggers->query_stack = (AfterTriggerEventList *)
-		MemoryContextAlloc(TopTransactionContext,
-						   8 * sizeof(AfterTriggerEventList));
-	afterTriggers->fdw_tuplestores = (Tuplestorestate **)
-		MemoryContextAllocZero(TopTransactionContext,
-							   8 * sizeof(Tuplestorestate *));
-	afterTriggers->maxquerydepth = 8;
-
-	/* Context for events is created only when needed */
-	afterTriggers->event_cxt = NULL;
-
-	/* Subtransaction stack is empty until/unless needed */
-	afterTriggers->state_stack = NULL;
-	afterTriggers->events_stack = NULL;
-	afterTriggers->depth_stack = NULL;
-	afterTriggers->firing_stack = NULL;
-	afterTriggers->maxtransdepth = 0;
+	Assert(afterTriggers.state == NULL);
+	Assert(afterTriggers.query_stack == NULL);
+	Assert(afterTriggers.fdw_tuplestores == NULL);
+	Assert(afterTriggers.maxquerydepth == 0);
+	Assert(afterTriggers.event_cxt == NULL);
+	Assert(afterTriggers.events.head == NULL);
+	Assert(afterTriggers.state_stack == NULL);
+	Assert(afterTriggers.events_stack == NULL);
+	Assert(afterTriggers.depth_stack == NULL);
+	Assert(afterTriggers.firing_stack == NULL);
+	Assert(afterTriggers.maxtransdepth == 0);
 }
 
 
@@ -3939,48 +3929,15 @@ AfterTriggerBeginXact(void)
  * AfterTriggerBeginQuery()
  *
  *	Called just before we start processing a single query within a
- *	transaction (or subtransaction).  Set up to record AFTER trigger
- *	events queued by the query.  Note that it is allowed to have
- *	nested queries within a (sub)transaction.
+ *	transaction (or subtransaction).  Most of the real work gets deferred
+ *	until somebody actually tries to queue a trigger event.
  * ----------
  */
 void
 AfterTriggerBeginQuery(void)
 {
-	AfterTriggerEventList *events;
-
-	/* Must be inside a transaction */
-	Assert(afterTriggers != NULL);
-
 	/* Increase the query stack depth */
-	afterTriggers->query_depth++;
-
-	/*
-	 * Allocate more space in the query stack if needed.
-	 */
-	if (afterTriggers->query_depth >= afterTriggers->maxquerydepth)
-	{
-		/* repalloc will keep the stack in the same context */
-		int			old_alloc = afterTriggers->maxquerydepth;
-		int			new_alloc = old_alloc * 2;
-
-		afterTriggers->query_stack = (AfterTriggerEventList *)
-			repalloc(afterTriggers->query_stack,
-					 new_alloc * sizeof(AfterTriggerEventList));
-		afterTriggers->fdw_tuplestores = (Tuplestorestate **)
-			repalloc(afterTriggers->fdw_tuplestores,
-					 new_alloc * sizeof(Tuplestorestate *));
-		/* Clear newly-allocated slots for subsequent lazy initialization. */
-		memset(afterTriggers->fdw_tuplestores + old_alloc,
-			   0, (new_alloc - old_alloc) * sizeof(Tuplestorestate *));
-		afterTriggers->maxquerydepth = new_alloc;
-	}
-
-	/* Initialize this query's list to empty */
-	events = &afterTriggers->query_stack[afterTriggers->query_depth];
-	events->head = NULL;
-	events->tail = NULL;
-	events->tailfree = NULL;
+	afterTriggers.query_depth++;
 }
 
 
@@ -4002,11 +3959,18 @@ AfterTriggerEndQuery(EState *estate)
 	AfterTriggerEventList *events;
 	Tuplestorestate *fdw_tuplestore;
 
-	/* Must be inside a transaction */
-	Assert(afterTriggers != NULL);
-
 	/* Must be inside a query, too */
-	Assert(afterTriggers->query_depth >= 0);
+	Assert(afterTriggers.query_depth >= 0);
+
+	/*
+	 * If we never even got as far as initializing the event stack, there
+	 * certainly won't be any events, so exit quickly.
+	 */
+	if (afterTriggers.query_depth >= afterTriggers.maxquerydepth)
+	{
+		afterTriggers.query_depth--;
+		return;
+	}
 
 	/*
 	 * Process all immediate-mode triggers queued by the query, and move the
@@ -4032,10 +3996,10 @@ AfterTriggerEndQuery(EState *estate)
 	 */
 	for (;;)
 	{
-		events = &afterTriggers->query_stack[afterTriggers->query_depth];
-		if (afterTriggerMarkEvents(events, &afterTriggers->events, true))
+		events = &afterTriggers.query_stack[afterTriggers.query_depth];
+		if (afterTriggerMarkEvents(events, &afterTriggers.events, true))
 		{
-			CommandId	firing_id = afterTriggers->firing_counter++;
+			CommandId	firing_id = afterTriggers.firing_counter++;
 
 			/* OK to delete the immediate events after processing them */
 			if (afterTriggerInvokeEvents(events, firing_id, estate, true))
@@ -4046,15 +4010,15 @@ AfterTriggerEndQuery(EState *estate)
 	}
 
 	/* Release query-local storage for events, including tuplestore if any */
-	fdw_tuplestore = afterTriggers->fdw_tuplestores[afterTriggers->query_depth];
+	fdw_tuplestore = afterTriggers.fdw_tuplestores[afterTriggers.query_depth];
 	if (fdw_tuplestore)
 	{
 		tuplestore_end(fdw_tuplestore);
-		afterTriggers->fdw_tuplestores[afterTriggers->query_depth] = NULL;
+		afterTriggers.fdw_tuplestores[afterTriggers.query_depth] = NULL;
 	}
-	afterTriggerFreeEventList(&afterTriggers->query_stack[afterTriggers->query_depth]);
+	afterTriggerFreeEventList(&afterTriggers.query_stack[afterTriggers.query_depth]);
 
-	afterTriggers->query_depth--;
+	afterTriggers.query_depth--;
 }
 
 
@@ -4075,18 +4039,15 @@ AfterTriggerFireDeferred(void)
 	AfterTriggerEventList *events;
 	bool		snap_pushed = false;
 
-	/* Must be inside a transaction */
-	Assert(afterTriggers != NULL);
-
-	/* ... but not inside a query */
-	Assert(afterTriggers->query_depth == -1);
+	/* Must not be inside a query */
+	Assert(afterTriggers.query_depth == -1);
 
 	/*
 	 * If there are any triggers to fire, make sure we have set a snapshot for
 	 * them to use.  (Since PortalRunUtility doesn't set a snap for COMMIT, we
 	 * can't assume ActiveSnapshot is valid on entry.)
 	 */
-	events = &afterTriggers->events;
+	events = &afterTriggers.events;
 	if (events->head != NULL)
 	{
 		PushActiveSnapshot(GetTransactionSnapshot());
@@ -4099,7 +4060,7 @@ AfterTriggerFireDeferred(void)
 	 */
 	while (afterTriggerMarkEvents(events, NULL, false))
 	{
-		CommandId	firing_id = afterTriggers->firing_counter++;
+		CommandId	firing_id = afterTriggers.firing_counter++;
 
 		if (afterTriggerInvokeEvents(events, firing_id, NULL, true))
 			break;				/* all fired */
@@ -4132,7 +4093,7 @@ void
 AfterTriggerEndXact(bool isCommit)
 {
 	/*
-	 * Forget everything we know about AFTER triggers.
+	 * Forget the pending-events list.
 	 *
 	 * Since all the info is in TopTransactionContext or children thereof, we
 	 * don't really need to do anything to reclaim memory.  However, the
@@ -4140,10 +4101,39 @@ AfterTriggerEndXact(bool isCommit)
 	 * soon as possible --- especially if we are aborting because we ran out
 	 * of memory for the list!
 	 */
-	if (afterTriggers && afterTriggers->event_cxt)
-		MemoryContextDelete(afterTriggers->event_cxt);
+	if (afterTriggers.event_cxt)
+	{
+		MemoryContextDelete(afterTriggers.event_cxt);
+		afterTriggers.event_cxt = NULL;
+		afterTriggers.events.head = NULL;
+		afterTriggers.events.tail = NULL;
+		afterTriggers.events.tailfree = NULL;
+	}
+
+	/*
+	 * Forget any subtransaction state as well.  Since this can't be very
+	 * large, we let the eventual reset of TopTransactionContext free the
+	 * memory instead of doing it here.
+	 */
+	afterTriggers.state_stack = NULL;
+	afterTriggers.events_stack = NULL;
+	afterTriggers.depth_stack = NULL;
+	afterTriggers.firing_stack = NULL;
+	afterTriggers.maxtransdepth = 0;
+
+
+	/*
+	 * Forget the query stack and constrant-related state information.  As
+	 * with the subtransaction state information, we don't bother freeing the
+	 * memory here.
+	 */
+	afterTriggers.query_stack = NULL;
+	afterTriggers.fdw_tuplestores = NULL;
+	afterTriggers.maxquerydepth = 0;
+	afterTriggers.state = NULL;
 
-	afterTriggers = NULL;
+	/* No more afterTriggers manipulation until next transaction starts. */
+	afterTriggers.query_depth = -1;
 }
 
 /*
@@ -4157,56 +4147,49 @@ AfterTriggerBeginSubXact(void)
 	int			my_level = GetCurrentTransactionNestLevel();
 
 	/*
-	 * Ignore call if the transaction is in aborted state.  (Probably
-	 * shouldn't happen?)
-	 */
-	if (afterTriggers == NULL)
-		return;
-
-	/*
 	 * Allocate more space in the stacks if needed.  (Note: because the
 	 * minimum nest level of a subtransaction is 2, we waste the first couple
 	 * entries of each array; not worth the notational effort to avoid it.)
 	 */
-	while (my_level >= afterTriggers->maxtransdepth)
+	while (my_level >= afterTriggers.maxtransdepth)
 	{
-		if (afterTriggers->maxtransdepth == 0)
+		if (afterTriggers.maxtransdepth == 0)
 		{
 			MemoryContext old_cxt;
 
 			old_cxt = MemoryContextSwitchTo(TopTransactionContext);
 
 #define DEFTRIG_INITALLOC 8
-			afterTriggers->state_stack = (SetConstraintState *)
+			afterTriggers.state_stack = (SetConstraintState *)
 				palloc(DEFTRIG_INITALLOC * sizeof(SetConstraintState));
-			afterTriggers->events_stack = (AfterTriggerEventList *)
+			afterTriggers.events_stack = (AfterTriggerEventList *)
 				palloc(DEFTRIG_INITALLOC * sizeof(AfterTriggerEventList));
-			afterTriggers->depth_stack = (int *)
+			afterTriggers.depth_stack = (int *)
 				palloc(DEFTRIG_INITALLOC * sizeof(int));
-			afterTriggers->firing_stack = (CommandId *)
+			afterTriggers.firing_stack = (CommandId *)
 				palloc(DEFTRIG_INITALLOC * sizeof(CommandId));
-			afterTriggers->maxtransdepth = DEFTRIG_INITALLOC;
+			afterTriggers.maxtransdepth = DEFTRIG_INITALLOC;
 
 			MemoryContextSwitchTo(old_cxt);
 		}
 		else
 		{
 			/* repalloc will keep the stacks in the same context */
-			int			new_alloc = afterTriggers->maxtransdepth * 2;
+			int			new_alloc = afterTriggers.maxtransdepth * 2;
 
-			afterTriggers->state_stack = (SetConstraintState *)
-				repalloc(afterTriggers->state_stack,
+			afterTriggers.state_stack = (SetConstraintState *)
+				repalloc(afterTriggers.state_stack,
 						 new_alloc * sizeof(SetConstraintState));
-			afterTriggers->events_stack = (AfterTriggerEventList *)
-				repalloc(afterTriggers->events_stack,
+			afterTriggers.events_stack = (AfterTriggerEventList *)
+				repalloc(afterTriggers.events_stack,
 						 new_alloc * sizeof(AfterTriggerEventList));
-			afterTriggers->depth_stack = (int *)
-				repalloc(afterTriggers->depth_stack,
+			afterTriggers.depth_stack = (int *)
+				repalloc(afterTriggers.depth_stack,
 						 new_alloc * sizeof(int));
-			afterTriggers->firing_stack = (CommandId *)
-				repalloc(afterTriggers->firing_stack,
+			afterTriggers.firing_stack = (CommandId *)
+				repalloc(afterTriggers.firing_stack,
 						 new_alloc * sizeof(CommandId));
-			afterTriggers->maxtransdepth = new_alloc;
+			afterTriggers.maxtransdepth = new_alloc;
 		}
 	}
 
@@ -4215,10 +4198,10 @@ AfterTriggerBeginSubXact(void)
 	 * is not saved until/unless changed.  Likewise, we don't make a
 	 * per-subtransaction event context until needed.
 	 */
-	afterTriggers->state_stack[my_level] = NULL;
-	afterTriggers->events_stack[my_level] = afterTriggers->events;
-	afterTriggers->depth_stack[my_level] = afterTriggers->query_depth;
-	afterTriggers->firing_stack[my_level] = afterTriggers->firing_counter;
+	afterTriggers.state_stack[my_level] = NULL;
+	afterTriggers.events_stack[my_level] = afterTriggers.events;
+	afterTriggers.depth_stack[my_level] = afterTriggers.query_depth;
+	afterTriggers.firing_stack[my_level] = afterTriggers.firing_counter;
 }
 
 /*
@@ -4236,26 +4219,19 @@ AfterTriggerEndSubXact(bool isCommit)
 	CommandId	subxact_firing_id;
 
 	/*
-	 * Ignore call if the transaction is in aborted state.  (Probably
-	 * unneeded)
-	 */
-	if (afterTriggers == NULL)
-		return;
-
-	/*
 	 * Pop the prior state if needed.
 	 */
 	if (isCommit)
 	{
-		Assert(my_level < afterTriggers->maxtransdepth);
+		Assert(my_level < afterTriggers.maxtransdepth);
 		/* If we saved a prior state, we don't need it anymore */
-		state = afterTriggers->state_stack[my_level];
+		state = afterTriggers.state_stack[my_level];
 		if (state != NULL)
 			pfree(state);
 		/* this avoids double pfree if error later: */
-		afterTriggers->state_stack[my_level] = NULL;
-		Assert(afterTriggers->query_depth ==
-			   afterTriggers->depth_stack[my_level]);
+		afterTriggers.state_stack[my_level] = NULL;
+		Assert(afterTriggers.query_depth ==
+			   afterTriggers.depth_stack[my_level]);
 	}
 	else
 	{
@@ -4264,7 +4240,7 @@ AfterTriggerEndSubXact(bool isCommit)
 		 * AfterTriggerBeginSubXact, in which case we mustn't risk touching
 		 * stack levels that aren't there.
 		 */
-		if (my_level >= afterTriggers->maxtransdepth)
+		if (my_level >= afterTriggers.maxtransdepth)
 			return;
 
 		/*
@@ -4273,42 +4249,46 @@ AfterTriggerEndSubXact(bool isCommit)
 		 * subtransaction will not add events to query levels started in a
 		 * earlier transaction state.
 		 */
-		while (afterTriggers->query_depth > afterTriggers->depth_stack[my_level])
+		while (afterTriggers.query_depth > afterTriggers.depth_stack[my_level])
 		{
-			Tuplestorestate *ts;
-
-			ts = afterTriggers->fdw_tuplestores[afterTriggers->query_depth];
-			if (ts)
+			if (afterTriggers.query_depth < afterTriggers.maxquerydepth)
 			{
-				tuplestore_end(ts);
-				afterTriggers->fdw_tuplestores[afterTriggers->query_depth] = NULL;
+				Tuplestorestate *ts;
+
+				ts = afterTriggers.fdw_tuplestores[afterTriggers.query_depth];
+				if (ts)
+				{
+					tuplestore_end(ts);
+					afterTriggers.fdw_tuplestores[afterTriggers.query_depth] = NULL;
+				}
+
+				afterTriggerFreeEventList(&afterTriggers.query_stack[afterTriggers.query_depth]);
 			}
 
-			afterTriggerFreeEventList(&afterTriggers->query_stack[afterTriggers->query_depth]);
-			afterTriggers->query_depth--;
+			afterTriggers.query_depth--;
 		}
-		Assert(afterTriggers->query_depth ==
-			   afterTriggers->depth_stack[my_level]);
+		Assert(afterTriggers.query_depth ==
+			   afterTriggers.depth_stack[my_level]);
 
 		/*
 		 * Restore the global deferred-event list to its former length,
 		 * discarding any events queued by the subxact.
 		 */
-		afterTriggerRestoreEventList(&afterTriggers->events,
-									 &afterTriggers->events_stack[my_level]);
+		afterTriggerRestoreEventList(&afterTriggers.events,
+									 &afterTriggers.events_stack[my_level]);
 
 		/*
 		 * Restore the trigger state.  If the saved state is NULL, then this
 		 * subxact didn't save it, so it doesn't need restoring.
 		 */
-		state = afterTriggers->state_stack[my_level];
+		state = afterTriggers.state_stack[my_level];
 		if (state != NULL)
 		{
-			pfree(afterTriggers->state);
-			afterTriggers->state = state;
+			pfree(afterTriggers.state);
+			afterTriggers.state = state;
 		}
 		/* this avoids double pfree if error later: */
-		afterTriggers->state_stack[my_level] = NULL;
+		afterTriggers.state_stack[my_level] = NULL;
 
 		/*
 		 * Scan for any remaining deferred events that were marked DONE or IN
@@ -4318,8 +4298,8 @@ AfterTriggerEndSubXact(bool isCommit)
 		 * (This essentially assumes that the current subxact includes all
 		 * subxacts started after it.)
 		 */
-		subxact_firing_id = afterTriggers->firing_stack[my_level];
-		for_each_event_chunk(event, chunk, afterTriggers->events)
+		subxact_firing_id = afterTriggers.firing_stack[my_level];
+		for_each_event_chunk(event, chunk, afterTriggers.events)
 		{
 			AfterTriggerShared evtshared = GetTriggerSharedData(event);
 
@@ -4334,6 +4314,66 @@ AfterTriggerEndSubXact(bool isCommit)
 	}
 }
 
+/* ----------
+ * AfterTriggerEnlargeQueryState()
+ *
+ *	Prepare the necessary state so that we can record AFTER trigger events
+ *	queued by a query.  It is allowed to have nested queries within a
+ *	(sub)transaction, so we need to have separate state for each query
+ *	nesting level.
+ * ----------
+ */
+static void
+AfterTriggerEnlargeQueryState(void)
+{
+	int		init_depth = afterTriggers.maxquerydepth;
+
+	Assert(afterTriggers.query_depth >= afterTriggers.maxquerydepth);
+
+	if (afterTriggers.maxquerydepth == 0)
+	{
+		int			new_alloc = Max(afterTriggers.query_depth, 8);
+
+		afterTriggers.query_stack = (AfterTriggerEventList *)
+			MemoryContextAlloc(TopTransactionContext,
+							   new_alloc * sizeof(AfterTriggerEventList));
+		afterTriggers.fdw_tuplestores = (Tuplestorestate **)
+			MemoryContextAllocZero(TopTransactionContext,
+								   new_alloc * sizeof(Tuplestorestate *));
+		afterTriggers.maxquerydepth = new_alloc;
+	}
+	else
+	{
+		/* repalloc will keep the stack in the same context */
+		int			old_alloc = afterTriggers.maxquerydepth;
+		int			new_alloc = Max(afterTriggers.query_depth, old_alloc * 2);
+
+		afterTriggers.query_stack = (AfterTriggerEventList *)
+			repalloc(afterTriggers.query_stack,
+					 new_alloc * sizeof(AfterTriggerEventList));
+		afterTriggers.fdw_tuplestores = (Tuplestorestate **)
+			repalloc(afterTriggers.fdw_tuplestores,
+					 new_alloc * sizeof(Tuplestorestate *));
+		/* Clear newly-allocated slots for subsequent lazy initialization. */
+		memset(afterTriggers.fdw_tuplestores + old_alloc,
+			   0, (new_alloc - old_alloc) * sizeof(Tuplestorestate *));
+		afterTriggers.maxquerydepth = new_alloc;
+	}
+
+	/* Initialize new query lists to empty */
+	while (init_depth < afterTriggers.maxquerydepth)
+	{
+		AfterTriggerEventList *events;
+
+		events = &afterTriggers.query_stack[init_depth];
+		events->head = NULL;
+		events->tail = NULL;
+		events->tailfree = NULL;
+
+		++init_depth;
+	}
+}
+
 /*
  * Create an empty SetConstraintState with room for numalloc trigstates
  */
@@ -4417,21 +4457,19 @@ AfterTriggerSetState(ConstraintsSetStmt *stmt)
 {
 	int			my_level = GetCurrentTransactionNestLevel();
 
-	/*
-	 * Ignore call if we aren't in a transaction.  (Shouldn't happen?)
-	 */
-	if (afterTriggers == NULL)
-		return;
+	/* If we haven't already done so, initialize our state. */
+	if (afterTriggers.state == NULL)
+		afterTriggers.state = SetConstraintStateCreate(8);
 
 	/*
 	 * If in a subtransaction, and we didn't save the current state already,
 	 * save it so it can be restored if the subtransaction aborts.
 	 */
 	if (my_level > 1 &&
-		afterTriggers->state_stack[my_level] == NULL)
+		afterTriggers.state_stack[my_level] == NULL)
 	{
-		afterTriggers->state_stack[my_level] =
-			SetConstraintStateCopy(afterTriggers->state);
+		afterTriggers.state_stack[my_level] =
+			SetConstraintStateCopy(afterTriggers.state);
 	}
 
 	/*
@@ -4442,13 +4480,13 @@ AfterTriggerSetState(ConstraintsSetStmt *stmt)
 		/*
 		 * Forget any previous SET CONSTRAINTS commands in this transaction.
 		 */
-		afterTriggers->state->numstates = 0;
+		afterTriggers.state->numstates = 0;
 
 		/*
 		 * Set the per-transaction ALL state to known.
 		 */
-		afterTriggers->state->all_isset = true;
-		afterTriggers->state->all_isdeferred = stmt->deferred;
+		afterTriggers.state->all_isset = true;
+		afterTriggers.state->all_isdeferred = stmt->deferred;
 	}
 	else
 	{
@@ -4622,7 +4660,7 @@ AfterTriggerSetState(ConstraintsSetStmt *stmt)
 		foreach(lc, tgoidlist)
 		{
 			Oid			tgoid = lfirst_oid(lc);
-			SetConstraintState state = afterTriggers->state;
+			SetConstraintState state = afterTriggers.state;
 			bool		found = false;
 			int			i;
 
@@ -4637,7 +4675,7 @@ AfterTriggerSetState(ConstraintsSetStmt *stmt)
 			}
 			if (!found)
 			{
-				afterTriggers->state =
+				afterTriggers.state =
 					SetConstraintStateAddItem(state, tgoid, stmt->deferred);
 			}
 		}
@@ -4656,12 +4694,12 @@ AfterTriggerSetState(ConstraintsSetStmt *stmt)
 	 */
 	if (!stmt->deferred)
 	{
-		AfterTriggerEventList *events = &afterTriggers->events;
+		AfterTriggerEventList *events = &afterTriggers.events;
 		bool		snapshot_set = false;
 
 		while (afterTriggerMarkEvents(events, NULL, true))
 		{
-			CommandId	firing_id = afterTriggers->firing_counter++;
+			CommandId	firing_id = afterTriggers.firing_counter++;
 
 			/*
 			 * Make sure a snapshot has been established in case trigger
@@ -4715,12 +4753,8 @@ AfterTriggerPendingOnRel(Oid relid)
 	AfterTriggerEventChunk *chunk;
 	int			depth;
 
-	/* No-op if we aren't in a transaction.  (Shouldn't happen?) */
-	if (afterTriggers == NULL)
-		return false;
-
 	/* Scan queued events */
-	for_each_event_chunk(event, chunk, afterTriggers->events)
+	for_each_event_chunk(event, chunk, afterTriggers.events)
 	{
 		AfterTriggerShared evtshared = GetTriggerSharedData(event);
 
@@ -4741,9 +4775,9 @@ AfterTriggerPendingOnRel(Oid relid)
 	 * if TRUNCATE/etc is executed by a function or trigger within an updating
 	 * query on the same relation, which is pretty perverse, but let's check.
 	 */
-	for (depth = 0; depth <= afterTriggers->query_depth; depth++)
+	for (depth = 0; depth <= afterTriggers.query_depth; depth++)
 	{
-		for_each_event_chunk(event, chunk, afterTriggers->query_stack[depth])
+		for_each_event_chunk(event, chunk, afterTriggers.query_stack[depth])
 		{
 			AfterTriggerShared evtshared = GetTriggerSharedData(event);
 
@@ -4787,15 +4821,17 @@ AfterTriggerSaveEvent(EState *estate, ResultRelInfo *relinfo,
 	Tuplestorestate *fdw_tuplestore = NULL;
 
 	/*
-	 * Check state.  We use normal tests not Asserts because it is possible to
+	 * Check state.  We use a normal test not Assert because it is possible to
 	 * reach here in the wrong state given misconfigured RI triggers, in
 	 * particular deferring a cascade action trigger.
 	 */
-	if (afterTriggers == NULL)
-		elog(ERROR, "AfterTriggerSaveEvent() called outside of transaction");
-	if (afterTriggers->query_depth < 0)
+	if (afterTriggers.query_depth < 0)
 		elog(ERROR, "AfterTriggerSaveEvent() called outside of query");
 
+	/* Be sure we have enough space to record events at this query depth. */
+	if (afterTriggers.query_depth >= afterTriggers.maxquerydepth)
+		AfterTriggerEnlargeQueryState();
+
 	/*
 	 * Validate the event code and collect the associated tuple CTIDs.
 	 *
@@ -4959,7 +4995,7 @@ AfterTriggerSaveEvent(EState *estate, ResultRelInfo *relinfo,
 		new_shared.ats_relid = RelationGetRelid(rel);
 		new_shared.ats_firing_id = 0;
 
-		afterTriggerAddEvent(&afterTriggers->query_stack[afterTriggers->query_depth],
+		afterTriggerAddEvent(&afterTriggers.query_stack[afterTriggers.query_depth],
 							 &new_event, &new_shared);
 	}
 
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to