On Sun, Jan 03, 2016 at 12:35:56AM -0500, Noah Misch wrote: > On Sat, Jan 02, 2016 at 07:22:13PM -0500, Tom Lane wrote: > > Noah Misch <n...@leadboat.com> writes: > > > I am inclined to add an Assert(portal->status != PORTAL_ACTIVE) to > > > emphasize > > > that this is backup only. MarkPortalActive() callers remain responsible > > > for > > > updating the status to something else before relinquishing control. > > > > No, I do not think that would be an improvement. There is no contract > > saying that this must be done earlier, IMO. > > Indeed, nobody wrote a contract. The assertion would record what has been the > sole standing practice for eleven years (since commit a393fbf9). It would > prompt discussion if a proposed patch would depart from that practice, and > that is a good thing. Also, every addition of dead code should label that > code to aid future readers.
Here's the patch I have in mind. AtAbort_Portals() has an older MarkPortalFailed() call, and the story is somewhat different there per its new comment. That call is plausible to reach with no bug involved, but MarkPortalFailed() would then be the wrong thing. nm
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c index a53673c..632b202 100644 --- a/src/backend/utils/mmgr/portalmem.c +++ b/src/backend/utils/mmgr/portalmem.c @@ -762,13 +762,22 @@ AtAbort_Portals(void) hash_seq_init(&status, PortalHashTable); while ((hentry = (PortalHashEnt *) hash_seq_search(&status)) != NULL) { Portal portal = hentry->portal; - /* Any portal that was actually running has to be considered broken */ + /* + * See similar code in AtSubAbort_Portals(). This would fire if code + * orchestrating multiple top-level transactions within a portal, such + * as VACUUM, caught errors and continued under the same portal with a + * fresh transaction. No part of core PostgreSQL functions that way, + * though it's a fair thing to want. Such code would wish the portal + * to remain ACTIVE, as in PreCommit_Portals(); we don't cater to + * that. Instead, like AtSubAbort_Portals(), interpret this as a bug. + */ + Assert(portal->status != PORTAL_ACTIVE); if (portal->status == PORTAL_ACTIVE) MarkPortalFailed(portal); /* * Do nothing else to cursors held over from a previous transaction. */ @@ -916,26 +925,29 @@ AtSubAbort_Portals(SubTransactionId mySubid, if (portal->activeSubid == mySubid) { /* Maintain activeSubid until the portal is removed */ portal->activeSubid = parentSubid; /* - * Upper-level portals that failed while running in this - * subtransaction must be forced into FAILED state, for the - * same reasons discussed below. + * If a bug in a MarkPortalActive() caller has it miss cleanup + * after having failed while running an upper-level portal in + * this subtransaction, we don't know what else in the portal + * is wrong. Force it into FAILED state, for the same reasons + * discussed below. * * We assume we can get away without forcing upper-level READY * portals to fail, even if they were run and then suspended. * In theory a suspended upper-level portal could have * acquired some references to objects that are about to be * destroyed, but there should be sufficient defenses against * such cases: the portal's original query cannot contain such * references, and any references within, say, cached plans of * PL/pgSQL functions are not from active queries and should * be protected by revalidation logic. */ + Assert(portal->status != PORTAL_ACTIVE); if (portal->status == PORTAL_ACTIVE) MarkPortalFailed(portal); /* * Also, if we failed it during the current subtransaction * (either just above, or earlier), reattach its resource @@ -957,14 +969,19 @@ AtSubAbort_Portals(SubTransactionId mySubid, } /* * Force any live portals of my own subtransaction into FAILED state. * We have to do this because they might refer to objects created or * changed in the failed subtransaction, leading to crashes within - * ExecutorEnd when portalcmds.c tries to close down the portal. + * ExecutorEnd when portalcmds.c tries to close down the portal. Each + * MarkPortalActive() caller ensures it updates the portal status + * again before relinquishing control, so ACTIVE can't happen here. + * If a bug does make it happen, dispose the portal like a normal + * MarkPortalActive() caller would. */ + Assert(portal->status != PORTAL_ACTIVE); if (portal->status == PORTAL_READY || portal->status == PORTAL_ACTIVE) MarkPortalFailed(portal); /* * Allow portalcmds.c to clean up the state it knows about, if we
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers