Re: [PATCHES] nested xacts and phantom Xids
Added to TODO, just so we don't forget later: * Use a phantom command counter for nested subtransactions to reduce tuple overhead --- Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > Hmm ... yes, this could be very ugly indeed, but I haven't even looked > > at the executor code so I can't comment. Are executor nodes copyable? > > Nope, and even if we had support for that the executor tree per se > is just the tip of the iceberg. There's also indexscan status, SRF > function internal state, yadda yadda. I think the odds of doing > something with all that stuff for 7.5 are exactly zero ... we'd better > define a stopgap behavior. > > > Oh, and I've been playing with large objects and I've encountered bugs > > elsewhere. I'll look at it with the new patch you just posted. > > Wouldn't surprise me, we've not looked at that yet either. > > I do feel that we have enough things working that we should commit to > nested transactions for 7.5. There will be some things that we have to > restrict, such as cursors and perhaps large objects. But it's surely > better than no subtransactions at all. > > regards, tom lane > > ---(end of broadcast)--- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to [EMAIL PROTECTED] so that your > message can get through to the mailing list cleanly > -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] nested xacts and phantom Xids
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Hmm ... yes, this could be very ugly indeed, but I haven't even looked > at the executor code so I can't comment. Are executor nodes copyable? Nope, and even if we had support for that the executor tree per se is just the tip of the iceberg. There's also indexscan status, SRF function internal state, yadda yadda. I think the odds of doing something with all that stuff for 7.5 are exactly zero ... we'd better define a stopgap behavior. > Oh, and I've been playing with large objects and I've encountered bugs > elsewhere. I'll look at it with the new patch you just posted. Wouldn't surprise me, we've not looked at that yet either. I do feel that we have enough things working that we should commit to nested transactions for 7.5. There will be some things that we have to restrict, such as cursors and perhaps large objects. But it's surely better than no subtransactions at all. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] nested xacts and phantom Xids
On Tue, Jun 29, 2004 at 06:59:20PM -0400, Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > As with the bufmgr.c original patch, I don't really know how to test > > that this actually works. [...] > > I forgot to mention to you that that code didn't work at all, btw. Bad news, I guess. > The other theory we could adopt is that cursors stay open till main xact > commit; this would imply not releasing buffer refcounts at subxact > commit, plus any other resources needed by the cursor. We're already > holding locks that way and it probably wouldn't be a big change to make > bufmgr work the same. I'm not sure that there are any other resources > involved, other than the Portal memory which we already handle properly. Well, AFAIR originally I had thought that refcounts should be held at subtrans commit; you suggested that there was no reason for a subtrans to keep a buffer refcount and that was it. I think the open cursor is a good reason why the count should be kept; it appears less useful if you can't use the cursor anywhere out of the level that created it. > Oh, there's another point: what happens if an outer xact level declares > a cursor, which is then FETCHed from by a subtransaction? At minimum we > have the problem that this could change the set of buffer pins held, > which breaks the present bufmgr solution entirely. It gets even more > interesting if you are of the opinion that subtransaction failure should > cause the effects of the FETCH to be undone --- we have no way to do > that at all, because there's no mechanism for saving/restoring the state > of an entire execution plan tree. Hmm ... yes, this could be very ugly indeed, but I haven't even looked at the executor code so I can't comment. Are executor nodes copyable? Oh, and I've been playing with large objects and I've encountered bugs elsewhere. I'll look at it with the new patch you just posted. -- Alvaro Herrera () "Vivir y dejar de vivir son soluciones imaginarias. La existencia está en otra parte" (Andre Breton) ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PATCHES] nested xacts and phantom Xids
Alvaro Herrera <[EMAIL PROTECTED]> writes: > As with the bufmgr.c original patch, I don't really know how to test > that this actually works. I fooled around with printing what it was > doing during a subtrans commit/abort, and it seems OK, but that's about > it. In what situations can a transaction roll back with a nonzero > reference count in a local buffer? You need an active cursor, eg begin; declare c cursor for select * from tenk1; fetch 1 in c; ... now you've got an open buffer refcount to some page of tenk1 I forgot to mention to you that that code didn't work at all, btw. I have fixed some of the problems in my local version but there's still a fairly large issue, which is what exactly we think the semantics of a cursor declared in a subtransaction ought to be. With bufmgr set up to consider open reference counts as a bug, we cannot hold such a cursor open past subtrans commit. One possible approach is to consider subxact commit the same as main xact commit as far as cursors are concerned: materialize anything declared WITH HOLD, close anything declared without. The other theory we could adopt is that cursors stay open till main xact commit; this would imply not releasing buffer refcounts at subxact commit, plus any other resources needed by the cursor. We're already holding locks that way and it probably wouldn't be a big change to make bufmgr work the same. I'm not sure that there are any other resources involved, other than the Portal memory which we already handle properly. The first approach is a lower-risk path; I'm not sure if the second one might have some hidden gotchas. It seems like the second one would be more flexible though. Any opinions which to pursue? Oh, there's another point: what happens if an outer xact level declares a cursor, which is then FETCHed from by a subtransaction? At minimum we have the problem that this could change the set of buffer pins held, which breaks the present bufmgr solution entirely. It gets even more interesting if you are of the opinion that subtransaction failure should cause the effects of the FETCH to be undone --- we have no way to do that at all, because there's no mechanism for saving/restoring the state of an entire execution plan tree. We might have to prohibit subtransactions from touching outer-level cursors, at least for 7.5. This would in turn make it a bit questionable whether there's any point in letting cursors propagate up out of subtransactions... regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] nested xacts and phantom Xids
On Sun, Jun 20, 2004 at 08:49:22PM -0400, Tom Lane wrote: > There's a good deal more than that missing :-(. Here are the modules or > actions that are called in CommitTransaction and/or AbortTransaction > that have not yet been touched by the patch: > > localbuf.c (refcounts need fixed same as bufmgr) Here is a patch against the original versions of these files; cleaned up bufmgr.c somewhat. Adds the same logic to local buffers (moving the BufferRefCount struct declaration to buf_internals.h so it's shared by both bufmgr.c and localbuf.c). Needs xact.c and xact.h patched as in the second patch. As with the bufmgr.c original patch, I don't really know how to test that this actually works. I fooled around with printing what it was doing during a subtrans commit/abort, and it seems OK, but that's about it. In what situations can a transaction roll back with a nonzero reference count in a local buffer? -- Alvaro Herrera () "I dream about dreams about dreams", sang the nightingale under the pale moon (Sandman) Index: src/include/storage/bufmgr.h === RCS file: /home/alvherre/cvs/pgsql-server/src/include/storage/bufmgr.h,v retrieving revision 1.82 diff -c -r1.82 bufmgr.h *** src/include/storage/bufmgr.h31 May 2004 19:24:05 - 1.82 --- src/include/storage/bufmgr.h21 Jun 2004 20:29:08 - *** *** 148,153 --- 148,155 extern char *ShowBufferUsage(void); extern void ResetBufferUsage(void); extern void AtEOXact_Buffers(bool isCommit); + extern void AtSubStart_Buffers(void); + extern void AtEOSubXact_Buffers(bool commit); extern void FlushBufferPool(void); extern BlockNumber BufferGetBlockNumber(Buffer buffer); extern BlockNumber RelationGetNumberOfBlocks(Relation relation); Index: src/include/storage/buf_internals.h === RCS file: /home/alvherre/cvs/pgsql-server/src/include/storage/buf_internals.h,v retrieving revision 1.71 diff -c -r1.71 buf_internals.h *** src/include/storage/buf_internals.h 18 Jun 2004 06:14:13 - 1.71 --- src/include/storage/buf_internals.h 29 Jun 2004 13:21:23 - *** *** 175,180 --- 175,189 extern long int BufferFlushCount; extern long int LocalBufferFlushCount; + /* + * We use a list of this struct to keep track of buffer reference + * count checking at subtransaction boundaries. + */ + typedef struct BufferRefCount + { + Buffer buffer; + int refcount; + } BufferRefCount; /* * Bufmgr Interface: *** *** 211,215 --- 220,226 bool *foundPtr); extern void WriteLocalBuffer(Buffer buffer, bool release); extern void AtEOXact_LocalBuffers(bool isCommit); + extern void AtSubStart_LocalBuffers(void); + extern void AtSubEnd_LocalBuffers(bool isCommit); #endif /* BUFMGR_INTERNALS_H */ Index: src/backend/storage/buffer/bufmgr.c === RCS file: /home/alvherre/cvs/pgsql-server/src/backend/storage/buffer/bufmgr.c,v retrieving revision 1.171 diff -c -r1.171 bufmgr.c *** src/backend/storage/buffer/bufmgr.c 18 Jun 2004 06:13:33 - 1.171 --- src/backend/storage/buffer/bufmgr.c 29 Jun 2004 20:26:09 - *** *** 38,43 --- 38,44 #include #include + #include "access/xact.h" #include "lib/stringinfo.h" #include "miscadmin.h" #include "storage/buf_internals.h" *** *** 46,51 --- 47,53 #include "storage/proc.h" #include "storage/smgr.h" #include "utils/relcache.h" + #include "utils/memutils.h" #include "pgstat.h" *** *** 67,72 --- 69,75 static void PinBuffer(BufferDesc *buf); static void UnpinBuffer(BufferDesc *buf); + static inline void BufferFixLeak(Buffer buffer, int should, bool warn); static void WaitIO(BufferDesc *buf); static void StartBufferIO(BufferDesc *buf, bool forInput); static void TerminateBufferIO(BufferDesc *buf, int err_flag); *** *** 826,853 for (i = 0; i < NBuffers; i++) { if (PrivateRefCount[i] != 0) ! { ! BufferDesc *buf = &(BufferDescriptors[i]); ! if (isCommit) ! elog(WARNING, !"buffer refcount leak: [%03d] " !"(rel=%u/%u/%u, blockNum=%u, flags=0x%x, refcount=%u %d)", !i, !buf->tag.rnode.spcNode, buf->tag.rnode.dbNode, !buf->tag.rnode.relNode, !buf->tag.blockNum, buf->flags, !buf->refcount, PrivateRefCount[i]); ! PrivateRefCount[i] = 1; /* make s
Re: [PATCHES] nested xacts and phantom Xids
Alvaro Herrera <[EMAIL PROTECTED]> writes: > - GUC vars are rolled back on subxact abort This did not work very well, but here is a revised GUC patch that I think does work. It requires xact.c to export a function to report the current nesting depth, and requires AtEOXact_GUC to be called in all four cleanup paths (main and subxact commit and abort). BTW, why do you have assign_transaction_read_only() in your patch? It seems to me to be useful to create a readonly subxact of a read-write outer transaction. Or is that just not-done-yet? regards, tom lane *** src/backend/utils/misc/README.orig Mon Jan 19 14:04:40 2004 --- src/backend/utils/misc/README Tue Jun 29 15:14:44 2004 *** *** 68,116 would be effective had there never been any SET commands in the current session. ! To handle these cases we must keep track of as many as four distinct ! values for each variable. They are: * actual variable contentsalways the current effective value * reset_value the value to use for RESET - * session_value the "committed" setting for the session - * tentative_value the uncommitted result of SET ! During initialization we set the first three of these (actual, reset_value, ! and session_value) based on whichever non-interactive source has the ! highest priority. All three will have the same value. A SET LOCAL command sets the actual variable (and nothing else). At ! transaction end, the session_value is used to restore the actual variable ! to its pre-transaction value. A SET (or SET SESSION) command sets the actual variable, and if no error, then sets the tentative_value. If the transaction commits, the ! tentative_value is assigned to the session_value and the actual variable ! (which could by now be different, if the SET was followed by SET LOCAL). ! If the transaction aborts, the tentative_value is discarded and the ! actual variable is restored from the session_value. RESET is executed like a SET, but using the reset_value as the desired new value. (We do not provide a RESET LOCAL command, but SET LOCAL TO DEFAULT has the same behavior that RESET LOCAL would.) The source associated with ! the reset_value also becomes associated with the actual and session values. If SIGHUP is received, the GUC code rereads the postgresql.conf configuration file (this does not happen in the signal handler, but at next return to main loop; note that it can be executed while within a transaction). New values from postgresql.conf are assigned to actual ! variable, reset_value, and session_value, but only if each of these has a ! current source priority <= PGC_S_FILE. (It is thus possible for ! reset_value to track the config-file setting even if there is currently ! a different interactive value of the actual variable.) Note that tentative_value is unused and undefined except between a SET command and the end of the transaction. Also notice that we must track ! the source associated with each of the four values. The assign_hook and show_hook routines work only with the actual variable, and are not directly aware of the additional values maintained by GUC. --- 68,133 would be effective had there never been any SET commands in the current session. ! To handle these cases we must keep track of many distinct values for each ! variable. The primary values are: * actual variable contentsalways the current effective value * reset_value the value to use for RESET * tentative_value the uncommitted result of SET ! The reason we need a tentative_value separate from the actual value is ! that when a transaction does SET followed by SET LOCAL, the actual value ! will now be the LOCAL value, but we want to remember the prior SET so that ! that value is restored at transaction commit. ! ! In addition, for each level of transaction (possibly nested) we have to ! remember the transaction-entry-time actual and tentative values, in case ! we need to restore them at transaction end. (The RESET value is essentially ! non-transactional, so it doesn't have to be stacked.) For efficiency these ! stack entries are not constructed until/unless the variable is actually SET ! within a particular transaction. ! ! During initialization we set the actual value and reset_value based on ! whichever non-interactive source has the highest priority. They will ! have the same value. The tentative_value is not meaningful at this point. ! ! A SET command starts by stacking the existing actual and tentative values ! if this hasn't already been done within the current transaction. Then: A SET LOCAL command sets the actual variable (and nothing else). At ! transaction end, the stacked values are used to restore the GUC entry ! to its pre-transaction state. A SET (or SET SESSION) command sets the actual variable
Re: [PATCHES] nested xacts and phantom Xids
On Tue, Jun 29, 2004 at 03:22:52PM -0400, Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > - GUC vars are rolled back on subxact abort > > This did not work very well, but here is a revised GUC patch that I think > does work. It requires xact.c to export a function to report the > current nesting depth, and requires AtEOXact_GUC to be called in all > four cleanup paths (main and subxact commit and abort). Very cool, thank you. I had thought about doing something like this but in the end I got scared away of changing too much code here. > BTW, why do you have assign_transaction_read_only() in your patch? It > seems to me to be useful to create a readonly subxact of a read-write > outer transaction. Or is that just not-done-yet? Nah, it's a leftover from back when there wasn't any way to roll back GUC vars. I thought it should be handled similarly to the way the isolation level should be handled. With your patch I think it can be ripped away entirely. -- Alvaro Herrera () Officer Krupke, what are we to do? Gee, officer Krupke, Krup you! (West Side Story, "Gee, Officer Krupke") ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] nested xacts and phantom Xids
On Sun, Jun 27, 2004 at 12:49:10AM -0400, Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > I'll work on adding the Xid-cache to PGPROC and using that in > > TransactionIdIsInProgress and the tqual routines. If you want to work > > on that let me know and I'll handle things like the password file, local > > bufmgr refcount, etc. > > Either one is okay, though doing the latter (ie modules you didn't touch > yet) would be probably a bit easier to merge. Will do. -- Alvaro Herrera () "Un poeta es un mundo encerrado en un hombre" (Victor Hugo) ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PATCHES] nested xacts and phantom Xids
On Sat, Jun 26, 2004 at 07:56:09PM -0400, Tom Lane wrote: > BTW, it would help to know what parts of the patch you intend to work on > over the next couple of days. I'm reviewing and editorializing right > now with intent to commit soon, so it would be good if we can avoid > tromping on each others' feet. Oops, I forgot that I had inadvertently left a kludge in commands/trigger.c. Please use this patch for this file instead. -- Alvaro Herrera () Jude: I wish humans laid eggs Ringlord: Why would you want humans to lay eggs? Jude: So I can eat them Index: include/commands/trigger.h === RCS file: /home/alvherre/cvs/pgsql-server/src/include/commands/trigger.h,v retrieving revision 1.45 diff -c -r1.45 trigger.h *** include/commands/trigger.h 29 Nov 2003 22:40:59 - 1.45 --- include/commands/trigger.h 25 Jun 2004 22:59:40 - *** *** 151,198 ItemPointer tupleid, HeapTuple newtuple); - - /* - * Deferred trigger stuff - */ - typedef struct DeferredTriggerStatusData - { - Oid dts_tgoid; - booldts_tgisdeferred; - } DeferredTriggerStatusData; - - typedef struct DeferredTriggerStatusData *DeferredTriggerStatus; - - typedef struct DeferredTriggerEventItem - { - Oid dti_tgoid; - int32 dti_state; - } DeferredTriggerEventItem; - - typedef struct DeferredTriggerEventData *DeferredTriggerEvent; - - typedef struct DeferredTriggerEventData - { - DeferredTriggerEvent dte_next; /* list link */ - int32 dte_event; - Oid dte_relid; - ItemPointerData dte_oldctid; - ItemPointerData dte_newctid; - int32 dte_n_items; - /* dte_item is actually a variable-size array, of length dte_n_items */ - DeferredTriggerEventItem dte_item[1]; - } DeferredTriggerEventData; - - extern void DeferredTriggerInit(void); extern void DeferredTriggerBeginXact(void); extern void DeferredTriggerEndQuery(void); extern void DeferredTriggerEndXact(void); extern void DeferredTriggerAbortXact(void); ! extern void DeferredTriggerSetState(ConstraintsSetStmt *stmt); - /* * in utils/adt/ri_triggers.c */ --- 151,165 ItemPointer tupleid, HeapTuple newtuple); extern void DeferredTriggerInit(void); extern void DeferredTriggerBeginXact(void); extern void DeferredTriggerEndQuery(void); extern void DeferredTriggerEndXact(void); extern void DeferredTriggerAbortXact(void); ! extern void DeferredTriggerBeginSubXact(void); ! extern void DeferredTriggerEndSubXact(bool isCommit); extern void DeferredTriggerSetState(ConstraintsSetStmt *stmt); /* * in utils/adt/ri_triggers.c */ Index: backend/commands/trigger.c === RCS file: /home/alvherre/cvs/pgsql-server/src/backend/commands/trigger.c,v retrieving revision 1.165 diff -c -r1.165 trigger.c *** backend/commands/trigger.c 26 May 2004 04:41:12 - 1.165 --- backend/commands/trigger.c 26 Jun 2004 18:12:01 - *** *** 50,58 MemoryContext per_tuple_context); static void DeferredTriggerSaveEvent(ResultRelInfo *relinfo, int event, bool row_trigger, HeapTuple oldtup, HeapTuple newtup); - static void DeferredTriggerExecute(DeferredTriggerEvent event, int itemno, - Relation rel, TriggerDesc *trigdesc, FmgrInfo *finfo, - MemoryContext per_tuple_context); /* --- 50,55 *** *** 1639,1685 /* -- * Deferred trigger stuff * -- */ ! typedef struct DeferredTriggersData { ! /* Internal data is held in a per-transaction memory context */ ! MemoryContext deftrig_cxt; ! /* ALL DEFERRED or ALL IMMEDIATE */ ! booldeftrig_all_isset; ! booldeftrig_all_isdeferred; ! /* Per trigger state */ ! List *deftrig_trigstates; ! /* List of pending deferred triggers. Previous comment below */ ! DeferredTriggerEvent deftrig_events; ! DeferredTriggerEvent deftrig_events_imm; ! DeferredTriggerEvent deftrig_event_tail; ! } DeferredTriggersData; ! /* -- ! * deftrig_events, deftrig_event_tail: ! * The list of pending deferred trigger events during the current transaction. * ! * deftrig_events is the head, deftrig_event_tail is the last entry. ! * Because this can grow pretty large, we don't use separate List nodes, ! * but instead thread the list through the dte_next fields of the member ! * nodes. Saves just a few bytes per entry, but that adds up. !
Re: [PATCHES] nested xacts and phantom Xids
Alvaro Herrera <[EMAIL PROTECTED]> writes: > I'll work on adding the Xid-cache to PGPROC and using that in > TransactionIdIsInProgress and the tqual routines. If you want to work > on that let me know and I'll handle things like the password file, local > bufmgr refcount, etc. Either one is okay, though doing the latter (ie modules you didn't touch yet) would be probably a bit easier to merge. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PATCHES] nested xacts and phantom Xids
On Sat, Jun 26, 2004 at 07:56:09PM -0400, Tom Lane wrote: > Do we really need SubtransCutoffXid? AFAICS the reason for having it is > only this claim in RecordTransactionCommit: > > * We can't mark committed subtransactions as fully committed, > * because concurrent transactions would see them as committed > * and not as in-progress. Leave them as "subcommitted" until > * the parent transaction is below OldestXmin, per VACUUM. Right, that's the only point where it's used. I don't know clearly if some kind of mechanism will be needed to handle SUBTRANS COMMIT states in pg_clog that were left behind by a crashed subtransaction though. > but I think this is dead wrong. As long as we mark the parent committed > first, there is no race condition. tqual tests that are using snapshots > will need to recognize that the subtransaction is a member of one of the > snapshotted main XIDs, and those that are not will think committed is > committed. So I want to mark subtransactions fully committed in > RecordTransactionCommit, and lose SubtransCutoffXid. Comments? Yes, sounds good. > BTW, it would help to know what parts of the patch you intend to work on > over the next couple of days. I'm reviewing and editorializing right > now with intent to commit soon, so it would be good if we can avoid > tromping on each others' feet. This is really excellent news. I'll work on adding the Xid-cache to PGPROC and using that in TransactionIdIsInProgress and the tqual routines. If you want to work on that let me know and I'll handle things like the password file, local bufmgr refcount, etc. -- Alvaro Herrera () FOO MANE PADME HUM ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] nested xacts and phantom Xids
Do we really need SubtransCutoffXid? AFAICS the reason for having it is only this claim in RecordTransactionCommit: * We can't mark committed subtransactions as fully committed, * because concurrent transactions would see them as committed * and not as in-progress. Leave them as "subcommitted" until * the parent transaction is below OldestXmin, per VACUUM. but I think this is dead wrong. As long as we mark the parent committed first, there is no race condition. tqual tests that are using snapshots will need to recognize that the subtransaction is a member of one of the snapshotted main XIDs, and those that are not will think committed is committed. So I want to mark subtransactions fully committed in RecordTransactionCommit, and lose SubtransCutoffXid. Comments? BTW, it would help to know what parts of the patch you intend to work on over the next couple of days. I'm reviewing and editorializing right now with intent to commit soon, so it would be good if we can avoid tromping on each others' feet. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PATCHES] nested xacts and phantom Xids
On Sun, Jun 20, 2004 at 08:49:22PM -0400, Tom Lane wrote: Regarding GUC, a WIP report: > Given patches for inval.c and guc.c, I would say that the patch is > functionally close enough to done that we could commit to including > it in 7.5 --- the other stuff could be wrapped up post-feature-freeze. I figured I could save the values whenever they are going to change, and restore them if the subtransaction aborts. This seems to work fine (lightly tested). I still have to figure out how to handle allocation for string vars, but I thought I'd post the patch for others to see. Please let me know if it's too ugly. (This patch misses the pieces in xact.c and xact.h but I'm sure the concept is clear.) I'll post a full patch once the missing deferred trigger stuff works. With the patches I posted to inval.c I think this fulfills the requirements, barring the performance issues raised. Comments? -- Alvaro Herrera () "No single strategy is always right (Unless the boss says so)" (Larry Wall) Index: guc.c === RCS file: /home/alvherre/cvs/pgsql-server/src/backend/utils/misc/guc.c,v retrieving revision 1.211 diff -c -w -b -B -c -r1.211 guc.c *** guc.c 11 Jun 2004 03:54:54 - 1.211 --- guc.c 24 Jun 2004 23:41:42 - *** *** 25,30 --- 25,31 #include "utils/guc.h" #include "utils/guc_tables.h" + #include "access/xact.h" #include "catalog/namespace.h" #include "catalog/pg_type.h" #include "commands/async.h" *** *** 54,59 --- 55,61 #include "tcop/tcopprot.h" #include "utils/array.h" #include "utils/builtins.h" + #include "utils/memutils.h" #include "utils/pg_locale.h" #include "pgstat.h" *** *** 76,81 --- 78,85 static const char *assign_log_destination(const char *value, bool doit, GucSource source); + static void SaveGucVariable(struct config_generic *conf); + #ifdef HAVE_SYSLOG extern char *Syslog_facility; extern char *Syslog_ident; *** *** 105,110 --- 109,115 GucSource source); static bool assign_stage_log_stats(bool newval, bool doit, GucSource source); static bool assign_log_stats(bool newval, bool doit, GucSource source); + static bool assign_transaction_read_only(bool newval, bool doit, GucSource source); /* *** *** 172,177 --- 177,183 static intmax_index_keys; static intmax_identifier_length; static intblock_size; + static intnesting_level; static bool integer_datetimes; /* Macros for freeing malloc'd pointers only if appropriate to do so */ *** *** 801,807 GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE }, &XactReadOnly, ! false, NULL, NULL }, { {"add_missing_from", PGC_USERSET, COMPAT_OPTIONS_PREVIOUS, --- 807,813 GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE }, &XactReadOnly, ! false, assign_transaction_read_only, NULL }, { {"add_missing_from", PGC_USERSET, COMPAT_OPTIONS_PREVIOUS, *** *** 1311,1316 --- 1317,1333 BLCKSZ, BLCKSZ, BLCKSZ, NULL, NULL }, + { + /* XXX probably it's a bad idea for this to be GUC_REPORT. */ + {"nesting_level", PGC_INTERNAL, UNGROUPED, + gettext_noop("Shows the current transaction nesting level"), + NULL, + GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_REPORT + }, + &nesting_level, + 0, 0, INT_MAX, NULL, NULL + }, + /* End-of-list marker */ { {NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL *** *** 2001,2014 return find_option(map_old_guc_names[i+1]); } ! /* Check if the name is qualified, and if so, check if the qualifier * maps to a custom variable class. */ dot = strchr(name, GUC_QUALIFIER_SEPARATOR); if(dot != NULL && is_custom_class(name, dot - name)) ! /* !* Add a placeholder variable for this name !*/ return (struct config_generic*)add_placeholder_variable(name); /* Unknown name */ --- 2018,2030 return find_option(map_old_guc_names[i+1]); } ! /* !* Check if the name is qualified, and if so, check if the qualifier * maps to a custom variable class. */ dot = strchr(name, GUC_QUALIFIER_SEPARATOR); if(dot != NULL && is_custom_class(name, dot - name)) ! /* Add a placeholder variable for
Re: [PATCHES] nested xacts and phantom Xids
On Wed, 2004-06-23 at 13:57, Bruce Momjian wrote: > I am sorry to have given Alvaro another idea that didn't work. No way! Keep having the ideas, please. I've done some more digging in dead time on all of this and I think we're on the right course in general by implementing all of this. ...well done to Alvaro for being able to make the ideas reality. Best regards, Simon Riggs ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PATCHES] nested xacts and phantom Xids
Alvaro Herrera <[EMAIL PROTECTED]> writes: > It'd be good to have savepoints right now. I'm not sure it'd be good to > expose the nested transactions implementation if we are going to offer > savepoints later, because it means we will have to keep nested > transactions forever. Nested transactions are *good*. We should not hide them. I don't object to offering spec-compatible savepoints, but the plain fact of the matter is that that's a dumbed-down API. We should not feel constrained to offer only that. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PATCHES] nested xacts and phantom Xids
On Wed, Jun 23, 2004 at 08:57:11AM -0400, Bruce Momjian wrote: > I am sorry to have given Alvaro another idea that didn't work. It allowed me to learn a lot of cool tricks, so it wasn't wasted work. The only think I'm sorry about is that I should have used the time for something more useful, like tackling the remaining problems in the nested xacts implementation proper. > However, thinking of options, I wonder if instead of phantom xids, we > should do phantom cids. Because only the local backend looks at the > command counter (cid). I think it might be alot cleaner. The phantom > cid would have a tuple bit set indicating that instead of being a cid, > it is an index into an array of cmin/cmax pairs. Yeah, maybe this can work. I'm not going to try however, at least not now. If somebody else wants to try, be my guest. -- Alvaro Herrera () "El conflicto es el camino real hacia la unión" ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PATCHES] nested xacts and phantom Xids
On Wed, Jun 23, 2004 at 08:58:15AM -0400, Bruce Momjian wrote: > If we add nested transactions for 7.5, are we going to need savepoints > too in the same release? If we don't, are we going to get a lot of > complaints? It'd be good to have savepoints right now. I'm not sure it'd be good to expose the nested transactions implementation if we are going to offer savepoints later, because it means we will have to keep nested transactions forever. Maybe it is a good idea to hide the implementation details and offer only the standard SAVEPOINT feature. I'm not sure how hard it is to change the syntax; I don't think it'd be a big deal. -- Alvaro Herrera () "La conclusión que podemos sacar de esos estudios es que no podemos sacar ninguna conclusión de ellos" (Tanenbaum) ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] nested xacts and phantom Xids
Alvaro Herrera wrote: > On Sun, Jun 20, 2004 at 04:37:16PM -0400, Tom Lane wrote: > > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > > Here I present the nested transactions patch and the phantom Xids patch > > > that goes with it. > > > > I looked at the phantom XIDs stuff a bit. I still have little confidence > > that the concept is correct :-( but here are some comments on the code > > level. > > Ok. I for one think this got much more complex than I had originally > thought it would be. I agree the changes to Set/Get Xmin/Xmax are way > beyond what one would want, but the alternative would be to spread the > complexity into their callers and I think that would be much worse. > > I don't have a lot of confidence in this either. The patch will be > available in archives if anybody wants to implement this in a cleaner > and safer way; I'll continue working on the rest of the things you > pointed out in the subtransactions patch. I am sorry to have given Alvaro another idea that didn't work. However, thinking of options, I wonder if instead of phantom xids, we should do phantom cids. Because only the local backend looks at the command counter (cid). I think it might be alot cleaner. The phantom cid would have a tuple bit set indicating that instead of being a cid, it is an index into an array of cmin/cmax pairs. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PATCHES] nested xacts and phantom Xids
If we add nested transactions for 7.5, are we going to need savepoints too in the same release? If we don't, are we going to get a lot of complaints? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] nested xacts and phantom Xids
On Sun, Jun 20, 2004 at 08:49:22PM -0400, Tom Lane wrote: > Of these the one that I think most urgently needs attention is inval.c; > that is complicated code already and figuring out what should happen > for subtrans commit/abort is not trivial. The others look relatively > straightforward to fix, if tedious. A patch for this is attached. It _seems_ to work ok; conceptually it seems ok too. I have done some testing which tends to indicate that it is right, but I'm not sure of all the implications this code has so I don't know how to test it exhaustively. Of course, regression tests pass, but this doesn't actually prove anything. -- Alvaro Herrera () "¿Que diferencia tiene para los muertos, los huérfanos, y aquellos que han perdido su hogar, si la loca destrucción ha sido realizada bajo el nombre del totalitarismo o del santo nombre de la libertad y la democracia?" (Gandhi) Index: src/include/storage/sinval.h === RCS file: /home/alvherre/cvs/pgsql-server/src/include/storage/sinval.h,v retrieving revision 1.35 diff -c -w -b -B -c -r1.35 sinval.h *** sinval.h2 Jun 2004 21:29:29 - 1.35 --- sinval.h22 Jun 2004 04:52:54 - *** *** 20,32 /* ! * We currently support two types of shared-invalidation messages: one that * invalidates an entry in a catcache, and one that invalidates a relcache * entry. More types could be added if needed. The message type is * identified by the first "int16" field of the message struct. Zero or * positive means a catcache inval message (and also serves as the catcache ! * ID field). -1 means a relcache inval message. Other negative values ! * are available to identify other inval message types. * * Relcache invalidation messages usually also cause invalidation of entries * in the smgr's relation cache. This means they must carry both logical --- 20,33 /* ! * We currently support three types of shared-invalidation messages: one that * invalidates an entry in a catcache, and one that invalidates a relcache * entry. More types could be added if needed. The message type is * identified by the first "int16" field of the message struct. Zero or * positive means a catcache inval message (and also serves as the catcache ! * ID field). -1 means a relcache inval message. -2 means a subtransaction ! * boundary message. Other negative values are available to identify other ! * inval message types. * * Relcache invalidation messages usually also cause invalidation of entries * in the smgr's relation cache. This means they must carry both logical *** *** 53,58 --- 54,64 * and so that negative cache entries can be recognized with good accuracy. * (Of course this assumes that all the backends are using identical hashing * code, but that should be OK.) + * + * A subtransaction boundary is not really a cache invalidation message; + * rather it's an implementation artifact for nested transactions. The + * cleanup code for subtransaction abort looks for this message as a boundary + * to know when to stop processing messages. */ typedef struct *** *** 79,89 --- 85,104 */ } SharedInvalRelcacheMsg; + #define SUBXACTBOUNDARYMSG_ID (-2) + + typedef struct + { + int16 id; /* type field --- must be first */ + TransactionId xid;/* transaction id */ + } SubxactBoundaryMsg; + typedef union { int16 id; /* type field --- must be first */ SharedInvalCatcacheMsg cc; SharedInvalRelcacheMsg rc; + SubxactBoundaryMsg sb; } SharedInvalidationMessage; Index: src/backend/utils/cache/inval.c === RCS file: /home/alvherre/cvs/pgsql-server/src/backend/utils/cache/inval.c,v retrieving revision 1.62 diff -c -w -b -B -c -r1.62 inval.c *** inval.c 18 Jun 2004 06:13:52 - 1.62 --- inval.c 23 Jun 2004 00:04:35 - *** *** 66,73 *manipulating the init file is in relcache.c, but we keep track of the *need for it here. * ! *All the request lists are kept in TopTransactionContext memory, since *they need not live beyond the end of the current transaction. * * * Portions Copyright (c) 1996-2003, PostgreSQL Global Development Group --- 66,75 *manipulating the init file is in relcache.c, but we keep track of the *need for it here. * ! *All the request lists are kept in CommitContext memory, since *they need not live beyond the end of the current transaction. + *Also, this makes it easy to free messages created in an aborting + *subtransaction. * * * Portions Copyright (c) 1996-2003, PostgreSQL Global Development Group ***
Re: [PATCHES] nested xacts and phantom Xids
Alvaro Herrera <[EMAIL PROTECTED]> writes: > On Mon, Jun 21, 2004 at 10:28:59PM -0400, Tom Lane wrote: >> Also, rather than labeling each entry individually, it might be better >> to keep a separate list for each level of transaction. Then instead of >> relabeling, you'd just concat the subtrans list to its parent's. Seems >> like this should be faster and less storage. > Right, but it makes harder to detect when a duplicate relcache entry is > going to be inserted. Judging from the commentary in the file this is > an issue. It's only a minor efficiency hack; don't get too tense about avoiding dups. (We don't bother to check for dup catcache-inval entries at all...) The only thing I'd suggest you need to preserve is the business about invalidating catcache before relcache; again that's just an efficiency hack, but it seems simple enough to be worth doing. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] nested xacts and phantom Xids
On Mon, Jun 21, 2004 at 10:28:59PM -0400, Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > This is tricky because chunks are prepended to the queue, but it also > > means we can stop processing as soon as a message belongs to another > > transaction. > > AFAIR there isn't any essential semantics to the ordering of the queue > entries, so you can probably get away with reordering if that makes life > any easier. > > Also, rather than labeling each entry individually, it might be better > to keep a separate list for each level of transaction. Then instead of > relabeling, you'd just concat the subtrans list to its parent's. Seems > like this should be faster and less storage. Right, but it makes harder to detect when a duplicate relcache entry is going to be inserted. Judging from the commentary in the file this is an issue. Another idea I had was: 1. at subtransaction start, add a special "boundary" message carrying the new Xid. 2. at subtransaction abort, process the first chunk backwards until the boundary message with the corresponding Xid is found; if it's not, jump to the next and repeat. At subtransaction commit nothing special needs to be done. This avoids having to label each message and to change the message appending scheme. -- Alvaro Herrera () "The eagle never lost so much time, as when he submitted to learn of the crow." (William Blake) ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PATCHES] nested xacts and phantom Xids
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Regarding the invalidation messages, what I'm currently looking at is to > add a TransactionId to each message, which will be CurrentTransactionId > for each new message. When a subxact commits, all its messages are > relabeled to its parent. When a subxact aborts, all messages labeled > with its TransactionId are locally processed and discarded. Sounds plausible offhand. > This is tricky because chunks are prepended to the queue, but it also > means we can stop processing as soon as a message belongs to another > transaction. AFAIR there isn't any essential semantics to the ordering of the queue entries, so you can probably get away with reordering if that makes life any easier. Also, rather than labeling each entry individually, it might be better to keep a separate list for each level of transaction. Then instead of relabeling, you'd just concat the subtrans list to its parent's. Seems like this should be faster and less storage. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] nested xacts and phantom Xids
On Sun, Jun 20, 2004 at 04:37:16PM -0400, Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > Here I present the nested transactions patch and the phantom Xids patch > > that goes with it. > > I looked at the phantom XIDs stuff a bit. I still have little confidence > that the concept is correct :-( but here are some comments on the code > level. Ok. I for one think this got much more complex than I had originally thought it would be. I agree the changes to Set/Get Xmin/Xmax are way beyond what one would want, but the alternative would be to spread the complexity into their callers and I think that would be much worse. I don't have a lot of confidence in this either. The patch will be available in archives if anybody wants to implement this in a cleaner and safer way; I'll continue working on the rest of the things you pointed out in the subtransactions patch. Sadly, something broke in a really strange way between my last cvs update and today's, because I can't get the patch to initdb. It compiles without warnings (and I did make distclean), but there's a weird bug I'll have to pursue. Regarding the invalidation messages, what I'm currently looking at is to add a TransactionId to each message, which will be CurrentTransactionId for each new message. When a subxact commits, all its messages are relabeled to its parent. When a subxact aborts, all messages labeled with its TransactionId are locally processed and discarded. This is tricky because chunks are prepended to the queue, but it also means we can stop processing as soon as a message belongs to another transaction. I think GUC vars are easier: when a variable is about to change value inside a subtransaction, save the previous value in a list from which it will be restored if the subtransaction later aborts. -- Alvaro Herrera () "Hay dos momentos en la vida de un hombre en los que no debería especular: cuando puede permitírselo y cuando no puede" (Mark Twain) ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PATCHES] nested xacts and phantom Xids
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > What ideas do you have to help him? > > Eat the four-byte overhead. > > Quite frankly, given the other functionality and performance issues > he has to solve in the next ten days (see my other message just now), > I think he'd be a fool to spend one more minute on phantom XIDs. > Maybe for 7.6 someone will have the time to make the idea work. Well, the problem with eating the 4-byte overhead is that it has a performance/storage impact for all installations, not just those that use nested transactions. You were discussing a performance impact for nested transactions, but I assume that is an impact when some on the server are using nested transactions and others are not, while this hit would be for all sites, even when no one is using nested transactions. However, given the list you just supplied, I agree Alvaro should focus on these items and add the 4-bytes to store needed values rather than continue to hack on phantom xids. We can certainly revisit that later, even in 7.6, or never. We can put Manfred on it. He did the compaction in the first place. :-) One point is that if we want to re-add those 4 bytes, we have to get community buy-in for it because it is a step backward for those who are not going to use nested transactions. (I don't want to go the compile-time switch route.) As far as cleaning up some of the items after feature freeze, well, I would like to avoid that, but it seems safe to do because the changes should be very localized, and because win32 will be cleaning up things during feature freeze too, I am sure. However, going into feature freeze knowing a features isn't 100% functional is something we try to avoid, so this would have to be a special exception based on the fact that the features is very difficult, and that Alvaro is working full-time on this. It does take us down the slippery slope of having the release process dictated by completion of nested transactions. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PATCHES] nested xacts and phantom Xids
Bruce Momjian <[EMAIL PROTECTED]> writes: > What ideas do you have to help him? Eat the four-byte overhead. Quite frankly, given the other functionality and performance issues he has to solve in the next ten days (see my other message just now), I think he'd be a fool to spend one more minute on phantom XIDs. Maybe for 7.6 someone will have the time to make the idea work. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PATCHES] nested xacts and phantom Xids
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Here I present the nested transactions patch and the phantom Xids patch > that goes with it. I took a very preliminary look through the nested-xacts patch, too. > Missing: rollback of SET CONSTRAINTS and GUC vars. There's a good deal more than that missing :-(. Here are the modules or actions that are called in CommitTransaction and/or AbortTransaction that have not yet been touched by the patch: shared-inval message queue processing (inval.c) lo_commit localbuf.c (refcounts need fixed same as bufmgr) eoxactcallbacks API needs extension GUC namespace (temp namespace cleanup) files (close/drop temp files) reset current userid during xact abort password file updating SetReindexProcessing disable (can these be nested??) Of these the one that I think most urgently needs attention is inval.c; that is complicated code already and figuring out what should happen for subtrans commit/abort is not trivial. The others look relatively straightforward to fix, if tedious. Given patches for inval.c and guc.c, I would say that the patch is functionally close enough to done that we could commit to including it in 7.5 --- the other stuff could be wrapped up post-feature-freeze. BUT (you knew there was a but coming, right?) ... I am really really disturbed about the potential performance hit from this patch. I don't think we want to have a nested-transaction feature that imposes significant overhead on people who aren't actually using nested transactions. As I looked through the patch I found quite a few places that would impose such overhead. The main thing that's bothering me is that TransactionIdIsInProgress has been turned from a relatively cheap test (just scan the in-memory PGPROC array) into an exceedingly expensive one. It's gratutitously badly coded (for N open main transactions it does N searches of the subtrans tree, when one would do), but I don't really want it going to the subtrans tree at all during typical cases. We had talked about keeping a limited number of active subtrans IDs in the PGPROC array, and I think we're really going to need to do that to buy back performance here. (BTW, what is the intended behavior of TransactionIdIsInProgress for a committed or aborted subtransaction whose parent is still open? If it has to recognize aborted subtranses as still in progress then things get very messy, but I think it probably doesn't have to.) I'm quite unhappy with the wide use of SubTransXidsHaveCommonAncestor to replace TransactionIdEquals in tqual.c, too. This turns very cheap tests into very expensive ones --- whether or not you use subtransactions --- and unfortunately tqual.c is coded on the assumption that these tests are cheap. We can *not* afford to repeat SubTransXidsHaveCommonAncestor for every open main transaction every time we check a tuple. In many ways this is the same problem as with TransactionIdIsInProgress, and it probably needs a similar solution. Also I found it annoying that XactLockTableWait waits for a top xact even when target subxact has already aborted; possibly this is even a recipe for deadlock. Perhaps should iterate up the tree one level at a time, waiting for each? Would mean that subxacts have to acquire a lock on their xid that other people can wait for, but this is not necessarily bad. (In fact, if we do that then I don't think XactLockTableWait has to change at all.) So I think there are several must-fix performance issues as well before we can make a decision to accept this for 7.5. I'd feel better about this if we had another month, but with ten days before feature freeze it's gonna be touch and go. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] nested xacts and phantom Xids
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Let me add that outside transactions read those phantom xid only when > > they are doing dirty reads. What I don't understand is when do outside > > transactions see tuples created inside a transaction? INSERT into a > > table with a unique key? > > > Once the main transaction commits, these phantom tuples should work just > > like ordinary tuples except they get their cmin overwritten when they > > are expired, I think. > > You're not doing anything to increase my confidence level in this > concept. You invented it, and even you aren't sure how it works. And your point is? I am trying to help Alvaro. I came up with an idea, and some thought it would help solve the problem. What ideas do you have to help him? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] nested xacts and phantom Xids
Bruce Momjian <[EMAIL PROTECTED]> writes: > Let me add that outside transactions read those phantom xid only when > they are doing dirty reads. What I don't understand is when do outside > transactions see tuples created inside a transaction? INSERT into a > table with a unique key? > Once the main transaction commits, these phantom tuples should work just > like ordinary tuples except they get their cmin overwritten when they > are expired, I think. You're not doing anything to increase my confidence level in this concept. You invented it, and even you aren't sure how it works. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PATCHES] nested xacts and phantom Xids
Bruce Momjian wrote: > Tom Lane wrote: > > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > > Here I present the nested transactions patch and the phantom Xids patch > > > that goes with it. > > > > I looked at the phantom XIDs stuff a bit. I still have little confidence > > that the concept is correct :-( but here are some comments on the code > > level. > > > > > + * They are marked immediately in pg_subtrans as direct childs of the topmost > > > + * Xid (no matter how deep we are in the transaction tree), > > > > Why is that a good idea --- won't it cause problems when you > > commit/abort an intermediate level of subtransaction? > > I don't think so. The phantom xid is used only by the outside > transaction waiting for that tuple to be committe or aborted. The > outside tranaction will sleep on the topmost xid completing, then check > the phantom xid status for commit/abort. Within the transaction, I think > he uses command counter to know the creation and destruction sub-xid. > > I think right now phantom xids are used only for making parts of a > subtransaction committed or aborted and only in cases where the tuple is > created and destroyed by parts of the same transaction tree. > > I don't feel too bad about the runtime cost if only subtransactions are > paying that cost. I know we are really stretching the system here but I > would like to try a little more rather than give up and taking a space > hit for all tuples. Let me add that outside transactions read those phantom xid only when they are doing dirty reads. What I don't understand is when do outside transactions see tuples created inside a transaction? INSERT into a table with a unique key? Once the main transaction commits, these phantom tuples should work just like ordinary tuples except they get their cmin overwritten when they are expired, I think. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PATCHES] nested xacts and phantom Xids
Bruce Momjian <[EMAIL PROTECTED]> writes: > I don't feel too bad about the runtime cost if only subtransactions are > paying that cost. That's exactly why I'm so exercised about what's been done to the HeapTupleSet/Get macros. That's significant cost that's paid even when you're not using *any* of this stuff. > I know we are really stretching the system here but I > would like to try a little more rather than give up and taking a space > hit for all tuples. I don't even have any confidence that there are no fundamental bugs in the phantom-xid concept :-(. I'd be willing to play along if an implementation that seemed acceptable speedwise were being offered, but this thing is not preferable to four-more-bytes even if it works. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] nested xacts and phantom Xids
Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > Here I present the nested transactions patch and the phantom Xids patch > > that goes with it. > > I looked at the phantom XIDs stuff a bit. I still have little confidence > that the concept is correct :-( but here are some comments on the code > level. > > > + * They are marked immediately in pg_subtrans as direct childs of the topmost > > + * Xid (no matter how deep we are in the transaction tree), > > Why is that a good idea --- won't it cause problems when you > commit/abort an intermediate level of subtransaction? I don't think so. The phantom xid is used only by the outside transaction waiting for that tuple to be committe or aborted. The outside tranaction will sleep on the topmost xid completing, then check the phantom xid status for commit/abort. Within the transaction, I think he uses command counter to know the creation and destruction sub-xid. I think right now phantom xids are used only for making parts of a subtransaction committed or aborted and only in cases where the tuple is created and destroyed by parts of the same transaction tree. I don't feel too bad about the runtime cost if only subtransactions are paying that cost. I know we are really stretching the system here but I would like to try a little more rather than give up and taking a space hit for all tuples. --- > > + * All this happens when HeapTupleHeaderSetXmax is called and the Xmin of the > > + * tuple is one of the Xids of the current transaction. > > + * > > + * Note that HeapTupleHeaderGetXmax/GetXmin return the masqueraded Xmin and > > + * Xmax, not the real one in the tuple header. > > I think that putting this sort of logic into Set/Get Xmin/Xmax is a > horrid idea. They are supposed to be transparent fetch/store macros, > not arbitrarily-complex filter functions. They're also supposed to > be reasonably fast :-( > > > + * ... We know to do this when SetXmax is called > > + * on a tuple that has the HEAP_MARKED_FOR_UPDATE bit set. > > Uglier and uglier. > > > *** > > *** 267,274 > > return true; > > > > /* deleting subtransaction aborted */ > > ! if (TransactionIdDidAbort(HeapTupleHeaderGetXmax(tuple))) > > ! return true; > > > > > > Assert(TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tuple))); > > > > --- 268,276 > > return true; > > > > /* deleting subtransaction aborted */ > > ! if (tuple->t_infomask & HEAP_XMIN_IS_PHANTOM) > > ! if > > (TransactionPhantomIsCommittedPhantom(HeapTupleHeaderGetPhantomId(tuple))) > > ! return true; > > > > > > Assert(TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tuple))); > > > > Er, what happened to checking for the "deleting subtransaction aborted" > case? > > On the whole, I think this was an interesting exercise but also an utter > failure. We'd be far better off to eat the extra 4 bytes per tuple > header and put back the separate Xmax field. The costs in both runtime > and complexity/reliability of this patch are simply not justified by > a 4-byte savings. > > regards, tom lane > > ---(end of broadcast)--- > TIP 6: Have you searched our list archives? > >http://archives.postgresql.org > -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] nested xacts and phantom Xids
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Here I present the nested transactions patch and the phantom Xids patch > that goes with it. I looked at the phantom XIDs stuff a bit. I still have little confidence that the concept is correct :-( but here are some comments on the code level. > + * They are marked immediately in pg_subtrans as direct childs of the topmost > + * Xid (no matter how deep we are in the transaction tree), Why is that a good idea --- won't it cause problems when you commit/abort an intermediate level of subtransaction? > + * All this happens when HeapTupleHeaderSetXmax is called and the Xmin of the > + * tuple is one of the Xids of the current transaction. > + * > + * Note that HeapTupleHeaderGetXmax/GetXmin return the masqueraded Xmin and > + * Xmax, not the real one in the tuple header. I think that putting this sort of logic into Set/Get Xmin/Xmax is a horrid idea. They are supposed to be transparent fetch/store macros, not arbitrarily-complex filter functions. They're also supposed to be reasonably fast :-( > + * ... We know to do this when SetXmax is called > + * on a tuple that has the HEAP_MARKED_FOR_UPDATE bit set. Uglier and uglier. > *** > *** 267,274 > return true; > > /* deleting subtransaction aborted */ > ! if (TransactionIdDidAbort(HeapTupleHeaderGetXmax(tuple))) > ! return true; > > > Assert(TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tuple))); > > --- 268,276 > return true; > > /* deleting subtransaction aborted */ > ! if (tuple->t_infomask & HEAP_XMIN_IS_PHANTOM) > ! if > (TransactionPhantomIsCommittedPhantom(HeapTupleHeaderGetPhantomId(tuple))) > ! return true; > > > Assert(TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tuple))); > Er, what happened to checking for the "deleting subtransaction aborted" case? On the whole, I think this was an interesting exercise but also an utter failure. We'd be far better off to eat the extra 4 bytes per tuple header and put back the separate Xmax field. The costs in both runtime and complexity/reliability of this patch are simply not justified by a 4-byte savings. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org