Re: [PATCHES] nested xacts and phantom Xids

2004-07-10 Thread Bruce Momjian

Added to TODO, just so we don't forget later:

* Use a phantom command counter for nested subtransactions to reduce
  tuple overhead


---

Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Hmm ... yes, this could be very ugly indeed, but I haven't even looked
  at the executor code so I can't comment.  Are executor nodes copyable?
 
 Nope, and even if we had support for that the executor tree per se
 is just the tip of the iceberg.  There's also indexscan status, SRF
 function internal state, yadda yadda.  I think the odds of doing
 something with all that stuff for 7.5 are exactly zero ... we'd better
 define a stopgap behavior.
 
  Oh, and I've been playing with large objects and I've encountered bugs
  elsewhere.  I'll look at it with the new patch you just posted.
 
 Wouldn't surprise me, we've not looked at that yet either.
 
 I do feel that we have enough things working that we should commit to
 nested transactions for 7.5.  There will be some things that we have to
 restrict, such as cursors and perhaps large objects.  But it's surely
 better than no subtransactions at all.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PATCHES] nested xacts and phantom Xids

2004-06-29 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 - GUC vars are rolled back on subxact abort

This did not work very well, but here is a revised GUC patch that I think
does work.  It requires xact.c to export a function to report the
current nesting depth, and requires AtEOXact_GUC to be called in all
four cleanup paths (main and subxact commit and abort).

BTW, why do you have assign_transaction_read_only() in your patch?  It
seems to me to be useful to create a readonly subxact of a read-write
outer transaction.  Or is that just not-done-yet?

regards, tom lane


*** src/backend/utils/misc/README.orig  Mon Jan 19 14:04:40 2004
--- src/backend/utils/misc/README   Tue Jun 29 15:14:44 2004
***
*** 68,116 
  would be effective had there never been any SET commands in the current
  session.
  
! To handle these cases we must keep track of as many as four distinct
! values for each variable.  They are:
  
  * actual variable contentsalways the current effective value
  
  * reset_value the value to use for RESET
  
- * session_value   the committed setting for the session
- 
  * tentative_value the uncommitted result of SET
  
! During initialization we set the first three of these (actual, reset_value,
! and session_value) based on whichever non-interactive source has the
! highest priority.  All three will have the same value.
  
  A SET LOCAL command sets the actual variable (and nothing else).  At
! transaction end, the session_value is used to restore the actual variable
! to its pre-transaction value.
  
  A SET (or SET SESSION) command sets the actual variable, and if no error,
  then sets the tentative_value.  If the transaction commits, the
! tentative_value is assigned to the session_value and the actual variable
! (which could by now be different, if the SET was followed by SET LOCAL).
! If the transaction aborts, the tentative_value is discarded and the
! actual variable is restored from the session_value.
  
  RESET is executed like a SET, but using the reset_value as the desired new
  value.  (We do not provide a RESET LOCAL command, but SET LOCAL TO DEFAULT
  has the same behavior that RESET LOCAL would.)  The source associated with
! the reset_value also becomes associated with the actual and session values.
  
  If SIGHUP is received, the GUC code rereads the postgresql.conf
  configuration file (this does not happen in the signal handler, but at
  next return to main loop; note that it can be executed while within a
  transaction).  New values from postgresql.conf are assigned to actual
! variable, reset_value, and session_value, but only if each of these has a
! current source priority = PGC_S_FILE.  (It is thus possible for
! reset_value to track the config-file setting even if there is currently
! a different interactive value of the actual variable.)
  
  Note that tentative_value is unused and undefined except between a SET
  command and the end of the transaction.  Also notice that we must track
! the source associated with each of the four values.
  
  The assign_hook and show_hook routines work only with the actual variable,
  and are not directly aware of the additional values maintained by GUC.
--- 68,133 
  would be effective had there never been any SET commands in the current
  session.
  
! To handle these cases we must keep track of many distinct values for each
! variable.  The primary values are:
  
  * actual variable contentsalways the current effective value
  
  * reset_value the value to use for RESET
  
  * tentative_value the uncommitted result of SET
  
! The reason we need a tentative_value separate from the actual value is
! that when a transaction does SET followed by SET LOCAL, the actual value
! will now be the LOCAL value, but we want to remember the prior SET so that
! that value is restored at transaction commit.
! 
! In addition, for each level of transaction (possibly nested) we have to
! remember the transaction-entry-time actual and tentative values, in case
! we need to restore them at transaction end.  (The RESET value is essentially
! non-transactional, so it doesn't have to be stacked.)  For efficiency these
! stack entries are not constructed until/unless the variable is actually SET
! within a particular transaction.
! 
! During initialization we set the actual value and reset_value based on
! whichever non-interactive source has the highest priority.  They will
! have the same value.  The tentative_value is not meaningful at this point.
! 
! A SET command starts by stacking the existing actual and tentative values
! if this hasn't already been done within the current transaction.  Then:
  
  A SET LOCAL command sets the actual variable (and nothing else).  At
! transaction end, the stacked values are used to restore the GUC entry
! to its pre-transaction state.
  
  A SET (or SET SESSION) command sets the actual variable, and 

Re: [PATCHES] nested xacts and phantom Xids

2004-06-29 Thread Alvaro Herrera
On Sun, Jun 20, 2004 at 08:49:22PM -0400, Tom Lane wrote:

 There's a good deal more than that missing :-(.  Here are the modules or
 actions that are called in CommitTransaction and/or AbortTransaction
 that have not yet been touched by the patch:
 
 localbuf.c (refcounts need fixed same as bufmgr)

Here is a patch against the original versions of these files; cleaned up
bufmgr.c somewhat.  Adds the same logic to local buffers (moving the
BufferRefCount struct declaration to buf_internals.h so it's shared by
both bufmgr.c and localbuf.c).  Needs xact.c and xact.h patched as in
the second patch.

As with the bufmgr.c original patch, I don't really know how to test
that this actually works.  I fooled around with printing what it was
doing during a subtrans commit/abort, and it seems OK, but that's about
it.  In what situations can a transaction roll back with a nonzero
reference count in a local buffer?

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
I dream about dreams about dreams, sang the nightingale
under the pale moon (Sandman)
Index: src/include/storage/bufmgr.h
===
RCS file: /home/alvherre/cvs/pgsql-server/src/include/storage/bufmgr.h,v
retrieving revision 1.82
diff -c -r1.82 bufmgr.h
*** src/include/storage/bufmgr.h31 May 2004 19:24:05 -  1.82
--- src/include/storage/bufmgr.h21 Jun 2004 20:29:08 -
***
*** 148,153 
--- 148,155 
  extern char *ShowBufferUsage(void);
  extern void ResetBufferUsage(void);
  extern void AtEOXact_Buffers(bool isCommit);
+ extern void AtSubStart_Buffers(void);
+ extern void AtEOSubXact_Buffers(bool commit);
  extern void FlushBufferPool(void);
  extern BlockNumber BufferGetBlockNumber(Buffer buffer);
  extern BlockNumber RelationGetNumberOfBlocks(Relation relation);
Index: src/include/storage/buf_internals.h
===
RCS file: /home/alvherre/cvs/pgsql-server/src/include/storage/buf_internals.h,v
retrieving revision 1.71
diff -c -r1.71 buf_internals.h
*** src/include/storage/buf_internals.h 18 Jun 2004 06:14:13 -  1.71
--- src/include/storage/buf_internals.h 29 Jun 2004 13:21:23 -
***
*** 175,180 
--- 175,189 
  extern long int BufferFlushCount;
  extern long int LocalBufferFlushCount;
  
+ /*
+  * We use a list of this struct to keep track of buffer reference
+  * count checking at subtransaction boundaries.
+  */
+ typedef struct BufferRefCount 
+ {
+   Buffer  buffer;
+   int refcount;
+ } BufferRefCount;
  
  /*
   * Bufmgr Interface:
***
*** 211,215 
--- 220,226 
 bool *foundPtr);
  extern void WriteLocalBuffer(Buffer buffer, bool release);
  extern void AtEOXact_LocalBuffers(bool isCommit);
+ extern void AtSubStart_LocalBuffers(void);
+ extern void AtSubEnd_LocalBuffers(bool isCommit);
  
  #endif   /* BUFMGR_INTERNALS_H */
Index: src/backend/storage/buffer/bufmgr.c
===
RCS file: /home/alvherre/cvs/pgsql-server/src/backend/storage/buffer/bufmgr.c,v
retrieving revision 1.171
diff -c -r1.171 bufmgr.c
*** src/backend/storage/buffer/bufmgr.c 18 Jun 2004 06:13:33 -  1.171
--- src/backend/storage/buffer/bufmgr.c 29 Jun 2004 20:26:09 -
***
*** 38,43 
--- 38,44 
  #include sys/file.h
  #include unistd.h
  
+ #include access/xact.h
  #include lib/stringinfo.h
  #include miscadmin.h
  #include storage/buf_internals.h
***
*** 46,51 
--- 47,53 
  #include storage/proc.h
  #include storage/smgr.h
  #include utils/relcache.h
+ #include utils/memutils.h
  #include pgstat.h
  
  
***
*** 67,72 
--- 69,75 
  
  static void PinBuffer(BufferDesc *buf);
  static void UnpinBuffer(BufferDesc *buf);
+ static inline void BufferFixLeak(Buffer buffer, int should, bool warn);
  static void WaitIO(BufferDesc *buf);
  static void StartBufferIO(BufferDesc *buf, bool forInput);
  static void TerminateBufferIO(BufferDesc *buf, int err_flag);
***
*** 826,853 
for (i = 0; i  NBuffers; i++)
{
if (PrivateRefCount[i] != 0)
!   {
!   BufferDesc *buf = (BufferDescriptors[i]);
  
!   if (isCommit)
!   elog(WARNING,
!buffer refcount leak: [%03d] 
!(rel=%u/%u/%u, blockNum=%u, flags=0x%x, 
refcount=%u %d),
!i,
!buf-tag.rnode.spcNode, buf-tag.rnode.dbNode,
!buf-tag.rnode.relNode,
!buf-tag.blockNum, buf-flags,
!buf-refcount, PrivateRefCount[i]);
  
!   PrivateRefCount[i] = 1; /* 

Re: [PATCHES] nested xacts and phantom Xids

2004-06-29 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 As with the bufmgr.c original patch, I don't really know how to test
 that this actually works.  I fooled around with printing what it was
 doing during a subtrans commit/abort, and it seems OK, but that's about
 it.  In what situations can a transaction roll back with a nonzero
 reference count in a local buffer?

You need an active cursor, eg

begin;
declare c cursor for select * from tenk1;
fetch 1 in c;
... now you've got an open buffer refcount to some page of tenk1

I forgot to mention to you that that code didn't work at all, btw.
I have fixed some of the problems in my local version but there's still
a fairly large issue, which is what exactly we think the semantics of a
cursor declared in a subtransaction ought to be.  With bufmgr set up to
consider open reference counts as a bug, we cannot hold such a cursor
open past subtrans commit.

One possible approach is to consider subxact commit the same as main
xact commit as far as cursors are concerned: materialize anything
declared WITH HOLD, close anything declared without.

The other theory we could adopt is that cursors stay open till main xact
commit; this would imply not releasing buffer refcounts at subxact
commit, plus any other resources needed by the cursor.  We're already
holding locks that way and it probably wouldn't be a big change to make
bufmgr work the same.  I'm not sure that there are any other resources
involved, other than the Portal memory which we already handle properly.

The first approach is a lower-risk path; I'm not sure if the second one
might have some hidden gotchas.  It seems like the second one would be
more flexible though.  Any opinions which to pursue?

Oh, there's another point: what happens if an outer xact level declares
a cursor, which is then FETCHed from by a subtransaction?  At minimum we
have the problem that this could change the set of buffer pins held,
which breaks the present bufmgr solution entirely.  It gets even more
interesting if you are of the opinion that subtransaction failure should
cause the effects of the FETCH to be undone --- we have no way to do
that at all, because there's no mechanism for saving/restoring the state
of an entire execution plan tree.  We might have to prohibit
subtransactions from touching outer-level cursors, at least for 7.5.
This would in turn make it a bit questionable whether there's any point
in letting cursors propagate up out of subtransactions...

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] nested xacts and phantom Xids

2004-06-29 Thread Alvaro Herrera
On Tue, Jun 29, 2004 at 06:59:20PM -0400, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  As with the bufmgr.c original patch, I don't really know how to test
  that this actually works.  [...]
 
 I forgot to mention to you that that code didn't work at all, btw.

Bad news, I guess.

 The other theory we could adopt is that cursors stay open till main xact
 commit; this would imply not releasing buffer refcounts at subxact
 commit, plus any other resources needed by the cursor.  We're already
 holding locks that way and it probably wouldn't be a big change to make
 bufmgr work the same.  I'm not sure that there are any other resources
 involved, other than the Portal memory which we already handle properly.

Well, AFAIR originally I had thought that refcounts should be held at
subtrans commit; you suggested that there was no reason for a subtrans
to keep a buffer refcount and that was it.  I think the open cursor is a
good reason why the count should be kept; it appears less useful if you
can't use the cursor anywhere out of the level that created it.

 Oh, there's another point: what happens if an outer xact level declares
 a cursor, which is then FETCHed from by a subtransaction?  At minimum we
 have the problem that this could change the set of buffer pins held,
 which breaks the present bufmgr solution entirely.  It gets even more
 interesting if you are of the opinion that subtransaction failure should
 cause the effects of the FETCH to be undone --- we have no way to do
 that at all, because there's no mechanism for saving/restoring the state
 of an entire execution plan tree.

Hmm ... yes, this could be very ugly indeed, but I haven't even looked
at the executor code so I can't comment.  Are executor nodes copyable?

Oh, and I've been playing with large objects and I've encountered bugs
elsewhere.  I'll look at it with the new patch you just posted.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Vivir y dejar de vivir son soluciones imaginarias.
La existencia está en otra parte (Andre Breton)


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PATCHES] nested xacts and phantom Xids

2004-06-29 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Hmm ... yes, this could be very ugly indeed, but I haven't even looked
 at the executor code so I can't comment.  Are executor nodes copyable?

Nope, and even if we had support for that the executor tree per se
is just the tip of the iceberg.  There's also indexscan status, SRF
function internal state, yadda yadda.  I think the odds of doing
something with all that stuff for 7.5 are exactly zero ... we'd better
define a stopgap behavior.

 Oh, and I've been playing with large objects and I've encountered bugs
 elsewhere.  I'll look at it with the new patch you just posted.

Wouldn't surprise me, we've not looked at that yet either.

I do feel that we have enough things working that we should commit to
nested transactions for 7.5.  There will be some things that we have to
restrict, such as cursors and perhaps large objects.  But it's surely
better than no subtransactions at all.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] nested xacts and phantom Xids

2004-06-27 Thread Alvaro Herrera
On Sun, Jun 27, 2004 at 12:49:10AM -0400, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  I'll work on adding the Xid-cache to PGPROC and using that in
  TransactionIdIsInProgress and the tqual routines.  If you want to work
  on that let me know and I'll handle things like the password file, local
  bufmgr refcount, etc.
 
 Either one is okay, though doing the latter (ie modules you didn't touch
 yet) would be probably a bit easier to merge.

Will do.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Un poeta es un mundo encerrado en un hombre (Victor Hugo)


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PATCHES] nested xacts and phantom Xids

2004-06-26 Thread Alvaro Herrera
On Sat, Jun 26, 2004 at 07:56:09PM -0400, Tom Lane wrote:
 Do we really need SubtransCutoffXid?  AFAICS the reason for having it is
 only this claim in RecordTransactionCommit:
 
  * We can't mark committed subtransactions as fully committed,
  * because concurrent transactions would see them as committed
  * and not as in-progress.  Leave them as subcommitted until
  * the parent transaction is below OldestXmin, per VACUUM.

Right, that's the only point where it's used.  I don't know clearly if
some kind of mechanism will be needed to handle SUBTRANS COMMIT states
in pg_clog that were left behind by a crashed subtransaction though.

 but I think this is dead wrong.  As long as we mark the parent committed
 first, there is no race condition.  tqual tests that are using snapshots
 will need to recognize that the subtransaction is a member of one of the
 snapshotted main XIDs, and those that are not will think committed is
 committed.  So I want to mark subtransactions fully committed in
 RecordTransactionCommit, and lose SubtransCutoffXid.  Comments?

Yes, sounds good.

 BTW, it would help to know what parts of the patch you intend to work on
 over the next couple of days.  I'm reviewing and editorializing right
 now with intent to commit soon, so it would be good if we can avoid
 tromping on each others' feet.

This is really excellent news.

I'll work on adding the Xid-cache to PGPROC and using that in
TransactionIdIsInProgress and the tqual routines.  If you want to work
on that let me know and I'll handle things like the password file, local
bufmgr refcount, etc.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
FOO MANE PADME HUM


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PATCHES] nested xacts and phantom Xids

2004-06-26 Thread Alvaro Herrera
On Sat, Jun 26, 2004 at 07:56:09PM -0400, Tom Lane wrote:

 BTW, it would help to know what parts of the patch you intend to work on
 over the next couple of days.  I'm reviewing and editorializing right
 now with intent to commit soon, so it would be good if we can avoid
 tromping on each others' feet.

Oops, I forgot that I had inadvertently left a kludge in
commands/trigger.c.  Please use this patch for this file instead.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Jude: I wish humans laid eggs
Ringlord: Why would you want humans to lay eggs?
Jude: So I can eat them
Index: include/commands/trigger.h
===
RCS file: /home/alvherre/cvs/pgsql-server/src/include/commands/trigger.h,v
retrieving revision 1.45
diff -c -r1.45 trigger.h
*** include/commands/trigger.h  29 Nov 2003 22:40:59 -  1.45
--- include/commands/trigger.h  25 Jun 2004 22:59:40 -
***
*** 151,198 
 ItemPointer tupleid,
 HeapTuple newtuple);
  
- 
- /*
-  * Deferred trigger stuff
-  */
- typedef struct DeferredTriggerStatusData
- {
-   Oid dts_tgoid;
-   booldts_tgisdeferred;
- } DeferredTriggerStatusData;
- 
- typedef struct DeferredTriggerStatusData *DeferredTriggerStatus;
- 
- typedef struct DeferredTriggerEventItem
- {
-   Oid dti_tgoid;
-   int32   dti_state;
- } DeferredTriggerEventItem;
- 
- typedef struct DeferredTriggerEventData *DeferredTriggerEvent;
- 
- typedef struct DeferredTriggerEventData
- {
-   DeferredTriggerEvent dte_next;  /* list link */
-   int32   dte_event;
-   Oid dte_relid;
-   ItemPointerData dte_oldctid;
-   ItemPointerData dte_newctid;
-   int32   dte_n_items;
-   /* dte_item is actually a variable-size array, of length dte_n_items */
-   DeferredTriggerEventItem dte_item[1];
- } DeferredTriggerEventData;
- 
- 
  extern void DeferredTriggerInit(void);
  extern void DeferredTriggerBeginXact(void);
  extern void DeferredTriggerEndQuery(void);
  extern void DeferredTriggerEndXact(void);
  extern void DeferredTriggerAbortXact(void);
! 
  extern void DeferredTriggerSetState(ConstraintsSetStmt *stmt);
  
- 
  /*
   * in utils/adt/ri_triggers.c
   */
--- 151,165 
 ItemPointer tupleid,
 HeapTuple newtuple);
  
  extern void DeferredTriggerInit(void);
  extern void DeferredTriggerBeginXact(void);
  extern void DeferredTriggerEndQuery(void);
  extern void DeferredTriggerEndXact(void);
  extern void DeferredTriggerAbortXact(void);
! extern void DeferredTriggerBeginSubXact(void);
! extern void DeferredTriggerEndSubXact(bool isCommit);
  extern void DeferredTriggerSetState(ConstraintsSetStmt *stmt);
  
  /*
   * in utils/adt/ri_triggers.c
   */
Index: backend/commands/trigger.c
===
RCS file: /home/alvherre/cvs/pgsql-server/src/backend/commands/trigger.c,v
retrieving revision 1.165
diff -c -r1.165 trigger.c
*** backend/commands/trigger.c  26 May 2004 04:41:12 -  1.165
--- backend/commands/trigger.c  26 Jun 2004 18:12:01 -
***
*** 50,58 
MemoryContext per_tuple_context);
  static void DeferredTriggerSaveEvent(ResultRelInfo *relinfo, int event,
   bool row_trigger, HeapTuple oldtup, HeapTuple 
newtup);
- static void DeferredTriggerExecute(DeferredTriggerEvent event, int itemno,
-   Relation rel, TriggerDesc *trigdesc, FmgrInfo 
*finfo,
-  MemoryContext per_tuple_context);
  
  
  /*
--- 50,55 
***
*** 1639,1685 
  
  /* --
   * Deferred trigger stuff
   * --
   */
  
! typedef struct DeferredTriggersData
  {
!   /* Internal data is held in a per-transaction memory context */
!   MemoryContext deftrig_cxt;
!   /* ALL DEFERRED or ALL IMMEDIATE */
!   booldeftrig_all_isset;
!   booldeftrig_all_isdeferred;
!   /* Per trigger state */
!   List   *deftrig_trigstates;
!   /* List of pending deferred triggers. Previous comment below */
!   DeferredTriggerEvent deftrig_events;
!   DeferredTriggerEvent deftrig_events_imm;
!   DeferredTriggerEvent deftrig_event_tail;
! } DeferredTriggersData;
  
! /* --
!  * deftrig_events, deftrig_event_tail:
!  * The list of pending deferred trigger events during the current transaction.
   *
!  * deftrig_events is the head, deftrig_event_tail is the last entry.
!  * Because this can grow pretty large, we don't use separate List nodes,
!  * but instead thread the list through the dte_next fields of the member
!  * nodes.  Saves just a few bytes per 

Re: [PATCHES] nested xacts and phantom Xids

2004-06-24 Thread Simon Riggs
On Wed, 2004-06-23 at 13:57, Bruce Momjian wrote:
 I am sorry to have given Alvaro another idea that didn't work. 

No way! Keep having the ideas, please.

I've done some more digging in dead time on all of this and I think
we're on the right course in general by implementing all of this.

...well done to Alvaro for being able to make the ideas reality.

Best regards, Simon Riggs


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PATCHES] nested xacts and phantom Xids

2004-06-24 Thread Alvaro Herrera
On Sun, Jun 20, 2004 at 08:49:22PM -0400, Tom Lane wrote:

Regarding GUC, a WIP report:

 Given patches for inval.c and guc.c, I would say that the patch is
 functionally close enough to done that we could commit to including
 it in 7.5 --- the other stuff could be wrapped up post-feature-freeze.

I figured I could save the values whenever they are going to change, and
restore them if the subtransaction aborts.  This seems to work fine
(lightly tested).

I still have to figure out how to handle allocation for string vars, but
I thought I'd post the patch for others to see.  Please let me know if
it's too ugly.  (This patch misses the pieces in xact.c and xact.h but
I'm sure the concept is clear.)

I'll post a full patch once the missing deferred trigger stuff works.
With the patches I posted to inval.c I think this fulfills the
requirements, barring the performance issues raised.

Comments?

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
No single strategy is always right (Unless the boss says so)
(Larry Wall)
Index: guc.c
===
RCS file: /home/alvherre/cvs/pgsql-server/src/backend/utils/misc/guc.c,v
retrieving revision 1.211
diff -c -w -b -B -c -r1.211 guc.c
*** guc.c   11 Jun 2004 03:54:54 -  1.211
--- guc.c   24 Jun 2004 23:41:42 -
***
*** 25,30 
--- 25,31 
  #include utils/guc.h
  #include utils/guc_tables.h
  
+ #include access/xact.h
  #include catalog/namespace.h
  #include catalog/pg_type.h
  #include commands/async.h
***
*** 54,59 
--- 55,61 
  #include tcop/tcopprot.h
  #include utils/array.h
  #include utils/builtins.h
+ #include utils/memutils.h
  #include utils/pg_locale.h
  #include pgstat.h
  
***
*** 76,81 
--- 78,85 
  static const char *assign_log_destination(const char *value,
bool doit, GucSource source);
  
+ static void SaveGucVariable(struct config_generic *conf);
+ 
  #ifdef HAVE_SYSLOG
  extern char *Syslog_facility;
  extern char *Syslog_ident;
***
*** 105,110 
--- 109,115 
   GucSource source);
  static bool assign_stage_log_stats(bool newval, bool doit, GucSource source);
  static bool assign_log_stats(bool newval, bool doit, GucSource source);
+ static bool assign_transaction_read_only(bool newval, bool doit, GucSource source);
  
  
  /*
***
*** 172,177 
--- 177,183 
  static intmax_index_keys;
  static intmax_identifier_length;
  static intblock_size;
+ static intnesting_level;
  static bool integer_datetimes;
  
  /* Macros for freeing malloc'd pointers only if appropriate to do so */
***
*** 801,807 
GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
},
XactReadOnly,
!   false, NULL, NULL
},
{
{add_missing_from, PGC_USERSET, COMPAT_OPTIONS_PREVIOUS,
--- 807,813 
GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
},
XactReadOnly,
!   false, assign_transaction_read_only, NULL
},
{
{add_missing_from, PGC_USERSET, COMPAT_OPTIONS_PREVIOUS,
***
*** 1311,1316 
--- 1317,1333 
BLCKSZ, BLCKSZ, BLCKSZ, NULL, NULL
},
  
+   {
+   /* XXX probably it's a bad idea for this to be GUC_REPORT. */
+   {nesting_level, PGC_INTERNAL, UNGROUPED,
+   gettext_noop(Shows the current transaction nesting level),
+   NULL,
+   GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_REPORT
+   },
+   nesting_level,
+   0, 0, INT_MAX, NULL, NULL
+   },
+ 
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL
***
*** 2001,2014 
return find_option(map_old_guc_names[i+1]);
}
  
!   /* Check if the name is qualified, and if so, check if the qualifier
 * maps to a custom variable class.
 */
dot = strchr(name, GUC_QUALIFIER_SEPARATOR);
if(dot != NULL  is_custom_class(name, dot - name))
!   /*
!* Add a placeholder variable for this name
!*/
return (struct config_generic*)add_placeholder_variable(name);
  
/* Unknown name */
--- 2018,2030 
return find_option(map_old_guc_names[i+1]);
}
  
!   /*
!* Check if the name is qualified, and if so, check if the qualifier
 * maps to a custom variable class.
 */
dot = strchr(name, GUC_QUALIFIER_SEPARATOR);
if(dot != NULL  is_custom_class(name, dot - name))
!   /* Add a placeholder variable for this name */
  

Re: [PATCHES] nested xacts and phantom Xids

2004-06-23 Thread Bruce Momjian
Alvaro Herrera wrote:
 On Sun, Jun 20, 2004 at 04:37:16PM -0400, Tom Lane wrote:
  Alvaro Herrera [EMAIL PROTECTED] writes:
   Here I present the nested transactions patch and the phantom Xids patch
   that goes with it.
  
  I looked at the phantom XIDs stuff a bit.  I still have little confidence
  that the concept is correct :-( but here are some comments on the code
  level.
 
 Ok.  I for one think this got much more complex than I had originally
 thought it would be.  I agree the changes to Set/Get Xmin/Xmax are way
 beyond what one would want, but the alternative would be to spread the
 complexity into their callers and I think that would be much worse.
 
 I don't have a lot of confidence in this either.  The patch will be
 available in archives if anybody wants to implement this in a cleaner
 and safer way; I'll continue working on the rest of the things you
 pointed out in the subtransactions patch.

I am sorry to have given Alvaro another idea that didn't work.  However,
thinking of options, I wonder if instead of phantom xids, we should do
phantom cids.  Because only the local backend looks at the command
counter (cid).  I think it might be alot cleaner.  The phantom cid would
have a tuple bit set indicating that instead of being a cid, it is an
index into an array of cmin/cmax pairs.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PATCHES] nested xacts and phantom Xids

2004-06-23 Thread Alvaro Herrera
On Wed, Jun 23, 2004 at 08:58:15AM -0400, Bruce Momjian wrote:
 If we add nested transactions for 7.5, are we going to need savepoints
 too in the same release?  If we don't, are we going to get a lot of
 complaints?

It'd be good to have savepoints right now.  I'm not sure it'd be good to
expose the nested transactions implementation if we are going to offer
savepoints later, because it means we will have to keep nested
transactions forever.

Maybe it is a good idea to hide the implementation details and offer
only the standard SAVEPOINT feature.  I'm not sure how hard it is to
change the syntax; I don't think it'd be a big deal.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
La conclusión que podemos sacar de esos estudios es que
no podemos sacar ninguna conclusión de ellos (Tanenbaum)


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PATCHES] nested xacts and phantom Xids

2004-06-23 Thread Alvaro Herrera
On Wed, Jun 23, 2004 at 08:57:11AM -0400, Bruce Momjian wrote:

 I am sorry to have given Alvaro another idea that didn't work.

It allowed me to learn a lot of cool tricks, so it wasn't wasted work.
The only think I'm sorry about is that I should have used the time for
something more useful, like tackling the remaining problems in the
nested xacts implementation proper.

 However, thinking of options, I wonder if instead of phantom xids, we
 should do phantom cids.  Because only the local backend looks at the
 command counter (cid).  I think it might be alot cleaner.  The phantom
 cid would have a tuple bit set indicating that instead of being a cid,
 it is an index into an array of cmin/cmax pairs.

Yeah, maybe this can work.  I'm not going to try however, at least not
now.  If somebody else wants to try, be my guest.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
El conflicto es el camino real hacia la unión


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PATCHES] nested xacts and phantom Xids

2004-06-23 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 It'd be good to have savepoints right now.  I'm not sure it'd be good to
 expose the nested transactions implementation if we are going to offer
 savepoints later, because it means we will have to keep nested
 transactions forever.

Nested transactions are *good*.  We should not hide them.

I don't object to offering spec-compatible savepoints, but the plain
fact of the matter is that that's a dumbed-down API.  We should not
feel constrained to offer only that.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PATCHES] nested xacts and phantom Xids

2004-06-22 Thread Alvaro Herrera
On Sun, Jun 20, 2004 at 08:49:22PM -0400, Tom Lane wrote:

 Of these the one that I think most urgently needs attention is inval.c;
 that is complicated code already and figuring out what should happen
 for subtrans commit/abort is not trivial.  The others look relatively
 straightforward to fix, if tedious.

A patch for this is attached.  It _seems_ to work ok; conceptually it
seems ok too.  I have done some testing which tends to indicate that it
is right, but I'm not sure of all the implications this code has so I
don't know how to test it exhaustively.  Of course, regression tests
pass, but this doesn't actually prove anything.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
¿Que diferencia tiene para los muertos, los huérfanos, y aquellos que han
perdido su hogar, si la loca destrucción ha sido realizada bajo el nombre
del totalitarismo o del santo nombre de la libertad y la democracia? (Gandhi)
Index: src/include/storage/sinval.h
===
RCS file: /home/alvherre/cvs/pgsql-server/src/include/storage/sinval.h,v
retrieving revision 1.35
diff -c -w -b -B -c -r1.35 sinval.h
*** sinval.h2 Jun 2004 21:29:29 -   1.35
--- sinval.h22 Jun 2004 04:52:54 -
***
*** 20,32 
  
  
  /*
!  * We currently support two types of shared-invalidation messages: one that
   * invalidates an entry in a catcache, and one that invalidates a relcache
   * entry.  More types could be added if needed.  The message type is
   * identified by the first int16 field of the message struct.  Zero or
   * positive means a catcache inval message (and also serves as the catcache
!  * ID field).  -1 means a relcache inval message.  Other negative values
!  * are available to identify other inval message types.
   *
   * Relcache invalidation messages usually also cause invalidation of entries
   * in the smgr's relation cache.  This means they must carry both logical
--- 20,33 
  
  
  /*
!  * We currently support three types of shared-invalidation messages: one that
   * invalidates an entry in a catcache, and one that invalidates a relcache
   * entry.  More types could be added if needed.  The message type is
   * identified by the first int16 field of the message struct.  Zero or
   * positive means a catcache inval message (and also serves as the catcache
!  * ID field).  -1 means a relcache inval message.  -2 means a subtransaction
!  * boundary message. Other negative values are available to identify other
!  * inval message types.
   *
   * Relcache invalidation messages usually also cause invalidation of entries
   * in the smgr's relation cache.  This means they must carry both logical
***
*** 53,58 
--- 54,64 
   * and so that negative cache entries can be recognized with good accuracy.
   * (Of course this assumes that all the backends are using identical hashing
   * code, but that should be OK.)
+  *
+  * A subtransaction boundary is not really a cache invalidation message;
+  * rather it's an implementation artifact for nested transactions.  The
+  * cleanup code for subtransaction abort looks for this message as a boundary
+  * to know when to stop processing messages.
   */
  
  typedef struct
***
*** 79,89 
--- 85,104 
 */
  } SharedInvalRelcacheMsg;
  
+ #define SUBXACTBOUNDARYMSG_ID (-2)
+ 
+ typedef struct
+ {
+   int16   id; /* type field --- must be 
first */
+   TransactionId   xid;/* transaction id */
+ } SubxactBoundaryMsg;
+ 
  typedef union
  {
int16   id; /* type field --- must be 
first */
SharedInvalCatcacheMsg cc;
SharedInvalRelcacheMsg rc;
+   SubxactBoundaryMsg sb;
  } SharedInvalidationMessage;
  
  
Index: src/backend/utils/cache/inval.c
===
RCS file: /home/alvherre/cvs/pgsql-server/src/backend/utils/cache/inval.c,v
retrieving revision 1.62
diff -c -w -b -B -c -r1.62 inval.c
*** inval.c 18 Jun 2004 06:13:52 -  1.62
--- inval.c 23 Jun 2004 00:04:35 -
***
*** 66,73 
   *manipulating the init file is in relcache.c, but we keep track of the
   *need for it here.
   *
!  *All the request lists are kept in TopTransactionContext memory, since
   *they need not live beyond the end of the current transaction.
   *
   *
   * Portions Copyright (c) 1996-2003, PostgreSQL Global Development Group
--- 66,75 
   *manipulating the init file is in relcache.c, but we keep track of the
   *need for it here.
   *
!  *All the request lists are kept in CommitContext memory, since
   *they need not live beyond the end of the current transaction.
+  *Also, this makes it easy to free messages created in an aborting
+  *subtransaction.
   *
   *
   * Portions Copyright (c) 1996-2003, PostgreSQL Global Development Group

Re: [PATCHES] nested xacts and phantom Xids

2004-06-21 Thread Alvaro Herrera
On Sun, Jun 20, 2004 at 04:37:16PM -0400, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Here I present the nested transactions patch and the phantom Xids patch
  that goes with it.
 
 I looked at the phantom XIDs stuff a bit.  I still have little confidence
 that the concept is correct :-( but here are some comments on the code
 level.

Ok.  I for one think this got much more complex than I had originally
thought it would be.  I agree the changes to Set/Get Xmin/Xmax are way
beyond what one would want, but the alternative would be to spread the
complexity into their callers and I think that would be much worse.

I don't have a lot of confidence in this either.  The patch will be
available in archives if anybody wants to implement this in a cleaner
and safer way; I'll continue working on the rest of the things you
pointed out in the subtransactions patch.

Sadly, something broke in a really strange way between my last cvs
update and today's, because I can't get the patch to initdb.  It
compiles without warnings (and I did make distclean), but there's a
weird bug I'll have to pursue.


Regarding the invalidation messages, what I'm currently looking at is to
add a TransactionId to each message, which will be CurrentTransactionId
for each new message.  When a subxact commits, all its messages are
relabeled to its parent.  When a subxact aborts, all messages labeled
with its TransactionId are locally processed and discarded.  This is
tricky because chunks are prepended to the queue, but it also means we
can stop processing as soon as a message belongs to another transaction.

I think GUC vars are easier: when a variable is about to change value
inside a subtransaction, save the previous value in a list from which it
will be restored if the subtransaction later aborts.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Hay dos momentos en la vida de un hombre en los que no debería
especular: cuando puede permitírselo y cuando no puede (Mark Twain)


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PATCHES] nested xacts and phantom Xids

2004-06-21 Thread Alvaro Herrera
On Mon, Jun 21, 2004 at 10:28:59PM -0400, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:

  This is tricky because chunks are prepended to the queue, but it also
  means we can stop processing as soon as a message belongs to another
  transaction.
 
 AFAIR there isn't any essential semantics to the ordering of the queue
 entries, so you can probably get away with reordering if that makes life
 any easier.

 Also, rather than labeling each entry individually, it might be better
 to keep a separate list for each level of transaction.  Then instead of
 relabeling, you'd just concat the subtrans list to its parent's.  Seems
 like this should be faster and less storage.

Right, but it makes harder to detect when a duplicate relcache entry is
going to be inserted.  Judging from the commentary in the file this is
an issue.

Another idea I had was:
1. at subtransaction start, add a special boundary message carrying
the new Xid.
2. at subtransaction abort, process the first chunk backwards until the
boundary message with the corresponding Xid is found; if it's not, jump
to the next and repeat.

At subtransaction commit nothing special needs to be done.

This avoids having to label each message and to change the message
appending scheme.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
The eagle never lost so much time, as
when he submitted to learn of the crow. (William Blake)


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PATCHES] nested xacts and phantom Xids

2004-06-21 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 On Mon, Jun 21, 2004 at 10:28:59PM -0400, Tom Lane wrote:
 Also, rather than labeling each entry individually, it might be better
 to keep a separate list for each level of transaction.  Then instead of
 relabeling, you'd just concat the subtrans list to its parent's.  Seems
 like this should be faster and less storage.

 Right, but it makes harder to detect when a duplicate relcache entry is
 going to be inserted.  Judging from the commentary in the file this is
 an issue.

It's only a minor efficiency hack; don't get too tense about avoiding dups.
(We don't bother to check for dup catcache-inval entries at all...)
The only thing I'd suggest you need to preserve is the business about
invalidating catcache before relcache; again that's just an efficiency
hack, but it seems simple enough to be worth doing.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] nested xacts and phantom Xids

2004-06-20 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Here I present the nested transactions patch and the phantom Xids patch
 that goes with it.

I looked at the phantom XIDs stuff a bit.  I still have little confidence
that the concept is correct :-( but here are some comments on the code
level.

 +  * They are marked immediately in pg_subtrans as direct childs of the topmost
 +  * Xid (no matter how deep we are in the transaction tree),

Why is that a good idea --- won't it cause problems when you
commit/abort an intermediate level of subtransaction?

 +  * All this happens when HeapTupleHeaderSetXmax is called and the Xmin of the
 +  * tuple is one of the Xids of the current transaction.
 +  *
 +  * Note that HeapTupleHeaderGetXmax/GetXmin return the masqueraded Xmin and
 +  * Xmax, not the real one in the tuple header.

I think that putting this sort of logic into Set/Get Xmin/Xmax is a
horrid idea.  They are supposed to be transparent fetch/store macros,
not arbitrarily-complex filter functions.  They're also supposed to
be reasonably fast :-(

 +  * ... We know to do this when SetXmax is called
 +  * on a tuple that has the HEAP_MARKED_FOR_UPDATE bit set.

Uglier and uglier.

 ***
 *** 267,274 
   return true;
   
   /* deleting subtransaction aborted */
 ! if (TransactionIdDidAbort(HeapTupleHeaderGetXmax(tuple)))
 ! return true;
   
   
 Assert(TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tuple)));
   
 --- 268,276 
   return true;
   
   /* deleting subtransaction aborted */
 ! if (tuple-t_infomask  HEAP_XMIN_IS_PHANTOM)
 ! if 
 (TransactionPhantomIsCommittedPhantom(HeapTupleHeaderGetPhantomId(tuple)))
 ! return true;
   
   
 Assert(TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tuple)));
   

Er, what happened to checking for the deleting subtransaction aborted
case?

On the whole, I think this was an interesting exercise but also an utter
failure.  We'd be far better off to eat the extra 4 bytes per tuple
header and put back the separate Xmax field.  The costs in both runtime
and complexity/reliability of this patch are simply not justified by
a 4-byte savings.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] nested xacts and phantom Xids

2004-06-20 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 I don't feel too bad about the runtime cost if only subtransactions are
 paying that cost.

That's exactly why I'm so exercised about what's been done to the
HeapTupleSet/Get macros.  That's significant cost that's paid even when
you're not using *any* of this stuff.

 I know we are really stretching the system here but I
 would like to try a little more rather than give up and taking a space
 hit for all tuples.

I don't even have any confidence that there are no fundamental bugs
in the phantom-xid concept :-(.  I'd be willing to play along if an
implementation that seemed acceptable speedwise were being offered,
but this thing is not preferable to four-more-bytes even if it works.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PATCHES] nested xacts and phantom Xids

2004-06-20 Thread Bruce Momjian
Bruce Momjian wrote:
 Tom Lane wrote:
  Alvaro Herrera [EMAIL PROTECTED] writes:
   Here I present the nested transactions patch and the phantom Xids patch
   that goes with it.
  
  I looked at the phantom XIDs stuff a bit.  I still have little confidence
  that the concept is correct :-( but here are some comments on the code
  level.
  
   +  * They are marked immediately in pg_subtrans as direct childs of the topmost
   +  * Xid (no matter how deep we are in the transaction tree),
  
  Why is that a good idea --- won't it cause problems when you
  commit/abort an intermediate level of subtransaction?
 
 I don't think so.  The phantom xid is used only by the outside
 transaction waiting for that tuple to be committe or aborted.  The
 outside tranaction will sleep on the topmost xid completing, then check
 the phantom xid status for commit/abort. Within the transaction, I think
 he uses command counter to know the creation and destruction sub-xid.
 
 I think right now phantom xids are used only for making parts of a
 subtransaction committed or aborted and only in cases where the tuple is
 created and destroyed by parts of the same transaction tree.
 
 I don't feel too bad about the runtime cost if only subtransactions are
 paying that cost.  I know we are really stretching the system here but I
 would like to try a little more rather than give up and taking a space
 hit for all tuples.

Let me add that outside transactions read those phantom xid only when
they are doing dirty reads.  What I don't understand is when do outside
transactions see tuples created inside a transaction?  INSERT into a
table with a unique key?

Once the main transaction commits, these phantom tuples should work just
like ordinary tuples except they get their cmin overwritten when they
are expired, I think.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PATCHES] nested xacts and phantom Xids

2004-06-20 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Let me add that outside transactions read those phantom xid only when
 they are doing dirty reads.  What I don't understand is when do outside
 transactions see tuples created inside a transaction?  INSERT into a
 table with a unique key?

 Once the main transaction commits, these phantom tuples should work just
 like ordinary tuples except they get their cmin overwritten when they
 are expired, I think.

You're not doing anything to increase my confidence level in this
concept.  You invented it, and even you aren't sure how it works.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] nested xacts and phantom Xids

2004-06-20 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Let me add that outside transactions read those phantom xid only when
  they are doing dirty reads.  What I don't understand is when do outside
  transactions see tuples created inside a transaction?  INSERT into a
  table with a unique key?
 
  Once the main transaction commits, these phantom tuples should work just
  like ordinary tuples except they get their cmin overwritten when they
  are expired, I think.
 
 You're not doing anything to increase my confidence level in this
 concept.  You invented it, and even you aren't sure how it works.

And your point is?   I am trying to help Alvaro.  I came up with an
idea, and some thought it would help solve the problem.  What ideas do
you have to help him?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PATCHES] nested xacts and phantom Xids

2004-06-20 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Here I present the nested transactions patch and the phantom Xids patch
 that goes with it.

I took a very preliminary look through the nested-xacts patch, too.

 Missing: rollback of SET CONSTRAINTS and GUC vars.

There's a good deal more than that missing :-(.  Here are the modules or
actions that are called in CommitTransaction and/or AbortTransaction
that have not yet been touched by the patch:

shared-inval message queue processing (inval.c)
lo_commit
localbuf.c (refcounts need fixed same as bufmgr)
eoxactcallbacks API needs extension
GUC
namespace (temp namespace cleanup)
files (close/drop temp files)
reset current userid during xact abort
password file updating
SetReindexProcessing disable (can these be nested??)

Of these the one that I think most urgently needs attention is inval.c;
that is complicated code already and figuring out what should happen
for subtrans commit/abort is not trivial.  The others look relatively
straightforward to fix, if tedious.

Given patches for inval.c and guc.c, I would say that the patch is
functionally close enough to done that we could commit to including
it in 7.5 --- the other stuff could be wrapped up post-feature-freeze.

BUT (you knew there was a but coming, right?) ... I am really really
disturbed about the potential performance hit from this patch.
I don't think we want to have a nested-transaction feature that imposes
significant overhead on people who aren't actually using nested
transactions.  As I looked through the patch I found quite a few places
that would impose such overhead.

The main thing that's bothering me is that TransactionIdIsInProgress
has been turned from a relatively cheap test (just scan the in-memory
PGPROC array) into an exceedingly expensive one.  It's gratutitously
badly coded (for N open main transactions it does N searches of the
subtrans tree, when one would do), but I don't really want it going to
the subtrans tree at all during typical cases.  We had talked about
keeping a limited number of active subtrans IDs in the PGPROC array,
and I think we're really going to need to do that to buy back
performance here.

(BTW, what is the intended behavior of TransactionIdIsInProgress for
a committed or aborted subtransaction whose parent is still open?
If it has to recognize aborted subtranses as still in progress then
things get very messy, but I think it probably doesn't have to.)

I'm quite unhappy with the wide use of SubTransXidsHaveCommonAncestor to
replace TransactionIdEquals in tqual.c, too.  This turns very cheap tests
into very expensive ones --- whether or not you use subtransactions ---
and unfortunately tqual.c is coded on the assumption that these tests
are cheap.  We can *not* afford to repeat SubTransXidsHaveCommonAncestor
for every open main transaction every time we check a tuple.  In many
ways this is the same problem as with TransactionIdIsInProgress, and
it probably needs a similar solution.

Also I found it annoying that XactLockTableWait waits for a top xact
even when target subxact has already aborted; possibly this is even a
recipe for deadlock.  Perhaps should iterate up the tree one level at a
time, waiting for each?  Would mean that subxacts have to acquire a
lock on their xid that other people can wait for, but this is not
necessarily bad.  (In fact, if we do that then I don't think
XactLockTableWait has to change at all.)

So I think there are several must-fix performance issues as well before
we can make a decision to accept this for 7.5.

I'd feel better about this if we had another month, but with ten days
before feature freeze it's gonna be touch and go.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] nested xacts and phantom Xids

2004-06-20 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 What ideas do you have to help him?

Eat the four-byte overhead.

Quite frankly, given the other functionality and performance issues
he has to solve in the next ten days (see my other message just now),
I think he'd be a fool to spend one more minute on phantom XIDs.
Maybe for 7.6 someone will have the time to make the idea work.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend