Re: New docs chapter on Transaction Management and related changes

Bruce Momjian Fri, 18 Nov 2022 11:12:15 -0800

On Mon, Nov  7, 2022 at 11:04:46PM +0100, Laurenz Albe wrote:
> On Sat, 2022-11-05 at 10:08 +0000, Simon Riggs wrote:
> > Agreed; new compilation patch attached, including mine and then
> > Robert's suggested rewordings.
> 
> Thanks.  There is clearly a lot of usefule information in this.


Sorry again for the long delay in replying to this.

> Some comments:
> 
> > --- a/doc/src/sgml/func.sgml
> > +++ b/doc/src/sgml/func.sgml
> > @@ -24673,7 +24673,10 @@ SELECT collation for ('foo' COLLATE "de_DE");
> >         <para>
> >          Returns the current transaction's ID.  It will assign a new one if 
> > the
> >          current transaction does not have one already (because it has not
> > -        performed any database updates).
> > +        performed any database updates);  see <xref
> > +        linkend="transaction-id"/> for details.  If executed in a
> > +        subtransaction this will return the top-level xid;  see <xref
> > +        linkend="subxacts"/> for details.
> >         </para></entry>
> >        </row>
> 
> I would use a comma after "subtransaction", and I think it would be better to 
> write
> "transaction ID" instead of "xid".

Agreed.

> > @@ -24690,6 +24693,7 @@ SELECT collation for ('foo' COLLATE "de_DE");
> >          ID is assigned yet.  (It's best to use this variant if the 
> > transaction
> >          might otherwise be read-only, to avoid unnecessary consumption of 
> > an
> >          XID.)
> > +        If executed in a subtransaction this will return the top-level xid.
> >         </para></entry>
> >        </row>
> 
> Same as above.

Agreed.

> > @@ -24733,6 +24737,8 @@ SELECT collation for ('foo' COLLATE "de_DE");
> >         <para>
> >          Returns a current <firstterm>snapshot</firstterm>, a data structure
> >          showing which transaction IDs are now in-progress.
> > +        Only top-level xids are included in the snapshot; subxids are not
> > +        shown;  see <xref linkend="subxacts"/> for details.
> >         </para></entry>
> >        </row>
> 
> Again, I would avoid "xid" and "subxid", or at least use "transaction ID 
> (xid)"
> and similar.

Done.

> > --- a/doc/src/sgml/ref/release_savepoint.sgml
> > +++ b/doc/src/sgml/ref/release_savepoint.sgml
> > @@ -34,23 +34,16 @@ RELEASE [ SAVEPOINT ] 
> > <replaceable>savepoint_name</replaceable>
> >    <title>Description</title>
> >  
> >    <para>
> > -   <command>RELEASE SAVEPOINT</command> destroys a savepoint previously 
> > defined
> > -   in the current transaction.
> > +   <command>RELEASE SAVEPOINT</command> will subcommit the subtransaction
> > +   established by the named savepoint, if one exists. This will release
> > +   any resources held by the subtransaction. If there were any
> > +   subtransactions of the named savepoint, these will also be subcommitted.
> >    </para>
> > 
> >    <para>
> 
> "Subtransactions of the named savepoint" is somewhat confusing; how about
> "subtransactions of the subtransaction established by the named savepoint"?
> 
> If that is too long and explicit, perhaps "subtransactions of that 
> subtransaction".

This paragraph has been rewritten to:

   <command>RELEASE SAVEPOINT</command> releases the named savepoint and
   all active savepoints that were created after the named savepoint,
   and frees their resources.  All changes made since the creation of the
   savepoint, excluding rolled back savepoints changes, are merged into
   the transaction or savepoint that was active when the named savepoint
   was created.  Changes made after <command>RELEASE SAVEPOINT</command>
   will also be part of this active transaction or savepoint.

> > @@ -78,7 +71,7 @@ RELEASE [ SAVEPOINT ] 
> > <replaceable>savepoint_name</replaceable>
> > 
> >    <para>
> >     It is not possible to release a savepoint when the transaction is in
> > -   an aborted state.
> > +   an aborted state, to do that use <xref linkend="sql-rollback-to"/>.
> >    </para>
> > 
> >    <para>
> 
> I think the following is more English:
> "It is not possible ... state; to do that, use <xref .../>."

Changed to:

   It is not possible to release a savepoint when the transaction is in
   an aborted state; to do that, use <xref linkend="sql-rollback-to"/>.

> > --- a/doc/src/sgml/ref/rollback.sgml
> > +++ b/doc/src/sgml/ref/rollback.sgml
> > @@ -56,11 +56,14 @@ ROLLBACK [ WORK | TRANSACTION ] [ AND [ NO ] CHAIN ]
> >      <term><literal>AND CHAIN</literal></term>
> >      <listitem>
> >       <para>
> > -      If <literal>AND CHAIN</literal> is specified, a new transaction is
> > +      If <literal>AND CHAIN</literal> is specified, a new unaborted 
> > transaction is
> >        immediately started with the same transaction characteristics (see 
> > <xref
> >        linkend="sql-set-transaction"/>) as the just finished one.  
> > Otherwise,
> >        no new transaction is started.
> 
> I don't think that is an improvement.  "Unaborted" is an un-word.  A new 
> transaction
> is always "unaborted", isn't it?

Agreed.

> > --- a/doc/src/sgml/wal.sgml
> > +++ b/doc/src/sgml/wal.sgml
> > @@ -909,4 +910,36 @@
> >     seem to be a problem in practice.
> >    </para>
> >   </sect1>
> > +
> > + <sect1 id="two-phase">
> > +
> > +  <title>Two-Phase Transactions</title>
> > +
> > +  <para>
> > +   <productname>PostgreSQL</productname> supports a two-phase commit (2PC)
> [...]
> > +   <filename>pg_twophase</filename> directory. Currently-prepared
> > +   transactions can be inspected using <link
> > +   
> > linkend="view-pg-prepared-xacts"><structname>pg_prepared_xacts</structname></link>.
> > +  </para>
> > + </sect1>
> > +
> >  </chapter>
> 
> I don't like "currently-prepared".  How about:
> "Transaction that are currently prepared can be inspected..."

Yes, now:

   Transactions that are currently prepared can be inspected using <link
   
linkend="view-pg-prepared-xacts"><structname>pg_prepared_xacts</structname></link>.

> This is clearly interesting information, but I don't think the WAL chapter is 
> the right
> place for this.  "pg_twophase" is already mentioned in "storage.sgml", and 
> details about
> when exactly a prepared transaction is persisted may exceed the details level 
> needed by
> the end user.
> 
> I'd look for that information in the reference page for PREPARE TRANSACTION; 
> perhaps
> that would be a better place.  Or, even better, the new "xact.sgml" chapter.

Agreed, moved to xact.sgml.

> > --- /dev/null
> > +++ b/doc/src/sgml/xact.sgml
> 
> +  <title>Transaction Management</title>
> 
> +   The word transaction is often abbreviated as "xact".
> 
> Should use <quote> here.

Done.

> > +   <title>Transactions and Identifiers</title>
> 
> > +   <para>
> > +    Once a transaction writes to the database, it is assigned a
> > +    non-virtual <literal>TransactionId</literal> (or <type>xid</type>),
> > +    e.g., <literal>278394</literal>. Xids are assigned sequentially
> > +    using a global counter used by all databases within the
> > +    <productname>PostgreSQL</productname> cluster. This property is used by
> > +    the transaction system to order transactions by their first database
> > +    write, i.e., lower-numbered xids started writing before higher-numbered
> > +    xids.  Of course, transactions might start in a different order.
> > +   </para>
> 
> "This property"?  How about:
> "Because transaction IDs are assigned sequentially, the transaction system can
> use them to order transactions by their first database write"
> 
> I would want some additional information here: why does the transaction 
> system have
> to order transactions by their first database write?
> 
> "Of course, transactions might start in a different order."
> 
> Now that confuses me.  Are you saying that BEGIN could be in a different order
> than the first database write?  Perhaps like this:
> 
> "Note that the order in which transactions perform their first database write
> might be different from the order in which the transactions started."

I rewrote the paragraph to be:

   Non-virtual <literal>TransactionId</literal> (or <type>xid</type>),
   e.g., <literal>278394</literal>, are assigned sequentially to
   transactions from a global counter used by all databases within
   the <productname>PostgreSQL</productname> cluster.  This assignment
   happens when a transaction first writes to the database. This means
   lower-numbered xids started writing before higher-numbered xids.
   Note that the order in which transactions perform their first database
   write might be different from the order in which the transactions
   started, particularly if the transaction started with statements that
   only performed database reads.

> > +    The internal transaction ID type <type>xid</type> is 32-bits wide
> 
> There should be no hyphen in "32 bits wide", just as in "3 years old".

Done.

> > +                            A 32-bit epoch is incremented during each
> > +    wrap around.
> 
> We usually call this "wraparound" without a space.

Fixed.

> > +                 There is also a 64-bit type <type>xid8</type> which
> > +    includes this epoch and therefore does not wrap around during the
> > +    life of an installation and can be converted to xid by casting.
> 
> Running "and"s.  Better:
> 
> "There is also ... and does not wrap ... life of an installation.
>  <type>xid8</type> can be converted to <type>xid</type> by casting."

I went with:

   There is also a 64-bit type <type>xid8</type> which
   includes this epoch and therefore does not wrap around during the
   life of an installation;  it can be converted to xid by casting.

> > +                                      Xids are used as the
> > +    basis for <productname>PostgreSQL</productname>'s <link
> > +    linkend="mvcc">MVCC</link> concurrency mechanism, <link
> > +    linkend="hot-standby">Hot Standby</link>, and Read Replica servers.
> 
> What is the difference between a hot standby and a read replica?  I think
> one of these terms is sufficient.

Agreed, I went with:

   Xids are used as the
   basis for <productname>PostgreSQL</productname>'s <link
   linkend="mvcc">MVCC</link> concurrency mechanism and streaming
   replication.

> > +    In addition to <literal>vxid</literal> and <type>xid</type>,
> > +    when a transaction is prepared for two-phase commit it
> > +    is also identified by a Global Transaction Identifier
> > +    (<acronym>GID</acronym>).
> 
> Better:
> 
> "In addition to <literal>vxid</literal> and <type>xid</type>,
>  prepared transactions also have a Global Transaction Identifier
>  (<acronym>GID</acronym>) that is assigned when the transaction is
>  prepared for two-phase commit."

I went with:

   In addition to <literal>vxid</literal> and <type>xid</type>,
   prepared transactions are also assigned Global Transaction
   Identifiers (<acronym>GID</acronym>).

> > +  <sect1 id="xact-locking">
> > +
> > +   <title>Transactions and Locking</title>
> > +
> > +   <para>
> > +    Currently-executing transactions are shown in <link
> > +    linkend="view-pg-locks"><structname>pg_locks</structname></link>
> > +    in columns <structfield>virtualxid</structfield> and
> > +    <structfield>transactionid</structfield>.
> 
> Better:
> 
> "The transaction IDs of currently executing transactions are shown in <link
>  linkend="view-pg-locks"><structname>pg_locks</structname></link>
>  in the columns <structfield>virtualxid</structfield> and
>  <structfield>transactionid</structfield>."

Done.

> > +    Lock waits on table-level locks are shown waiting for
> > +    <structfield>virtualxid</structfield>, while lock waits on row-level
> > +    locks are shown waiting for <structfield>transactionid</structfield>.
> 
> That's not true.  Transactions waiting for table-level locks are shown
> waiting for a "relation" lock in both "pg_stat_activity" and "pg_locks".

I tested and you are right.  I went with more generic wording:

   Some lock types wait on <structfield>virtualxid</structfield>,
   while other types wait on <structfield>transactionid</structfield>.

> > +    Row-level read and write locks are recorded directly in locked
> > +    rows and can be inspected using the <xref linkend="pgrowlocks"/>
> > +    extension.  Row-level read locks might also require the assignment
> > +    of multixact IDs (<literal>mxid</literal>). Mxids are recorded in
> > +    the <filename>pg_multixact</filename> directory.
> 
> "are recorded directly in *the* locked rows"

Done.

> I think the mention of multixacts should link to
> <xref linkend="vacuum-for-multixact-wraparound"/>.  Again, I would not
> specifically mention the directory, since it is already described in
> "storage.sgml", but I have no strong optinion there.

Done with:

   Row-level read locks might also require the assignment
   of multixact IDs (<literal>mxid</literal>;  see <xref
   linkend="vacuum-for-multixact-wraparound"/>).

> > +  <sect1 id="subxacts">
> > +
> > +   <title>Subtransactions</title>
> 
> > +    The word subtransaction is often abbreviated as
> > +    <literal>subxact</literal>.
> 
> I'd use <quote>, not <literal>.

Done.

> > +    If a subtransaction is assigned a non-virtual transaction ID,
> > +    its transaction ID is referred to as a <literal>subxid</literal>.
> 
> Again, I would use <quote>, since we don't <literal> "subxid"
> elsewhere.

Done.

> +                                                                   Up to
> +    64 open subxids are cached in shared memory for each backend; after
> +    that point, the overhead increases significantly since we must look
> +    up subxid entries in <filename>pg_subtrans</filename>.
> 
> Comma before "since".  Perhaps you should mention that this means disk I/O.

I went with:

   Up to 64 open subxids are cached in shared memory for each backend; after
   that point, the storage I/O overhead increases significantly, since
   we must look up subxid entries in <filename>pg_subtrans</filename>.

Updated full patch attached.

-- 
  Bruce Momjian  <br...@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index bd50ea8e48..ef3982f11a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7229,12 +7229,14 @@ local0.*    /var/log/postgresql
             </row>
             <row>
              <entry><literal>%v</literal></entry>
-             <entry>Virtual transaction ID (backendID/localXID)</entry>
+             <entry>Virtual transaction ID (backendID/localXID);  see
+             <xref linkend="transaction-id"/></entry>
              <entry>no</entry>
             </row>
             <row>
              <entry><literal>%x</literal></entry>
-             <entry>Transaction ID (0 if none is assigned)</entry>
+             <entry>Transaction ID (0 if none is assigned);  see
+             <xref linkend="transaction-id"/></entry>
              <entry>no</entry>
             </row>
             <row>
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index b030b36002..fdffba4442 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -4992,7 +4992,8 @@ WHERE ...
     <structfield>xmin</structfield> and <structfield>xmax</structfield>.  Transaction identifiers are 32-bit quantities.
     In some contexts, a 64-bit variant <type>xid8</type> is used.  Unlike
     <type>xid</type> values, <type>xid8</type> values increase strictly
-    monotonically and cannot be reused in the lifetime of a database cluster.
+    monotonically and cannot be reused in the lifetime of a database
+    cluster.  See <xref linkend="transaction-id"/> for more details.
    </para>
 
    <para>
diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index de450cd661..0d6be9a2fa 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -104,6 +104,7 @@
 <!ENTITY protocol   SYSTEM "protocol.sgml">
 <!ENTITY sources    SYSTEM "sources.sgml">
 <!ENTITY storage    SYSTEM "storage.sgml">
+<!ENTITY transaction     SYSTEM "xact.sgml">
 <!ENTITY tablesample-method SYSTEM "tablesample-method.sgml">
 <!ENTITY generic-wal SYSTEM "generic-wal.sgml">
 <!ENTITY custom-rmgr SYSTEM "custom-rmgr.sgml">
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 6e0425cb3d..38ff0c82e3 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -24673,7 +24673,10 @@ SELECT collation for ('foo' COLLATE "de_DE");
        <para>
         Returns the current transaction's ID.  It will assign a new one if the
         current transaction does not have one already (because it has not
-        performed any database updates).
+        performed any database updates);  see <xref
+        linkend="transaction-id"/> for details.  If executed in a
+        subtransaction, this will return the top-level transaction ID;
+        see <xref linkend="subxacts"/> for details.
        </para></entry>
       </row>
 
@@ -24690,6 +24693,8 @@ SELECT collation for ('foo' COLLATE "de_DE");
         ID is assigned yet.  (It's best to use this variant if the transaction
         might otherwise be read-only, to avoid unnecessary consumption of an
         XID.)
+        If executed in a subtransaction, this will return the top-level
+        transaction ID.
        </para></entry>
       </row>
 
@@ -24733,6 +24738,9 @@ SELECT collation for ('foo' COLLATE "de_DE");
        <para>
         Returns a current <firstterm>snapshot</firstterm>, a data structure
         showing which transaction IDs are now in-progress.
+        Only top-level transaction ID are included in the snapshot;
+        subtransaction ID are not shown;  see <xref linkend="subxacts"/>
+        for details.
        </para></entry>
       </row>
 
@@ -24787,7 +24795,8 @@ SELECT collation for ('foo' COLLATE "de_DE");
         Is the given transaction ID <firstterm>visible</firstterm> according
         to this snapshot (that is, was it completed before the snapshot was
         taken)?  Note that this function will not give the correct answer for
-        a subtransaction ID.
+        a subtransaction ID (subxid);  see <xref linkend="subxacts"/> for
+        details.
        </para></entry>
       </row>
      </tbody>
@@ -24799,8 +24808,9 @@ SELECT collation for ('foo' COLLATE "de_DE");
     wraps around every 4 billion transactions.  However,
     the functions shown in <xref linkend="functions-pg-snapshot"/> use a
     64-bit type <type>xid8</type> that does not wrap around during the life
-    of an installation, and can be converted to <type>xid</type> by casting if
-    required.  The data type <type>pg_snapshot</type> stores information about
+    of an installation and can be converted to <type>xid</type> by casting if
+    required;  see <xref linkend="transaction-id"/> for details. 
+    The data type <type>pg_snapshot</type> stores information about
     transaction ID visibility at a particular moment in time.  Its components
     are described in <xref linkend="functions-pg-snapshot-parts"/>.
     <type>pg_snapshot</type>'s textual representation is
@@ -24846,7 +24856,7 @@ SELECT collation for ('foo' COLLATE "de_DE");
         xmax</literal> and not in this list was already completed at the time
         of the snapshot, and thus is either visible or dead according to its
         commit status.  This list does not include the transaction IDs of
-        subtransactions.
+        subtransactions (subxids).
        </entry>
       </row>
      </tbody>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index d6d0a3a814..fe138a47bb 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1710,7 +1710,8 @@
      <literal>3</literal> (values under that are reserved) and the
      epoch value is incremented by one.
      In some contexts, the epoch and xid values are
-     considered together as a single 64-bit value.
+     considered together as a single 64-bit value;  see <xref
+     linkend="transaction-id"/> for more details.
     </para>
     <para>
      For more information, see
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index e5d622d514..186d7ffa11 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -918,7 +918,8 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
        <structfield>backend_xid</structfield> <type>xid</type>
       </para>
       <para>
-       Top-level transaction identifier of this backend, if any.
+       Top-level transaction identifier of this backend, if any;  see
+       <xref linkend="transaction-id"/>.
       </para></entry>
      </row>
 
@@ -1890,7 +1891,8 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
      </row>
      <row>
       <entry><literal>virtualxid</literal></entry>
-      <entry>Waiting to acquire a virtual transaction ID lock.</entry>
+      <entry>Waiting to acquire a virtual transaction ID lock;  see
+      <xref linkend="transaction-id"/>.</entry>
      </row>
     </tbody>
    </tgroup>
diff --git a/doc/src/sgml/pgrowlocks.sgml b/doc/src/sgml/pgrowlocks.sgml
index 2914bf6e6d..ad15cda668 100644
--- a/doc/src/sgml/pgrowlocks.sgml
+++ b/doc/src/sgml/pgrowlocks.sgml
@@ -57,7 +57,8 @@ pgrowlocks(text) returns setof record
      <row>
       <entry><structfield>locker</structfield></entry>
       <entry><type>xid</type></entry>
-      <entry>Transaction ID of locker, or multixact ID if multitransaction</entry>
+      <entry>Transaction ID of locker, or multixact ID if
+      multitransaction;  see <xref linkend="transaction-id"/></entry>
      </row>
      <row>
       <entry><structfield>multi</structfield></entry>
diff --git a/doc/src/sgml/postgres.sgml b/doc/src/sgml/postgres.sgml
index 73439c049e..2e271862fc 100644
--- a/doc/src/sgml/postgres.sgml
+++ b/doc/src/sgml/postgres.sgml
@@ -271,6 +271,7 @@ break is not needed in a wider output rendering.
   &brin;
   &hash;
   &storage;
+  &transaction;
   &bki;
   &planstats;
   &backup-manifest;
diff --git a/doc/src/sgml/ref/commit.sgml b/doc/src/sgml/ref/commit.sgml
index 5f244cdd3c..53d830998c 100644
--- a/doc/src/sgml/ref/commit.sgml
+++ b/doc/src/sgml/ref/commit.sgml
@@ -62,6 +62,9 @@ COMMIT [ WORK | TRANSACTION ] [ AND [ NO ] CHAIN ]
       linkend="sql-set-transaction"/>) as the just finished one.  Otherwise,
       no new transaction is started.
      </para>
+     <para>
+      The SQL Standard describes this as a chained transaction.
+     </para>
     </listitem>
    </varlistentry>
   </variablelist>
diff --git a/doc/src/sgml/ref/release_savepoint.sgml b/doc/src/sgml/ref/release_savepoint.sgml
index daf8eb9a43..afefbef297 100644
--- a/doc/src/sgml/ref/release_savepoint.sgml
+++ b/doc/src/sgml/ref/release_savepoint.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  <refnamediv>
   <refname>RELEASE SAVEPOINT</refname>
-  <refpurpose>destroy a previously defined savepoint</refpurpose>
+  <refpurpose>release a previously defined savepoint</refpurpose>
  </refnamediv>
 
  <refsynopsisdiv>
@@ -34,23 +34,13 @@ RELEASE [ SAVEPOINT ] <replaceable>savepoint_name</replaceable>
   <title>Description</title>
 
   <para>
-   <command>RELEASE SAVEPOINT</command> destroys a savepoint previously defined
-   in the current transaction.
-  </para>
-
-  <para>
-   Destroying a savepoint makes it unavailable as a rollback point,
-   but it has no other user visible behavior.  It does not undo the
-   effects of commands executed after the savepoint was established.
-   (To do that, see <xref linkend="sql-rollback-to"/>.)
-   Destroying a savepoint when
-   it is no longer needed allows the system to reclaim some resources
-   earlier than transaction end.
-  </para>
-
-  <para>
-   <command>RELEASE SAVEPOINT</command> also destroys all savepoints that were
-   established after the named savepoint was established.
+   <command>RELEASE SAVEPOINT</command> releases the named savepoint and
+   all active savepoints that were created after the named savepoint,
+   and frees their resources.  All changes made since the creation of the
+   savepoint, excluding rolled back savepoints changes, are merged into
+   the transaction or savepoint that was active when the named savepoint
+   was created.  Changes made after <command>RELEASE SAVEPOINT</command>
+   will also be part of this active transaction or savepoint.
   </para>
  </refsect1>
 
@@ -62,7 +52,7 @@ RELEASE [ SAVEPOINT ] <replaceable>savepoint_name</replaceable>
     <term><replaceable>savepoint_name</replaceable></term>
     <listitem>
      <para>
-      The name of the savepoint to destroy.
+      The name of the savepoint to release.
      </para>
     </listitem>
    </varlistentry>
@@ -78,7 +68,7 @@ RELEASE [ SAVEPOINT ] <replaceable>savepoint_name</replaceable>
 
   <para>
    It is not possible to release a savepoint when the transaction is in
-   an aborted state.
+   an aborted state;  to do that, use <xref linkend="sql-rollback-to"/>.
   </para>
 
   <para>
@@ -93,7 +83,7 @@ RELEASE [ SAVEPOINT ] <replaceable>savepoint_name</replaceable>
   <title>Examples</title>
 
   <para>
-   To establish and later destroy a savepoint:
+   To establish and later release a savepoint:
 <programlisting>
 BEGIN;
     INSERT INTO table1 VALUES (3);
@@ -104,6 +94,36 @@ COMMIT;
 </programlisting>
    The above transaction will insert both 3 and 4.
   </para>
+
+  <para>
+   A more complex example with multiple nested subtransactions:
+<programlisting>
+BEGIN;
+    INSERT INTO table1 VALUES (1);
+    SAVEPOINT sp1;
+    INSERT INTO table1 VALUES (2);
+    SAVEPOINT sp2;
+    INSERT INTO table1 VALUES (3);
+    RELEASE SAVEPOINT sp2;
+    INSERT INTO table1 VALUES (4))); -- generates an error
+</programlisting>
+   In this example, the application requests the release of the savepoint
+   <literal>sp2</literal>, which inserted 3.  This changes the insert's
+   transaction context to <literal>sp1</literal>.  When the statement
+   attempting to insert value 4 generates an error, the insertion of 2 and
+   4 are lost because they are in the same, now-rolled back savepoint,
+   and value 3 is in the same transaction context.  The application can
+   now only choose one of these two commands, since all other commands
+   will be ignored with a warning:
+<programlisting>
+   ROLLBACK;
+   ROLLBACK TO SAVEPOINT sp1;
+</programlisting>
+   Choosing <command>ROLLBACK</command> will abort everything, including
+   value 1, whereas <command>ROLLBACK TO SAVEPOINT sp1</command> will retain
+   value 1 and allow the transaction to continue.
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/doc/src/sgml/ref/rollback.sgml b/doc/src/sgml/ref/rollback.sgml
index 142f71e774..8272e1605c 100644
--- a/doc/src/sgml/ref/rollback.sgml
+++ b/doc/src/sgml/ref/rollback.sgml
@@ -57,9 +57,9 @@ ROLLBACK [ WORK | TRANSACTION ] [ AND [ NO ] CHAIN ]
     <listitem>
      <para>
       If <literal>AND CHAIN</literal> is specified, a new transaction is
-      immediately started with the same transaction characteristics (see <xref
-      linkend="sql-set-transaction"/>) as the just finished one.  Otherwise,
-      no new transaction is started.
+      immediately started with the same transaction characteristics (see
+      <xref linkend="sql-set-transaction"/>) as the just finished one.
+      Otherwise, no new transaction is started.
      </para>
     </listitem>
    </varlistentry>
diff --git a/doc/src/sgml/ref/rollback_to.sgml b/doc/src/sgml/ref/rollback_to.sgml
index 27fa95cd1b..32c1bb9723 100644
--- a/doc/src/sgml/ref/rollback_to.sgml
+++ b/doc/src/sgml/ref/rollback_to.sgml
@@ -35,8 +35,9 @@ ROLLBACK [ WORK | TRANSACTION ] TO [ SAVEPOINT ] <replaceable>savepoint_name</re
 
   <para>
    Roll back all commands that were executed after the savepoint was
-   established.  The savepoint remains valid and can be rolled back to
-   again later, if needed.
+   established and then start a new subtransaction at the same transaction level.
+   The savepoint remains valid and can be rolled back to again later,
+   if needed.
   </para>
 
   <para>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 7c716fe327..4d8bc659f2 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -1416,7 +1416,8 @@
       </para>
       <para>
        Virtual ID of the transaction targeted by the lock,
-       or null if the target is not a virtual transaction ID
+       or null if the target is not a virtual transaction ID;  see
+       <xref linkend="transactions"/>
       </para></entry>
      </row>
 
@@ -1425,8 +1426,8 @@
        <structfield>transactionid</structfield> <type>xid</type>
       </para>
       <para>
-       ID of the transaction targeted by the lock,
-       or null if the target is not a transaction ID
+       ID of the transaction targeted by the lock, or null if the target
+       is not a transaction ID;  <xref linkend="transactions"/>
       </para></entry>
      </row>
 
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 6a38b53744..ed7929cbcd 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -4,8 +4,9 @@
  <title>Reliability and the Write-Ahead Log</title>
 
  <para>
-  This chapter explains how the Write-Ahead Log is used to obtain
-  efficient, reliable operation.
+  This chapter explains how to control the reliability of
+  <productname>PostgreSQL</productname>, including details about the
+  Write-Ahead Log.
  </para>
 
  <sect1 id="wal-reliability">
@@ -909,4 +910,5 @@
    seem to be a problem in practice.
   </para>
  </sect1>
+
 </chapter>
diff --git a/doc/src/sgml/xact.sgml b/doc/src/sgml/xact.sgml
new file mode 100644
index 0000000000..ec475bf74a
--- /dev/null
+++ b/doc/src/sgml/xact.sgml
@@ -0,0 +1,204 @@
+<!-- doc/src/sgml/mvcc.sgml -->
+
+<chapter id="transactions">
+
+ <title>Transaction Processing</title>
+
+ <para>
+  This chapter provides an overview of the internals of
+  <productname>PostgreSQL</productname>'s transaction management system.
+  The word transaction is often abbreviated as <quote>xact</quote>.
+ </para>
+
+ <sect1 id="transaction-id">
+
+  <title>Transactions and Identifiers</title>
+
+  <para>
+   Transactions can be created explicitly using <command>BEGIN</command>
+   and <command>COMMIT</command>, which creates a transaction block.
+   An SQL statement outside of a transaction block automatically uses
+   a single-statement transaction.
+  </para>
+
+  <para>
+   Every transaction is identified by a unique
+   <literal>VirtualTransactionId</literal> (also called
+   <literal>virtualXID</literal> or <literal>vxid</literal>), which
+   is comprised of a backend ID (or <literal>backendID</literal>)
+   and a sequentially-assigned number local to each backend, known as
+   <literal>localXID</literal>.  For example, the virtual transaction
+   ID <literal>4/12532</literal> has a <literal>backendID</literal>
+   of <literal>4</literal> and a <literal>localXID</literal> of
+   <literal>12532</literal>.
+  </para>
+
+  <para>
+   Non-virtual <literal>TransactionId</literal> (or <type>xid</type>),
+   e.g., <literal>278394</literal>, are assigned sequentially to
+   transactions from a global counter used by all databases within
+   the <productname>PostgreSQL</productname> cluster.  This assignment
+   happens when a transaction first writes to the database. This means
+   lower-numbered xids started writing before higher-numbered xids.
+   Note that the order in which transactions perform their first database
+   write might be different from the order in which the transactions
+   started, particularly if the transaction started with statements that
+   only performed database reads.
+  </para>
+
+  <para>
+   The internal transaction ID type <type>xid</type> is 32 bits wide
+   and <link linkend="vacuum-for-wraparound">wraps around</link> every
+   4 billion transactions. A 32-bit epoch is incremented during each
+   wraparound. There is also a 64-bit type <type>xid8</type> which
+   includes this epoch and therefore does not wrap around during the
+   life of an installation;  it can be converted to xid by casting.
+   The functions in <xref linkend="functions-pg-snapshot"/>
+   return <type>xid8</type> values.  Xids are used as the
+   basis for <productname>PostgreSQL</productname>'s <link
+   linkend="mvcc">MVCC</link> concurrency mechanism and streaming
+   replication.
+  </para>
+
+  <para>
+   When a top-level transaction with a (non-virtual) xid commits,
+   it is marked as committed in the <filename>pg_xact</filename>
+   directory. Additional information is recorded in the
+   <filename>pg_commit_ts</filename> directory if <xref
+   linkend="guc-track-commit-timestamp"/> is enabled.
+  </para>
+
+  <para>
+   In addition to <literal>vxid</literal> and <type>xid</type>,
+   prepared transactions are also assigned Global Transaction
+   Identifiers (<acronym>GID</acronym>). GIDs are string literals up
+   to 200 bytes long, which must be unique amongst other currently
+   prepared transactions.  The mapping of GID to xid is shown in <link
+   linkend="view-pg-prepared-xacts"><structname>pg_prepared_xacts</structname></link>.
+  </para>
+ </sect1>
+
+ <sect1 id="xact-locking">
+
+  <title>Transactions and Locking</title>
+
+  <para>
+   The transaction IDs of currently executing transactions are shown in
+   <link linkend="view-pg-locks"><structname>pg_locks</structname></link>
+   in columns <structfield>virtualxid</structfield> and
+   <structfield>transactionid</structfield>.  Read-only transactions
+   will have <structfield>virtualxid</structfield>s but NULL
+   <structfield>transactionid</structfield>s, while read-write transactions
+   will have both as non-NULL.
+  </para>
+
+  <para>
+   Some lock types wait on <structfield>virtualxid</structfield>,
+   while other types wait on <structfield>transactionid</structfield>.
+   Row-level read and write locks are recorded directly in the locked
+   rows and can be inspected using the <xref linkend="pgrowlocks"/>
+   extension.  Row-level read locks might also require the assignment
+   of multixact IDs (<literal>mxid</literal>;  see <xref
+   linkend="vacuum-for-multixact-wraparound"/>).
+  </para>
+ </sect1>
+
+ <sect1 id="subxacts">
+
+  <title>Subtransactions</title>
+
+  <para>
+   Subtransactions are started inside transactions, allowing large
+   transactions to be broken into smaller units.  Subtransactions can
+   commit or abort without affecting their parent transactions, allowing
+   parent transactions to continue. This allows errors to be handled
+   more easily, which is a common application development pattern.
+   The word subtransaction is often abbreviated as
+   <quote>subxact</quote>.
+  </para>
+
+  <para>
+   Subtransactions can be started explicitly using the
+   <command>SAVEPOINT</command> command, but can also be started in
+   other ways, such as PL/pgSQL's <command>EXCEPTION</command> clause.
+   PL/Python and PL/TCL also support explicit subtransactions.
+   Subtransactions can also be started from other subtransactions.
+   The top-level transaction and its child subtransactions form a
+   hierarchy or tree, which is why we refer to the main transaction as
+   the top-level transaction.
+  </para>
+
+  <para>
+   If a subtransaction is assigned a non-virtual transaction ID,
+   its transaction ID is referred to as a <quote>subxid</quote>.
+   Read-only subtransactions are not assigned subxids, but once they
+   attempt to write, they will be assigned one. This also causes all of
+   a subxid's parents, up to and including the top-level transaction,
+   to be assigned non-virtual transaction ids.  We ensure that a parent
+   xid is always lower than any of its child subxids.
+  </para>
+
+  <para>
+   The immediate parent xid of each subxid is recorded in the
+   <filename>pg_subtrans</filename> directory. No entry is made for
+   top-level xids since they do not have a parent, nor is an entry made
+   for read-only subtransactions.
+  </para>
+
+  <para>
+   When a subtransaction commits, all of its committed child
+   subtransactions with subxids will also be considered subcommitted
+   in that transaction.  When a subtransaction aborts, all of its child
+   subtransactions will also be considered aborted.
+  </para>
+
+  <para>
+   When a top-level transaction with an xid commits, all of its
+   subcommitted child subtransactions are also persistently recorded
+   as committed in the <filename>pg_xact</filename> directory.  If the
+   top-level transaction aborts, all its subtransactions are also aborted,
+   even if they were subcommitted.
+  </para>
+
+  <para>
+   The more subtransactions each transaction keeps open (not rolled back
+   or released), the greater the transaction management overhead. Up to
+   64 open subxids are cached in shared memory for each backend; after
+   that point, the storage I/O overhead increases significantly, since
+   we must look up subxid entries in <filename>pg_subtrans</filename>.
+  </para>
+ </sect1>
+
+ <sect1 id="two-phase">
+
+  <title>Two-Phase Transactions</title>
+
+  <para>
+   <productname>PostgreSQL</productname> supports a two-phase commit (2PC)
+   protocol that allows multiple distributed systems to work together
+   in a transactional manner.  The commands are <command>PREPARE
+   TRANSACTION</command>, <command>COMMIT PREPARED</command> and
+   <command>ROLLBACK PREPARED</command>.  Two-phase transactions
+   are intended for use by external transaction management systems.
+   <productname>PostgreSQL</productname> follows the features and model
+   proposed by the X/Open XA standard, but does not implement some less
+   often used aspects.
+  </para>
+
+  <para>
+   When the user executes <command>PREPARE TRANSACTION</command>, the
+   only possible next commands are <command>COMMIT PREPARED</command>
+   or <command>ROLLBACK PREPARED</command>. In general, this prepared
+   state is intended to be of very short duration, but external
+   availability issues might mean transactions stay in this state
+   for an extended interval. Short-lived prepared
+   transactions are stored only in shared memory and WAL.
+   Transactions that span checkpoints are recorded in the
+   <filename>pg_twophase</filename> directory.  Transactions
+   that are currently prepared can be inspected using <link
+   linkend="view-pg-prepared-xacts"><structname>pg_prepared_xacts</structname></link>.
+  </para>
+ </sect1>
+
+</chapter>
+

Re: New docs chapter on Transaction Management and related changes

Reply via email to