Re: [PATCHES] Fast CLUSTER

Simon Riggs Thu, 22 Feb 2007 14:54:00 -0800

On Tue, 2007-02-20 at 14:38 -0300, Alvaro Herrera wrote:

> Cool.  I noticed that the SGML seems broken here:


Corrected. 

> You need to close the <listitem> and <para> opened in the COPY mention.
> 
> > + 
> > + static void
> > + heap_sync_relation(Relation rel)
> > + {
> > +   if (!rel->rd_istemp)
> 
> No comment in this function?

I've added more comments as you suggest.

Thanks for the review.

-- 
  Simon Riggs             
  EnterpriseDB   http://www.enterprisedb.com

Index: doc/src/sgml/perform.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/perform.sgml,v
retrieving revision 1.63
diff -c -r1.63 perform.sgml
*** doc/src/sgml/perform.sgml	1 Feb 2007 19:10:24 -0000	1.63
--- doc/src/sgml/perform.sgml	22 Feb 2007 11:50:58 -0000
***************
*** 908,919 ****
      will perform more slowly when <varname>archive_command</varname>
      is set, as a result of their needing to write large amounts of WAL.
      This applies to the following commands: 
!     <command>CREATE TABLE AS SELECT</command>, 
!     <command>CREATE INDEX</command> and also <command>COPY</command>, when
!     it is executed in the same transaction as a prior 
!     <command>CREATE TABLE</command> or <command>TRUNCATE</command> command.
!    </para>
! 
    </sect2>
  
    <sect2 id="populate-analyze">
--- 908,928 ----
      will perform more slowly when <varname>archive_command</varname>
      is set, as a result of their needing to write large amounts of WAL.
      This applies to the following commands: 
!     <itemizedlist>
!      <listitem><para><command>CREATE TABLE AS SELECT</command></para></listitem>
!      <listitem><para><command>CREATE INDEX</command></para></listitem>
!      <listitem><para><command>ALTER TABLE SET TABLESPACE</command></para></listitem>
!      <listitem><para><command>CLUSTER</command></para></listitem>
!      <listitem>
!       <para><command>COPY</command>, when it is executed after one of
! 	  these commands, yet in the same transaction:
! 	  <itemizedlist>
! 	   <listitem><para><command>CREATE TABLE</command></para></listitem>
! 	   <listitem><para><command>TRUNCATE</command></para></listitem>
! 	  </itemizedlist>
!       </para>
!      </listitem>
!   </itemizedlist>
    </sect2>
  
    <sect2 id="populate-analyze">
***************
*** 950,964 ****
      By default, <application>pg_dump</> uses <command>COPY</>, and when
      it is generating a complete schema-and-data dump, it is careful to
      load data before creating indexes and foreign keys.  So in this case
!     the first several guidelines are handled automatically.  What is left
!     for you to do is to set appropriate (i.e., larger than normal) values
!     for <varname>maintenance_work_mem</varname> and
!     <varname>checkpoint_segments</varname>, as well as unsetting 
!     <varname>archive_command</varname> before loading the dump script,
!     and then to run <command>ANALYZE</> afterwards and resetting
!     <varname>archive_command</varname> if required. All of the 
!     parameters can be reset once the load has completed without needing
!     to restart the server, as described in <xref linkend="config-setting">.
     </para>
  
     <para>
--- 959,992 ----
      By default, <application>pg_dump</> uses <command>COPY</>, and when
      it is generating a complete schema-and-data dump, it is careful to
      load data before creating indexes and foreign keys.  So in this case
!     several guidelines are handled automatically.  What is left
!     for you to do is to:
!     <itemizedlist>
!      <listitem><para>Set appropriate (i.e., larger than normal) values
!       for <varname>maintenance_work_mem</varname> and
!       <varname>checkpoint_segments</varname>.</para></listitem>
!      <listitem><para>Unset <varname>archive_command</varname> before
! 	  loading the dump script, and then reset 
! 	  <varname>archive_command</varname> afterwards, if required.</para>
! 	 </listitem>
!      <listitem>
! 	  <para>
!        Consider whether the whole dump can be restored as a single
! 	   transaction, so that COPY commands can be optimized. Remember that
! 	   this will mean that all restore commands will either fully completed
! 	   or fully rolled back. This mode can be specified by passing the 
! 	   <option>-1</> or <option>--single-transaction</> command-line options
! 	   to <application>psql</> or <application>pg_restore</>. When using this
! 	   mode, be aware that even the smallest of errors can rollback a
!        restore that has already run for many hours. However, that might
!        still be preferable to manually cleaning up a complex database
!        after a partially restored dump.
! 	  </para>
! 	 </listitem>
!      <listitem><para>Run <command>ANALYZE</> afterwards.</para></listitem>
! 	All of the above mentioned parameters can be reset once the load has
! 	completed without needing to restart the server, as described in 
! 	<xref linkend="config-setting">.
     </para>
  
     <para>
Index: src/backend/access/heap/heapam.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/access/heap/heapam.c,v
retrieving revision 1.228
diff -c -r1.228 heapam.c
*** src/backend/access/heap/heapam.c	9 Feb 2007 03:35:33 -0000	1.228
--- src/backend/access/heap/heapam.c	22 Feb 2007 11:51:00 -0000
***************
*** 60,65 ****
--- 60,66 ----
  
  static XLogRecPtr log_heap_update(Relation reln, Buffer oldbuf,
  		   ItemPointerData from, Buffer newbuf, HeapTuple newtup, bool move);
+ static void heap_sync_relation(Relation rel);
  
  
  /* ----------------------------------------------------------------
***************
*** 3991,3996 ****
--- 3992,4003 ----
  /* ----------------
   *		heap_sync - sync a heap, for use when no WAL has been written
   *
+  * 		Will sync all relations associated with the main table,
+  *		i.e. the main heap, indexes, the toast relation and its index
+  *		If new relation types were added, this would need extension
+  *		XXX Indexes are mentioned because its possible in the future that
+  *		we may choose to avoid writing WAL for indexes as well,
+  *		in certain cicrumstances.
   * ----------------
   */
  void
***************
*** 3998,4003 ****
--- 4005,4042 ----
  {
  	if (!rel->rd_istemp)
  	{
+ 		/* main heap */
+ 		heap_sync_relation(rel);
+ 
+ 		/* main heap indexes, if any */
+ 		/* we always use WAL for index inserts, so no need to sync */
+ 
+ 		/* toast heap, if any */
+ 		if (OidIsValid(rel->rd_rel->reltoastrelid))
+ 		{
+ 			 Relation		toastrel;
+ 
+ 			 toastrel = heap_open(rel->rd_rel->reltoastrelid,
+ 								  AccessShareLock);
+ 			 heap_sync_relation(toastrel);
+ 			 heap_close(toastrel, AccessShareLock);
+ 		}
+ 
+ 		/* toast index, if toast heap */
+ 		/* we always use WAL for index inserts, so no need to sync */
+ 	}
+ }
+ 
+ /*
+  * heap_sync_relation
+  *
+  * Worker function for heap_sync(). Performs sync for an individual relation.
+  */
+ static void
+ heap_sync_relation(Relation rel)
+ {
+ 	if (!rel->rd_istemp)
+ 	{
  		/*
  		 * If we skipped using WAL, and it's not a temp relation,
  		 * we must force the relation down to disk before it's
Index: src/backend/commands/cluster.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/commands/cluster.c,v
retrieving revision 1.156
diff -c -r1.156 cluster.c
*** src/backend/commands/cluster.c	1 Feb 2007 19:10:25 -0000	1.156
--- src/backend/commands/cluster.c	22 Feb 2007 11:51:00 -0000
***************
*** 653,658 ****
--- 653,659 ----
  	char	   *nulls;
  	IndexScanDesc scan;
  	HeapTuple	tuple;
+ 	bool		use_wal;
  
  	/*
  	 * Open the relations we need.
***************
*** 676,681 ****
--- 677,688 ----
  	memset(nulls, 'n', natts * sizeof(char));
  
  	/*
+ 	 * We need to log the copied data in WAL iff WAL archiving is enabled AND
+ 	 * it's not a temp rel.
+ 	 */
+ 	use_wal = XLogArchivingActive() && !NewHeap->rd_istemp;
+ 
+ 	/*
  	 * Scan through the OldHeap on the OldIndex and copy each tuple into the
  	 * NewHeap.
  	 */
***************
*** 722,728 ****
  		if (NewHeap->rd_rel->relhasoids)
  			HeapTupleSetOid(copiedTuple, HeapTupleGetOid(tuple));
  
! 		simple_heap_insert(NewHeap, copiedTuple);
  
  		heap_freetuple(copiedTuple);
  
--- 729,735 ----
  		if (NewHeap->rd_rel->relhasoids)
  			HeapTupleSetOid(copiedTuple, HeapTupleGetOid(tuple));
  
! 		fast_heap_insert(NewHeap, copiedTuple, use_wal);
  
  		heap_freetuple(copiedTuple);
  
***************
*** 736,741 ****
--- 743,752 ----
  
  	index_close(OldIndex, NoLock);
  	heap_close(OldHeap, NoLock);
+ 
+ 	if (!use_wal)
+ 		heap_sync(NewHeap);
+ 
  	heap_close(NewHeap, NoLock);
  }
  
Index: src/backend/commands/copy.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/commands/copy.c,v
retrieving revision 1.276
diff -c -r1.276 copy.c
*** src/backend/commands/copy.c	20 Feb 2007 17:32:13 -0000	1.276
--- src/backend/commands/copy.c	22 Feb 2007 11:51:01 -0000
***************
*** 1857,1863 ****
  	 * do not need to worry about inconsistent states.
  	 * As mentioned in comments in utils/rel.h, the in-same-transaction test is
  	 * not completely reliable, since rd_createSubId can be reset to zero in
! 	 * certain cases before the end of the creating transaction. 
  	 * We are doing this for performance only, so we only need to know: 
  	 * if rd_createSubid != InvalidSubTransactionId then it is *always* just 
  	 * created. If we have PITR enabled, then we *must* use_wal
--- 1857,1865 ----
  	 * do not need to worry about inconsistent states.
  	 * As mentioned in comments in utils/rel.h, the in-same-transaction test is
  	 * not completely reliable, since rd_createSubId can be reset to zero in
! 	 * certain cases before the end of the creating transaction.
! 	 * CLUSTER creates a new relfilenode but it removes the old one immediately
! 	 * for other reasons, so we cannot optimise COPY following CLUSTER.
  	 * We are doing this for performance only, so we only need to know: 
  	 * if rd_createSubid != InvalidSubTransactionId then it is *always* just 
  	 * created. If we have PITR enabled, then we *must* use_wal
***************
*** 2121,2148 ****
  	 * If we skipped writing WAL for heaps, then we need to sync
  	 */
  	if (!use_wal)
- 	{
- 		/* main heap */
  		heap_sync(cstate->rel);
  
- 		/* main heap indexes, if any */
- 		/* we always use WAL for index inserts, so no need to sync */
- 
- 		/* toast heap, if any */
- 		if (OidIsValid(cstate->rel->rd_rel->reltoastrelid))
- 		{
- 			 Relation		toastrel;
- 
- 			 toastrel = heap_open(cstate->rel->rd_rel->reltoastrelid,
- 								  AccessShareLock);
- 			 heap_sync(toastrel);
- 			 heap_close(toastrel, AccessShareLock);
- 		}
- 
- 		/* toast index, if toast heap */
- 		/* we always use WAL for index inserts, so no need to sync */
- 	}
- 
  	/* Done, clean up */
  	error_context_stack = errcontext.previous;
  
--- 2123,2130 ----

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Re: [PATCHES] Fast CLUSTER

Reply via email to