The last item on my list before close is making VACUUM FULL and Hot
Standby play nicely together.

The options to do this were and still are:

(1) Add WAL messages for non-transactional relcache invalidations
(2) Allow system relations to be cluster-ed/vacuum full-ed.

(1) was how we did it originally and I believe it worked without
problem. We saw the opportunity to do (2) and it has been worth
exploring.

My approach to (2) was to look at this design-wise. Given my experience
with other aspects of relcache and invalidation, it all looked do-able
without problem. The design seems straightforward in a few ways, though
had a few special cases.

I attach a WIP patch that is sufficient to do psql -c "VACUUM FULL
pg_am;" successfully, it being the easiest case out of the special
cases. It's also easy to remove the special case-avoidance code in
vacuum.c to see the various failures that occur without further work.
I've added tests to cover the new cases, which currently cause make
check to fail solely because I haven't updated the test output, since
this is a WIP.

I'm not aware of any specific technical blockers to continuing with (2).

Having said that I now realise a few things I didn't before:

* Approach (2) effects the core of Postgres, even if you don't use Hot
Standby.
* I've had to remove 7 sanity checks to get the first few VACUUMs
working. ISTM that removing various basic checks in the code is not a
good thing. 
* There are are more special cases than I realised at first: temp,
shared, with-toast, nailed, shared-and-nailed, pg_class, normal system.

Taken together, ISTM that the benefits of progressing towards (2) are
not worth the potential burst radius of any problems introduced. With
the larger number of special cases and the removal of checks we may be
less able to spot or cope with failure. Given that people are getting
edgy about code instability, I would say there's likely to be some here.

My feeling is that we should return to approach (1) for Postgres 9.0,
since we have an approach that works and isolates any change to just Hot
Standby related codepaths. I'm not personally keen to introduce core
changes unrelated to the new feature (HS).

Your thoughts, please.

-- 
 Simon Riggs           www.2ndQuadrant.com
Index: src/backend/catalog/index.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/catalog/index.c,v
retrieving revision 1.331
diff -c -r1.331 index.c
*** src/backend/catalog/index.c	22 Jan 2010 16:40:18 -0000	1.331
--- src/backend/catalog/index.c	30 Jan 2010 14:59:11 -0000
***************
*** 600,614 ****
  				 errmsg_internal("concurrent index creation for exclusion constraints is not supported")));
  
  	/*
- 	 * We cannot allow indexing a shared relation after initdb (because
- 	 * there's no way to make the entry in other databases' pg_class).
- 	 */
- 	if (shared_relation && !IsBootstrapProcessingMode())
- 		ereport(ERROR,
- 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- 				 errmsg("shared indexes cannot be created after initdb")));
- 
- 	/*
  	 * Validate shared/non-shared tablespace (must check this before doing
  	 * GetNewRelFileNode, to prevent Assert therein)
  	 */
--- 600,605 ----
Index: src/backend/catalog/toasting.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/catalog/toasting.c,v
retrieving revision 1.28
diff -c -r1.28 toasting.c
*** src/backend/catalog/toasting.c	28 Jan 2010 23:21:11 -0000	1.28
--- src/backend/catalog/toasting.c	30 Jan 2010 14:59:28 -0000
***************
*** 127,141 ****
  
  	/*
  	 * Toast table is shared if and only if its parent is.
- 	 *
- 	 * We cannot allow toasting a shared relation after initdb (because
- 	 * there's no way to mark it toasted in other databases' pg_class).
  	 */
  	shared_relation = rel->rd_rel->relisshared;
- 	if (shared_relation && !IsBootstrapProcessingMode())
- 		ereport(ERROR,
- 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- 				 errmsg("shared tables cannot be toasted after initdb")));
  
  	/*
  	 * Is it already toasted?
--- 127,134 ----
Index: src/backend/commands/cluster.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/commands/cluster.c,v
retrieving revision 1.195
diff -c -r1.195 cluster.c
*** src/backend/commands/cluster.c	28 Jan 2010 23:21:11 -0000	1.195
--- src/backend/commands/cluster.c	30 Jan 2010 17:28:17 -0000
***************
*** 376,394 ****
  	Relation	OldIndex;
  
  	/*
- 	 * Disallow clustering system relations.  This will definitely NOT work
- 	 * for shared relations (we have no way to update pg_class rows in other
- 	 * databases), nor for nailed-in-cache relations (the relfilenode values
- 	 * for those are hardwired, see relcache.c).  It might work for other
- 	 * system relations, but I ain't gonna risk it.
- 	 */
- 	if (IsSystemRelation(OldHeap))
- 		ereport(ERROR,
- 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- 				 errmsg("\"%s\" is a system catalog",
- 						RelationGetRelationName(OldHeap))));
- 
- 	/*
  	 * Don't allow cluster on temp tables of other backends ... their local
  	 * buffer manager is not going to cope.
  	 */
--- 376,381 ----
***************
*** 605,611 ****
  	 */
  	snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u", tableOid);
  
! 	OIDNewHeap = make_new_heap(tableOid, NewHeapName, tableSpace);
  
  	/*
  	 * We don't need CommandCounterIncrement() because make_new_heap did it.
--- 592,602 ----
  	 */
  	snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u", tableOid);
  
! 	/* 
! 	 * Override the value of allow_system_table_mods to allow us to create
! 	 * a new relation in system namespace.
! 	 */
! 	OIDNewHeap = make_new_heap(tableOid, NewHeapName, tableSpace, IsSystemRelation(OldHeap));
  
  	/*
  	 * We don't need CommandCounterIncrement() because make_new_heap did it.
***************
*** 621,627 ****
  	CommandCounterIncrement();
  
  	/* Swap the physical files of the old and new heaps. */
! 	swap_relation_files(tableOid, OIDNewHeap, frozenXid);
  
  	CommandCounterIncrement();
  
--- 612,618 ----
  	CommandCounterIncrement();
  
  	/* Swap the physical files of the old and new heaps. */
! 	swap_relation_files(tableOid, OIDNewHeap, frozenXid, IsSystemRelation(OldHeap));
  
  	CommandCounterIncrement();
  
***************
*** 679,685 ****
   * Create the new table that we will fill with correctly-ordered data.
   */
  Oid
! make_new_heap(Oid OIDOldHeap, const char *NewName, Oid NewTableSpace)
  {
  	TupleDesc	OldHeapDesc,
  				tupdesc;
--- 670,676 ----
   * Create the new table that we will fill with correctly-ordered data.
   */
  Oid
! make_new_heap(Oid OIDOldHeap, const char *NewName, Oid NewTableSpace, bool allow_system_table_mods)
  {
  	TupleDesc	OldHeapDesc,
  				tupdesc;
***************
*** 731,737 ****
  										  ONCOMMIT_NOOP,
  										  reloptions,
  										  false,
! 										  allowSystemTableMods);
  
  	ReleaseSysCache(tuple);
  
--- 722,728 ----
  										  ONCOMMIT_NOOP,
  										  reloptions,
  										  false,
! 										  allow_system_table_mods);
  
  	ReleaseSysCache(tuple);
  
***************
*** 1028,1034 ****
   * which is the correct value.
   */
  void
! swap_relation_files(Oid r1, Oid r2, TransactionId frozenXid)
  {
  	Relation	relRelation;
  	HeapTuple	reltup1,
--- 1019,1025 ----
   * which is the correct value.
   */
  void
! swap_relation_files(Oid r1, Oid r2, TransactionId frozenXid, bool allow_system_table_mods)
  {
  	Relation	relRelation;
  	HeapTuple	reltup1,
***************
*** 1125,1131 ****
  		{
  			count = deleteDependencyRecordsFor(RelationRelationId,
  											   relform1->reltoastrelid);
! 			if (count != 1)
  				elog(ERROR, "expected one dependency record for TOAST table, found %ld",
  					 count);
  		}
--- 1116,1122 ----
  		{
  			count = deleteDependencyRecordsFor(RelationRelationId,
  											   relform1->reltoastrelid);
! 			if (count != 1 && !allow_system_table_mods)
  				elog(ERROR, "expected one dependency record for TOAST table, found %ld",
  					 count);
  		}
***************
*** 1133,1139 ****
  		{
  			count = deleteDependencyRecordsFor(RelationRelationId,
  											   relform2->reltoastrelid);
! 			if (count != 1)
  				elog(ERROR, "expected one dependency record for TOAST table, found %ld",
  					 count);
  		}
--- 1124,1130 ----
  		{
  			count = deleteDependencyRecordsFor(RelationRelationId,
  											   relform2->reltoastrelid);
! 			if (count != 1 && !allow_system_table_mods)
  				elog(ERROR, "expected one dependency record for TOAST table, found %ld",
  					 count);
  		}
Index: src/backend/commands/tablecmds.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/commands/tablecmds.c,v
retrieving revision 1.320
diff -c -r1.320 tablecmds.c
*** src/backend/commands/tablecmds.c	28 Jan 2010 23:21:11 -0000	1.320
--- src/backend/commands/tablecmds.c	30 Jan 2010 13:57:22 -0000
***************
*** 2902,2908 ****
  			snprintf(NewHeapName, sizeof(NewHeapName),
  					 "pg_temp_%u", tab->relid);
  
! 			OIDNewHeap = make_new_heap(tab->relid, NewHeapName, NewTableSpace);
  
  			/*
  			 * Copy the heap data into the new table with the desired
--- 2902,2908 ----
  			snprintf(NewHeapName, sizeof(NewHeapName),
  					 "pg_temp_%u", tab->relid);
  
! 			OIDNewHeap = make_new_heap(tab->relid, NewHeapName, NewTableSpace, allowSystemTableMods);
  
  			/*
  			 * Copy the heap data into the new table with the desired
***************
*** 2917,2923 ****
  			 * new relfrozenxid because we rewrote all the tuples on
  			 * ATRewriteTable, so no older Xid remains on the table.
  			 */
! 			swap_relation_files(tab->relid, OIDNewHeap, RecentXmin);
  
  			CommandCounterIncrement();
  
--- 2917,2923 ----
  			 * new relfrozenxid because we rewrote all the tuples on
  			 * ATRewriteTable, so no older Xid remains on the table.
  			 */
! 			swap_relation_files(tab->relid, OIDNewHeap, RecentXmin, allowSystemTableMods);
  
  			CommandCounterIncrement();
  
Index: src/backend/commands/vacuum.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/commands/vacuum.c,v
retrieving revision 1.403
diff -c -r1.403 vacuum.c
*** src/backend/commands/vacuum.c	6 Jan 2010 05:31:13 -0000	1.403
--- src/backend/commands/vacuum.c	30 Jan 2010 18:24:21 -0000
***************
*** 1187,1193 ****
  	 */
  	if (!(vacstmt->options & VACOPT_FULL))
  		heldoff = lazy_vacuum_rel(onerel, vacstmt, vac_strategy, scanned_all);
! 	else if ((vacstmt->options & VACOPT_INPLACE) || IsSystemRelation(onerel))
  		heldoff = full_vacuum_rel(onerel, vacstmt);
  	else
  	{
--- 1187,1197 ----
  	 */
  	if (!(vacstmt->options & VACOPT_FULL))
  		heldoff = lazy_vacuum_rel(onerel, vacstmt, vac_strategy, scanned_all);
! 	else if ((vacstmt->options & VACOPT_INPLACE) || 
! 				IsSharedRelation(relid) || 
! 				onerel->rd_istemp ||
! 				onerel->rd_isnailed || 
! 				(IsSystemRelation(onerel) && toast_relid != InvalidOid))
  		heldoff = full_vacuum_rel(onerel, vacstmt);
  	else
  	{
Index: src/backend/utils/cache/relcache.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/utils/cache/relcache.c,v
retrieving revision 1.300
diff -c -r1.300 relcache.c
*** src/backend/utils/cache/relcache.c	13 Jan 2010 23:07:08 -0000	1.300
--- src/backend/utils/cache/relcache.c	30 Jan 2010 14:59:59 -0000
***************
*** 2437,2452 ****
  	}
  
  	/*
- 	 * check that hardwired list of shared rels matches what's in the
- 	 * bootstrap .bki file.  If you get a failure here during initdb, you
- 	 * probably need to fix IsSharedRelation() to match whatever you've done
- 	 * to the set of shared relations.
- 	 */
- 	if (shared_relation != IsSharedRelation(relid))
- 		elog(ERROR, "shared_relation flag for \"%s\" does not match IsSharedRelation(%u)",
- 			 relname, relid);
- 
- 	/*
  	 * switch to the cache context to create the relcache entry.
  	 */
  	if (!CacheMemoryContext)
--- 2437,2442 ----
Index: src/bin/initdb/initdb.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/bin/initdb/initdb.c,v
retrieving revision 1.184
diff -c -r1.184 initdb.c
*** src/bin/initdb/initdb.c	26 Jan 2010 16:18:12 -0000	1.184
--- src/bin/initdb/initdb.c	30 Jan 2010 15:02:25 -0000
***************
*** 1900,1906 ****
  
  	PG_CMD_OPEN;
  
! 	PG_CMD_PUTS("ANALYZE;\nVACUUM FULL;\nVACUUM FREEZE;\n");
  
  	PG_CMD_CLOSE;
  
--- 1900,1906 ----
  
  	PG_CMD_OPEN;
  
! 	PG_CMD_PUTS("ANALYZE;\nVACUUM;\nVACUUM FREEZE;\n");
  
  	PG_CMD_CLOSE;
  
***************
*** 1940,1946 ****
  		/*
  		 * Finally vacuum to clean up dead rows in pg_database
  		 */
! 		"VACUUM FULL pg_database;\n",
  		NULL
  	};
  
--- 1940,1946 ----
  		/*
  		 * Finally vacuum to clean up dead rows in pg_database
  		 */
! 		"VACUUM pg_database;\n",
  		NULL
  	};
  
Index: src/include/commands/cluster.h
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/include/commands/cluster.h,v
retrieving revision 1.38
diff -c -r1.38 cluster.h
*** src/include/commands/cluster.h	6 Jan 2010 05:31:14 -0000	1.38
--- src/include/commands/cluster.h	30 Jan 2010 13:55:53 -0000
***************
*** 24,30 ****
  						   bool recheck);
  extern void mark_index_clustered(Relation rel, Oid indexOid);
  extern Oid make_new_heap(Oid OIDOldHeap, const char *NewName,
! 			  Oid NewTableSpace);
! extern void swap_relation_files(Oid r1, Oid r2, TransactionId frozenXid);
  
  #endif   /* CLUSTER_H */
--- 24,31 ----
  						   bool recheck);
  extern void mark_index_clustered(Relation rel, Oid indexOid);
  extern Oid make_new_heap(Oid OIDOldHeap, const char *NewName,
! 			  Oid NewTableSpace, bool allow_system_table_mods);
! extern void swap_relation_files(Oid r1, Oid r2, TransactionId frozenXid,
! 								 bool allow_system_table_mods);
  
  #endif   /* CLUSTER_H */
Index: src/test/regress/sql/vacuum.sql
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/test/regress/sql/vacuum.sql,v
retrieving revision 1.3
diff -c -r1.3 vacuum.sql
*** src/test/regress/sql/vacuum.sql	6 Jan 2010 05:31:14 -0000	1.3
--- src/test/regress/sql/vacuum.sql	30 Jan 2010 17:59:43 -0000
***************
*** 56,67 ****
  
  INSERT INTO vacid (relid, filenode_0)
  SELECT oid, relfilenode FROM pg_class WHERE oid::regclass IN (
!   'pg_am',       -- normal catalog
!   'pg_class',    -- fundamental catalog
!   'pg_database', -- shared catalog
!   'vaccluster' , -- clustered table
!   'vacid',       -- temp table
!   'vactst'       -- normal table
  );
  
  -- only clusterd table should be changed
--- 56,72 ----
  
  INSERT INTO vacid (relid, filenode_0)
  SELECT oid, relfilenode FROM pg_class WHERE oid::regclass IN (
!   'pg_am',       	-- normal catalog
!   'pg_description',	-- normal catalog, with toast
!   'pg_type',		-- nailed catalog
!   'pg_proc',		-- nailed catalog, with toast
!   'pg_tablespace',	-- shared catalog
!   'pg_authid',		-- shared catalog, with toast
!   'pg_class',    	-- fundamental catalog
!   'pg_database', 	-- shared and nailed catalog
!   'vaccluster' , 	-- clustered table
!   'vacid',       	-- temp table
!   'vactst'       	-- normal table
  );
  
  -- only clusterd table should be changed
***************
*** 71,76 ****
--- 76,86 ----
  
  -- all tables should not be changed
  VACUUM (FULL INPLACE) pg_am;
+ VACUUM (FULL INPLACE) pg_description;
+ VACUUM (FULL INPLACE) pg_type;
+ VACUUM (FULL INPLACE) pg_proc;
+ VACUUM (FULL INPLACE) pg_tablespace;
+ VACUUM (FULL INPLACE) pg_authid;
  VACUUM (FULL INPLACE) pg_class;
  VACUUM (FULL INPLACE) pg_database;
  VACUUM (FULL INPLACE) vaccluster;
***************
*** 79,86 ****
  UPDATE vacid SET filenode_2 = relfilenode
    FROM pg_class WHERE oid = relid;
  
- -- only non-system tables should be changed
  VACUUM FULL pg_am;
  VACUUM FULL pg_class;
  VACUUM FULL pg_database;
  VACUUM FULL vaccluster;
--- 89,100 ----
  UPDATE vacid SET filenode_2 = relfilenode
    FROM pg_class WHERE oid = relid;
  
  VACUUM FULL pg_am;
+ VACUUM FULL pg_description;
+ VACUUM FULL pg_type;
+ VACUUM FULL pg_proc;
+ VACUUM FULL pg_tablespace;
+ VACUUM FULL pg_authid;
  VACUUM FULL pg_class;
  VACUUM FULL pg_database;
  VACUUM FULL vaccluster;
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to