Re: [PATCHES] [HACKERS] Why copy_relation_data only use wal whenWALarchivingis enabled

2007-11-15 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 Here's an updated version of the patch. There was a bogus assertion in
 the previous one, comparing against mdsync_cycle_ctr instead of
 mdunlink_cycle_ctr.

Applied with minor corrections.

I'm not sure whether we should consider back-patching this.  Thoughts?

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] [HACKERS] Why copy_relation_data only use wal whenWALarchivingis enabled

2007-10-20 Thread Heikki Linnakangas
Here's an updated version of the patch. There was a bogus assertion in
the previous one, comparing against mdsync_cycle_ctr instead of
mdunlink_cycle_ctr.

Heikki Linnakangas wrote:
 Tom Lane wrote:
 Heikki Linnakangas [EMAIL PROTECTED] writes:
 The best I can think of is to rename the obsolete file to
 relfilenode.stale, when it's scheduled for deletion at next
 checkpoint, and check for .stale-suffixed files in GetNewRelFileNode,
 and delete them immediately in DropTableSpace.
 This is getting too Rube Goldbergian for my tastes.  What if we just
 make DROP TABLESPACE force a checkpoint before proceeding?
 
 Patch attached.
 
 The scenario we're preventing is still possible if for some reason the
 latest checkpoint record is damaged, and we start recovery from the
 previous checkpoint record. I think the probability of that happening,
 together with the OID wrap-around and hitting the relfilenode of a
 recently deleted file with a new one, is low enough to not worry about.
 If we cared, we could fix it by letting the files to linger for two
 checkpoint cycles instead of one.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/access/transam/xlog.c
===
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.286
diff -c -r1.286 xlog.c
*** src/backend/access/transam/xlog.c	12 Oct 2007 19:39:59 -	1.286
--- src/backend/access/transam/xlog.c	18 Oct 2007 20:16:56 -
***
*** 45,50 
--- 45,51 
  #include storage/fd.h
  #include storage/pmsignal.h
  #include storage/procarray.h
+ #include storage/smgr.h
  #include storage/spin.h
  #include utils/builtins.h
  #include utils/pg_locale.h
***
*** 5668,5673 
--- 5669,5680 
  	checkPoint.time = time(NULL);
  
  	/*
+ 	 * Let the md storage manager to prepare for checkpoint before
+ 	 * we determine the REDO pointer.
+ 	 */
+ 	mdcheckpointbegin();
+ 
+ 	/*
  	 * We must hold WALInsertLock while examining insert state to determine
  	 * the checkpoint REDO pointer.
  	 */
***
*** 5887,5892 
--- 5894,5904 
  	END_CRIT_SECTION();
  
  	/*
+ 	 * Let the md storage manager to do its post-checkpoint work.
+ 	 */
+ 	mdcheckpointend();
+ 
+ 	/*
  	 * Delete old log files (those no longer needed even for previous
  	 * checkpoint).
  	 */
Index: src/backend/commands/tablespace.c
===
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/commands/tablespace.c,v
retrieving revision 1.49
diff -c -r1.49 tablespace.c
*** src/backend/commands/tablespace.c	1 Aug 2007 22:45:08 -	1.49
--- src/backend/commands/tablespace.c	18 Oct 2007 20:31:53 -
***
*** 460,472 
  	LWLockAcquire(TablespaceCreateLock, LW_EXCLUSIVE);
  
  	/*
! 	 * Try to remove the physical infrastructure
  	 */
  	if (!remove_tablespace_directories(tablespaceoid, false))
! 		ereport(ERROR,
! (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
!  errmsg(tablespace \%s\ is not empty,
! 		tablespacename)));
  
  	/* Record the filesystem change in XLOG */
  	{
--- 460,488 
  	LWLockAcquire(TablespaceCreateLock, LW_EXCLUSIVE);
  
  	/*
! 	 * Try to remove the physical infrastructure. 
  	 */
  	if (!remove_tablespace_directories(tablespaceoid, false))
! 	{
! 		/*
! 		 * There can be lingering empty files in the directories, left behind
! 		 * by for example DROP TABLE, that have been scheduled for deletion
! 		 * at next checkpoint (see comments in mdunlink() for details). We 
! 		 * could just delete them immediately, but we can't tell them apart
! 		 * from important data files that we mustn't delete. So instead, we
! 		 * force a checkpoint which will clean out any lingering files, and
! 		 * try again.
! 		 */
! 		RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT);
! 		if (!remove_tablespace_directories(tablespaceoid, false))
! 		{
! 			/* Still not empty, the files must be important then */
! 			ereport(ERROR,
! 	(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
! 	 errmsg(tablespace \%s\ is not empty,
! 			tablespacename)));
! 		}
! 	}
  
  	/* Record the filesystem change in XLOG */
  	{
Index: src/backend/storage/smgr/md.c
===
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/storage/smgr/md.c,v
retrieving revision 1.129
diff -c -r1.129 md.c
*** src/backend/storage/smgr/md.c	3 Jul 2007 14:51:24 -	1.129
--- src/backend/storage/smgr/md.c	20 Oct 2007 14:10:08 -
***
*** 34,39 
--- 34,40 
  /* special values for the segno arg to RememberFsyncRequest */
  #define FORGET_RELATION_FSYNC	(InvalidBlockNumber)
  #define FORGET_DATABASE_FSYNC	(InvalidBlockNumber-1)
+ #define UNLINK_RELATION_REQUEST	(InvalidBlockNumber-2)
  
  /*
   * On Windows, we have to interpret EACCES 

Re: [PATCHES] [HACKERS] Why copy_relation_data only use wal whenWALarchivingis enabled

2007-10-18 Thread Heikki Linnakangas
Tom Lane wrote:
 Heikki Linnakangas [EMAIL PROTECTED] writes:
 The best I can think of is to rename the obsolete file to
 relfilenode.stale, when it's scheduled for deletion at next
 checkpoint, and check for .stale-suffixed files in GetNewRelFileNode,
 and delete them immediately in DropTableSpace.
 
 This is getting too Rube Goldbergian for my tastes.  What if we just
 make DROP TABLESPACE force a checkpoint before proceeding?

Patch attached.

The scenario we're preventing is still possible if for some reason the
latest checkpoint record is damaged, and we start recovery from the
previous checkpoint record. I think the probability of that happening,
together with the OID wrap-around and hitting the relfilenode of a
recently deleted file with a new one, is low enough to not worry about.
If we cared, we could fix it by letting the files to linger for two
checkpoint cycles instead of one.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/access/transam/xlog.c
===
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.286
diff -c -r1.286 xlog.c
*** src/backend/access/transam/xlog.c	12 Oct 2007 19:39:59 -	1.286
--- src/backend/access/transam/xlog.c	18 Oct 2007 20:16:56 -
***
*** 45,50 
--- 45,51 
  #include storage/fd.h
  #include storage/pmsignal.h
  #include storage/procarray.h
+ #include storage/smgr.h
  #include storage/spin.h
  #include utils/builtins.h
  #include utils/pg_locale.h
***
*** 5668,5673 
--- 5669,5680 
  	checkPoint.time = time(NULL);
  
  	/*
+ 	 * Let the md storage manager to prepare for checkpoint before
+ 	 * we determine the REDO pointer.
+ 	 */
+ 	mdcheckpointbegin();
+ 
+ 	/*
  	 * We must hold WALInsertLock while examining insert state to determine
  	 * the checkpoint REDO pointer.
  	 */
***
*** 5887,5892 
--- 5894,5904 
  	END_CRIT_SECTION();
  
  	/*
+ 	 * Let the md storage manager to do its post-checkpoint work.
+ 	 */
+ 	mdcheckpointend();
+ 
+ 	/*
  	 * Delete old log files (those no longer needed even for previous
  	 * checkpoint).
  	 */
Index: src/backend/commands/tablespace.c
===
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/commands/tablespace.c,v
retrieving revision 1.49
diff -c -r1.49 tablespace.c
*** src/backend/commands/tablespace.c	1 Aug 2007 22:45:08 -	1.49
--- src/backend/commands/tablespace.c	18 Oct 2007 20:31:53 -
***
*** 460,472 
  	LWLockAcquire(TablespaceCreateLock, LW_EXCLUSIVE);
  
  	/*
! 	 * Try to remove the physical infrastructure
  	 */
  	if (!remove_tablespace_directories(tablespaceoid, false))
! 		ereport(ERROR,
! (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
!  errmsg(tablespace \%s\ is not empty,
! 		tablespacename)));
  
  	/* Record the filesystem change in XLOG */
  	{
--- 460,488 
  	LWLockAcquire(TablespaceCreateLock, LW_EXCLUSIVE);
  
  	/*
! 	 * Try to remove the physical infrastructure. 
  	 */
  	if (!remove_tablespace_directories(tablespaceoid, false))
! 	{
! 		/*
! 		 * There can be lingering empty files in the directories, left behind
! 		 * by for example DROP TABLE, that have been scheduled for deletion
! 		 * at next checkpoint (see comments in mdunlink() for details). We 
! 		 * could just delete them immediately, but we can't tell them apart
! 		 * from important data files that we mustn't delete. So instead, we
! 		 * force a checkpoint which will clean out any lingering files, and
! 		 * try again.
! 		 */
! 		RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT);
! 		if (!remove_tablespace_directories(tablespaceoid, false))
! 		{
! 			/* Still not empty, the files must be important then */
! 			ereport(ERROR,
! 	(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
! 	 errmsg(tablespace \%s\ is not empty,
! 			tablespacename)));
! 		}
! 	}
  
  	/* Record the filesystem change in XLOG */
  	{
Index: src/backend/storage/smgr/md.c
===
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/storage/smgr/md.c,v
retrieving revision 1.129
diff -c -r1.129 md.c
*** src/backend/storage/smgr/md.c	3 Jul 2007 14:51:24 -	1.129
--- src/backend/storage/smgr/md.c	18 Oct 2007 21:11:43 -
***
*** 34,39 
--- 34,40 
  /* special values for the segno arg to RememberFsyncRequest */
  #define FORGET_RELATION_FSYNC	(InvalidBlockNumber)
  #define FORGET_DATABASE_FSYNC	(InvalidBlockNumber-1)
+ #define UNLINK_RELATION_REQUEST	(InvalidBlockNumber-2)
  
  /*
   * On Windows, we have to interpret EACCES as possibly meaning the same as
***
*** 113,118 
--- 114,123 
   * table remembers the pending operations.	We use a hash table mostly as
   * a convenient way of