Tom Lane wrote:
> I managed to crash the executor in the tablespace.sql test while working
> on a 9.1 patch, and discovered that the postmaster fails to recover
> from that.  The end of postmaster.log looks like
> 
> LOG:  all server processes terminated; reinitializing
> LOG:  database system was interrupted; last known up at 2010-07-11 19:30:07 
> EDT
> LOG:  database system was not properly shut down; automatic recovery in 
> progress
> LOG:  consistent recovery state reached at 0/EE26F30
> LOG:  redo starts at 0/EE26F30
> FATAL:  directory 
> "/home/postgres/pgsql/src/test/regress/testtablespace/PG_9.1_201004261" 
> already in use as a tablespace
> CONTEXT:  xlog redo create ts: 127158 
> "/home/postgres/pgsql/src/test/regress/testtablespace"
> LOG:  startup process (PID 13914) exited with exit code 1
> LOG:  aborting startup due to startup process failure
> 
> It looks to me like those well-intentioned recent changes in this area
> broke the crash-recovery case.  Not good.

Sorry for the delay.  I didn't realize this was my code that was broken
until Heikki told me via IM.

The bug is that we can't replay mkdir()/symlink() and assume those will
always succeed.  I looked at the createdb redo code and it basically
drops the directory before creating it.

The tablespace directory/symlink setup is more complex, so I just wrote
the attached patch to trigger a redo-'delete' tablespace operation
before the create tablespace redo operation.

Ignoring mkdir/symlink creation failure is not an option because the
symlink might point to some wrong location or something.

-- 
  Bruce Momjian  <br...@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + None of us is going to be here forever. +
Index: src/backend/commands/tablespace.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/commands/tablespace.c,v
retrieving revision 1.77
diff -c -c -r1.77 tablespace.c
*** src/backend/commands/tablespace.c	18 Jul 2010 04:47:46 -0000	1.77
--- src/backend/commands/tablespace.c	18 Jul 2010 05:17:23 -0000
***************
*** 1355,1368 ****
  	/* Backup blocks are not used in tblspc records */
  	Assert(!(record->xl_info & XLR_BKP_BLOCK_MASK));
  
! 	if (info == XLOG_TBLSPC_CREATE)
! 	{
! 		xl_tblspc_create_rec *xlrec = (xl_tblspc_create_rec *) XLogRecGetData(record);
! 		char	   *location = xlrec->ts_path;
! 
! 		create_tablespace_directories(location, xlrec->ts_id);
! 	}
! 	else if (info == XLOG_TBLSPC_DROP)
  	{
  		xl_tblspc_drop_rec *xlrec = (xl_tblspc_drop_rec *) XLogRecGetData(record);
  
--- 1355,1365 ----
  	/* Backup blocks are not used in tblspc records */
  	Assert(!(record->xl_info & XLR_BKP_BLOCK_MASK));
  
! 	/*
! 	 *	If we are creating a tablespace during recovery, it is unclear
! 	 *	what state it is in, so potentially remove it before creating it.
! 	 */
! 	if (info == XLOG_TBLSPC_DROP || info == XLOG_TBLSPC_CREATE)
  	{
  		xl_tblspc_drop_rec *xlrec = (xl_tblspc_drop_rec *) XLogRecGetData(record);
  
***************
*** 1395,1400 ****
--- 1392,1407 ----
  	}
  	else
  		elog(PANIC, "tblspc_redo: unknown op code %u", info);
+ 
+ 	/* Now create the tablespace we perhaps just removed. */
+ 	if (info == XLOG_TBLSPC_CREATE)
+ 	{
+ 		xl_tblspc_create_rec *xlrec = (xl_tblspc_create_rec *) XLogRecGetData(record);
+ 		char	   *location = xlrec->ts_path;
+ 
+ 		create_tablespace_directories(location, xlrec->ts_id);
+ 	}
+ 
  }
  
  void
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to