Greetings,

  We've come across this rather annoying error happening during our
  builds:

  ERROR:  could not create directory "pg_tblspc/25120/PG_9.3_201212081/231253": 
File exists

  It turns out that this is coming from copydir() when called by
  createdb() during a CREATE DATABASE .. FROM TEMPLATE where the
  template has tables in tablespaces.

  It turns out that createdb() currently only takes an AccessShareLock
  on pg_tablespace when scanning it with SnapshotNow, making it possible
  for a concurrent process to make some uninteresting modification to a
  tablespace (such as an ACL change) which will cause the heap scan in
  createdb() to see a given tablespace multiple times.  copydir() will
  then, rightfully, complain that it's being asked to create a directory
  which already exists.

  Given that this is during createdb(), I'm guessing we don't have any
  option but to switch the scan to using ShareLock, since there isn't a
  snapshot available to do an MVCC scan with (I'm guessing that there
  could be other issues trying to do that anyway).

  Attached is a patch which does this and corrects the problem for us
  (of course, we're now going to go rework our build system to not
  modify tablespace ACLs, since with this patch concurrent builds end up
  blocking on each other, which is annoying).

        Thanks,

                Stephen
colordiff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
new file mode 100644
index 1f6e02d..60a5099
*** a/src/backend/commands/dbcommands.c
--- b/src/backend/commands/dbcommands.c
*************** createdb(const CreatedbStmt *stmt)
*** 549,556 ****
  		/*
  		 * Iterate through all tablespaces of the template database, and copy
  		 * each one to the new database.
  		 */
! 		rel = heap_open(TableSpaceRelationId, AccessShareLock);
  		scan = heap_beginscan(rel, SnapshotNow, 0, NULL);
  		while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
  		{
--- 549,563 ----
  		/*
  		 * Iterate through all tablespaces of the template database, and copy
  		 * each one to the new database.
+ 		 *
+ 		 * We need to use ShareLock to prevent other processes from updating a
+ 		 * tablespace (such as changing an ACL, for example) or we will end up
+ 		 * running into the same tablespace multiple times during our heap scan
+ 		 * (the prior-to-update tuple and then the new tuple after the update)
+ 		 * and copydir() will, rightfully, complain that the directory already
+ 		 * exists.
  		 */
! 		rel = heap_open(TableSpaceRelationId, ShareLock);
  		scan = heap_beginscan(rel, SnapshotNow, 0, NULL);
  		while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
  		{
*************** createdb(const CreatedbStmt *stmt)
*** 607,613 ****
  			}
  		}
  		heap_endscan(scan);
! 		heap_close(rel, AccessShareLock);
  
  		/*
  		 * We force a checkpoint before committing.  This effectively means
--- 614,620 ----
  			}
  		}
  		heap_endscan(scan);
! 		heap_close(rel, ShareLock);
  
  		/*
  		 * We force a checkpoint before committing.  This effectively means

Attachment: signature.asc
Description: Digital signature

Reply via email to