Greetings, We've come across this rather annoying error happening during our builds:
ERROR: could not create directory "pg_tblspc/25120/PG_9.3_201212081/231253": File exists It turns out that this is coming from copydir() when called by createdb() during a CREATE DATABASE .. FROM TEMPLATE where the template has tables in tablespaces. It turns out that createdb() currently only takes an AccessShareLock on pg_tablespace when scanning it with SnapshotNow, making it possible for a concurrent process to make some uninteresting modification to a tablespace (such as an ACL change) which will cause the heap scan in createdb() to see a given tablespace multiple times. copydir() will then, rightfully, complain that it's being asked to create a directory which already exists. Given that this is during createdb(), I'm guessing we don't have any option but to switch the scan to using ShareLock, since there isn't a snapshot available to do an MVCC scan with (I'm guessing that there could be other issues trying to do that anyway). Attached is a patch which does this and corrects the problem for us (of course, we're now going to go rework our build system to not modify tablespace ACLs, since with this patch concurrent builds end up blocking on each other, which is annoying). Thanks, Stephen
colordiff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c new file mode 100644 index 1f6e02d..60a5099 *** a/src/backend/commands/dbcommands.c --- b/src/backend/commands/dbcommands.c *************** createdb(const CreatedbStmt *stmt) *** 549,556 **** /* * Iterate through all tablespaces of the template database, and copy * each one to the new database. */ ! rel = heap_open(TableSpaceRelationId, AccessShareLock); scan = heap_beginscan(rel, SnapshotNow, 0, NULL); while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL) { --- 549,563 ---- /* * Iterate through all tablespaces of the template database, and copy * each one to the new database. + * + * We need to use ShareLock to prevent other processes from updating a + * tablespace (such as changing an ACL, for example) or we will end up + * running into the same tablespace multiple times during our heap scan + * (the prior-to-update tuple and then the new tuple after the update) + * and copydir() will, rightfully, complain that the directory already + * exists. */ ! rel = heap_open(TableSpaceRelationId, ShareLock); scan = heap_beginscan(rel, SnapshotNow, 0, NULL); while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL) { *************** createdb(const CreatedbStmt *stmt) *** 607,613 **** } } heap_endscan(scan); ! heap_close(rel, AccessShareLock); /* * We force a checkpoint before committing. This effectively means --- 614,620 ---- } } heap_endscan(scan); ! heap_close(rel, ShareLock); /* * We force a checkpoint before committing. This effectively means
signature.asc
Description: Digital signature