On Wed, Sep 27, 2006 at 04:13:34PM -0400, Tom Lane wrote:
> Jon Lapham <[EMAIL PROTECTED]> writes in pgsql-general:
> > If I run...
> > sleep 3; echo starting; createdb bar
> > ...and power off the VM while the "createdb bar" is running.
> 
> > Upon restart, about 50% of the time I can reproduce the following error 
> > message:
> 
> > [EMAIL PROTECTED] ~]$ psql bar
> > psql: FATAL:  database "bar" does not exist
> > [EMAIL PROTECTED] ~]$ createdb bar
> > createdb: database creation failed: ERROR: could not create directory 
> > "base/65536": File exists
> 
> What apparently is happening here is that the same OID has been assigned
> to the new database both times.  Even though the createdb didn't
> complete, the directory it started to build is there and so there's a
> filename collision.
> 
> > So, running "createdb bar" a second time works.
> 
> Yeah, because the OID counter has been advanced, and so the second
> createdb uses a nonconflicting OID.
> 
> In theory this scenario should not happen, because a crash-and-restart
> is supposed to guarantee that the OID counter comes up at or beyond
> where it was before the crash.
> 
> After thinking about it for awhile, I believe the problem is that
> CREATE DATABASE is breaking the "WAL rule": it's allowing a data change
> (specifically, creation of the new DB subdirectory) to hit disk without
> having guaranteed that associated WAL entries were flushed first.
> Specifically, if we generated an XLOG_NEXTOID WAL entry to record the
> consumption of an OID for the database, there isn't anything ensuring
> that record gets to disk before the mkdir occurs.  (ie, the comment in
> XLogPutNextOid is correct as far as it goes, but it fails to account
> for outside-the-database effects such as creation of a directory named
> after the OID.)  Hence after restart the OID counter might not get
> advanced as far as it should have been.
> 
> We could fix this two different ways:
> 
> 1. Put an XLogFlush into createdb() somewhere between making the
> pg_database entry and starting to create subdirectories.
> 
> 2. Check for conflicting database directories while assigning the OID,
> comparable to what GetNewRelFileNode() does for table files.
> 
> #2 has some appeal because it could deal with random junk in
> $PGDATA/base regardless of how the junk got there.  However, to do that
> in a really bulletproof way we'd have to check all the tablespace
> directories too, and that's starting to get a tad tedious for something
> that shouldn't happen anyway.
> 
> So I'm leaning to #1 as a suitably low-effort fix.  Thoughts?

It'd be nice to clean things up, but I understand the reluctance to do
so. Maybe a good compromise would be to warn about files that are
present in $PGDATA but don't show up in any catalogs.

Then again, if we're doing that, we could probably just nuke 'em...
-- 
Jim Nasby                                            [EMAIL PROTECTED]
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Reply via email to