Dirk Lutzebaeck and I just spent a tense couple of hours trying to
figure out why a large database Down Under wasn't coming up after being
reloaded from a base backup plus PITR recovery.  The symptoms were that
the recovery went fine, but backend processes would fail at startup or
soon after with "could not open relation XX/XX/XX: No such file" type of

The answer that ultimately emerged was that they'd been running a
nightly maintenance script that did REINDEX SYSTEM (among other things
I suppose).  The PITR base backup included pg_internal.init files that
were appropriate when it was taken, and the PITR recovery process did
nothing whatsoever to update 'em :-(.  So incoming backends picked up
init files with obsolete relfilenode values.

We don't actually need to *update* the file, per se, we only need to
remove it if no longer valid --- the next incoming backend will rebuild
it.  I could see fixing this by making WAL recovery run around and zap
all the .init files (only problem is to find 'em), or we could add a new
kind of WAL record saying "remove the .init file for database XYZ"
to be emitted whenever someone removes the active one.  Thoughts?

Meanwhile, if you're trying to recover from a PITR backup and it's not
working, try removing any pg_internal.init files you can find.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to