On Wed, Feb 14, 2007 at 11:34:47PM +0100, Peter Kovacs wrote:
> The post I quoted also says:
> 
> "That should not happen; it should always be possible to detect whether
> the file is stale, *if* your start script is written correctly."
> 
> "That" in the quote refers to my description:
> "On system startup PostgreSQL 8.1.4 refuses to start due to the pid
> file is [being] left over from [a] previous "session" on Solaris 10
> x86."

Ah, I hadn't noticed that this was in fact a problem you had
experienced.

> Nothing did "actually break", but something happened which should not
> have happened if my "start script is [had been] written correctly".

No, something did break then.

> The issue here is:
> Does the SMF setup described in the referenced Sun document make sure
> that the postmaster process will start and become functional without
> human intervention even if there is a pid file left over from a
> previously crashed (abruptly terminated) postmaster process?

I don't know.  But reading Tom Lane's reply to you I don't see what our
SMF manifest and start method are doing wrong.

Hmm, let's see, your log file said:

    LOG:  00000: shutting down
    LOCATION:  ShutdownXLOG, xlog.c:5031

    <I infer the restart is here>

    FATAL:  58P01: could not remove old lock file "postmaster.pid": No
    such file or directory
    HINT:  The file seems accidentally left over, but it could not be
    removed. Please remove the file by hand and try again.

The hint and the fatal message come the code that Tom Lane quoted.

This means that the exclusive create open(2) of the lock file failed
with EEXIST or EACCES, no other process was found with the pid in the
lock file and then when PostgreSQL went to remove the lockfile the
lockfile was gone!

What removed the lockfile, and why should PostgreSQL complain at that
point instead of just looping and trying again?

Nico
-- 

Reply via email to