Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-27 Thread Magnus Hagander
On Thu, Aug 27, 2009 at 02:38, Tom Lanet...@sss.pgh.pa.us wrote: I did have another thought. It could compare the time from uptime to the timestamp on the lock file. If the server's been restarted since the time in the lock file then it must be stale. uhm. unless clock's been changed...

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-26 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: In general I'd not recommend that an init script go messing with the contents of the postmaster.pid file, which it would have to do to have any of this logic in the script. But LSB specifically provides the pidofproc function to extract the pid info.

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-26 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: This brings me back round to what I was looking at recently -- the possibility of trying to make an LSB-conforming init script for PostgreSQL. I'm having a lot of trouble, though, trying to get either the postmaster or pg_ctl to behave well

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-26 Thread Chander Ganesan
Kevin Grittner wrote: Tom Lane t...@sss.pgh.pa.us wrote: In general I'd not recommend that an init script go messing with the contents of the postmaster.pid file, which it would have to do to have any of this logic in the script. But LSB specifically provides the pidofproc function

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-26 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: start_daemon doesn't provide for switching to a non-root userid according to that spec, so it seems like *it's* missing a crucial detail. Hmmm... I didn't see anything requiring that it only by run by root. Do you see something that suggests that it

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-26 Thread Tom Lane
I wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: Thanks Andrew, Alvaro, and Chander. You've given me some thoughts to toss around. Of course, any of these is going to be somewhat more complex than using [ pg_ctl -w ] Yeah. I wonder if we shouldn't expend a bit more effort to

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-26 Thread Greg Stark
On Thu, Aug 27, 2009 at 12:32 AM, Tom Lanet...@sss.pgh.pa.us wrote: Attached is a simple patch that uses the environment-variable approach. This is a whole lot more self-contained than what would be needed to pass the PID as an explicit switch, so I'm inclined to do it this way. You could

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-26 Thread Tom Lane
Greg Stark gsst...@mit.edu writes: On Thu, Aug 27, 2009 at 12:32 AM, Tom Lanet...@sss.pgh.pa.us wrote: Attached is a simple patch that uses the environment-variable approach. So with this change you would have the startup script not remove the lock file? Huh? The startup script shouldn't

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-26 Thread Greg Stark
On Thu, Aug 27, 2009 at 1:01 AM, Tom Lanet...@sss.pgh.pa.us wrote: So with this change you would have the startup script not remove the lock file? Huh?  The startup script shouldn't *ever* remove the lock file. That's been true all along, and this doesn't change it. I thought that was the

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-26 Thread Tom Lane
Greg Stark gsst...@mit.edu writes: On Thu, Aug 27, 2009 at 1:01 AM, Tom Lanet...@sss.pgh.pa.us wrote: Huh?  The startup script shouldn't *ever* remove the lock file. That's been true all along, and this doesn't change it. I thought that was the whole difference between using pg_ctl to start

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-26 Thread Andrew Dunstan
Tom Lane wrote: I was actually having second thoughts about the idea of using file locking. The only environment in which I've heard of file locks not being trustworthy is NFS, and if you're running a DB on NFS you've probably got worse problems than this one. Notably, if you mistakenly try

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-26 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes: Tom Lane wrote: Has anyone heard of other contexts in which file locks don't work? Has Windows got them? Yes. But they are mandatory rather than advisory, I believe. Probably wouldn't matter for our purposes? I guess what we'd need is a writer's

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: Hmm. As stated, I would expect pg_ctl to make it worse. I've been playing with this, and I think the problem was that we wanted a non-zero exit from the script if the start failed. That's trivial with pg_ctl -w but not running postgres directly. I guess

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Andrew Dunstan
Kevin Grittner wrote: Tom Lane t...@sss.pgh.pa.us wrote: Hmm. As stated, I would expect pg_ctl to make it worse. I've been playing with this, and I think the problem was that we wanted a non-zero exit from the script if the start failed. That's trivial with pg_ctl -w but not

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Alvaro Herrera
Kevin Grittner wrote: The reason is that we don't want certain other processes attempting to start until and unless the database they use has started successfully. This is something we're not quite ready on, yet. We need some mechanism that allows scripts to verify not only that postmaster

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes: Here's a snippet from my F11 system: $SU -l postgres -c $PGENGINE/postmaster -p '$PGPORT' -D '$PGDATA' ${PGOPTS} $PGLOG 21 /dev/null sleep 2 pid=`pidof -s $PGENGINE/postmaster` if [ $pid ] [ -f

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Chander Ganesan
Alvaro Herrera wrote: Kevin Grittner wrote: The reason is that we don't want certain other processes attempting to start until and unless the database they use has started successfully. This is something we're not quite ready on, yet. We need some mechanism that allows scripts to

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Kevin Grittner
Chander Ganesan chan...@otg-nc.com wrote: Alvaro Herrera wrote: Kevin Grittner wrote: The reason is that we don't want certain other processes attempting to start until and unless the database they use has started successfully. This is something we're not quite ready on, yet. We need

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: Thanks Andrew, Alvaro, and Chander. You've given me some thoughts to toss around. Of course, any of these is going to be somewhat more complex than using [ pg_ctl -w ] Yeah. I wonder if we shouldn't expend a bit more effort to make that way

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: Of course, this is a complete kluge --- it assumes the postmaster will create its pidfile in less than two seconds. And for that matter, it's not very proof against the case of a pre-existing postmaster. But in any case, it (intentionally) doesn't wait

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Alvaro Herrera
Chander Ganesan wrote: Alvaro Herrera wrote: This is something we're not quite ready on, yet. We need some mechanism that allows scripts to verify not only that postmaster started, but also that it has finished recovery. You can sort-of do it by attempting a connection and checking the

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: The two ways I can see to do that are to add a command line switch to the postmaster, or to pass the PID as an environment variable, say PG_GRANDPARENT_PID. The latter is a bit uglier but it would require touching much less code (and documentation).

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Kevin Grittner
Alvaro Herrera alvhe...@commandprompt.com wrote: That's within my definition of ugly, yes :-) My ideal tool would do something like $ pg_ping -h foo -p IN_RECOVERY $ echo $? 2 $ # sleep a bit ... $ pg_ping -h foo -p READY $ echo $? 0 Cool, but how would you do that

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: You're thinking that pg_ctl would capture it's parent PID and pass it to the postmaster one way or the other? That seems like it covers the specific issue you were referencing up-thread. It has been bubbling around in my head that we have

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: Alvaro Herrera alvhe...@commandprompt.com wrote: That's within my definition of ugly, yes :-) My ideal tool would do something like $ pg_ping -h foo -p IN_RECOVERY $ echo $? 2 $ # sleep a bit ... $ pg_ping -h foo -p

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Alvaro Herrera
Kevin Grittner wrote: Alvaro Herrera alvhe...@commandprompt.com wrote: That's within my definition of ugly, yes :-) My ideal tool would do something like $ pg_ping -h foo -p IN_RECOVERY $ echo $? 2 $ # sleep a bit ... $ pg_ping -h foo -p READY $ echo $?

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Andrew Dunstan
Tom Lane wrote: But in any case, it (intentionally) doesn't wait for the postmaster to be ready to accept connections, so it's not solving Kevin's problem. Maybe we need a --wait mode for pg_ctl status that would test connecting to the database the same way it

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: Only if they are running at times when your postmaster(s) aren't ... Well, those rsync scripts for pushing the PITR base backup and the WAL stream to other machines are crontab jobs which kick off once per minute. Still, just from a security point of

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: Still, this seems like it's not as deterministic as it should be. Is there any reasonable way to pin it down beyond the PID? Like also saving a start time into the postmaster.pid file and checking that

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Alvaro Herrera
Kevin Grittner wrote: You wouldn't object to using either of those in a Linux service script, though, would you? Yeah, operating-system-specific init scripts do not need to be portable :-) Of course, they need to work across a wide range of Linux systems ... -- Alvaro Herrera

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: Tom Lane t...@sss.pgh.pa.us wrote: stuff like vacuum scripts could surely be run from a different userid. My first thought was they have to run as the database superuser. (In our case, that is the same as the OS user running the cluster.)

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: Tom Lane t...@sss.pgh.pa.us wrote: How would you get the latter in a portable fashion? (Do not mention ps please ... and I don't want to hear about lsof either ...) You wouldn't object to using either of those in a Linux service script,

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Aidan Van Dyk
* Tom Lane t...@sss.pgh.pa.us [090825 18:43]: How would you get the latter in a portable fashion? (Do not mention ps please ... and I don't want to hear about lsof either ...) Can postmaster keep a exclusive lock on its own pid file the entire time it's running? If you can open it and lock

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Tom Lane
Aidan Van Dyk ai...@highrise.ca writes: Can postmaster keep a exclusive lock on its own pid file the entire time it's running? That's been discussed, but file locking isn't all that portable or trustworthy :-( regards, tom lane -- Sent via pgsql-hackers mailing list

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-25 Thread Chander Ganesan
Tom Lane wrote: Aidan Van Dyk ai...@highrise.ca writes: Can postmaster keep a exclusive lock on its own pid file the entire time it's running? That's been discussed, but file locking isn't all that portable or trustworthy :-( regards, tom lane What about

[HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Josh Berkus
... for the simple reason that nobody is maintaining it. Wheeler just pointed out to me today that the OSX startup script hasn't been updated since 7.4 and contains misinformation and dangerous scripting. Other startup scripts there are equally dilapidated, and aren't used by the linux distros

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Chander Ganesan
Josh Berkus wrote: ... for the simple reason that nobody is maintaining it. Wheeler just pointed out to me today that the OSX startup script hasn't been updated since 7.4 and contains misinformation and dangerous scripting. Other startup scripts there are equally dilapidated, and aren't used

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Tom Lane
Chander Ganesan chan...@otg-nc.com writes: Josh Berkus wrote: ... for the simple reason that nobody is maintaining it. Wheeler just pointed out to me today that the OSX startup script hasn't been updated since 7.4 and contains misinformation and dangerous scripting. Other startup scripts

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Alvaro Herrera
Josh Berkus wrote: ... for the simple reason that nobody is maintaining it. Wheeler just pointed out to me today that the OSX startup script hasn't been updated since 7.4 and contains misinformation and dangerous scripting. Other startup scripts there are equally dilapidated, and aren't

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Alvaro Herrera
Tom Lane wrote: (Personally, I use scripts based on start-scripts/osx/ for a number of services on my own machines, so if there's something wrong with them I'd definitely like to know what it is.) What kind of based on? I mean, are there some changes of yours that could be applied to the

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes: Tom, (Personally, I use scripts based on start-scripts/osx/ for a number of services on my own machines, so if there's something wrong with them I'd definitely like to know what it is.) I quote: # What to use to start up the postmaster (we do NOT use

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread David E. Wheeler
On Aug 19, 2009, at 11:48 AM, Tom Lane wrote: (Personally, I use scripts based on start-scripts/osx/ for a number of services on my own machines, so if there's something wrong with them I'd definitely like to know what it is.) +1. Please don't remove the start scripts. I use them on every

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane wrote: (Personally, I use scripts based on start-scripts/osx/ for a number of services on my own machines, so if there's something wrong with them I'd definitely like to know what it is.) What kind of based on? I mean, are there

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: Josh Berkus j...@agliodbs.com writes: we do NOT use pg_ctl for [postmaster start], as it adds no value and can cause the postmaster to misrecognize a stale lock file And? That statement was and remains perfectly correct. Is this mentioned in the

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Josh Berkus
Tom, # What to use to start up the postmaster (we do NOT use pg_ctl for this, # as it adds no value and can cause the postmaster to misrecognize a stale # lock file) DAEMON=$prefix/bin/postmaster And? That statement was and remains perfectly correct. We don't use pg_ctl to start the

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Bruce Momjian
Josh Berkus wrote: Tom, # What to use to start up the postmaster (we do NOT use pg_ctl for this, # as it adds no value and can cause the postmaster to misrecognize a stale # lock file) DAEMON=$prefix/bin/postmaster And? That statement was and remains perfectly correct. We don't

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: Tom Lane t...@sss.pgh.pa.us wrote: we do NOT use pg_ctl for [postmaster start], as it adds no value and can cause the postmaster to misrecognize a stale lock file And? That statement was and remains perfectly correct. Is this mentioned

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread David E. Wheeler
On Aug 19, 2009, at 2:03 PM, Tom Lane wrote: These considerations don't apply to ordinary hand launching of the postmaster, for the primary reason that the chance of a false PID match is several orders of magnitude smaller when you're talking about a manual restart --- the likely postmaster

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Bruce Momjian
Should we add a comment to the startup scripts linking this email? http://archives.postgresql.org/message-id/28922.1250715...@sss.pgh.pa.us --- Tom Lane wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes:

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes: Nice summary, Tom. Do the distro packagers know this, though? All the active ones I know of learned it the hard way, or were paying attention when someone else did. Still, it wouldn't be a bad thing for us to document it somewhere.

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Greg Stark
On Wed, Aug 19, 2009 at 10:03 PM, Tom Lanet...@sss.pgh.pa.us wrote: What this all leads to is that it's safe to launch a postmaster from an init script via something like        su - postgres sh -c postmaster ... Surely you don't want -? If you run postgres's .profile etc. then random user

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Josh Berkus
Tom, (Personally, I use scripts based on start-scripts/osx/ for a number of services on my own machines, so if there's something wrong with them I'd definitely like to know what it is.) I quote: # What to use to start up the postmaster (we do NOT use pg_ctl for this, # as it adds no value

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: The problem is that after a system crash and reboot, an old postmaster.pid file might be left behind. The postmaster can only safely remove this lock file if it is *certain* that it doesn't represent another live postmaster process. Otherwise it is

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: Right -- we did run into this in spades when our backup server, running dozens of instances of PostgreSQL in warm standby to confirm the integrity of the files received, crashed hard. I wasn't sure if this was the problem being addressed.

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Tom Lane
Greg Stark gsst...@mit.edu writes: On Wed, Aug 19, 2009 at 10:03 PM, Tom Lanet...@sss.pgh.pa.us wrote: What this all leads to is that it's safe to launch a postmaster from an init script via something like su - postgres sh -c postmaster ... Surely you don't want -? If you run

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: Well, using a different user per instance is a good idea because then the safety analysis I gave holds rigorously for each instance. It doesn't get you out of the problem by itself, because the problem

Re: [HACKERS] We should Axe /contrib/start-scripts

2009-08-19 Thread Kevin Grittner
I wrote: Oh, right -- it does let PostgreSQL automatically deal with the file left by a different instance, but could still fail on it's own file. Er, it does let PostgreSQL automatically deal with a different instance using the PID matching what this instance left in its file, but could be