Re: [HACKERS] [GENERAL] Shutting down a warm standby database in 8.2beta3
On Fri, Nov 17, 2006 at 11:40:36PM -0500, Tom Lane wrote: Stephen Harris [EMAIL PROTECTED] writes: Why not, after calling fork() create a new process group with setsid() and then instead of killing the recovery thread, kill the whole process group (-PID rather than PID)? Then every process (the recovery thread, the system, the script, any child of the script) will all receive the signal. This seems like a good answer if setsid and/or setpgrp are universally available. I fear it won't work on Windows though :-(. Also, each It's POSIX, so I would suppose it's standard on most modern *nix platforms. Windows... bluh. I wonder how perl handles POSIX::setsid() on Windows! backend would become its own process group leader --- does anyone know if adding hundreds of process groups would slow down any popular kernels? Shouldn't hurt. This is, after all, what using in a command line shell with job control (csh, ksh, tcsh, bash, zsh) does. Because you only run one archive or recovery thread at a time (which is very good and very clever) you won't have too many process groups at any instance in time. [ thinks for a bit... ] Another issue is that there'd be a race condition during backend start: if the postmaster tries to kill -PID before the backend has managed to execute setsid, it wouldn't work. *ponder* Bugger. Standard solutions (eg try three times with a second pause) would mitigate this, but Hmm. Another idea is to make the shutdown be more co-operative under control of the script; eg an exit code of 0 means xlog is now available, code if 1 means the log is non-existent (so recovery is complete) and an exit code of 255 means failure to recover; perform database shutdown. In this way a solution similar to the existing trigger files (recovery complete) could be used. It's a little messy in that pg_ctl wouldn't be used to shutdown the database; the script would essentially tell the recovery thread to abort, which would tell the main postmaster to shutdown. We'd have no clients connected, no child process running, so a smart shutdown would work. -- rgds Stephen ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] Shutting down a warm standby database in 8.2beta3
Stephen Harris [EMAIL PROTECTED] writes: Doing a shutdown immediate isn't to clever because it actually leaves the recovery threads running LOG: restored log file 00010001003E from archive LOG: received immediate shutdown request LOG: restored log file 00010001003F from archive Hm, that should work --- AFAICS the startup process should abort on SIGQUIT the same as any regular backend. [ thinks... ] Ah-hah, man system(3) tells the tale: system() ignores the SIGINT and SIGQUIT signals, and blocks the SIGCHLD signal, while waiting for the command to terminate. If this might cause the application to miss a signal that would have killed it, the application should examine the return value from system() and take whatever action is appropriate to the application if the command terminated due to receipt of a signal. So the SIGQUIT went to the recovery script command and was missed by the startup process. It looks to me like your script actually ignored the signal, which you'll need to fix, but it also looks like we are not checking for these cases in RestoreArchivedFile(), which we'd better fix. As the code stands, if the recovery script is killed by a signal, we'd take that as normal termination of the recovery and proceed to come up, which is definitely the Wrong Thing. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [GENERAL] Shutting down a warm standby database in 8.2beta3
Stephen Harris [EMAIL PROTECTED] writes: However, it seems the signal wasn't sent at all. Now that I think about it, the behavior of system() is predicated on the assumption that SIGINT and SIGQUIT originate with the tty driver and are broadcast to all members of the session's process group --- so the called command will get them too, and there's no need for system() to do anything except wait to see whether the called command dies or traps the signal. This does not apply to signals originated by the postmaster --- it doesn't even know that the child process is doing a system(), much less have any way to signal the grandchild. Ugh. Reimplementing system() seems pretty ugly, but maybe we have no choice. It strikes me that system() has a race condition as defined anyway, because if a signal arrives between blocking the handler and issuing the fork(), it'll disappear into the ether; and the same at the end of the routine. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [GENERAL] Shutting down a warm standby database in 8.2beta3
On Fri, Nov 17, 2006 at 10:49:39PM -0500, Tom Lane wrote: Stephen Harris [EMAIL PROTECTED] writes: However, it seems the signal wasn't sent at all. Now that I think about it, the behavior of system() is predicated on the assumption that SIGINT and SIGQUIT originate with the tty driver and are broadcast to all members of the session's process group --- so the This does not apply to signals originated by the postmaster --- it doesn't even know that the child process is doing a system(), much less have any way to signal the grandchild. Ugh. Why not, after calling fork() create a new process group with setsid() and then instead of killing the recovery thread, kill the whole process group (-PID rather than PID)? Then every process (the recovery thread, the system, the script, any child of the script) will all receive the signal. -- rgds Stephen ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] Shutting down a warm standby database in 8.2beta3
On Fri, Nov 17, 2006 at 05:03:44PM -0500, Tom Lane wrote: Stephen Harris [EMAIL PROTECTED] writes: Doing a shutdown immediate isn't to clever because it actually leaves the recovery threads running LOG: restored log file 00010001003E from archive LOG: received immediate shutdown request LOG: restored log file 00010001003F from archive Hm, that should work --- AFAICS the startup process should abort on SIGQUIT the same as any regular backend. [ thinks... ] Ah-hah, man system(3) tells the tale: system() ignores the SIGINT and SIGQUIT signals, and blocks the SIGCHLD signal, while waiting for the command to terminate. If this might cause the application to miss a signal that would have killed it, the application should examine the return value from system() and take whatever action is appropriate to the application if the command terminated due to receipt of a signal. So the SIGQUIT went to the recovery script command and was missed by the startup process. It looks to me like your script actually ignored the signal, which you'll need to fix, but it also looks like we are not My script was just a ksh script and didn't do anything special with signals. Essentially it does #!/bin/ksh -p [...variable setup...] while [ ! -f $wanted_file ] do if [ -f $abort_file ] then exit 1 fi sleep 5 done cat $wanted_file I know signals can be deferred in scripts (a signal sent to the script during the sleep will be deferred if a trap handler had been written for the signal) but they _do_ get delivered. However, it seems the signal wasn't sent at all. Once the wanted file appeared the recovery thread from postmaster started a _new_ script for the next log. I'll rewrite the script in perl (probably monday when I'm back in the office) and stick lots of signal() traps in to see if anything does get sent to the script. As the code stands, if the recovery script is killed by a signal, we'd take that as normal termination of the recovery and proceed to come up, which is definitely the Wrong Thing. Oh good; that means I'm not mad :-) -- rgds Stephen ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [GENERAL] Shutting down a warm standby database in 8.2beta3
Stephen Harris [EMAIL PROTECTED] writes: My script was just a ksh script and didn't do anything special with signals. Essentially it does #!/bin/ksh -p [...variable setup...] while [ ! -f $wanted_file ] do if [ -f $abort_file ] then exit 1 fi sleep 5 done cat $wanted_file I know signals can be deferred in scripts (a signal sent to the script during the sleep will be deferred if a trap handler had been written for the signal) but they _do_ get delivered. Sure, but it might be getting delivered to, say, your sleep command. You haven't checked the return value of sleep to handle any errors that may occur. As it stands you have to check for errors from every single command executed by your script. That doesn't seem terribly practical to expect of useres. As long as Postgres is using SIGQUIT for its own communication it seems it really ought to arrange to block the signal while the script is running so it will receive the signals it expects once the script ends. Alternatively perhaps Postgres really ought to be using USR1/USR2 or other signals that library routines won't think they have any business rearranging. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [GENERAL] Shutting down a warm standby database in 8.2beta3
On Fri, Nov 17, 2006 at 09:39:39PM -0500, Gregory Stark wrote: Stephen Harris [EMAIL PROTECTED] writes: [...variable setup...] while [ ! -f $wanted_file ] do if [ -f $abort_file ] then exit 1 fi sleep 5 done cat $wanted_file I know signals can be deferred in scripts (a signal sent to the script during Sure, but it might be getting delivered to, say, your sleep command. You No. The sleep command keeps on running. I could see that using ps. To the best of my knowldge, a random child process of the script wouldn't even get a signal. All the postmaster recovery thread knows about is the system() - ie sh -c. All sh knows about is the ksh process. Neither postmaster or sh know about sleep and so sleep wouldn't receive the signal (unless it was sent to all processes in the process group). Here's an example from Solaris 10 demonstrating lack of signal propogation. $ uname -sr SunOS 5.10 $ echo $0 /bin/sh $ cat x #!/bin/ksh -p sleep 1 $ ./x 4622 $ kill 4622 $ 4622 Terminated $ ps -ef | grep sleep sweh 4624 4602 0 22:13:13 pts/1 0:00 grep sleep sweh 4623 1 0 22:13:04 pts/1 0:00 sleep 1 This is, in fact, what proper job control shells do. Doing the same test with ksh as the command shell will kill the sleep :-) $ echo $0 -ksh $ ./x [1] 4632 $ kill %1 [1] + Terminated ./x $ ps -ef | grep sleep sweh 4635 4582 0 22:15:17 pts/1 0:00 grep sleep [ Aside: The only way I've been able to guarantee all processes and child processes and everything to be killed is to run a subprocess with setsid() to create a new process group and kill the whole process group. It's a pain ] If postmaster was sending a signal to the system() process then sh -c might not signal the ksh script, anyway. The ksh script might terminate, or it might defer until sleep had finished. Only if postmaster had signalled a complete process group would sleep ever see the signal. -- rgds Stephen ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [GENERAL] Shutting down a warm standby database in 8.2beta3
Stephen Harris [EMAIL PROTECTED] writes: On Fri, Nov 17, 2006 at 10:49:39PM -0500, Tom Lane wrote: This does not apply to signals originated by the postmaster --- it doesn't even know that the child process is doing a system(), much less have any way to signal the grandchild. Ugh. Why not, after calling fork() create a new process group with setsid() and then instead of killing the recovery thread, kill the whole process group (-PID rather than PID)? Then every process (the recovery thread, the system, the script, any child of the script) will all receive the signal. This seems like a good answer if setsid and/or setpgrp are universally available. I fear it won't work on Windows though :-(. Also, each backend would become its own process group leader --- does anyone know if adding hundreds of process groups would slow down any popular kernels? [ thinks for a bit... ] Another issue is that there'd be a race condition during backend start: if the postmaster tries to kill -PID before the backend has managed to execute setsid, it wouldn't work. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [GENERAL] Shutting down a warm standby database in 8.2beta3
Gregory Stark [EMAIL PROTECTED] writes: Sure, but it might be getting delivered to, say, your sleep command. You haven't checked the return value of sleep to handle any errors that may occur. As it stands you have to check for errors from every single command executed by your script. The expectation is that something like SIGINT or SIGQUIT would be delivered to both the sleep command and the shell process running the script. So the shell should fail anyway. (Of course, a nontrivial archive or recovery script had better be checking for failures at each step, but this is not very relevant to the immediate problem.) Alternatively perhaps Postgres really ought to be using USR1/USR2 or other signals that library routines won't think they have any business rearranging. The existing signal assignments were all picked for what seem to me to be good reasons; I'm disinclined to change them. In any case, the important point here is that we'd really like an archive or recovery script, or for that matter any command executed via system() from a backend, to abort when the parent backend is SIGINT'd or SIGQUIT'd. Stephen's idea of executing setsid() at each backend start seems interesting, but is there a way that will work on Windows? regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [GENERAL] Shutting down a warm standby database in 8.2beta3
Tom Lane [EMAIL PROTECTED] writes: Gregory Stark [EMAIL PROTECTED] writes: Sure, but it might be getting delivered to, say, your sleep command. You haven't checked the return value of sleep to handle any errors that may occur. As it stands you have to check for errors from every single command executed by your script. The expectation is that something like SIGINT or SIGQUIT would be delivered to both the sleep command and the shell process running the script. So the shell should fail anyway. (Of course, a nontrivial archive or recovery script had better be checking for failures at each step, but this is not very relevant to the immediate problem.) Hm, I tried to test that before I sent that. But I guess my test was faulty since I was really testing what process the terminal handling delivered the signal to: $ cat /tmp/test.sh #!/bin/sh echo before sleep 5 || echo sleep failed echo after $ sh /tmp/test.sh ; echo $? before ^\ /tmp/test.sh: line 4: 23407 Quitsleep 5 sleep failed after 0 -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [GENERAL] Shutting down a warm standby database in 8.2beta3
Gregory Stark [EMAIL PROTECTED] writes: Hm, I tried to test that before I sent that. But I guess my test was faulty since I was really testing what process the terminal handling delivered the signal to: Interesting. I tried the same test on HPUX, and find that its /bin/sh seems to ignore SIGQUIT but not SIGINT: $ sh /tmp/test.sh ; echo $? before -- typed ^C here 130 $ sh /tmp/test.sh ; echo $? before -- typed ^\ here /tmp/test.sh[4]: 25166 Quit(coredump) sleep failed after 0 $ There is nothing in the shell man page about this :-( That seems to leave us back at square one. How can we ensure an archive or recovery script will fail on being signaled? (Obviously we can't prevent someone from trapping the signal, but it'd be good if the default behavior was this way.) regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq