On 1/31/13 5:42 PM, MauMau wrote:
> Thank you for sharing your experience.  So you also considered making
> postmaster SIGKILL children like me, didn't you?  I bet most of people
> who encounter this problem would feel like that.
> 
> It is definitely pg_ctl who needs to be prepared, not the users.  It may
> not be easy to find out postgres processes to SIGKILL if multiple
> instances are running on the same host.  Just doing "pkill postgres"
> will unexpectedly terminate postgres of other instances.

In my case, it was one backend process segfaulting, and then some other
backend processes didn't respond to the subsequent SIGQUIT sent out by
the postmaster.  So pg_ctl didn't have any part in it.

We ended up addressing that by installing a nagios event handler that
checked for this situation and cleaned it up.

> I would like to make a patch which that changes SIGQUIT to SIGKILL when
> postmaster terminates children.  Any other better ideas?

That was my idea back then, but there were some concerns about it.

I found an old patch that I had prepared for this, which I have
attached.  YMMV.
From bebb95abe7a55173cab0558da3373d6a3631465b Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <pet...@postgresql.org>
Date: Wed, 16 Dec 2009 17:19:14 +0200
Subject: [PATCH 3/3] Time out the ereport() call in quickdie() after 60 seconds

---
 src/backend/tcop/postgres.c |   25 +++++++++++++++++++++++++
 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index b2fb501..ab6805a 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -191,6 +191,7 @@ static bool IsTransactionExitStmtList(List *parseTrees);
 static bool IsTransactionStmtList(List *parseTrees);
 static void drop_unnamed_stmt(void);
 static void SigHupHandler(SIGNAL_ARGS);
+static void quickdie_alarm_handler(SIGNAL_ARGS);
 static void log_disconnections(int code, Datum arg);
 
 
@@ -2539,9 +2540,17 @@ void
 quickdie(SIGNAL_ARGS)
 {
        sigaddset(&BlockSig, SIGQUIT); /* prevent nested calls */
+       sigdelset(&BlockSig, SIGALRM);
        PG_SETMASK(&BlockSig);
 
        /*
+        * Set up a timeout in case the ereport() call below blocks for a
+        * long time.
+        */
+       pqsignal(SIGALRM, quickdie_alarm_handler);
+       alarm(60);
+
+       /*
         * If we're aborting out of client auth, don't risk trying to send
         * anything to the client; we will likely violate the protocol,
         * not to mention that we may have interrupted the guts of OpenSSL
@@ -2586,6 +2595,22 @@ quickdie(SIGNAL_ARGS)
 }
 
 /*
+ * Take over quickdie()'s work if the alarm expired.
+ */
+static void
+quickdie_alarm_handler(SIGNAL_ARGS)
+{
+       /*
+        * We got here if ereport() was blocking, so don't go there again
+        * except when really asked for.
+        */
+       elog(DEBUG5, "quickdie aborted by alarm");
+
+       on_exit_reset();
+       exit(2);
+}
+
+/*
  * Shutdown signal from postmaster: abort transaction and exit
  * at soonest convenient time
  */
-- 
1.6.5

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to