I wrote:
> "Kevin Grittner" <[email protected]> writes:
>> Thanks Andrew, Alvaro, and Chander. You've given me some thoughts to
>> toss around. Of course, any of these is going to be somewhat more
>> complex than using [ pg_ctl -w ]
> Yeah. I wonder if we shouldn't expend a bit more effort to make that
> way bulletproof. As I mentioned the other day, if there were a way for
> pg_ctl to pass down its parent's PID then we could have the postmaster
> exclude that as a false match, and then using pg_ctl would be just as
> safe as invoking the postmaster directly.
> The two ways I can see to do that are to add a command line switch
> to the postmaster, or to pass the PID as an environment variable,
> say "PG_GRANDPARENT_PID". The latter is a bit uglier but it would
> require touching much less code (and documentation).
Attached is a simple patch that uses the environment-variable approach.
This is a whole lot more self-contained than what would be needed to
pass the PID as an explicit switch, so I'm inclined to do it this way.
You could argue that a bad guy could confuse matters by intentionally
passing an existing postmaster's PID in this variable --- but someone
with the ability to launch the postmaster can probably also remove an
existing lockfile, so I don't think there's a credible security risk.
Any objections?
regards, tom lane
Index: src/backend/utils/init/miscinit.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/init/miscinit.c,v
retrieving revision 1.176
diff -c -r1.176 miscinit.c
*** src/backend/utils/init/miscinit.c 12 Aug 2009 20:53:30 -0000 1.176
--- src/backend/utils/init/miscinit.c 26 Aug 2009 23:24:56 -0000
***************
*** 683,689 ****
int len;
int encoded_pid;
pid_t other_pid;
! pid_t my_pid = getpid();
/*
* We need a loop here because of race conditions. But don't loop
forever
--- 683,728 ----
int len;
int encoded_pid;
pid_t other_pid;
! pid_t my_pid,
! my_p_pid,
! my_gp_pid;
! const char *envvar;
!
! /*
! * If the PID in the lockfile is our own PID or our parent's or
! * grandparent's PID, then the file must be stale (probably left over
from
! * a previous system boot cycle). We need to check this because of the
! * likelihood that a reboot will assign exactly the same PID as we had
in
! * the previous reboot, or one that's only one or two counts larger and
! * hence the lockfile's PID now refers to an ancestor shell process. We
! * allow pg_ctl to pass down its parent shell PID (our grandparent PID)
! * via the environment variable PG_GRANDPARENT_PID; this is so that
! * launching the postmaster via pg_ctl can be just as reliable as
! * launching it directly. There is no provision for detecting
! * further-removed ancestor processes, but if the init script is written
! * carefully then all but the immediate parent shell will be root-owned
! * processes and so the kill test will fail with EPERM. Note that we
! * cannot get a false negative this way, because an existing postmaster
! * would surely never launch a competing postmaster or pg_ctl process
! * directly.
! */
! my_pid = getpid();
!
! #ifndef WIN32
! my_p_pid = getppid();
! #else
! /*
! * Windows hasn't got getppid(), but doesn't need it since it's not
! * using real kill() either...
! */
! my_p_pid = 0;
! #endif
!
! envvar = getenv("PG_GRANDPARENT_PID");
! if (envvar)
! my_gp_pid = atoi(envvar);
! else
! my_gp_pid = 0;
/*
* We need a loop here because of race conditions. But don't loop
forever
***************
*** 745,761 ****
/*
* Check to see if the other process still exists
*
! * If the PID in the lockfile is our own PID or our parent's
PID, then
! * the file must be stale (probably left over from a previous
system
! * boot cycle). We need this test because of the likelihood
that a
! * reboot will assign exactly the same PID as we had in the
previous
! * reboot. Also, if there is just one more process launch
in this
! * reboot than in the previous one, the lockfile might mention
our
! * parent's PID. We can reject that since we'd never be
launched
! * directly by a competing postmaster. We can't detect
grandparent
! * processes unfortunately, but if the init script is written
! * carefully then all but the immediate parent shell will be
! * root-owned processes and so the kill test will fail with
EPERM.
*
* We can treat the EPERM-error case as okay because that error
* implies that the existing process has a different userid
than we
--- 784,794 ----
/*
* Check to see if the other process still exists
*
! * Per discussion above, my_pid, my_p_pid, and my_gp_pid can be
! * ignored as false matches.
! *
! * Normally kill() will fail with ESRCH if the given PID doesn't
! * exist.
*
* We can treat the EPERM-error case as okay because that error
* implies that the existing process has a different userid
than we
***************
*** 772,789 ****
* Unix socket file belonging to an instance of Postgres being
run by
* someone else, at least on machines where /tmp hasn't got a
* stickybit.)
- *
- * Windows hasn't got getppid(), but doesn't need it since it's
not
- * using real kill() either...
- *
- * Normally kill() will fail with ESRCH if the given PID doesn't
- * exist.
*/
! if (other_pid != my_pid
! #ifndef WIN32
! && other_pid != getppid()
! #endif
! )
{
if (kill(other_pid, 0) == 0 ||
(errno != ESRCH && errno != EPERM))
--- 805,813 ----
* Unix socket file belonging to an instance of Postgres being
run by
* someone else, at least on machines where /tmp hasn't got a
* stickybit.)
*/
! if (other_pid != my_pid && other_pid != my_p_pid &&
! other_pid != my_gp_pid)
{
if (kill(other_pid, 0) == 0 ||
(errno != ESRCH && errno != EPERM))
Index: src/bin/pg_ctl/pg_ctl.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_ctl/pg_ctl.c,v
retrieving revision 1.111
diff -c -r1.111 pg_ctl.c
*** src/bin/pg_ctl/pg_ctl.c 11 Jun 2009 14:49:07 -0000 1.111
--- src/bin/pg_ctl/pg_ctl.c 26 Aug 2009 23:24:56 -0000
***************
*** 672,677 ****
--- 672,692 ----
unlimit_core_size();
#endif
+ /*
+ * If possible, tell the postmaster our parent shell's PID (see the
+ * comments in CreateLockFile() for motivation). Windows hasn't got
+ * getppid() unfortunately.
+ */
+ #ifndef WIN32
+ {
+ static char env_var[32];
+
+ snprintf(env_var, sizeof(env_var), "PG_GRANDPARENT_PID=%d",
+ (int) getppid());
+ putenv(env_var);
+ }
+ #endif
+
exitcode = start_postmaster();
if (exitcode != 0)
{
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers