subject:"\[HACKERS\] pg_terminate

On Thu, Mar 15, 2012 at 10:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniel Farina dan...@heroku.com writes:
 On Thu, Mar 15, 2012 at 10:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 But actually I don't see what you hope to gain from such a change,
 even if it can be made to work.  Anyone who can do kill(SIGINT) can
 do kill(SIGKILL), say --- so you have to be able to trust the signal
 sender.  What's the point of not trusting it to verify the client
 identity?

 No longer true with pg_cancel_backend not-by-superuser, no?

 No.  That doesn't affect the above argument in the least.  And in fact
 if there's any question whatsoever as to whether unprivileged
 cross-backend signals are secure, they are not going in in the first
 place.

Okay, well, I believe there is a race in handling common
administrative signals that *might* possibly matter.  In the past,
pg_cancel_backend was superuser only, which is a lot like saying only
available to people who can be the postgres user and run kill.  The
cancellation packet is handled via first checking the other backend's
BackendKey and then SIGINTing it, leaving only the most narrow
possibility for a misdirected SIGINT.

But it really is unfortunate that I, a user, run a query or have a
problematic connection of my own role and just want the thing to stop,
but I can't do anything about it without superuser.  In recognition of
that, pg_cancel_backend now can operate on backends owned by the same
user (once again, checked before the signal is processed by the
receiver, just like with the cancellation packet).

There was some angst (but I'm not sure about how specific or uniform)
to extend such signaling power to pg_terminate_backend, and the only
objection I can think of is there is this race, or so it would seem to
me.  Maybe it's too small to care, in which case we can just extend
the same policy to pg_terminate_backend, or maybe it's not, in which
case we could get rid of any signaling race conditions.

The only hypothetical person who would be happy with the current
situation, if characterized correctly, would be one who thinks that
pid-race on SIGINT from non-superusers (long has been true in the form
of the cancellation packet) is okay but on SIGTERM such a race is not.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend for same-role

On Thu, Mar 15, 2012 at 11:01 PM, Daniel Farina dan...@heroku.com wrote:
 Okay, well, I believe there is a race in handling common
 administrative signals that *might* possibly matter.  In the past,
 pg_cancel_backend was superuser only, which is a lot like saying only
 available to people who can be the postgres user and run kill.  The
 cancellation packet is handled via first checking the other backend's
 BackendKey and then SIGINTing it, leaving only the most narrow
 possibility for a misdirected SIGINT.

Attached is a patch that sketches a removal of the caveat by relying
on the BackendId in PGPROC instead of the pid.  Basically, the idea is
to make it work more like how cancellation keys work, except for
internal SQL functions.  I think the unsatisfying crux here is the
uniqueness of BackendId over the life of one *postmaster* invocation:

sinvaladt.c

MyBackendId = (stateP - segP-procState[0]) + 1;
/* Advertise assigned backend ID in MyProc */
MyProc-backendId = MyBackendId;

I'm not sure precisely what to think about how this numbering winds up
working on quick inspection.  Clearly, if BackendIds can be reused
quickly then the pid-race problem comes back in spirit right away.

But given the contract of MyBackendId as I understand it (it just has
to be unique among all backends at any given time), it could be
changed.  I don't *think* it's used for its arithmetic relationship to
its underlying components anywhere.

Another similar solution (not attached) would be to send information
about the originating backend through PGPROC and having the check be
against those rather than merely a correct and unambiguous
MyBackendId.

I also see now that cancellation packets does not have this caveat
because the postmaster is control of all forking and joining in a
serially executed path, so it need not worry about pid racing.

Another nice use for this might be to, say, deliver the memory or
performance stats of another process while-in-execution, without
having to be superuser or and/or gdbing in back to the calling backend.

-- 
fdr
From f466fe53e8e64cfa49bf56dbdf5920f9ea4e3562 Mon Sep 17 00:00:00 2001
From: Daniel Farina dan...@heroku.com
Date: Thu, 15 Mar 2012 20:46:38 -0700
Subject: [PATCH] Implement race-free sql-originated backend
 cancellation/termination

If promising, this patch requires a few documentation updates in
addendum.

Since 0495aaad8b337642830a4d4e82f8b8c02b27b1be, pg_cancel_backend can
be run on backends that have the same role as the backend
pg_cancel_backend is being invoked from.  Since that time, a
documented caveat exists stating that there was a race whereby a
process death-and-recycle could result in an otherwise unrelated
backend receiving SIGINT.

Now it is desirable for pg_terminate_backend -- which also has the
effect of having the backend exit, and closing the socket -- to also
be usable by non-superusers.  Presuming SIGINT was acceptable to race,
it's not clear that it's acceptable for SIGTERM to race in the same
way, so this patch seeks to try to do something about that.

This patch attempts to close this race condition by targeting query
cancellation/termination against the per-backend-startup unique
BackendId to unambiguously identify the session rather than a PID.
This makes the SQL function act more like how cancellation keys work
already (perhaps these paths can be converged somehow).

Signed-off-by: Daniel Farina dan...@heroku.com
---
 src/backend/access/transam/twophase.c |4 +
 src/backend/storage/ipc/procsignal.c  |3 +
 src/backend/storage/lmgr/proc.c   |3 +
 src/backend/tcop/postgres.c   |   55 +
 src/backend/utils/adt/misc.c  |  104 +++-
 src/include/miscadmin.h   |1 +
 src/include/storage/proc.h|   17 +
 src/include/storage/procsignal.h  |1 +
 8 files changed, 146 insertions(+), 42 deletions(-)

*** a/src/backend/access/transam/twophase.c
--- b/src/backend/access/transam/twophase.c
***
*** 62,67 
--- 62,68 
  #include storage/procarray.h
  #include storage/sinvaladt.h
  #include storage/smgr.h
+ #include storage/spin.h
  #include utils/builtins.h
  #include utils/memutils.h
  #include utils/timestamp.h
***
*** 326,331  MarkAsPreparing(TransactionId xid, const char *gid,
--- 327,335 
  	proc-backendId = InvalidBackendId;
  	proc-databaseId = databaseid;
  	proc-roleId = owner;
+ 	SpinLockInit(MyProc-adminMutex);
+ 	proc-adminAction = ADMIN_ACTION_NONE;
+ 	proc-adminBackendId = InvalidBackendId;
  	proc-lwWaiting = false;
  	proc-lwWaitMode = 0;
  	proc-lwWaitLink = NULL;
*** a/src/backend/storage/ipc/procsignal.c
--- b/src/backend/storage/ipc/procsignal.c
***
*** 258,263  procsignal_sigusr1_handler(SIGNAL_ARGS)
--- 258,266 
  	if (CheckProcSignal(PROCSIG_NOTIFY_INTERRUPT))
  		HandleNotifyInterrupt();
  
+ 	if (CheckProcSignal(PROCSIG_ADMIN_ACTION_INTERRUPT))
+

Re: [HACKERS] pg_terminate_backend for same-role

2012-03-16 Thread Noah Misch

On Thu, Mar 15, 2012 at 04:14:03PM -0700, Daniel Farina wrote:
 Parallel to pg_cancel_backend, it'd be nice to allow the user to just
 outright kill a backend that they own (politely, with a SIGTERM),
 aborting any transactions in progress, including the idle transaction,
 and closing the socket.

+1

 I imagine the problem is a race condition whereby a pid might be
 reused by another process owned by another user (doesn't that also
 affect pg_cancel_backend?).  Shall we just do everything using the
 MyCancelKey (which I think could just be called SessionKey,
 SessionSecret, or even just Session) as to ensure we have no case
 of mistaken identity? Or does that end up being problematic?

No, I think the hazard you identify here is orthogonal to the question of when
to authorize pg_terminate_backend().  As you note downthread, protocol-level
cancellations available in released versions already exhibit this hazard.  I
wouldn't mind a clean fix for this, but it's an independent subject.


Here I discussed a hazard specific to allowing pg_terminate_backend():
http://archives.postgresql.org/message-id/20110602045955.gc8...@tornado.gateway.2wire.net

To summarize, user code can trap SIGINT cancellations, but it cannot trap
SIGTERM terminations.  If a backend is executing a SECURITY DEFINER function
when another backend of the same role calls pg_terminate_backend() thereon,
the pg_terminate_backend() caller could achieve something he cannot achieve in
PostgreSQL 9.1.  I vote that this is an acceptable loss.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend for same-role

On Fri, Mar 16, 2012 at 3:42 PM, Noah Misch n...@leadboat.com wrote:
On Thu, Mar 15, 2012 at 04:14:03PM -0700, Daniel Farina wrote:
Parallel to pg_cancel_backend, it'd be nice to allow the user to just
outright kill a backend that they own (politely, with a SIGTERM),
aborting any transactions in progress, including the idle transaction,
and closing the socket.

I imagine the problem is a race condition whereby a pid might be
reused by another process owned by another user (doesn't that also
affect pg_cancel_backend?). Shall we just do everything using the
MyCancelKey (which I think could just be called SessionKey,
SessionSecret, or even just Session) as to ensure we have no case
of mistaken identity? Or does that end up being problematic?

No, I think the hazard you identify here is orthogonal to the question of when
to authorize pg_terminate_backend(). As you note downthread, protocol-level
cancellations available in released versions already exhibit this hazard. I
wouldn't mind a clean fix for this, but it's an independent subject.

Here I discussed a hazard specific to allowing pg_terminate_backend():
http://archives.postgresql.org/message-id/20110602045955.gc8...@tornado.gateway.2wire.net

To summarize, user code can trap SIGINT cancellations, but it cannot trap
SIGTERM terminations. If a backend is executing a SECURITY DEFINER function
when another backend of the same role calls pg_terminate_backend() thereon,
the pg_terminate_backend() caller could achieve something he cannot achieve in
PostgreSQL 9.1. I vote that this is an acceptable loss.

I'll throw out a patch that just lets this hazard exist and see what
happens, although it is obsoleted/incompatible with the one already
attached.

--
fdr

Implement-race-free-sql-originated-backend-cancellation-v2.patch.gz
Description: GNU Zip compressed data

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend for same-role

On Fri, Mar 16, 2012 at 4:42 PM, Daniel Farina dan...@heroku.com wrote:
 Hmm. Well, here's a patch that implements exactly that, I think,

That version had some screws loose due to some editor snafus.
Hopefully all better.

-- 
fdr


Implement-race-free-sql-originated-backend-cancellation-v3.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_terminate_backend for same-role

2012-03-15 Thread Daniel Farina

Parallel to pg_cancel_backend, it'd be nice to allow the user to just
outright kill a backend that they own (politely, with a SIGTERM),
aborting any transactions in progress, including the idle transaction,
and closing the socket.

I imagine the problem is a race condition whereby a pid might be
reused by another process owned by another user (doesn't that also
affect pg_cancel_backend?).  Shall we just do everything using the
MyCancelKey (which I think could just be called SessionKey,
SessionSecret, or even just Session) as to ensure we have no case
of mistaken identity? Or does that end up being problematic?

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend for same-role

2012-03-15 Thread Fujii Masao

On Fri, Mar 16, 2012 at 8:14 AM, Daniel Farina dan...@heroku.com wrote:
 Parallel to pg_cancel_backend, it'd be nice to allow the user to just
 outright kill a backend that they own (politely, with a SIGTERM),
 aborting any transactions in progress, including the idle transaction,
 and closing the socket.

+1

 I imagine the problem is a race condition whereby a pid might be
 reused by another process owned by another user (doesn't that also
 affect pg_cancel_backend?).

Yes, but I think it's too unlikely to happen. Not sure if we really
should worry about that.

 Shall we just do everything using the
 MyCancelKey (which I think could just be called SessionKey,
 SessionSecret, or even just Session) as to ensure we have no case
 of mistaken identity? Or does that end up being problematic?

What if pid is unfortunately reused after passing the test of MyCancelKey
and before sending the signal?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend for same-role

2012-03-15 Thread Daniel Farina

On Thu, Mar 15, 2012 at 9:39 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Shall we just do everything using the
 MyCancelKey (which I think could just be called SessionKey,
 SessionSecret, or even just Session) as to ensure we have no case
 of mistaken identity? Or does that end up being problematic?

 What if pid is unfortunately reused after passing the test of MyCancelKey
 and before sending the signal?

The way MyCancelKey is checked now is backwards, in my mind.  It seems
like it would be better checked by the receiving PID (one can use a
check/recheck also, if so inclined).  Is there a large caveat to that?
I'm working on a small patch to do this I hope to post soon (modulus
difficulties), but am fully aware that messing around PGPROC and
signal handling can be a bit fiddly.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend for same-role

2012-03-15 Thread Tom Lane

Daniel Farina dan...@heroku.com writes:
 The way MyCancelKey is checked now is backwards, in my mind.  It seems
 like it would be better checked by the receiving PID (one can use a
 check/recheck also, if so inclined).  Is there a large caveat to that?

You mean, other than the fact that kill(2) can't transmit such a key?

But actually I don't see what you hope to gain from such a change,
even if it can be made to work.  Anyone who can do kill(SIGINT) can
do kill(SIGKILL), say --- so you have to be able to trust the signal
sender.  What's the point of not trusting it to verify the client
identity?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend for same-role

2012-03-15 Thread Daniel Farina

On Thu, Mar 15, 2012 at 10:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniel Farina dan...@heroku.com writes:
 The way MyCancelKey is checked now is backwards, in my mind.  It seems
 like it would be better checked by the receiving PID (one can use a
 check/recheck also, if so inclined).  Is there a large caveat to that?

 You mean, other than the fact that kill(2) can't transmit such a key?

I was planning on using an out-of-line mechanism. Bad idea?

 But actually I don't see what you hope to gain from such a change,
 even if it can be made to work.  Anyone who can do kill(SIGINT) can
 do kill(SIGKILL), say --- so you have to be able to trust the signal
 sender.  What's the point of not trusting it to verify the client
 identity?

No longer true with pg_cancel_backend not-by-superuser, no?  Now there
are new people who can do kill(SIGINT) (modulus the already existing
cancel requests).

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend for same-role

2012-03-15 Thread Tom Lane

Daniel Farina dan...@heroku.com writes:
 On Thu, Mar 15, 2012 at 10:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 But actually I don't see what you hope to gain from such a change,
 even if it can be made to work.  Anyone who can do kill(SIGINT) can
 do kill(SIGKILL), say --- so you have to be able to trust the signal
 sender.  What's the point of not trusting it to verify the client
 identity?

 No longer true with pg_cancel_backend not-by-superuser, no?

No.  That doesn't affect the above argument in the least.  And in fact
if there's any question whatsoever as to whether unprivileged
cross-backend signals are secure, they are not going in in the first
place.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-07-01 Thread Torello Querci

2011/6/2 Noah Misch n...@leadboat.com:
 On Wed, Jun 01, 2011 at 10:26:34PM -0400, Josh Kupershmidt wrote:
 On Wed, Jun 1, 2011 at 5:55 PM, Noah Misch n...@leadboat.com wrote:
  On Sun, May 29, 2011 at 10:56:02AM -0400, Josh Kupershmidt wrote:
  Looking around, I see there were real problems[1] with sending SIGTERM
  to individual backends back in 2005 or so, and pg_terminate_backend()
  was only deemed safe enough to put in for 8.4 [2]. So expanding
  pg_terminate_backend() privileges does make me a tad nervous.
 
  The documentation for the CREATE USER flag would boil down to omit this 
  flag
  only if you're worried about undiscovered PostgreSQL bugs in this area. 
  ?I'd
  echo Tom's sentiment from the first thread, In any case I think we have to
  solve it, not create new mechanisms to try to ignore it.

 I do agree with Tom's sentiment from that thread. But, if we are
 confident that pg_terminate_backend() is safe enough to relax
 permissions on, then I take it you agree we should plan to extend this
 power to all users?

 Yes; that's what I was trying to say.

 Having thought about this some more, I do now see a risk.  Currently, a 
 SECURITY
 DEFINER function (actually any function, but that's where it matters) can trap
 query_canceled.  By doing so, the author can ensure that only superusers and
 crashes may halt the function during a section protected in this way.  One 
 might
 use it to guard a series of updates made over dblink.  pg_terminate_backend()
 breaks this protection.  I've never designed something this way; it only
 suffices when you merely sort-of-care about transactional integrity.  Perhaps
 it's an acceptable loss for this feature?

 And if so, is this patch a good first step on that path?


Understand that the pg_terminate_backend() is able to kill process
that need not to be killed.
I suppose that looking inside the internal postgreql table in order to
not allow a normal db owner to kill a superuser connection can avoid
this problem?

 Yes.

  Reading through those old threads made me realize this patch would
  give database owners the ability to kill off autovacuum workers. Seems
  like we'd want to restrict that power to superusers.
 
  Would we? ?Any old user can already stifle VACUUM by holding a transaction 
  open.

 This is true, though it's possible we might at some point want a
 backend process which really shouldn't be killable by non-superusers
 (if vacuum/autovacuum isn't one already.) Actually, I could easily
 imagine a superuser running an important query on a database getting
 peeved if a non-superuser were allowed to cancel/terminate his
 queries.

 That's really a different level of user isolation than we have.  If your
 important query runs on a database owned by someone else, calls functions 
 owned
 by someone else, or reads tables owned by someone else, you're substantially 
 at
 the mercy of those object owners.  That situation probably is unsatisfactory 
 to
 some folks.  Adding the possibility that a database owner could cancel your
 query seems like an extension of that codependency more than a new exposure.

If I am the database owner I need to be able to manage my DB. Ok for
superuser connection (and internal administrative process like
autovacuum)
I am the developer, not the DBA, so sometimes, when I wrong something,
I need to kill my session if I wrong something 

Can we suppose, in a more generic case,  that an user can kill
connection only from the same user even if this is not the database
owner?

Best Regards, Torello

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-06-01 Thread Josh Kupershmidt

On Wed, Jun 1, 2011 at 5:55 PM, Noah Misch n...@leadboat.com wrote:
 On Sun, May 29, 2011 at 10:56:02AM -0400, Josh Kupershmidt wrote:
 Looking around, I see there were real problems[1] with sending SIGTERM
 to individual backends back in 2005 or so, and pg_terminate_backend()
 was only deemed safe enough to put in for 8.4 [2]. So expanding
 pg_terminate_backend() privileges does make me a tad nervous.

 The documentation for the CREATE USER flag would boil down to omit this flag
 only if you're worried about undiscovered PostgreSQL bugs in this area.  I'd
 echo Tom's sentiment from the first thread, In any case I think we have to
 solve it, not create new mechanisms to try to ignore it.

I do agree with Tom's sentiment from that thread. But, if we are
confident that pg_terminate_backend() is safe enough to relax
permissions on, then I take it you agree we should plan to extend this
power to all users? And if so, is this patch a good first step on that
path?

 Reading through those old threads made me realize this patch would
 give database owners the ability to kill off autovacuum workers. Seems
 like we'd want to restrict that power to superusers.

 Would we?  Any old user can already stifle VACUUM by holding a transaction 
 open.

This is true, though it's possible we might at some point want a
backend process which really shouldn't be killable by non-superusers
(if vacuum/autovacuum isn't one already.) Actually, I could easily
imagine a superuser running an important query on a database getting
peeved if a non-superuser were allowed to cancel/terminate his
queries.

Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-05-29 Thread Josh Kupershmidt

On Sun, May 29, 2011 at 5:04 AM, Noah Misch n...@leadboat.com wrote:
What risks arise from unconditionally allowing these calls for the same user's
backends? `pg_cancel_backend' ought to be safe enough; the user always has
access to the standard cancellation protocol, making the SQL interface a mere
convenience (albeit a compelling one). `pg_terminate_backend' does open up
access to a new behavior, but no concrete risks come to mind.

Looking around, I see there were real problems[1] with sending SIGTERM
to individual backends back in 2005 or so, and pg_terminate_backend()
was only deemed safe enough to put in for 8.4 [2]. So expanding
pg_terminate_backend() privileges does make me a tad nervous.

Reading through those old threads made me realize this patch would
give database owners the ability to kill off autovacuum workers. Seems
like we'd want to restrict that power to superusers.

On the other hand, this *would* be substantial new authority for database
owners. Seems like a reasonable authority to grant, though.

And I also realized that this patch's approach might force us to
maintain a permissions wart if we ever want to implement fine-grained
control for this stuff, e.g. a per-role setting enabling self-kills.
It would be a bit lame to have to document Use this CREATE/ALTER ROLE
flag. Or be the database owner. Or be a superuser.

Now, a few technical comments about the patch:
1.) This bit looks dangerous:
+ backend = pgstat_fetch_stat_beentry(i);
+ if (backend-st_procpid == pid) {

Since pgstat_fetch_stat_beentry() might return NULL.

I think you want BackendPidGetProc().

Ah, thanks for the pointer.

Josh

--
[1]
http://postgresql.1045698.n5.nabble.com/pg-terminate-backend-idea-td1930120.html
[2]
http://postgresql.1045698.n5.nabble.com/Re-COMMITTERS-pgsql-Add-pg-terminate-backend-to-allow-terminating-only-a-single-td1983763i20.html

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-05-28 Thread Josh Kupershmidt

On Fri, Mar 11, 2011 at 8:54 AM, Bruce Momjian br...@momjian.us wrote:
 I have added it to the next commit fest.

Hi Torello,

I have volunteered (more accurately, Greg Smith volunteered me :-)
to be a reviewer for this patch. I know you're a bit new here, so I
thought I'd outline where this patch stands and what's expected if
you'd like to move it along.

We organize patch reviews via commitfests lasting a month or so.
Some more information about this process:
http://wiki.postgresql.org/wiki/CommitFest

Each commitfest is a period wherein you can expect to receive some
feedback on your patch and advice on things which might need to be
improved (in this case, it's my job to provide you this feedback).
Your patch is in the upcoming commitfest, scheduled to run from June
15 to July 14.

So if you're interested in being responsible for this patch, or some
variant of it, eventually making its way into PostgreSQL 9.2, you
should be willing to update your patch based on feedback, request
advice, etc. during this period. If you're not interested in getting
sucked into this process that's OK -- just please advise us if that's
the case, and maybe someone else will be willing to take charge of the
patch.

Anssi and I posted some initial feedback on the patch's goals earlier.
I would like to ultimately see users have the capability to
pg_cancel_backend() their own queries. But I could at least conceive
of others not wanting this behavior enabled by default. So perhaps
this patch's approach of granting extra privs to the database owner
could work as a first attempt. And maybe a later version could
introduce a GUC allowing the DBA to control whether users can
cancel/terminate their backends, or we could instead have an option
flag to CREATE/ALTER ROLE, allowing per-user configuration.

It would be helpful to hear from others whether this patch's goals
would work as a first pass at this problem, so that Torello doesn't
waste time on a doomed approach. Also, it might be helpful to add an
entry on the Todo list for 'allow non-superusers to use
pg_cancel_backend()', in case this patch gets sunk.

Now, a few technical comments about the patch:
1.) This bit looks dangerous:
+backend = pgstat_fetch_stat_beentry(i);
+if (backend-st_procpid == pid) {

Since pgstat_fetch_stat_beentry() might return NULL.

I'm a bit suspicious about whether looping through
pgstat_fetch_stat_beentry() is the best way to determine the database
owner for a given backend PID, but I haven't dug in enough yet to
suggest a better alternative.

2.) The way the code inside pg_signal_backend() is structured, doing:
  select pg_cancel_backend(12345);

as non-superuser, where '12345' is a fictitious PID, can now give you
the incorrect error message:

  ERROR:  must be superuser or target database owner to signal other
server processes

3.) No documentation adjustments, and the comments need some cleaup.
Torello: I'll be happy to handle comments/documentation for you as a
native English speaker, so you don't have to worry about this part.

That's it for now. Torello, I look forward to hearing back from you,
and hope that you have some time to work on this patch further.

Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-03-11 Thread Bruce Momjian

Kevin Grittner wrote:
 Torello Querci tque...@gmail.com wrote:
  
  I attach a path for this
  
 It's too late in the release cycle to consider this for version 9.1.
 Please add it to the open CommitFest for consideration for 9.2:
  
 https://commitfest.postgresql.org/action/commitfest_view/open

I have added it to the next commit fest.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-02-27 Thread Josh Kupershmidt

On Mon, Feb 14, 2011 at 8:58 AM, Anssi Kääriäinen
anssi.kaariai...@thl.fi wrote:
 On 02/14/2011 02:10 PM, Torello Querci wrote:

 I suppose that give the right to the owner db user to terminate or
 cancel other session connected to the database which it is owner is a
 good thing.
 I not see any security problem because this user can cancel or
 terminate only the session related with the own database,
 but if you think that this is a problem, a configuration parameter can be
 used.

 For what it's worth, a big +1 from me. We have pretty much the same use
 case.

 It would be good if you could also terminate your own connections.

The superuser-only restriction for pg_cancel_backend() has been a pet
peeve of mine as well. I actually posted a patch a while back to let
users pg_cancel_backend() their own queries, see:
http://archives.postgresql.org/pgsql-admin/2010-02/msg00052.php

IMO it'd be better to do away with this patch's check of:
/* If the user not is the superuser, need to be the db owner. */

and instead just check if the target session's user matches that of
the cancel requester.

Additionally, this patch keeps all the permission checking inside
pg_signal_backend(). That's fine if we're sure that we want
pg_cancel_backend() and pg_terminate_backend() to undergo the same
permissions check, but perhaps it's a bad idea to relax the
permissions check on pg_terminate_backend() ?

Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-02-14 Thread Torello Querci

Hi,

this is the first time that I post here, so if I wrong please don't kill me ...
I see that pg_terminate_backend and pg_cancel_backend  can be execute
only by admin users.
This approach seems to be too restrictive in a lots of real situation.

In dept, I have a situation where it is created one database machine
for all the postgresql database.
This database machine is managed by IT staff that have created two
user for each application.
One user is the owner db user that create, drop, grant on this db,
while the other user is the application db.

In this situation I (the developer) not able to disconnect any client
and stop any high weight queries.
Unfortunately the application run on application server that is
manager, again, by IT staff and I not have the right to stop it.

I suppose that give the right to the owner db user to terminate or
cancel other session connected to the database which it is owner is a
good thing.
I not see any security problem because this user can cancel or
terminate only the session related with the own database,
but if you think that this is a problem, a configuration parameter can be used.

Of course I can create a function with admin right that do the same
thing but the IT staff need to install, configure, and give the right
grant.
So, I suppose, that this can to be only a workaround, not the solution.

Sorry for my English.

I attach a path for this


Best Regards, Torello
diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
index 5bda4af..5327447 100644
--- a/src/backend/utils/adt/misc.c
+++ b/src/backend/utils/adt/misc.c
@@ -33,6 +33,7 @@
 #include storage/procarray.h
 #include utils/builtins.h
 #include tcop/tcopprot.h
+#include pgstat.h
 
 #define atooid(x)  ((Oid) strtoul((x), NULL, 10))
 
@@ -75,9 +76,33 @@ static bool
 pg_signal_backend(int pid, int sig)
 {
 	if (!superuser())
-		ereport(ERROR,
-(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
-			(errmsg(must be superuser to signal other server processes;
+	{
+bool haveRight = false;
+PgBackendStatus *backend;
+
+		/* If the user not is the superuser, need to be the db owner. */
+		if (pg_database_ownercheck(MyDatabaseId, GetUserId())) {
+
+/* Check for the specify backend in the stat info table */
+int nBackend = pgstat_fetch_stat_numbackends();
+int i;
+for (i = 1; i=nBackend; ++i) {
+backend = pgstat_fetch_stat_beentry(i);
+if (backend-st_procpid == pid) {
+if (backend-st_databaseid == MyDatabaseId)
+haveRight = true;
+break;
+}
+}
+}
+
+if (!haveRight)
+			ereport(ERROR,
+	(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+(errmsg(must be superuser or database destination owner to signal other server processes;
+	}
+
+
 
 	if (!IsBackendPid(pid))
 	{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-02-14 Thread Anssi Kääriäinen


On 02/14/2011 02:10 PM, Torello Querci wrote:

I suppose that give the right to the owner db user to terminate or
cancel other session connected to the database which it is owner is a
good thing.
I not see any security problem because this user can cancel or
terminate only the session related with the own database,
but if you think that this is a problem, a configuration parameter can be used.
For what it's worth, a big +1 from me. We have pretty much the same use 
case.


It would be good if you could also terminate your own connections.

 - Anssi

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-02-14 Thread Kevin Grittner

Torello Querci tque...@gmail.com wrote:
 
 I attach a path for this
 
It's too late in the release cycle to consider this for version 9.1.
Please add it to the open CommitFest for consideration for 9.2:
 
https://commitfest.postgresql.org/action/commitfest_view/open
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

2008-04-17 Thread Bruce Momjian

Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  The closest thing I can think of to an automated test is to run repeated
  sets of the parallel regression tests, and each time SIGTERM a randomly
  chosen backend at a randomly chosen time.  Then see if anything funny
 
  Yep, that was my plan, plus running the parallel regression tests you
  get the possibility of 2 backends.
 
 I was intentionally suggesting only one kill per test cycle.  Multiple
 kills will probably create an O(N^2) explosion in the set of possible
 downstream-failure deltas.  I doubt you'd really get any improvement
 in testing coverage to justify the much larger amount of hand validation
 needed.
 
 It also strikes me that you could make some simple alterations to the
 regression tests to reduce the set of observable downstream deltas.
 For example, anyplace where a test loads a table with successive INSERTs
 and that table is used by later tests, wrap the INSERT sequence with
 BEGIN/END.  Then there is only one possible downstream delta (empty
 table) and not N different possibilities for an N-row table.

I have added pg_terminate_backend() to use SIGTERM and will start
running tests as discussed with Tom.  I will post my scripts too.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Magnus Hagander

Bruce Momjian wrote:
 Tom Lane wrote:
  Bruce Momjian [EMAIL PROTECTED] writes:
   I think if we want pg_terminate_backend, we have to do it the
   way that was originally discussed: have it issue SIGTERM and fix
   whatever needs to be fixed in the SIGTERM code path.
  
   Well, with no movement on this TODO item since it was removed in
   8.0, I am willing to give users something that works most of the
   time.
  
  If the users want it so bad, why has no one stepped up to do the
  testing?
 
 Good question.  Tom and I talked about this on the phone today.
 
 I think the problem is testing to try to prove the lack of a bug.  How
 long does someone test to know they have reasonably proven a bug
 doesn't exist?  

Right. I think we have *a lot* of users that regularly use SIGTERM to
kill the backends, becuase they have no idea it's dangerous. They just
use ps to find the backend and kill to see it go away.

If we had a *big* problem with it, we'd hear about it. So I doubt we
have. But it's quite possible that we have a narrow problem that can
cause problems in some issues - but I expect most people who run this
stuff without knowing it's dangerous aren't running the worlds largest
and most loaded databases...

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Magnus Hagander

Tom Lane wrote:
 I'm willing to enable a SIGTERM-based pg_terminate_backend for 8.4
 if there is some reasonable amount of testing done during this
 development cycle to try to expose any problems.  If no one is willing
 to do any such testing, then as far as I'm concerned there is no
 market for the feature and we should drop it from TODO.

If someone can come up with an automated script to do this kind of
testing, I can commit a VM or three to running this 24/7 for a month,
easily... But I don't trust myself in coming up with a test-case that's
good enough :-P

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

Tom Lane [EMAIL PROTECTED] writes:

 ISTM that there will be more cases like this in future, so we need a
 general solution anyway.  I propose the following sort of code structure
 for these situations:

   on_shmem_exit(cleanup_routine, arg);
   PG_TRY();
   {
   ... do something ...
   cancel_shmem_exit(cleanup_routine, arg);
   }
   PG_CATCH();
   {
   cancel_shmem_exit(cleanup_routine, arg);
   cleanup_routine(arg);
   PG_RE_THROW();
   }
   PG_END_TRY();

[We would also have to block SIGTERM around the second cancel_shmem_exit and
cleanup_routine, no? Or if it's idempotent (actually, wouldn't it have to be?)
run them in the reverse order.]

This seems exceptionally fragile. Someone's bound to add another case without
realizing it. I thought about doing something like having a flag
in_unprotected_pg_try which PG_TRY would set to true, and on_shmem_exit would
clear. Then any LWLock or other shmem operation could assert it's cleared.

But that doesn't work for nested PG_TRY blocks. Possibly we could get it to
work by keeping a pair of counters but I'm not sure that's sufficient.

Are all the known cases LWLocks? Is there a reason we couldn't just have a
generic cleanup routine which just releases all LWLocks we hold before
exiting?

Or weakly -- an assert in CHECK_FOR_INTERRUPTS that there are no LWLocks held
or if there are that there's a cleanup routine in place. We wouldn't know for
sure that it's the *right* cleanup routine or enough cleanup routines but
hopefully we would catch the error often enough to notice.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

Gregory Stark [EMAIL PROTECTED] writes:


 Or weakly -- an assert in CHECK_FOR_INTERRUPTS 

oops, that's irrelevant of course. 


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 ISTM that there will be more cases like this in future, so we need a
 general solution anyway.  I propose the following sort of code structure
 for these situations:

 [We would also have to block SIGTERM around the second cancel_shmem_exit and
 cleanup_routine, no? Or if it's idempotent (actually, wouldn't it have to be?)
 run them in the reverse order.]

No, we wouldn't, because a SIGTERM can only actually fire at a
CHECK_FOR_INTERRUPTS() call.  You'd just need to be sure there wasn't
one in the cleanup code.

 Are all the known cases LWLocks?

*None* of the known cases are LWLocks, nor any other resource that we
have generic cleanup code for.  The problem cases are one-off resources
that it seemed we could avoid having a real cleanup mechanism for.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

Tom Lane [EMAIL PROTECTED] writes:

 [We would also have to block SIGTERM around the second cancel_shmem_exit and
 cleanup_routine, no? Or if it's idempotent (actually, wouldn't it have to 
 be?)
 run them in the reverse order.]

 No, we wouldn't, because a SIGTERM can only actually fire at a
 CHECK_FOR_INTERRUPTS() call.  You'd just need to be sure there wasn't
 one in the cleanup code.

Wait, huh? In that case I don't see what advantage any of this has over
Bruce's patch. And his approach seemed a lot more robust.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 No, we wouldn't, because a SIGTERM can only actually fire at a
 CHECK_FOR_INTERRUPTS() call.  You'd just need to be sure there wasn't
 one in the cleanup code.

 Wait, huh? In that case I don't see what advantage any of this has over
 Bruce's patch. And his approach seemed a lot more robust.

 Maybe I missed something, but I thought he was just proposing some
 macro syntactic sugar over the same code that I described.

No, I meant the earlier patch which you rejected with the flag in MyProc. I
realize there were other issues but the initial concern was that it wouldn't
respond promptly because it would wait for CHECK_FOR_INTERRUPTS. But if
sigterm was never handled except at a CHECK_FOR_INTERRUPTS then that was never
a factor.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 No, we wouldn't, because a SIGTERM can only actually fire at a
 CHECK_FOR_INTERRUPTS() call.  You'd just need to be sure there wasn't
 one in the cleanup code.

 Wait, huh? In that case I don't see what advantage any of this has over
 Bruce's patch. And his approach seemed a lot more robust.

Maybe I missed something, but I thought he was just proposing some
macro syntactic sugar over the same code that I described.

I'm not against syntactic sugar, but it doesn't seem to be totally
clean; I think you would need two repetitions of the function and
arg anyway:

PG_ENSURE_ERROR_CLEANUP(cleanup_function, arg);
{
... risky code here ...
}
PG_END_ENSURE_ERROR_CLEANUP(cleanup_function, arg);

because both the before and after parts need to know the function
name and the arg to pass it.  The only way to avoid duplication would be
if the whole controlled code block was actually an argument to the
macro:

PG_ENSURE_ERROR_CLEANUP(cleanup_function, arg,
... risky code here ...
);

but there are very good reasons not to do that when the controlled
code can be large; for instance that #if can't be used portably
in a macro argument.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

Gregory Stark [EMAIL PROTECTED] writes:
 No, I meant the earlier patch which you rejected with the flag in MyProc. I
 realize there were other issues but the initial concern was that it wouldn't
 respond promptly because it would wait for CHECK_FOR_INTERRUPTS.

No, that's not the concern in the least.  The concern is that something
could trap the attempted throwing of the error *after*
CHECK_FOR_INTERRUPTS has noticed the signal.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Bruce Momjian

Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  I am starting to think we need to analyze the source code rather than
  testing, because of what we are testing for.
 
 I was thinking about that a bit more, in connection with trying to
 verbalize why I don't like your patch ;-)
 
 The fundamental assumption here is that we think that proc_exit() from
 the main loop of PostgresMain() should be safe, because that's exercised
 anytime a client quits or dies unexpectedly, and so it should be a
 well-tested path.  What is lacking such test coverage is the many code
 paths that start proc_exit() from any random CHECK_FOR_INTERRUPTS()
 point in the backend.  Some of those points might be (in fact are known
 to be, as of today) holding locks or otherwise in a state that prevents
 a clean proc_exit().
 
 Your patch proposes to replace that code path by one that fakes a
 QueryCancel error and then does proc_exit() once control reaches the
 outer level.  While that certainly *should* work (ignoring the question
 of whether control gets to the outer level in any reasonable amount of
 time), it's a bit of a stretch to claim that it's a well-tested case.
 At the moment it would only arise if a QueryCancel interrupt arrived at
 approximately the same time that the client died or sent a disconnect
 message, which is surely far less probable than client death by itself.
 So I don't buy the argument that this is a better-tested path, and
 I definitely don't want to introduce new complexity in order to make it
 happen like that.

I would like my use of SIGINT to be like someone pressing ^C and then \q
in psql, but I can see that the SIGINT might be delivered in places
where libpq would not deliver it, like during shutdown.  Hopefully that
could be addressed, but I agree using SIGTERM is better and more useful.

 Now, as to whether a SIGTERM-style exit is safe.  With the exception
 of the PG_CATCH issues that we already know about, any other case seems
 like a clear coding error: you'd have to be doing something that you
 know could throw an error, without having made provision to clean up
 whatever unclean state you are in.  It'd be nice to think that we
 haven't been that silly anywhere.  Experience however teaches that such
 an assumption is hubris.
 
 I think that the way forward is to fix the PG_CATCH issues (which I will
 try to get done later this week) and then get someone to mount a serious
 effort to break it, as outlined in previous discussions:
 http://archives.postgresql.org/pgsql-hackers/2006-08/msg00174.php

I was thinking of running two parallel regression tests and killing one
at random and see if the others complete.  One problem is that the
regression tests issue a lot of server error messages so somehow I would
need to filter out the normal errors, which I think I could do.

Magnus has offered to test too.

 I'm willing to enable a SIGTERM-based pg_terminate_backend for 8.4
 if there is some reasonable amount of testing done during this
 development cycle to try to expose any problems.  If no one is willing
 to do any such testing, then as far as I'm concerned there is no market
 for the feature and we should drop it from TODO.

Agreed, but the feature is going to be asked for again soon so we would
probably have to put it on the features we don't want, and that is going
to look pretty odd.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

Magnus Hagander [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I'm willing to enable a SIGTERM-based pg_terminate_backend for 8.4
 if there is some reasonable amount of testing done during this
 development cycle to try to expose any problems.

 If someone can come up with an automated script to do this kind of
 testing, I can commit a VM or three to running this 24/7 for a month,
 easily... But I don't trust myself in coming up with a test-case that's
 good enough :-P

The closest thing I can think of to an automated test is to run repeated
sets of the parallel regression tests, and each time SIGTERM a randomly
chosen backend at a randomly chosen time.  Then see if anything funny
happens.  The hard part here is distinguishing expected from unexpected
regression outputs, especially in view of the fact that some of the
tests depend on database contents set up by earlier tests.

I'm thinking that you could automatically discard the regression diff
for the specific test that got SIGTERM'd, as long as it looked like
the normal output up to the point where the terminated by
administrator error appears.  Then what you'd have is the potential for
downstream failures due to things not being created, which *should* fall
into a fairly stylized set of possible diffs.  So get the script to
throw away any diffs that exactly match ones seen previously.  Run it
for awhile, and then hand-validate the set of diffs that it's saved
... or if any of 'em look funny, report.

One gotcha I can think of is that killing the prepared_xacts test
can leave you with open 2PC transactions, which will interfere with
starting the next cycle of the tests (you have to kill them before you
can dropdb).  But you could add a rollback prepared to the driver
script to clean out any uncommitted prepared xact.

Whether this is workable or not depends on the size of the set of
expected downstream-failure diffs.  My gut feeling from many years of
watching regression test crashes is that it'd be large but not
completely impractical to look through by hand.

I haven't time to write something like that myself, but offhand it seems
like it could be done without more than a day or so's work, especially
if you start from the buildfarm infrastructure.

BTW, don't forget to include autovac workers in the set of SIGTERM
target candidates.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Bruce Momjian

Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  I'm willing to enable a SIGTERM-based pg_terminate_backend for 8.4
  if there is some reasonable amount of testing done during this
  development cycle to try to expose any problems.
 
  If someone can come up with an automated script to do this kind of
  testing, I can commit a VM or three to running this 24/7 for a month,
  easily... But I don't trust myself in coming up with a test-case that's
  good enough :-P
 
 The closest thing I can think of to an automated test is to run repeated
 sets of the parallel regression tests, and each time SIGTERM a randomly
 chosen backend at a randomly chosen time.  Then see if anything funny

Yep, that was my plan, plus running the parallel regression tests you
get the possibility of 2 backends.

 happens.  The hard part here is distinguishing expected from unexpected
 regression outputs, especially in view of the fact that some of the
 tests depend on database contents set up by earlier tests.

Oh, that is a problem.  I was going to get the typical errors, sort them
and put them in a file.  Then when I run the test, I sort the log again
and use diff | grep '' to see an differences.  If there are cases that
generate new errors on a previous failure, I will have to add that
output too.

 I'm thinking that you could automatically discard the regression diff
 for the specific test that got SIGTERM'd, as long as it looked like
 the normal output up to the point where the terminated by
 administrator error appears.  Then what you'd have is the potential for
 downstream failures due to things not being created, which *should* fall
 into a fairly stylized set of possible diffs.  So get the script to
 throw away any diffs that exactly match ones seen previously.  Run it
 for awhile, and then hand-validate the set of diffs that it's saved
 ... or if any of 'em look funny, report.

Yep, see above.

 One gotcha I can think of is that killing the prepared_xacts test
 can leave you with open 2PC transactions, which will interfere with
 starting the next cycle of the tests (you have to kill them before you
 can dropdb).  But you could add a rollback prepared to the driver
 script to clean out any uncommitted prepared xact.

Good idea.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 The closest thing I can think of to an automated test is to run repeated
 sets of the parallel regression tests, and each time SIGTERM a randomly
 chosen backend at a randomly chosen time.  Then see if anything funny

 Yep, that was my plan, plus running the parallel regression tests you
 get the possibility of 2 backends.

I was intentionally suggesting only one kill per test cycle.  Multiple
kills will probably create an O(N^2) explosion in the set of possible
downstream-failure deltas.  I doubt you'd really get any improvement
in testing coverage to justify the much larger amount of hand validation
needed.

It also strikes me that you could make some simple alterations to the
regression tests to reduce the set of observable downstream deltas.
For example, anyplace where a test loads a table with successive INSERTs
and that table is used by later tests, wrap the INSERT sequence with
BEGIN/END.  Then there is only one possible downstream delta (empty
table) and not N different possibilities for an N-row table.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Bruce Momjian

Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  The closest thing I can think of to an automated test is to run repeated
  sets of the parallel regression tests, and each time SIGTERM a randomly
  chosen backend at a randomly chosen time.  Then see if anything funny
 
  Yep, that was my plan, plus running the parallel regression tests you
  get the possibility of 2 backends.
 
 I was intentionally suggesting only one kill per test cycle.  Multiple
 kills will probably create an O(N^2) explosion in the set of possible
 downstream-failure deltas.  I doubt you'd really get any improvement
 in testing coverage to justify the much larger amount of hand validation
 needed.

No, the point is that you have 2 backends to choose from to kill.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_terminate_backend() idea

Gregory Stark wrote:
   Add pg_terminate_backend() to allow terminating only a single session.
 
 I'm interested in this because I was already looking for a solution to the
 out of signals problem we have.
 
 I think we could expand this by having a bunch of boolean flags, one each for
 different conditions including the sinval processing conditions, interrupt,
 info, and terminate. (Any more?)
 
 The two things we would have to check to be sure of is:
 
 1) Do we care about how many times events are processed? Ie, if you press
 interrupt twice is it important that that be handled as an interrupt twice? It
 doesn't work that way currently for interrupt but are any of the other
 conditions sensitive to this? I don't think so.

Not that I can think of.

 2) Do we care what order things happen in? Ie, if you send an info request and
 then a cancel request is it ok if the cancel is handled first. I don't see why
 not myself. And if it's a terminate request we *clear* don't want to bother
 handling any other events first.

I don't think we care.

 It seems to me we could replace all of the above with either SIGINT or USR1
 and have a bunch of boolean flags in MyProc. I'm not sure of the implication
 for sinval processing of having to get a whole bunch of LWLocks though.

Keep in mind PGPROC-terminate was easier because once it was set it was
never cleared.  If you need to clear the flag and keep going the code is
going to need to be a little more sophisticated to avoid lost interrupts.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_terminate_backend() issues

Tom Lane wrote:
 I wrote:
  All in all, this patch wasn't ready to apply without review.  I'm
  currently looking to see whether it's salvageable or not.
 
 After further study, I've concluded that it is in fact not salvageable,
 and I respectfully request that it be reverted.

OK, reverted.

 I was able to get things to more or less work most of the time with
 three or four additional ugly hacks in postgres.c, but I still don't
 have any great confidence that there aren't windows where a terminate
 request wouldn't be ignored (even without considering the uncooperative-
 function scenario).  Moreover it's entirely unclear that this approach
 actually dodges any of the hypothetical bugs in SIGTERM handling.

I don't understand.  If we call proc_exit(0) instead, it is the same as
someone exiting the backend via PQfinish().

 Given the complexity that it adds, I think it's fair to say that this
 approach is probably buggier than the other way.

 I think if we want pg_terminate_backend, we have to do it the way that
 was originally discussed: have it issue SIGTERM and fix whatever needs
 to be fixed in the SIGTERM code path.  As the TODO list used to say
 before it got editorialized upon, this is mostly a matter of testing
 (something that I can tell this patch was sadly lacking :-().
 We do need also to go around and fix any places that think a PG_TRY
 block is a sufficient way to clean up state in shared memory --- cf
 this AFAIK-still-unfixed bug:
 http://archives.postgresql.org/pgsql-hackers/2007-04/msg00218.php

Well, with no movement on this TODO item since it was removed in 8.0, I
am willing to give users something that works most of the time.  As you
said already, cancel isn't going to be 100% reliable anyway because of
places we can't check for cancel.

I will work on a new version of the patch that people can see.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I was able to get things to more or less work most of the time with
 three or four additional ugly hacks in postgres.c, but I still don't
 have any great confidence that there aren't windows where a terminate
 request wouldn't be ignored (even without considering the uncooperative-
 function scenario).  Moreover it's entirely unclear that this approach
 actually dodges any of the hypothetical bugs in SIGTERM handling.

 I don't understand.  If we call proc_exit(0) instead, it is the same as
 someone exiting the backend via PQfinish().

Well, using proc_exit() instead of die() might have alleviated that
particular objection, but there are still the others.

 I think if we want pg_terminate_backend, we have to do it the way that
 was originally discussed: have it issue SIGTERM and fix whatever needs
 to be fixed in the SIGTERM code path.

 Well, with no movement on this TODO item since it was removed in 8.0, I
 am willing to give users something that works most of the time.

If the users want it so bad, why has no one stepped up to do the
testing?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

I did a quick grep for PG_CATCH uses to see what we have along the lines
of this bug:
http://archives.postgresql.org/pgsql-hackers/2007-04/msg00218.php

In current sources there are three places at risk:

btbulkdelete, as noted in the above message
pg_start_backup needs to reset forcePageWrites = false
createdb wants to UnlockSharedObject and delete any already-copied files

In createdb, the Unlock actually doesn't need to get cleaned up, since
transaction abort would release the lock anyway.  But leaving a possibly
large mess of orphaned files doesn't seem nice.

ISTM that there will be more cases like this in future, so we need a
general solution anyway.  I propose the following sort of code structure
for these situations:

on_shmem_exit(cleanup_routine, arg);
PG_TRY();
{
... do something ...
cancel_shmem_exit(cleanup_routine, arg);
}
PG_CATCH();
{
cancel_shmem_exit(cleanup_routine, arg);
cleanup_routine(arg);
PG_RE_THROW();
}
PG_END_TRY();

where cancel_shmem_exit is defined to pop the latest shmem_exit_list
entry if it matches the passed arguments (I don't see any particular
need to search further than that in the list).  This structure
guarantees that cleanup_routine will be called on the way out of
either a plain ERROR or a FATAL exit.

Thoughts?  We clearly must do something about this before we can even
think of calling retail SIGTERM a supported feature ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() idea

2008-04-15 Thread Gregory Stark

Bruce Momjian [EMAIL PROTECTED] writes:

 Gregory Stark wrote:

 It seems to me we could replace all of the above with either SIGINT or USR1
 and have a bunch of boolean flags in MyProc. I'm not sure of the implication
 for sinval processing of having to get a whole bunch of LWLocks though.

 Keep in mind PGPROC-terminate was easier because once it was set it was
 never cleared.  If you need to clear the flag and keep going the code is
 going to need to be a little more sophisticated to avoid lost interrupts.

That's kind of the point. I don't think we care about lost interrupts for any
of these things. As long as we know a sinval flush happened I think we don't
care how many more happened. Or a cancel request or terminate request.

For sinval of course it would be important to clear the flags (with MyProc
locked) prior to processing the queue in case one arrives as soon as we're
finished. But I think that's the only detail.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

Tom Lane wrote:
 I did a quick grep for PG_CATCH uses to see what we have along the lines
 of this bug:
 http://archives.postgresql.org/pgsql-hackers/2007-04/msg00218.php
 
 In current sources there are three places at risk:
 
 btbulkdelete, as noted in the above message
 pg_start_backup   needs to reset forcePageWrites = false
 createdb wants to UnlockSharedObject and delete any already-copied files
 
 In createdb, the Unlock actually doesn't need to get cleaned up, since
 transaction abort would release the lock anyway.  But leaving a possibly
 large mess of orphaned files doesn't seem nice.
 
 ISTM that there will be more cases like this in future, so we need a
 general solution anyway.  I propose the following sort of code structure
 for these situations:
 
   on_shmem_exit(cleanup_routine, arg);
   PG_TRY();
   {
   ... do something ...
   cancel_shmem_exit(cleanup_routine, arg);
   }
   PG_CATCH();
   {
   cancel_shmem_exit(cleanup_routine, arg);
   cleanup_routine(arg);
   PG_RE_THROW();
   }
   PG_END_TRY();
 
 where cancel_shmem_exit is defined to pop the latest shmem_exit_list
 entry if it matches the passed arguments (I don't see any particular
 need to search further than that in the list).  This structure
 guarantees that cleanup_routine will be called on the way out of
 either a plain ERROR or a FATAL exit.
 
 Thoughts?  We clearly must do something about this before we can even
 think of calling retail SIGTERM a supported feature ...

Here is a little different idea.  Seems we want something more like:

PG_ON_EXIT();
{
cleanup_routine(arg);
}
   PG_TRY();
   {
   ... do something ...
   }
   PG_CATCH();
   {
   PG_RE_THROW();
   }
   PG_END_TRY();

PG_ON_EXIT() would place cleanup_routine() into the shmem_exit_list and
it would be called at the end of PG_CATCH and removed from
shmem_exit_list by PG_END_TRY().  Another idea would be to define the
PG_CATCH() first so it would be put in the shmem_exit_list before the
TRY was tried, but I assume you don't want all those functions to run on
exit.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

Bruce Momjian [EMAIL PROTECTED] writes:
 Here is a little different idea.  Seems we want something more like:

Yeah, I thought about building a set of macros that would have this
ability folded in, but it didn't seem like much notational advantage
given that you have to write the cleanup routine out-of-line anyway.

Also, until we have a lot more than three use-sites, more macros
don't seem to be worth the trouble.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Here is a little different idea.  Seems we want something more like:
 
 Yeah, I thought about building a set of macros that would have this
 ability folded in, but it didn't seem like much notational advantage
 given that you have to write the cleanup routine out-of-line anyway.

Yea, agreed.  I am a little worried about the out-of-line issue because
some cases might become more complex because you have to pass lots of
parameters, perhaps.

 Also, until we have a lot more than three use-sites, more macros
 don't seem to be worth the trouble.

Oh, only three use sites.  I thought we had three wrong sites but that
more had to be changed.  Yea, go for it!

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  I think if we want pg_terminate_backend, we have to do it the way that
  was originally discussed: have it issue SIGTERM and fix whatever needs
  to be fixed in the SIGTERM code path.
 
  Well, with no movement on this TODO item since it was removed in 8.0, I
  am willing to give users something that works most of the time.
 
 If the users want it so bad, why has no one stepped up to do the
 testing?

Good question.  Tom and I talked about this on the phone today.

I think the problem is testing to try to prove the lack of a bug.  How
long does someone test to know they have reasonably proven a bug doesn't
exist?  

I think the other problem is what to test.  One SIGTERM problem that was
reported was related to schema changes.  Certainly that has to be
tested, but what else:  prepared statements, queries, HOT updates,
connect/disconnect, notify?  There are lots of place to test and it is
hard to know which ones, and for how long.

I am starting to think we need to analyze the source code rather than
testing, because of what we are testing for.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend() issues

Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  I was able to get things to more or less work most of the time with
  three or four additional ugly hacks in postgres.c, but I still don't
  have any great confidence that there aren't windows where a terminate
  request wouldn't be ignored (even without considering the uncooperative-
  function scenario).  Moreover it's entirely unclear that this approach
  actually dodges any of the hypothetical bugs in SIGTERM handling.
 
  I don't understand.  If we call proc_exit(0) instead, it is the same as
  someone exiting the backend via PQfinish().
 
 Well, using proc_exit() instead of die() might have alleviated that
 particular objection, but there are still the others.

Tom, thanks for you feedback.  It was very helpful.

I have adjusted the patch to return a locked PGPROC entry, initialize
the PGPROC-terminate boolean, called proc_exit(), and moved the
PGPROC-terminate test out of the cancel loop entirely so lost cancel
signals are not as big a problem anymore.

I agree it would be best for pg_terminate_backend() and for the
postmaster use of SIGTERM if SIGTERM was more reliable about cleanup, so
hopefully that will work for 8.4.  I have attached my patch in hopes we
can use it as a backup in case we can't get SIGTERM to work for
individual backends for 8.4.

[ I have posted about testing in a separate email.]

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/func.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/func.sgml,v
retrieving revision 1.432
diff -c -c -r1.432 func.sgml
*** doc/src/sgml/func.sgml  15 Apr 2008 20:28:46 -  1.432
--- doc/src/sgml/func.sgml  16 Apr 2008 03:59:20 -
***
*** 11849,11854 
--- 11849,11857 
  primarypg_cancel_backend/primary
 /indexterm
 indexterm
+ primarypg_terminate_backend/primary
+/indexterm
+indexterm
  primarypg_reload_conf/primary
 /indexterm
 indexterm
***
*** 11885,11890 
--- 11888,11900 
/row
row
 entry
+ 
literalfunctionpg_terminate_backend/function(parameterpid/parameter 
typeint/)/literal
+ /entry
+entrytypeboolean/type/entry
+entryTerminate a backend/entry
+   /row
+   row
+entry
  literalfunctionpg_reload_conf/function()/literal
  /entry
 entrytypeboolean/type/entry
***
*** 11907,11915 
 /para
  
 para
! functionpg_cancel_backend/ sends a query cancel
! (systemitemSIGINT/) signal to a backend process identified by
! process ID.  The process ID of an active backend can be found from
  the structfieldprocpid/structfield column in the
  structnamepg_stat_activity/structname view, or by listing the
  commandpostgres/command processes on the server with
--- 11917,11926 
 /para
  
 para
! functionpg_cancel_backend/ and functionpg_terminate_backend/
! send a query cancel (systemitemSIGINT/) signal to a backend process
! identified by process ID.  The
! process ID of an active backend can be found from
  the structfieldprocpid/structfield column in the
  structnamepg_stat_activity/structname view, or by listing the
  commandpostgres/command processes on the server with
Index: doc/src/sgml/runtime.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v
retrieving revision 1.413
diff -c -c -r1.413 runtime.sgml
*** doc/src/sgml/runtime.sgml   15 Apr 2008 20:28:46 -  1.413
--- doc/src/sgml/runtime.sgml   16 Apr 2008 03:59:20 -
***
*** 1372,1377 
--- 1372,1384 
  well.
 /para
/important
+ 
+   para
+To terminate a session while allowing other sessions to continue, use
+functionpg_terminate_backend()/ (xref
+linkend=functions-admin-signal-table) rather than sending a signal
+to the child process.
+   /para
   /sect1
  
   sect1 id=preventing-server-spoofing
Index: src/backend/access/transam/twophase.c
===
RCS file: /cvsroot/pgsql/src/backend/access/transam/twophase.c,v
retrieving revision 1.41
diff -c -c -r1.41 twophase.c
*** src/backend/access/transam/twophase.c   25 Mar 2008 22:42:42 -  
1.41
--- src/backend/access/transam/twophase.c   16 Apr 2008 03:59:20 -
***
*** 283,288 
--- 283,289 
gxact-proc.databaseId = databaseid;
gxact-proc.roleId = owner;
gxact-proc.inCommit = false;
+   gxact-proc.terminate = false;
gxact-proc.vacuumFlags = 0;
gxact-proc.lwWaiting = false;
gxact-proc.lwExclusive = false;
Index:

Re: [HACKERS] pg_terminate_backend() issues

Bruce Momjian [EMAIL PROTECTED] writes:
 I am starting to think we need to analyze the source code rather than
 testing, because of what we are testing for.

I was thinking about that a bit more, in connection with trying to
verbalize why I don't like your patch ;-)

The fundamental assumption here is that we think that proc_exit() from
the main loop of PostgresMain() should be safe, because that's exercised
anytime a client quits or dies unexpectedly, and so it should be a
well-tested path.  What is lacking such test coverage is the many code
paths that start proc_exit() from any random CHECK_FOR_INTERRUPTS()
point in the backend.  Some of those points might be (in fact are known
to be, as of today) holding locks or otherwise in a state that prevents
a clean proc_exit().

Your patch proposes to replace that code path by one that fakes a
QueryCancel error and then does proc_exit() once control reaches the
outer level.  While that certainly *should* work (ignoring the question
of whether control gets to the outer level in any reasonable amount of
time), it's a bit of a stretch to claim that it's a well-tested case.
At the moment it would only arise if a QueryCancel interrupt arrived at
approximately the same time that the client died or sent a disconnect
message, which is surely far less probable than client death by itself.
So I don't buy the argument that this is a better-tested path, and
I definitely don't want to introduce new complexity in order to make it
happen like that.

Now, as to whether a SIGTERM-style exit is safe.  With the exception
of the PG_CATCH issues that we already know about, any other case seems
like a clear coding error: you'd have to be doing something that you
know could throw an error, without having made provision to clean up
whatever unclean state you are in.  It'd be nice to think that we
haven't been that silly anywhere.  Experience however teaches that such
an assumption is hubris.

I think that the way forward is to fix the PG_CATCH issues (which I will
try to get done later this week) and then get someone to mount a serious
effort to break it, as outlined in previous discussions:
http://archives.postgresql.org/pgsql-hackers/2006-08/msg00174.php

I'm willing to enable a SIGTERM-based pg_terminate_backend for 8.4
if there is some reasonable amount of testing done during this
development cycle to try to expose any problems.  If no one is willing
to do any such testing, then as far as I'm concerned there is no market
for the feature and we should drop it from TODO.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend

2006-08-18 Thread Magnus Hagander

  Since I have a stuck backend without client again, I'll have to
 kill
  -SIGTERM a backend. Fortunately, I do have console access to
 that
  machine and it's not win32 but a decent OS.
 
 
 
  You do know that on Windows you can use pg_ctl to send a pseudo
  SIGTERM to a backend, don't you?
 The main issue still is that console access id required, on any OS.

Yeah.
Though for the Windows case only, we could easily enough make it
possible to run pg_ctl kill remotely, since we use a named pipe. Does
this seem like a good or bad idea?

//Magnus


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] pg_terminate_backend

2006-08-18 Thread Andreas Pflug

Magnus Hagander wrote:
 Since I have a stuck backend without client again, I'll have to
 
 kill
 
 -SIGTERM a backend. Fortunately, I do have console access to
 
 that
 
 machine and it's not win32 but a decent OS.


 
 You do know that on Windows you can use pg_ctl to send a pseudo
 SIGTERM to a backend, don't you?
   
 The main issue still is that console access id required, on any OS.
 

 Yeah.
 Though for the Windows case only, we could easily enough make it
 possible to run pg_ctl kill remotely, since we use a named pipe. Does
 this seem like a good or bad idea?
   

Not too helpful. How to kill a win32 backend from a linux workstation?
Additionally, NP requires an authenticated RPC connection. I you're not
allowed to access the console, you probably haven't got sufficient
access permissions to NP as well, or you'd need extra policy tweaking or
so. Nightmarish, just to avoid the easy and intuitive way.

Regards,
Andreas

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] pg_terminate_backend

2006-08-18 Thread Tom Lane

Magnus Hagander [EMAIL PROTECTED] writes:
 Though for the Windows case only, we could easily enough make it
 possible to run pg_ctl kill remotely, since we use a named pipe. Does
 this seem like a good or bad idea?

Seems like we'd be opening a can of security worms :-(

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] pg_terminate_backend

2006-08-18 Thread Magnus Hagander

  Though for the Windows case only, we could easily enough make it
  possible to run pg_ctl kill remotely, since we use a named pipe.
 Does
  this seem like a good or bad idea?
 
 Seems like we'd be opening a can of security worms :-(

Not really, standard windows ACL already applies to everything, so you
need to be an admin on the machine to make it work.

Anyhoo, I don't really see the gain in it, which also seems to be what
others think, so let's just drop that idea.

//Magnus


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] pg_terminate_backend

Andreas Pflug [EMAIL PROTECTED] writes:
 utils/adt/misc.c says:
 //* Disabled in 8.0 due to reliability concerns; FIXME someday *//
 Datum
 *pg_terminate_backend*(PG_FUNCTION_ARGS)

 Well, AFAIR there were no more issues raised about code paths that don't 
 clean up correctly, so can we please
 remove that comment and make the function live finally? 

No, you have that backwards.  The burden of proof is on those who want
it to show that it's now safe.  The situation is not different than it
was before, except that we can now actually point to a specific bug that
did exist, whereas the original concern was just an unfocused one that
the code path hadn't been adequately exercised.  That concern is now
even more pressing than it was.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] pg_terminate_backend

Andrew Dunstan wrote:


 Andreas Pflug wrote:

 Since I have a stuck backend without client again, I'll have to kill
 -SIGTERM a backend. Fortunately, I do have console access to that
 machine and it's not win32 but a decent OS.
  


 You do know that on Windows you can use pg_ctl to send a pseudo
 SIGTERM to a backend, don't you?
The main issue still is that console access id required, on any OS.

Regards,
Andreas


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] pg_terminate_backend

Tom Lane wrote:
 Andreas Pflug [EMAIL PROTECTED] writes:
   
 utils/adt/misc.c says:
 //* Disabled in 8.0 due to reliability concerns; FIXME someday *//
 Datum
 *pg_terminate_backend*(PG_FUNCTION_ARGS)
 

   
 Well, AFAIR there were no more issues raised about code paths that don't 
 clean up correctly, so can we please
 remove that comment and make the function live finally? 
 

 No, you have that backwards.  The burden of proof is on those who want
 it to show that it's now safe.  The situation is not different than it
 was before, except that we can now actually point to a specific bug that
 did exist, whereas the original concern was just an unfocused one that
 the code path hadn't been adequately exercised.  That concern is now
 even more pressing than it was.
   

If the backend's stuck, I'll have to SIGTERM it, whether there's
pg_terminate_backend or not. Ultimately, if resources should remain
locked, there's no chance except restarting the whole server anyway.
SIGTERM gives me a fair chance (90%) that it will work without restart.

The persistent refusal of supporting the function makes it more painful
to execute, but not less necessary.

Regards,
Andreas


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Bruce Momjian

Tom Lane wrote:
 Andreas Pflug [EMAIL PROTECTED] writes:
  utils/adt/misc.c says:
  //* Disabled in 8.0 due to reliability concerns; FIXME someday *//
  Datum
  *pg_terminate_backend*(PG_FUNCTION_ARGS)
 
  Well, AFAIR there were no more issues raised about code paths that don't 
  clean up correctly, so can we please
  remove that comment and make the function live finally? 
 
 No, you have that backwards.  The burden of proof is on those who want
 it to show that it's now safe.  The situation is not different than it
 was before, except that we can now actually point to a specific bug that
 did exist, whereas the original concern was just an unfocused one that
 the code path hadn't been adequately exercised.  That concern is now
 even more pressing than it was.

I am not sure how you prove the non-existance of a bug.  Ideas?

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] pg_terminate_backend

Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 No, you have that backwards.  The burden of proof is on those who want
 it to show that it's now safe.  The situation is not different than it
 was before, except that we can now actually point to a specific bug that
 did exist, whereas the original concern was just an unfocused one that
 the code path hadn't been adequately exercised.  That concern is now
 even more pressing than it was.

 I am not sure how you prove the non-existance of a bug.  Ideas?

What I'm looking for is some concentrated testing.  The fact that some
people once in a while SIGTERM a backend doesn't give me any confidence
in it.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] pg_terminate_backend

 What I'm looking for is some concentrated testing.  The fact that some
 people once in a while SIGTERM a backend doesn't give me any confidence
 in it.

Now wait a minute, is there some risk of lockup if I kill a backend ?
Cause I do that relatively often (say 20 times a day, when some web
users time out but their query keeps running). Should I rather not do it
?

Thanks,
Csaba.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] pg_terminate_backend

Csaba Nagy [EMAIL PROTECTED] writes:
 Now wait a minute, is there some risk of lockup if I kill a backend ?
 Cause I do that relatively often (say 20 times a day, when some web
 users time out but their query keeps running). Should I rather not do it
 ?

statement_timeout is your friend.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] pg_terminate_backend

You didn't answer the original question: is killing SIGTERM a backend
known/suspected to be dangerous ? And if yes, what's the risk (pointers
to discussions would be nice too).

 statement_timeout is your friend.

I know, but unfortunately I can't use it. I did try to use
statement_timeout and it worked out quite bad (due to our usage
scenario).

Some of the web requests which time out on the web should still go
through... and we have activities which should not observe statement
timeout at all, i.e. they must finish however long that takes.

I know it would be possible to use a different user with it's own
statement timeout for those requests, but that means we have to rewrite
a lot of code which is not possible immediately, and our admins would
resist to add even more configuration (additional users=additional
connection pool+caches and all to be configured). We also can fix the
queries so no timeout happens in the first place, but that will take us
even more time.

Cheers,
Csaba.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] pg_terminate_backend

Andreas Pflug [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 No, you have that backwards.  The burden of proof is on those who want
 it to show that it's now safe.

 If the backend's stuck, I'll have to SIGTERM it, whether there's
 pg_terminate_backend or not.

Stuck?  You have not shown us a case where SIGTERM rather than SIGINT
is necessary or appropriate.  It seems to me the above is assuming the
existence of unknown backend bugs, exactly the same thing you think
I shouldn't be assuming ...

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] pg_terminate_backend

On Thu, 2006-08-03 at 18:10, Csaba Nagy wrote:
 You didn't answer the original question: is killing SIGTERM a backend
  ^^^
Nevermind, I don't do that. I do 'kill backend_pid' without specifying
the signal, and I'm sufficiently unfamiliar with the unix signal names
to have confused them. Is a plain kill still dangerous ?

Thanks,
Csaba.




---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] pg_terminate_backend

 Stuck?  You have not shown us a case where SIGTERM rather than SIGINT
 is necessary or appropriate.  It seems to me the above is assuming the
 existence of unknown backend bugs, exactly the same thing you think
 I shouldn't be assuming ...

I do know a case where a plain kill will seem to be stucked: on vacuum
of a big table. I guess when it starts an index's cleanup scan it will
insist to finish it before stopping. I'm not sure if that's the cause,
but I have seen delays of 30 minutes for killing a vacuum... it's true
that finally it always did die... but it's also true that I have 'kill
-9'-ed it before because I thought it's stucked.

Cheers,
Csaba.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] pg_terminate_backend

Csaba Nagy [EMAIL PROTECTED] writes:
 I do know a case where a plain kill will seem to be stucked: on vacuum
 of a big table. I guess when it starts an index's cleanup scan it will
 insist to finish it before stopping.

We've fixed a few cases of missing CHECK_FOR_INTERRUPTS lately, and will
fix more if you can point them out.  Note though that SIGTERM is just as
vulnerable to that as SIGINT.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] pg_terminate_backend

Tom Lane wrote:
 Andreas Pflug [EMAIL PROTECTED] writes:
   
 Tom Lane wrote:
 
 No, you have that backwards.  The burden of proof is on those who want
 it to show that it's now safe.
   

   
 If the backend's stuck, I'll have to SIGTERM it, whether there's
 pg_terminate_backend or not.
 

 Stuck?  You have not shown us a case where SIGTERM rather than SIGINT
 is necessary or appropriate. 
Last night, I had a long-running query I launched from pgAdmin. It was
happily running and completing on the server (took about 2 hours), and
the backend went back to IDLE. pgAdmin didn't get back a response,
assuming the query was still running. Apparently, the VPN router had
interrupted the connection silently without notifying either side of the
tcp connection. Since the backend is IDLE, there's no query to cancel
and SIGINT won't help. So Stuck for me means a backend *not*
responding to SIGINT.
BTW, there's another scenario where SIGINT won't help. Imagine an app
running wild hammering the server with queries regardless of query
cancels (maybe some retry mechanism). You'd like to interrupt that
connection, i.e. get rid of the backend.

Regards,
Andreas


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] pg_terminate_backend

Csaba Nagy wrote:
 On Thu, 2006-08-03 at 18:10, Csaba Nagy wrote:
   
 You didn't answer the original question: is killing SIGTERM a backend
 
   ^^^
 Nevermind, I don't do that. I do 'kill backend_pid' without specifying
 the signal, and I'm sufficiently unfamiliar with the unix signal names
 to have confused them. Is a plain kill still dangerous ?
   
SIGTERM is the default kill parameter, so you do exactly what I'm
talking about.

Regards,
Andreas


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] pg_terminate_backend

Csaba Nagy [EMAIL PROTECTED] writes:
 On Thu, 2006-08-03 at 18:10, Csaba Nagy wrote:
 You didn't answer the original question: is killing SIGTERM a backend
   ^^^
 Nevermind, I don't do that. I do 'kill backend_pid' without specifying
 the signal,

man kill says the default is SIGTERM.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] pg_terminate_backend

Bruce Momjian wrote:


 I am not sure how you prove the non-existance of a bug.  Ideas?
   
Would be worth at least the Nobel prize :-)

Regards,
Andreas



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] pg_terminate_backend

 man kill says the default is SIGTERM.

OK, so that means I do use it... is it known to be dangerous ? I thought
till now that it is safe to use. What about select pg_cancel_backend()
?

Thanks,
Csaba.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] pg_terminate_backend

Csaba Nagy wrote:
 man kill says the default is SIGTERM.
 

 OK, so that means I do use it... is it known to be dangerous ? I thought
 till now that it is safe to use. 
Apparently you never suffered any problems from that; neither did I.

 What about select pg_cancel_backend()
   

That's the function wrapper around kill -SIGINT, which is probably the
way you could safely stop your queries most of the time.


Regards,
Andreas


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread [EMAIL PROTECTED]








I am not sure how you prove the non-existance of a bug.  Ideas?





I do that by deleting all of my code (usually by accident :-)

No code, no bugs!

 -- Korry

Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Bruce Momjian

Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  No, you have that backwards.  The burden of proof is on those who want
  it to show that it's now safe.  The situation is not different than it
  was before, except that we can now actually point to a specific bug that
  did exist, whereas the original concern was just an unfocused one that
  the code path hadn't been adequately exercised.  That concern is now
  even more pressing than it was.
 
  I am not sure how you prove the non-existance of a bug.  Ideas?
 
 What I'm looking for is some concentrated testing.  The fact that some
 people once in a while SIGTERM a backend doesn't give me any confidence
 in it.

OK, here is an opportunity for someone to run tests to get this into
8.2.  The code already exists in CVS, but we need testing to enable it.
I would think running a huge workload and killing it over and over again
would be a good test.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] pg_terminate_backend