Re: [HACKERS] pg_terminate_backend can terminate background workers and autovacuum launchers

2017-06-27 Thread Yugo Nagata
On Fri, 23 Jun 2017 19:43:35 -0400
Stephen Frost  wrote:

> Alvaro, all,
> 
> * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> > Yugo Nagata wrote:
> > 
> > > I tried to make it. Patch attached. 
> > > 
> > > It is easy and simple. Although I haven't tried for autovacuum worker,
> > > I confirmed I can change other process' parameters without affecting 
> > > others.
> > > Do you want this in PG?
> > 
> > One advantage of this gadget is that you can signal backends that you
> > own without being superuser, so +1 for the general idea.  (Please do
> > create a fresh thread so that this can be appropriately linked to in
> > commitfest app, though.)
> 
> Well, that wouldn't quite work with the proposed patch because the
> proposed patch REVOKE's execute from public...

Yes. It is intentional to revoke execute from public because we need
to change configuration before signaling other backend and it needs
superuser privilege.
 
> I'm trying to figure out how it's actually useful to be able to signal a
> backend that you own about a config change that you *aren't* able to
> make without being a superuser..  Now, if you could tell the other
> backend to use a new value for some USERSET GUC, then *that* would be
> useful and interesting.
> 
> I haven't got any particularly great ideas for how to make that happen
> though.
> 
> > You need a much bigger patch for the autovacuum worker.  The use case I
> > had in mind is that you have a maintenance window and can run fast
> > vacuuming during it, but if the table is large and vacuum can't finish
> > within that window then you want vacuum to slow down, without stopping
> > it completely.  But implementing this requires juggling the
> > stdVacuumCostDelay/Limit values during the reload, which are currently
> > read at start of vacuuming only; the working values are overwritten from
> > those during a rebalance.
> 
> Being able to change those values during a vacuuming run would certainly
> be useful, I've had cases where I would have liked to have been able to
> do just exactly that.  That's largely independent of this though.
> 
> Thanks!
> 
> Stephen


-- 
Yugo Nagata 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend can terminate background workers and autovacuum launchers

2017-06-27 Thread Yugo Nagata
On Sat, 24 Jun 2017 08:09:52 +0900
Michael Paquier  wrote:

> On Sat, Jun 24, 2017 at 5:07 AM, Alvaro Herrera
>  wrote:
> > Yugo Nagata wrote:
> >
> >> I tried to make it. Patch attached.
> >>
> >> It is easy and simple. Although I haven't tried for autovacuum worker,
> >> I confirmed I can change other process' parameters without affecting 
> >> others.
> >> Do you want this in PG?
> 
> Just browsing the patch...
> 
> +if (r == SIGNAL_BACKEND_NOSUPERUSER)
> +ereport(WARNING,
> +(errmsg("must be a superuser to terminate superuser
> process")));
> +
> +if (r == SIGNAL_BACKEND_NOPERMISSION)
> +ereport(WARNING,
> + (errmsg("must be a member of the role whose process
> is being terminated or member of pg_signal_backend")));
> Both messages are incorrect. This is not trying to terminate a process.

I'll fix this.

> 
> +Datum
> +pg_reload_conf_pid(PG_FUNCTION_ARGS)
> I think that the naming is closer to pg_reload_backend.
> 
> Documentation is needed as well.

Agree with pg_reload_backend and I'll write the documetation.

> 
> > One advantage of this gadget is that you can signal backends that you
> > own without being superuser, so +1 for the general idea.  (Please do
> > create a fresh thread so that this can be appropriately linked to in
> > commitfest app, though.)
> 
> That would be nice.

Sure. I'll create a new thread and attach the new patch to it.

> 
> > You need a much bigger patch for the autovacuum worker.  The use case I
> > had in mind is that you have a maintenance window and can run fast
> > vacuuming during it, but if the table is large and vacuum can't finish
> > within that window then you want vacuum to slow down, without stopping
> > it completely.  But implementing this requires juggling the
> > stdVacuumCostDelay/Limit values during the reload, which are currently
> > read at start of vacuuming only; the working values are overwritten from
> > those during a rebalance.
> 
> Yeah, that's independent from the patch above.

Agree. It would be another feature from pg_reload_backend and
I think it would be implmented as another patch.



> -- 
> Michael
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Yugo Nagata 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend can terminate background workers and autovacuum launchers

2017-06-23 Thread Stephen Frost
Alvaro, all,

* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> Yugo Nagata wrote:
> 
> > I tried to make it. Patch attached. 
> > 
> > It is easy and simple. Although I haven't tried for autovacuum worker,
> > I confirmed I can change other process' parameters without affecting others.
> > Do you want this in PG?
> 
> One advantage of this gadget is that you can signal backends that you
> own without being superuser, so +1 for the general idea.  (Please do
> create a fresh thread so that this can be appropriately linked to in
> commitfest app, though.)

Well, that wouldn't quite work with the proposed patch because the
proposed patch REVOKE's execute from public...

I'm trying to figure out how it's actually useful to be able to signal a
backend that you own about a config change that you *aren't* able to
make without being a superuser..  Now, if you could tell the other
backend to use a new value for some USERSET GUC, then *that* would be
useful and interesting.

I haven't got any particularly great ideas for how to make that happen
though.

> You need a much bigger patch for the autovacuum worker.  The use case I
> had in mind is that you have a maintenance window and can run fast
> vacuuming during it, but if the table is large and vacuum can't finish
> within that window then you want vacuum to slow down, without stopping
> it completely.  But implementing this requires juggling the
> stdVacuumCostDelay/Limit values during the reload, which are currently
> read at start of vacuuming only; the working values are overwritten from
> those during a rebalance.

Being able to change those values during a vacuuming run would certainly
be useful, I've had cases where I would have liked to have been able to
do just exactly that.  That's largely independent of this though.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_terminate_backend can terminate background workers and autovacuum launchers

2017-06-23 Thread Michael Paquier
On Sat, Jun 24, 2017 at 5:07 AM, Alvaro Herrera
 wrote:
> Yugo Nagata wrote:
>
>> I tried to make it. Patch attached.
>>
>> It is easy and simple. Although I haven't tried for autovacuum worker,
>> I confirmed I can change other process' parameters without affecting others.
>> Do you want this in PG?

Just browsing the patch...

+if (r == SIGNAL_BACKEND_NOSUPERUSER)
+ereport(WARNING,
+(errmsg("must be a superuser to terminate superuser
process")));
+
+if (r == SIGNAL_BACKEND_NOPERMISSION)
+ereport(WARNING,
+ (errmsg("must be a member of the role whose process
is being terminated or member of pg_signal_backend")));
Both messages are incorrect. This is not trying to terminate a process.

+Datum
+pg_reload_conf_pid(PG_FUNCTION_ARGS)
I think that the naming is closer to pg_reload_backend.

Documentation is needed as well.

> One advantage of this gadget is that you can signal backends that you
> own without being superuser, so +1 for the general idea.  (Please do
> create a fresh thread so that this can be appropriately linked to in
> commitfest app, though.)

That would be nice.

> You need a much bigger patch for the autovacuum worker.  The use case I
> had in mind is that you have a maintenance window and can run fast
> vacuuming during it, but if the table is large and vacuum can't finish
> within that window then you want vacuum to slow down, without stopping
> it completely.  But implementing this requires juggling the
> stdVacuumCostDelay/Limit values during the reload, which are currently
> read at start of vacuuming only; the working values are overwritten from
> those during a rebalance.

Yeah, that's independent from the patch above.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend can terminate background workers and autovacuum launchers

2017-06-23 Thread Alvaro Herrera
Yugo Nagata wrote:

> I tried to make it. Patch attached. 
> 
> It is easy and simple. Although I haven't tried for autovacuum worker,
> I confirmed I can change other process' parameters without affecting others.
> Do you want this in PG?

One advantage of this gadget is that you can signal backends that you
own without being superuser, so +1 for the general idea.  (Please do
create a fresh thread so that this can be appropriately linked to in
commitfest app, though.)

You need a much bigger patch for the autovacuum worker.  The use case I
had in mind is that you have a maintenance window and can run fast
vacuuming during it, but if the table is large and vacuum can't finish
within that window then you want vacuum to slow down, without stopping
it completely.  But implementing this requires juggling the
stdVacuumCostDelay/Limit values during the reload, which are currently
read at start of vacuuming only; the working values are overwritten from
those during a rebalance.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend can terminate background workers and autovacuum launchers

2017-06-23 Thread Yugo Nagata
On Thu, 22 Jun 2017 14:08:30 +0900
Michael Paquier  wrote:

> On Thu, Jun 22, 2017 at 1:52 PM, Yugo Nagata  wrote:
> > On Thu, 22 Jun 2017 12:05:19 +0900
> > Michael Paquier  wrote:
> >> signal-able). Different thought, but I'd love to see a SQL function
> >> that allows triggering SIGHUP on a specific process, like an
> >> autovacuum worker to change its cost parameters. There is
> >> pg_reload_conf() to do so but that's system-wide.
> >
> > For example, is that like this?
> >
> > =# alter system set autovacuum_vacuum_cost_delay to 10;
> > =# select pg_reload_conf();
> > =# alter system reset autovacuum_vacuum_cost_delay;
> 
> No need to reset the parameter afterwards as this makes it sensible to
> updates afterwards, but you have the idea. Note that this is rather
> recent, as autovacuum listens to SIGHUP only since a75fb9b3.

I tried to make it. Patch attached. 

It is easy and simple. Although I haven't tried for autovacuum worker,
I confirmed I can change other process' parameters without affecting others.
Do you want this in PG?


[session A (PID:18375)]
=# show autovacuum_vacuum_cost_delay;
 autovacuum_vacuum_cost_delay 
--
 20ms
(1 row)


[session B]
postgres=# alter system set autovacuum_vacuum_cost_delay to 10;
ALTER SYSTEM
postgres=# select pg_reload_conf(18375);
 pg_reload_conf 

 t
(1 row)

postgres=# show autovacuum_vacuum_cost_delay;
 autovacuum_vacuum_cost_delay 
--
 20ms
(1 row)


[session A again]
postgres=# show autovacuum_vacuum_cost_delay;
 autovacuum_vacuum_cost_delay 
--
 10ms
(1 row)



> -- 
> Michael


-- 
Yugo Nagata 
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 0fdad0c..d79faf7 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1128,6 +1128,7 @@ REVOKE EXECUTE ON FUNCTION pg_wal_replay_pause() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_wal_replay_resume() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_rotate_logfile() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_reload_conf() FROM public;
+REVOKE EXECUTE ON FUNCTION pg_reload_conf(int) FROM public;
 REVOKE EXECUTE ON FUNCTION pg_current_logfile() FROM public;
 REVOKE EXECUTE ON FUNCTION pg_current_logfile(text) FROM public;
 
diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
index 62341b8..cc173ba 100644
--- a/src/backend/utils/adt/misc.c
+++ b/src/backend/utils/adt/misc.c
@@ -339,6 +339,36 @@ pg_reload_conf(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(true);
 }
 
+/*
+ * Signal to reload the database configuration for a specified backend
+ *
+ * Permission checking for this function is managed through the normal
+ * GRANT system.
+ */
+Datum
+pg_reload_conf_pid(PG_FUNCTION_ARGS)
+{
+	int			pid = PG_GETARG_INT32(0);
+	int			r = pg_signal_backend(pid, SIGHUP);
+
+	if (r == SIGNAL_BACKEND_NOSUPERUSER)
+		ereport(WARNING,
+(errmsg("must be a superuser to terminate superuser process")));
+
+	if (r == SIGNAL_BACKEND_NOPERMISSION)
+		ereport(WARNING,
+ (errmsg("must be a member of the role whose process is being terminated or member of pg_signal_backend")));
+
+	if (r)
+	{
+		ereport(WARNING,
+(errmsg("failed to send signal to backend: %d", pid)));
+		PG_RETURN_BOOL(false);
+	}
+
+	PG_RETURN_BOOL(true);
+}
+
 
 /*
  * Rotate log file
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 1191b4a..7258f15 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -3255,6 +3255,8 @@ DESCR("true if wal replay is paused");
 
 DATA(insert OID = 2621 ( pg_reload_conf			PGNSP PGUID 12 1 0 0 0 f f f f t f v s 0 0 16 "" _null_ _null_ _null_ _null_ _null_ pg_reload_conf _null_ _null_ _null_ ));
 DESCR("reload configuration files");
+DATA(insert OID = 3409 ( pg_reload_conf			PGNSP PGUID 12 1 0 0 0 f f f f t f v s 1 0 16 "23" _null_ _null_ _null_ _null_ _null_ pg_reload_conf_pid _null_ _null_ _null_ ));
+DESCR("reload configuration files");
 DATA(insert OID = 2622 ( pg_rotate_logfile		PGNSP PGUID 12 1 0 0 0 f f f f t f v s 0 0 16 "" _null_ _null_ _null_ _null_ _null_ pg_rotate_logfile _null_ _null_ _null_ ));
 DESCR("rotate log file");
 DATA(insert OID = 3800 ( pg_current_logfilePGNSP PGUID 12 1 0 0 0 f f f f f f v s 0 0 25 "" _null_ _null_ _null_ _null_ _null_ pg_current_logfile _null_ _null_ _null_ ));

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend can terminate background workers and autovacuum launchers

2017-06-21 Thread Michael Paquier
On Thu, Jun 22, 2017 at 1:52 PM, Yugo Nagata  wrote:
> On Thu, 22 Jun 2017 12:05:19 +0900
> Michael Paquier  wrote:
>> signal-able). Different thought, but I'd love to see a SQL function
>> that allows triggering SIGHUP on a specific process, like an
>> autovacuum worker to change its cost parameters. There is
>> pg_reload_conf() to do so but that's system-wide.
>
> For example, is that like this?
>
> =# alter system set autovacuum_vacuum_cost_delay to 10;
> =# select pg_reload_conf();
> =# alter system reset autovacuum_vacuum_cost_delay;

No need to reset the parameter afterwards as this makes it sensible to
updates afterwards, but you have the idea. Note that this is rather
recent, as autovacuum listens to SIGHUP only since a75fb9b3.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend can terminate background workers and autovacuum launchers

2017-06-21 Thread Yugo Nagata
On Thu, 22 Jun 2017 12:05:19 +0900
Michael Paquier  wrote:

> On Thu, Jun 22, 2017 at 11:52 AM, Andres Freund  wrote:
> > On 2017-06-22 11:49:47 +0900, Yugo Nagata wrote:
> >> I agree that we can kill theses processes by the OS command. However,
> >> It seems to me that pg_{cancel,terminate}_backend don't need to be able to
> >> kill processes except for client backends because we can do same thing by
> >> the OS command if necessary, and acutually these functions cannot kill
> >> most other processes, for example, background writer. Are the autovacuum
> >> launcher and background worker special for these functions?
> >
> > I strongly disagree with this - I think it's quite useful to be able to
> > kill things via SQL that can hold lock on database objects.   I'm not
> > seeing which problem would be solved by prohibiting this?
> 
> +1. Controlling them as of now is useful, particularly now that all
> processes show wait events, even auxiliary ones (though not

I got it. I agree that the current behaviour and it isn't worth changint it.
In my understand, processes that the functions can kill (client backends,
background workers, autovacuum processes) are ones that can hold lock
on database objects.

> signal-able). Different thought, but I'd love to see a SQL function
> that allows triggering SIGHUP on a specific process, like an
> autovacuum worker to change its cost parameters. There is
> pg_reload_conf() to do so but that's system-wide.

For example, is that like this?

=# alter system set autovacuum_vacuum_cost_delay to 10;
=# select pg_reload_conf();
=# alter system reset autovacuum_vacuum_cost_delay;

> -- 
> Michael


-- 
Yugo Nagata 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend can terminate background workers and autovacuum launchers

2017-06-21 Thread Michael Paquier
On Thu, Jun 22, 2017 at 11:52 AM, Andres Freund  wrote:
> On 2017-06-22 11:49:47 +0900, Yugo Nagata wrote:
>> I agree that we can kill theses processes by the OS command. However,
>> It seems to me that pg_{cancel,terminate}_backend don't need to be able to
>> kill processes except for client backends because we can do same thing by
>> the OS command if necessary, and acutually these functions cannot kill
>> most other processes, for example, background writer. Are the autovacuum
>> launcher and background worker special for these functions?
>
> I strongly disagree with this - I think it's quite useful to be able to
> kill things via SQL that can hold lock on database objects.   I'm not
> seeing which problem would be solved by prohibiting this?

+1. Controlling them as of now is useful, particularly now that all
processes show wait events, even auxiliary ones (though not
signal-able). Different thought, but I'd love to see a SQL function
that allows triggering SIGHUP on a specific process, like an
autovacuum worker to change its cost parameters. There is
pg_reload_conf() to do so but that's system-wide.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend can terminate background workers and autovacuum launchers

2017-06-21 Thread Andres Freund
On 2017-06-22 11:49:47 +0900, Yugo Nagata wrote:
> On Wed, 21 Jun 2017 11:04:34 -0400
> Robert Haas  wrote:
> 
> > On Wed, Jun 21, 2017 at 7:56 AM, Yugo Nagata  wrote:
> > > I have found that we can cancel/terminate autovacuum launchers and
> > > background worker processes by pg_cancel/terminate_backend function.
> > > I'm wondering this behavior is not expected and if not I want to fix it.
> > 
> > I think it is expected.  Even if we blocked it, those processes have
> > to cope gracefully with SIGTERM, because anyone with access to the OS
> > user can kill them that way by hand.
> 
> I agree that we can kill theses processes by the OS command. However,
> It seems to me that pg_{cancel,terminate}_backend don't need to be able to
> kill processes except for client backends because we can do same thing by
> the OS command if necessary, and acutually these functions cannot kill
> most other processes, for example, background writer. Are the autovacuum
> launcher and background worker special for these functions?

I strongly disagree with this - I think it's quite useful to be able to
kill things via SQL that can hold lock on database objects.   I'm not
seeing which problem would be solved by prohibiting this?

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend can terminate background workers and autovacuum launchers

2017-06-21 Thread Yugo Nagata
On Wed, 21 Jun 2017 11:04:34 -0400
Robert Haas  wrote:

> On Wed, Jun 21, 2017 at 7:56 AM, Yugo Nagata  wrote:
> > I have found that we can cancel/terminate autovacuum launchers and
> > background worker processes by pg_cancel/terminate_backend function.
> > I'm wondering this behavior is not expected and if not I want to fix it.
> 
> I think it is expected.  Even if we blocked it, those processes have
> to cope gracefully with SIGTERM, because anyone with access to the OS
> user can kill them that way by hand.

I agree that we can kill theses processes by the OS command. However,
It seems to me that pg_{cancel,terminate}_backend don't need to be able to
kill processes except for client backends because we can do same thing by
the OS command if necessary, and acutually these functions cannot kill
most other processes, for example, background writer. Are the autovacuum
launcher and background worker special for these functions?

> 
> > However, we can terminate background workers by pg_terminate_backend.
> > In the following example, I terminated the logical replication launcher,
> > and this process did not appear again[1].
> >
> > postgres=# select pg_terminate_backend(30902);
> >  pg_terminate_backend
> > --
> >  t
> > (1 row)
> 
> That seems to be a bug in logical replication.
> 
> > Similarly, we can terminate autovacuum launcher by pg_terminate_backend,
> > but a new process is restarted by postmaster in this case.[2]
> >
> > postgres=# select pg_terminate_backend(30900);
> >  pg_terminate_backend
> > --
> >  t
> > (1 row)
> 
> That is as I would expect.
> 
> > [2]
> > On the other hand, when we use pg_cancel_backend for autovacuum launcher,
> > it causes the following error. I'll report the detail in another thread.
> >
> >  ERROR:  can't attach the same segment more than once
> 
> I think that's a bug.
> 
> -- 
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


-- 
Yugo Nagata 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend can terminate background workers and autovacuum launchers

2017-06-21 Thread Alvaro Herrera
Yugo Nagata wrote:

> However, we can terminate background workers by pg_terminate_backend.
> In the following example, I terminated the logical replication launcher,
> and this process did not appear again[1]. 
> 
> postgres=# select pg_terminate_backend(30902);
>  pg_terminate_backend 
> --
>  t
> (1 row)

I think failing to restart the replication launcher after it stops is
probably a bug, and should be fixed by having postmaster start another
one if it dies.

> Similarly, we can terminate autovacuum launcher by pg_terminate_backend,
> but a new process is restarted by postmaster in this case.[2]

Yeah, this is operating as intended.  You can turn autovacuum off and
the launcher should go away, and turn it back on and launcher should
start.  So we expect the autovac launcher to respond to signals.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend can terminate background workers and autovacuum launchers

2017-06-21 Thread Robert Haas
On Wed, Jun 21, 2017 at 7:56 AM, Yugo Nagata  wrote:
> I have found that we can cancel/terminate autovacuum launchers and
> background worker processes by pg_cancel/terminate_backend function.
> I'm wondering this behavior is not expected and if not I want to fix it.

I think it is expected.  Even if we blocked it, those processes have
to cope gracefully with SIGTERM, because anyone with access to the OS
user can kill them that way by hand.

> However, we can terminate background workers by pg_terminate_backend.
> In the following example, I terminated the logical replication launcher,
> and this process did not appear again[1].
>
> postgres=# select pg_terminate_backend(30902);
>  pg_terminate_backend
> --
>  t
> (1 row)

That seems to be a bug in logical replication.

> Similarly, we can terminate autovacuum launcher by pg_terminate_backend,
> but a new process is restarted by postmaster in this case.[2]
>
> postgres=# select pg_terminate_backend(30900);
>  pg_terminate_backend
> --
>  t
> (1 row)

That is as I would expect.

> [2]
> On the other hand, when we use pg_cancel_backend for autovacuum launcher,
> it causes the following error. I'll report the detail in another thread.
>
>  ERROR:  can't attach the same segment more than once

I think that's a bug.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_terminate_backend can terminate background workers and autovacuum launchers

2017-06-21 Thread Yugo Nagata
Hi,

I have found that we can cancel/terminate autovacuum launchers and
background worker processes by pg_cancel/terminate_backend function.
I'm wondering this behavior is not expected and if not I want to fix it.


The current pg_stat_activity shows background workers and autovacuum
lancher as below. It made me come up with the question.

postgres=# select pid, wait_event, backend_type from pg_stat_activity ;
  pid  | wait_event  |backend_type 
---+-+-
 30902 | LogicalLauncherMain | background worker
 30900 | AutoVacuumMain  | autovacuum launcher
 30923 | | client backend
 30898 | BgWriterMain| background writer
 30897 | CheckpointerMain| checkpointer
 30899 | WalWriterMain   | walwriter
(6 rows)

We cannot use pg_terminate/cancel_backend for most processes
except client backends. For example, when I tried to terminate
the background writer, I got a warning and failed.

postgres=# select pg_terminate_backend(30899);
WARNING:  PID 30899 is not a PostgreSQL server process
 pg_terminate_backend 
--
 f
(1 row)

However, we can terminate background workers by pg_terminate_backend.
In the following example, I terminated the logical replication launcher,
and this process did not appear again[1]. 

postgres=# select pg_terminate_backend(30902);
 pg_terminate_backend 
--
 t
(1 row)

postgres=# select pid, wait_event, backend_type from pg_stat_activity ;
  pid  |wait_event |backend_type 
---+---+-
 30900 | AutoVacuumMain| autovacuum launcher
 30923 |   | client backend
 30898 | BgWriterHibernate | background writer
 30897 | CheckpointerMain  | checkpointer
 30899 | WalWriterMain | walwriter
(5 rows)

Similarly, we can terminate autovacuum launcher by pg_terminate_backend,
but a new process is restarted by postmaster in this case.[2]

postgres=# select pg_terminate_backend(30900);
 pg_terminate_backend 
--
 t
(1 row)

postgres=# select pid, wait_event, backend_type from pg_stat_activity ;
  pid  |wait_event |backend_type 
---+---+-
 32483 | AutoVacuumMain| autovacuum launcher
 30923 |   | client backend
 30898 | BgWriterHibernate | background writer
 30897 | CheckpointerMain  | checkpointer
 30899 | WalWriterMain | walwriter
(5 rows)


My question is whether the behavior of pg_terminate/cancel_backend is
expected. If these functions should succeed only for client backends,
we need to fix the behavior. Attached is a patch to fix it in that case.

In my patch, process type is checked in pg_signal_backend(), and if it is
background worker or autovacuum launcher then throw a warning and fail. 
BackendPidGetProc() returns valid PGPROC for proccesses that are initialized
by PostgresInit(), and, in my understand, all such proccess are client
backends, background workers, and autovacuum launcher. So, if this is
neither background woker nor autovacuum launcher, this should be
a normal client backend. For this check, I added a new field,
isAutoVacuumLauncher, to PGPROC.

Any comments would be appreciated.

-
[1]
AFAIK, we have to restart the server to enable logical replication after this.
I'm not sure this is expected, but I found the following comment in
ProcessInterrupts(). Does "can be stopped at any time" mean that we can
drop this process completely?

2852 else if (IsLogicalLauncher())
2853 {
2854 ereport(DEBUG1,
2855 (errmsg("logical replication launcher shutting 
down")));
2856 
2857 /* The logical replication launcher can be stopped at any 
time. */
2858 proc_exit(0);
2859 }

When the logical replication launcher receive SIGTERM, this exits with 
exitstatus 0,
so this is not restarted by the postmaster.

[2]
On the other hand, when we use pg_cancel_backend for autovacuum launcher,
it causes the following error. I'll report the detail in another thread.

 ERROR:  can't attach the same segment more than once

-

Regards,

-- 
Yugo Nagata 


pg_terminate_backend.pach
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend for same-role

2012-06-27 Thread Magnus Hagander
On Tue, Jun 26, 2012 at 10:58 PM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Mar 29, 2012 at 3:04 AM, Daniel Farina dan...@heroku.com wrote:
 On Mon, Mar 26, 2012 at 12:14 AM, Jeff Davis pg...@j-davis.com wrote:
 On Tue, 2012-03-20 at 01:38 -0700, Daniel Farina wrote:
 On Thu, Mar 15, 2012 at 9:39 PM, Fujii Masao masao.fu...@gmail.com wrote:
  On Fri, Mar 16, 2012 at 8:14 AM, Daniel Farina dan...@heroku.com wrote:
  Parallel to pg_cancel_backend, it'd be nice to allow the user to just
  outright kill a backend that they own (politely, with a SIGTERM),
  aborting any transactions in progress, including the idle transaction,
  and closing the socket.
 
  +1

 Here's a patch implementing the simple version, with no more guards
 against signal racing than have been seen previously.  The more
 elaborate variants to close those races is being discussed in a
 parallel thread, but I thought I'd get this simple version out there.

 Review:

 After reading through the threads, it looks like there was no real
 objection to this approach -- pid recycling is not something we're
 concerned about.

 I think you're missing a doc update though, in func.sgml:

 For the less restrictive functionpg_cancel_backend/, the role of an
 active backend can be found from
 the structfieldusename/structfield column of the
 structnamepg_stat_activity/structname view.

 Also, high-availability.sgml makes reference to pg_cancel_backend(), and
 it might be worthwhile to add an ...and pg_terminate_backend() there.

 Other than that, it looks good to me.

 Good comments. Patch attached to address the doc issues.  The only
 iffy thing is that the paragraph For the less restrictive... I have
 opted to remove in its entirely.  I think the documents are already
 pretty clear about the same-user rule, and I'm not sure if this is the
 right place for a crash-course on attributes in pg_stat_activity (but
 maybe it is).

 ...and pg_terminate_backend seems exactly right.

 And I think now that the system post-patch doesn't have such a strange
 contrast between the ability to send SIGINT vs. SIGTERM, such a
 contrast may not be necessary.

 I'm also not sure what the policy is about filling paragraphs in the
 manual.  I filled one, which increases the sgml churn a bit. git
 (show|diff) --word-diff helps clean it up.

 I went ahead and committed this.

 I kinda think we should back-patch this into 9.2.  It doesn't involve
 a catalog change, and would make the behavior consistent between the
 two releases, instead of changing in 9.1 and then changing again in
 9.2.  Thoughts?

+1.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend for same-role

2012-06-27 Thread Robert Haas
On Wed, Jun 27, 2012 at 8:13 AM, Magnus Hagander mag...@hagander.net wrote:
 I went ahead and committed this.

 I kinda think we should back-patch this into 9.2.  It doesn't involve
 a catalog change, and would make the behavior consistent between the
 two releases, instead of changing in 9.1 and then changing again in
 9.2.  Thoughts?

 +1.

Three votes with no objections is good enough for me, so, done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend for same-role

2012-06-26 Thread Robert Haas
On Thu, Mar 29, 2012 at 3:04 AM, Daniel Farina dan...@heroku.com wrote:
 On Mon, Mar 26, 2012 at 12:14 AM, Jeff Davis pg...@j-davis.com wrote:
 On Tue, 2012-03-20 at 01:38 -0700, Daniel Farina wrote:
 On Thu, Mar 15, 2012 at 9:39 PM, Fujii Masao masao.fu...@gmail.com wrote:
  On Fri, Mar 16, 2012 at 8:14 AM, Daniel Farina dan...@heroku.com wrote:
  Parallel to pg_cancel_backend, it'd be nice to allow the user to just
  outright kill a backend that they own (politely, with a SIGTERM),
  aborting any transactions in progress, including the idle transaction,
  and closing the socket.
 
  +1

 Here's a patch implementing the simple version, with no more guards
 against signal racing than have been seen previously.  The more
 elaborate variants to close those races is being discussed in a
 parallel thread, but I thought I'd get this simple version out there.

 Review:

 After reading through the threads, it looks like there was no real
 objection to this approach -- pid recycling is not something we're
 concerned about.

 I think you're missing a doc update though, in func.sgml:

 For the less restrictive functionpg_cancel_backend/, the role of an
 active backend can be found from
 the structfieldusename/structfield column of the
 structnamepg_stat_activity/structname view.

 Also, high-availability.sgml makes reference to pg_cancel_backend(), and
 it might be worthwhile to add an ...and pg_terminate_backend() there.

 Other than that, it looks good to me.

 Good comments. Patch attached to address the doc issues.  The only
 iffy thing is that the paragraph For the less restrictive... I have
 opted to remove in its entirely.  I think the documents are already
 pretty clear about the same-user rule, and I'm not sure if this is the
 right place for a crash-course on attributes in pg_stat_activity (but
 maybe it is).

 ...and pg_terminate_backend seems exactly right.

 And I think now that the system post-patch doesn't have such a strange
 contrast between the ability to send SIGINT vs. SIGTERM, such a
 contrast may not be necessary.

 I'm also not sure what the policy is about filling paragraphs in the
 manual.  I filled one, which increases the sgml churn a bit. git
 (show|diff) --word-diff helps clean it up.

I went ahead and committed this.

I kinda think we should back-patch this into 9.2.  It doesn't involve
a catalog change, and would make the behavior consistent between the
two releases, instead of changing in 9.1 and then changing again in
9.2.  Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend for same-role

2012-06-26 Thread Daniel Farina
On Tue, Jun 26, 2012 at 1:58 PM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Mar 29, 2012 at 3:04 AM, Daniel Farina dan...@heroku.com wrote:
 On Mon, Mar 26, 2012 at 12:14 AM, Jeff Davis pg...@j-davis.com wrote:
 On Tue, 2012-03-20 at 01:38 -0700, Daniel Farina wrote:
 On Thu, Mar 15, 2012 at 9:39 PM, Fujii Masao masao.fu...@gmail.com wrote:
  On Fri, Mar 16, 2012 at 8:14 AM, Daniel Farina dan...@heroku.com wrote:
  Parallel to pg_cancel_backend, it'd be nice to allow the user to just
  outright kill a backend that they own (politely, with a SIGTERM),
  aborting any transactions in progress, including the idle transaction,
  and closing the socket.
 
  +1

 Here's a patch implementing the simple version, with no more guards
 against signal racing than have been seen previously.  The more
 elaborate variants to close those races is being discussed in a
 parallel thread, but I thought I'd get this simple version out there.

 Review:

 After reading through the threads, it looks like there was no real
 objection to this approach -- pid recycling is not something we're
 concerned about.

 I think you're missing a doc update though, in func.sgml:

 For the less restrictive functionpg_cancel_backend/, the role of an
 active backend can be found from
 the structfieldusename/structfield column of the
 structnamepg_stat_activity/structname view.

 Also, high-availability.sgml makes reference to pg_cancel_backend(), and
 it might be worthwhile to add an ...and pg_terminate_backend() there.

 Other than that, it looks good to me.

 Good comments. Patch attached to address the doc issues.  The only
 iffy thing is that the paragraph For the less restrictive... I have
 opted to remove in its entirely.  I think the documents are already
 pretty clear about the same-user rule, and I'm not sure if this is the
 right place for a crash-course on attributes in pg_stat_activity (but
 maybe it is).

 ...and pg_terminate_backend seems exactly right.

 And I think now that the system post-patch doesn't have such a strange
 contrast between the ability to send SIGINT vs. SIGTERM, such a
 contrast may not be necessary.

 I'm also not sure what the policy is about filling paragraphs in the
 manual.  I filled one, which increases the sgml churn a bit. git
 (show|diff) --word-diff helps clean it up.

 I went ahead and committed this.

 I kinda think we should back-patch this into 9.2.  It doesn't involve
 a catalog change, and would make the behavior consistent between the
 two releases, instead of changing in 9.1 and then changing again in
 9.2.  Thoughts?

I think that would be good.

It is at the level of pain whereas I would backpatch from day one, but
I think it would be a welcome change to people who couldn't justify
gnashing their teeth and distributing a tweaked Postgres like I would.
 It saves me some effort, too.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend for same-role

2012-03-29 Thread Daniel Farina
On Mon, Mar 26, 2012 at 12:14 AM, Jeff Davis pg...@j-davis.com wrote:
 On Tue, 2012-03-20 at 01:38 -0700, Daniel Farina wrote:
 On Thu, Mar 15, 2012 at 9:39 PM, Fujii Masao masao.fu...@gmail.com wrote:
  On Fri, Mar 16, 2012 at 8:14 AM, Daniel Farina dan...@heroku.com wrote:
  Parallel to pg_cancel_backend, it'd be nice to allow the user to just
  outright kill a backend that they own (politely, with a SIGTERM),
  aborting any transactions in progress, including the idle transaction,
  and closing the socket.
 
  +1

 Here's a patch implementing the simple version, with no more guards
 against signal racing than have been seen previously.  The more
 elaborate variants to close those races is being discussed in a
 parallel thread, but I thought I'd get this simple version out there.

 Review:

 After reading through the threads, it looks like there was no real
 objection to this approach -- pid recycling is not something we're
 concerned about.

 I think you're missing a doc update though, in func.sgml:

 For the less restrictive functionpg_cancel_backend/, the role of an
 active backend can be found from
 the structfieldusename/structfield column of the
 structnamepg_stat_activity/structname view.

 Also, high-availability.sgml makes reference to pg_cancel_backend(), and
 it might be worthwhile to add an ...and pg_terminate_backend() there.

 Other than that, it looks good to me.

Good comments. Patch attached to address the doc issues.  The only
iffy thing is that the paragraph For the less restrictive... I have
opted to remove in its entirely.  I think the documents are already
pretty clear about the same-user rule, and I'm not sure if this is the
right place for a crash-course on attributes in pg_stat_activity (but
maybe it is).

...and pg_terminate_backend seems exactly right.

And I think now that the system post-patch doesn't have such a strange
contrast between the ability to send SIGINT vs. SIGTERM, such a
contrast may not be necessary.

I'm also not sure what the policy is about filling paragraphs in the
manual.  I filled one, which increases the sgml churn a bit. git
(show|diff) --word-diff helps clean it up.

-- 
fdr


Extend-same-role-backend-management-to-pg_terminate_backend-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend for same-role

2012-03-26 Thread Jeff Davis
On Tue, 2012-03-20 at 01:38 -0700, Daniel Farina wrote:
 On Thu, Mar 15, 2012 at 9:39 PM, Fujii Masao masao.fu...@gmail.com wrote:
  On Fri, Mar 16, 2012 at 8:14 AM, Daniel Farina dan...@heroku.com wrote:
  Parallel to pg_cancel_backend, it'd be nice to allow the user to just
  outright kill a backend that they own (politely, with a SIGTERM),
  aborting any transactions in progress, including the idle transaction,
  and closing the socket.
 
  +1
 
 Here's a patch implementing the simple version, with no more guards
 against signal racing than have been seen previously.  The more
 elaborate variants to close those races is being discussed in a
 parallel thread, but I thought I'd get this simple version out there.

Review:

After reading through the threads, it looks like there was no real
objection to this approach -- pid recycling is not something we're
concerned about.

I think you're missing a doc update though, in func.sgml:

For the less restrictive functionpg_cancel_backend/, the role of an
active backend can be found from
the structfieldusename/structfield column of the
structnamepg_stat_activity/structname view.

Also, high-availability.sgml makes reference to pg_cancel_backend(), and
it might be worthwhile to add an ...and pg_terminate_backend() there.

Other than that, it looks good to me.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend for same-role

2012-03-20 Thread Daniel Farina
On Thu, Mar 15, 2012 at 9:39 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Fri, Mar 16, 2012 at 8:14 AM, Daniel Farina dan...@heroku.com wrote:
 Parallel to pg_cancel_backend, it'd be nice to allow the user to just
 outright kill a backend that they own (politely, with a SIGTERM),
 aborting any transactions in progress, including the idle transaction,
 and closing the socket.

 +1

Here's a patch implementing the simple version, with no more guards
against signal racing than have been seen previously.  The more
elaborate variants to close those races is being discussed in a
parallel thread, but I thought I'd get this simple version out there.

-- 
fdr
From 73c794a0cce148c2848adfb06be9aac985ac41d8 Mon Sep 17 00:00:00 2001
From: Daniel Farina dan...@heroku.com
Date: Sun, 18 Mar 2012 20:08:37 -0700
Subject: [PATCH] Extend same-role backend management to pg_terminate_backend

This makes it more similar to pg_cancel_backend, except it gives users
the ability to close runaway connections entirely.

Signed-off-by: Daniel Farina dan...@heroku.com
---
 doc/src/sgml/func.sgml   |6 +-
 src/backend/utils/adt/misc.c |   12 +++-
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 34fea16..7cece3a 100644
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 14403,14409  SELECT set_config('log_statement_stats', 'off', false);
  literalfunctionpg_terminate_backend(parameterpid/parameter typeint/)/function/literal
  /entry
 entrytypeboolean/type/entry
!entryTerminate a backend/entry
/row
   /tbody
  /tgroup
--- 14403,14413 
  literalfunctionpg_terminate_backend(parameterpid/parameter typeint/)/function/literal
  /entry
 entrytypeboolean/type/entry
!entryTerminate a backend.  You can execute this against
! another backend that has exactly the same role as the user
! calling the function.  In all other cases, you must be a
! superuser.
!/entry
/row
   /tbody
  /tgroup
*** a/src/backend/utils/adt/misc.c
--- b/src/backend/utils/adt/misc.c
***
*** 162,179  pg_cancel_backend(PG_FUNCTION_ARGS)
  }
  
  /*
!  * Signal to terminate a backend process.  Only allowed by superuser.
   */
  Datum
  pg_terminate_backend(PG_FUNCTION_ARGS)
  {
! 	if (!superuser())
  		ereport(ERROR,
  (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
! 			 errmsg(must be superuser to terminate other server processes),
!  errhint(You can cancel your own processes with pg_cancel_backend().)));
  
! 	PG_RETURN_BOOL(pg_signal_backend(PG_GETARG_INT32(0), SIGTERM) == SIGNAL_BACKEND_SUCCESS);
  }
  
  /*
--- 162,181 
  }
  
  /*
!  * Signal to terminate a backend process.  This is allowed if you are superuser
!  * or have the same role as the process being terminated.
   */
  Datum
  pg_terminate_backend(PG_FUNCTION_ARGS)
  {
! 	int			r = pg_signal_backend(PG_GETARG_INT32(0), SIGTERM);
! 
! 	if (r == SIGNAL_BACKEND_NOPERMISSION)
  		ereport(ERROR,
  (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
!  (errmsg(must be superuser or have the same role to terminate backends running in other server processes;
  
! 	PG_RETURN_BOOL(r == SIGNAL_BACKEND_SUCCESS);
  }
  
  /*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend for same-role

2012-03-17 Thread Noah Misch
On Fri, Mar 16, 2012 at 04:42:07PM -0700, Daniel Farina wrote:
 On Fri, Mar 16, 2012 at 3:42 PM, Noah Misch n...@leadboat.com wrote:
  On Thu, Mar 15, 2012 at 04:14:03PM -0700, Daniel Farina wrote:
  I imagine the problem is a race condition whereby a pid might be
  reused by another process owned by another user (doesn't that also
  affect pg_cancel_backend?). ?Shall we just do everything using the
  MyCancelKey (which I think could just be called SessionKey,
  SessionSecret, or even just Session) as to ensure we have no case
  of mistaken identity? Or does that end up being problematic?
 
  No, I think the hazard you identify here is orthogonal to the question of 
  when
  to authorize pg_terminate_backend(). ?As you note downthread, protocol-level
  cancellations available in released versions already exhibit this hazard. ?I
  wouldn't mind a clean fix for this, but it's an independent subject.
 
 Hmm. Well, here's a patch that implements exactly that, I think,
 similar to the one posted to this thread, but not using BackendIds,
 but rather the newly-introduced SessionId.  Would appreciate
 comments.  Because an out-of-band signaling method has been integrated
 more complex behaviors -- such as closing the
 TERM-against-SECURITY-DEFINER-FUNCTION hazard -- can be addressed.
 For now I've only attempted to solve the problem of backend ambiguity,
 which basically necessitated out-of-line information transfer as per
 the usual means.

This patch still changes the policy for pg_terminate_backend(), and it does
not fix other SIGINT senders like processCancelRequest() and ProcSleep().  If
you're concerned about PID-reuse races, audit all backend signalling.  Either
fix all such problems or propose a plan to get there eventually.  Any further
discussion of this topic needs a new subject line; mixing its consideration
with proposals to change the policy behind pg_terminate_backend() reduces the
chances of the right people commenting on these distinct questions.

Currently, when pg_terminate_backend() follows a pg_cancel_backend() on which
the target has yet to act, the eventual outcome is a terminated process.  With
this patch, the pg_terminate_backend() becomes a no-op with this warning:

 !  ereport(WARNING,
 !  
 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 !   (errmsg(process is busy responding 
 to administrative 
 !   request)),
 !   (errhint(This is temporary, and may 
 be retried.;

That's less useful than the current behavior.


That being said, I can't get too excited about closing PID-reuse races.  I've
yet to see another program do so.  I've never seen a trouble report around
this race for any software.  Every OS I have used assigns PIDs so as to
maximize the reuse interval, which seems like an important POLA measure given
typical admin formulae like kill `ps | grep ...`.  This defense can only
matter in fork-bomb conditions, at which point a stray signal is minor.

I do think it's worth keeping this idea in a back pocket for achieving those
more complex behaviors, should we ever desire them.

  Here I discussed a hazard specific to allowing pg_terminate_backend():
  http://archives.postgresql.org/message-id/20110602045955.gc8...@tornado.gateway.2wire.net
 
  To summarize, user code can trap SIGINT cancellations, but it cannot trap
  SIGTERM terminations. ?If a backend is executing a SECURITY DEFINER function
  when another backend of the same role calls pg_terminate_backend() thereon,
  the pg_terminate_backend() caller could achieve something he cannot achieve 
  in
  PostgreSQL 9.1. ?I vote that this is an acceptable loss.
 
 I'll throw out a patch that just lets this hazard exist and see what
 happens, although it is obsoleted/incompatible with the one already
 attached.

+1.  Has anyone actually said that the PID-reuse race or the thing I mention
above should block such a patch?  I poked back through the threads I could
remember and found nothing.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend for same-role

2012-03-16 Thread Daniel Farina
On Thu, Mar 15, 2012 at 10:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniel Farina dan...@heroku.com writes:
 On Thu, Mar 15, 2012 at 10:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 But actually I don't see what you hope to gain from such a change,
 even if it can be made to work.  Anyone who can do kill(SIGINT) can
 do kill(SIGKILL), say --- so you have to be able to trust the signal
 sender.  What's the point of not trusting it to verify the client
 identity?

 No longer true with pg_cancel_backend not-by-superuser, no?

 No.  That doesn't affect the above argument in the least.  And in fact
 if there's any question whatsoever as to whether unprivileged
 cross-backend signals are secure, they are not going in in the first
 place.

Okay, well, I believe there is a race in handling common
administrative signals that *might* possibly matter.  In the past,
pg_cancel_backend was superuser only, which is a lot like saying only
available to people who can be the postgres user and run kill.  The
cancellation packet is handled via first checking the other backend's
BackendKey and then SIGINTing it, leaving only the most narrow
possibility for a misdirected SIGINT.

But it really is unfortunate that I, a user, run a query or have a
problematic connection of my own role and just want the thing to stop,
but I can't do anything about it without superuser.  In recognition of
that, pg_cancel_backend now can operate on backends owned by the same
user (once again, checked before the signal is processed by the
receiver, just like with the cancellation packet).

There was some angst (but I'm not sure about how specific or uniform)
to extend such signaling power to pg_terminate_backend, and the only
objection I can think of is there is this race, or so it would seem to
me.  Maybe it's too small to care, in which case we can just extend
the same policy to pg_terminate_backend, or maybe it's not, in which
case we could get rid of any signaling race conditions.

The only hypothetical person who would be happy with the current
situation, if characterized correctly, would be one who thinks that
pid-race on SIGINT from non-superusers (long has been true in the form
of the cancellation packet) is okay but on SIGTERM such a race is not.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend for same-role

2012-03-16 Thread Daniel Farina
On Thu, Mar 15, 2012 at 11:01 PM, Daniel Farina dan...@heroku.com wrote:
 Okay, well, I believe there is a race in handling common
 administrative signals that *might* possibly matter.  In the past,
 pg_cancel_backend was superuser only, which is a lot like saying only
 available to people who can be the postgres user and run kill.  The
 cancellation packet is handled via first checking the other backend's
 BackendKey and then SIGINTing it, leaving only the most narrow
 possibility for a misdirected SIGINT.

Attached is a patch that sketches a removal of the caveat by relying
on the BackendId in PGPROC instead of the pid.  Basically, the idea is
to make it work more like how cancellation keys work, except for
internal SQL functions.  I think the unsatisfying crux here is the
uniqueness of BackendId over the life of one *postmaster* invocation:

sinvaladt.c

MyBackendId = (stateP - segP-procState[0]) + 1;
/* Advertise assigned backend ID in MyProc */
MyProc-backendId = MyBackendId;

I'm not sure precisely what to think about how this numbering winds up
working on quick inspection.  Clearly, if BackendIds can be reused
quickly then the pid-race problem comes back in spirit right away.

But given the contract of MyBackendId as I understand it (it just has
to be unique among all backends at any given time), it could be
changed.  I don't *think* it's used for its arithmetic relationship to
its underlying components anywhere.

Another similar solution (not attached) would be to send information
about the originating backend through PGPROC and having the check be
against those rather than merely a correct and unambiguous
MyBackendId.

I also see now that cancellation packets does not have this caveat
because the postmaster is control of all forking and joining in a
serially executed path, so it need not worry about pid racing.

Another nice use for this might be to, say, deliver the memory or
performance stats of another process while-in-execution, without
having to be superuser or and/or gdbing in back to the calling backend.

-- 
fdr
From f466fe53e8e64cfa49bf56dbdf5920f9ea4e3562 Mon Sep 17 00:00:00 2001
From: Daniel Farina dan...@heroku.com
Date: Thu, 15 Mar 2012 20:46:38 -0700
Subject: [PATCH] Implement race-free sql-originated backend
 cancellation/termination

If promising, this patch requires a few documentation updates in
addendum.

Since 0495aaad8b337642830a4d4e82f8b8c02b27b1be, pg_cancel_backend can
be run on backends that have the same role as the backend
pg_cancel_backend is being invoked from.  Since that time, a
documented caveat exists stating that there was a race whereby a
process death-and-recycle could result in an otherwise unrelated
backend receiving SIGINT.

Now it is desirable for pg_terminate_backend -- which also has the
effect of having the backend exit, and closing the socket -- to also
be usable by non-superusers.  Presuming SIGINT was acceptable to race,
it's not clear that it's acceptable for SIGTERM to race in the same
way, so this patch seeks to try to do something about that.

This patch attempts to close this race condition by targeting query
cancellation/termination against the per-backend-startup unique
BackendId to unambiguously identify the session rather than a PID.
This makes the SQL function act more like how cancellation keys work
already (perhaps these paths can be converged somehow).

Signed-off-by: Daniel Farina dan...@heroku.com
---
 src/backend/access/transam/twophase.c |4 +
 src/backend/storage/ipc/procsignal.c  |3 +
 src/backend/storage/lmgr/proc.c   |3 +
 src/backend/tcop/postgres.c   |   55 +
 src/backend/utils/adt/misc.c  |  104 +++-
 src/include/miscadmin.h   |1 +
 src/include/storage/proc.h|   17 +
 src/include/storage/procsignal.h  |1 +
 8 files changed, 146 insertions(+), 42 deletions(-)

*** a/src/backend/access/transam/twophase.c
--- b/src/backend/access/transam/twophase.c
***
*** 62,67 
--- 62,68 
  #include storage/procarray.h
  #include storage/sinvaladt.h
  #include storage/smgr.h
+ #include storage/spin.h
  #include utils/builtins.h
  #include utils/memutils.h
  #include utils/timestamp.h
***
*** 326,331  MarkAsPreparing(TransactionId xid, const char *gid,
--- 327,335 
  	proc-backendId = InvalidBackendId;
  	proc-databaseId = databaseid;
  	proc-roleId = owner;
+ 	SpinLockInit(MyProc-adminMutex);
+ 	proc-adminAction = ADMIN_ACTION_NONE;
+ 	proc-adminBackendId = InvalidBackendId;
  	proc-lwWaiting = false;
  	proc-lwWaitMode = 0;
  	proc-lwWaitLink = NULL;
*** a/src/backend/storage/ipc/procsignal.c
--- b/src/backend/storage/ipc/procsignal.c
***
*** 258,263  procsignal_sigusr1_handler(SIGNAL_ARGS)
--- 258,266 
  	if (CheckProcSignal(PROCSIG_NOTIFY_INTERRUPT))
  		HandleNotifyInterrupt();
  
+ 	if (CheckProcSignal(PROCSIG_ADMIN_ACTION_INTERRUPT))
+ 		

Re: [HACKERS] pg_terminate_backend for same-role

2012-03-16 Thread Noah Misch
On Thu, Mar 15, 2012 at 04:14:03PM -0700, Daniel Farina wrote:
 Parallel to pg_cancel_backend, it'd be nice to allow the user to just
 outright kill a backend that they own (politely, with a SIGTERM),
 aborting any transactions in progress, including the idle transaction,
 and closing the socket.

+1

 I imagine the problem is a race condition whereby a pid might be
 reused by another process owned by another user (doesn't that also
 affect pg_cancel_backend?).  Shall we just do everything using the
 MyCancelKey (which I think could just be called SessionKey,
 SessionSecret, or even just Session) as to ensure we have no case
 of mistaken identity? Or does that end up being problematic?

No, I think the hazard you identify here is orthogonal to the question of when
to authorize pg_terminate_backend().  As you note downthread, protocol-level
cancellations available in released versions already exhibit this hazard.  I
wouldn't mind a clean fix for this, but it's an independent subject.


Here I discussed a hazard specific to allowing pg_terminate_backend():
http://archives.postgresql.org/message-id/20110602045955.gc8...@tornado.gateway.2wire.net

To summarize, user code can trap SIGINT cancellations, but it cannot trap
SIGTERM terminations.  If a backend is executing a SECURITY DEFINER function
when another backend of the same role calls pg_terminate_backend() thereon,
the pg_terminate_backend() caller could achieve something he cannot achieve in
PostgreSQL 9.1.  I vote that this is an acceptable loss.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend for same-role

2012-03-16 Thread Daniel Farina
On Fri, Mar 16, 2012 at 3:42 PM, Noah Misch n...@leadboat.com wrote:
 On Thu, Mar 15, 2012 at 04:14:03PM -0700, Daniel Farina wrote:
 Parallel to pg_cancel_backend, it'd be nice to allow the user to just
 outright kill a backend that they own (politely, with a SIGTERM),
 aborting any transactions in progress, including the idle transaction,
 and closing the socket.

 +1

 I imagine the problem is a race condition whereby a pid might be
 reused by another process owned by another user (doesn't that also
 affect pg_cancel_backend?).  Shall we just do everything using the
 MyCancelKey (which I think could just be called SessionKey,
 SessionSecret, or even just Session) as to ensure we have no case
 of mistaken identity? Or does that end up being problematic?

 No, I think the hazard you identify here is orthogonal to the question of when
 to authorize pg_terminate_backend().  As you note downthread, protocol-level
 cancellations available in released versions already exhibit this hazard.  I
 wouldn't mind a clean fix for this, but it's an independent subject.

Hmm. Well, here's a patch that implements exactly that, I think,
similar to the one posted to this thread, but not using BackendIds,
but rather the newly-introduced SessionId.  Would appreciate
comments.  Because an out-of-band signaling method has been integrated
more complex behaviors -- such as closing the
TERM-against-SECURITY-DEFINER-FUNCTION hazard -- can be addressed.
For now I've only attempted to solve the problem of backend ambiguity,
which basically necessitated out-of-line information transfer as per
the usual means.

 Here I discussed a hazard specific to allowing pg_terminate_backend():
 http://archives.postgresql.org/message-id/20110602045955.gc8...@tornado.gateway.2wire.net

 To summarize, user code can trap SIGINT cancellations, but it cannot trap
 SIGTERM terminations.  If a backend is executing a SECURITY DEFINER function
 when another backend of the same role calls pg_terminate_backend() thereon,
 the pg_terminate_backend() caller could achieve something he cannot achieve in
 PostgreSQL 9.1.  I vote that this is an acceptable loss.

I'll throw out a patch that just lets this hazard exist and see what
happens, although it is obsoleted/incompatible with the one already
attached.

-- 
fdr


Implement-race-free-sql-originated-backend-cancellation-v2.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend for same-role

2012-03-16 Thread Daniel Farina
On Fri, Mar 16, 2012 at 4:42 PM, Daniel Farina dan...@heroku.com wrote:
 Hmm. Well, here's a patch that implements exactly that, I think,

That version had some screws loose due to some editor snafus.
Hopefully all better.

-- 
fdr


Implement-race-free-sql-originated-backend-cancellation-v3.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_terminate_backend for same-role

2012-03-15 Thread Daniel Farina
Parallel to pg_cancel_backend, it'd be nice to allow the user to just
outright kill a backend that they own (politely, with a SIGTERM),
aborting any transactions in progress, including the idle transaction,
and closing the socket.

I imagine the problem is a race condition whereby a pid might be
reused by another process owned by another user (doesn't that also
affect pg_cancel_backend?).  Shall we just do everything using the
MyCancelKey (which I think could just be called SessionKey,
SessionSecret, or even just Session) as to ensure we have no case
of mistaken identity? Or does that end up being problematic?

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend for same-role

2012-03-15 Thread Fujii Masao
On Fri, Mar 16, 2012 at 8:14 AM, Daniel Farina dan...@heroku.com wrote:
 Parallel to pg_cancel_backend, it'd be nice to allow the user to just
 outright kill a backend that they own (politely, with a SIGTERM),
 aborting any transactions in progress, including the idle transaction,
 and closing the socket.

+1

 I imagine the problem is a race condition whereby a pid might be
 reused by another process owned by another user (doesn't that also
 affect pg_cancel_backend?).

Yes, but I think it's too unlikely to happen. Not sure if we really
should worry about that.

 Shall we just do everything using the
 MyCancelKey (which I think could just be called SessionKey,
 SessionSecret, or even just Session) as to ensure we have no case
 of mistaken identity? Or does that end up being problematic?

What if pid is unfortunately reused after passing the test of MyCancelKey
and before sending the signal?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend for same-role

2012-03-15 Thread Daniel Farina
On Thu, Mar 15, 2012 at 9:39 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Shall we just do everything using the
 MyCancelKey (which I think could just be called SessionKey,
 SessionSecret, or even just Session) as to ensure we have no case
 of mistaken identity? Or does that end up being problematic?

 What if pid is unfortunately reused after passing the test of MyCancelKey
 and before sending the signal?

The way MyCancelKey is checked now is backwards, in my mind.  It seems
like it would be better checked by the receiving PID (one can use a
check/recheck also, if so inclined).  Is there a large caveat to that?
I'm working on a small patch to do this I hope to post soon (modulus
difficulties), but am fully aware that messing around PGPROC and
signal handling can be a bit fiddly.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend for same-role

2012-03-15 Thread Tom Lane
Daniel Farina dan...@heroku.com writes:
 The way MyCancelKey is checked now is backwards, in my mind.  It seems
 like it would be better checked by the receiving PID (one can use a
 check/recheck also, if so inclined).  Is there a large caveat to that?

You mean, other than the fact that kill(2) can't transmit such a key?

But actually I don't see what you hope to gain from such a change,
even if it can be made to work.  Anyone who can do kill(SIGINT) can
do kill(SIGKILL), say --- so you have to be able to trust the signal
sender.  What's the point of not trusting it to verify the client
identity?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend for same-role

2012-03-15 Thread Daniel Farina
On Thu, Mar 15, 2012 at 10:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniel Farina dan...@heroku.com writes:
 The way MyCancelKey is checked now is backwards, in my mind.  It seems
 like it would be better checked by the receiving PID (one can use a
 check/recheck also, if so inclined).  Is there a large caveat to that?

 You mean, other than the fact that kill(2) can't transmit such a key?

I was planning on using an out-of-line mechanism. Bad idea?

 But actually I don't see what you hope to gain from such a change,
 even if it can be made to work.  Anyone who can do kill(SIGINT) can
 do kill(SIGKILL), say --- so you have to be able to trust the signal
 sender.  What's the point of not trusting it to verify the client
 identity?

No longer true with pg_cancel_backend not-by-superuser, no?  Now there
are new people who can do kill(SIGINT) (modulus the already existing
cancel requests).

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend for same-role

2012-03-15 Thread Tom Lane
Daniel Farina dan...@heroku.com writes:
 On Thu, Mar 15, 2012 at 10:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 But actually I don't see what you hope to gain from such a change,
 even if it can be made to work.  Anyone who can do kill(SIGINT) can
 do kill(SIGKILL), say --- so you have to be able to trust the signal
 sender.  What's the point of not trusting it to verify the client
 identity?

 No longer true with pg_cancel_backend not-by-superuser, no?

No.  That doesn't affect the above argument in the least.  And in fact
if there's any question whatsoever as to whether unprivileged
cross-backend signals are secure, they are not going in in the first
place.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-07-01 Thread Torello Querci
2011/6/2 Noah Misch n...@leadboat.com:
 On Wed, Jun 01, 2011 at 10:26:34PM -0400, Josh Kupershmidt wrote:
 On Wed, Jun 1, 2011 at 5:55 PM, Noah Misch n...@leadboat.com wrote:
  On Sun, May 29, 2011 at 10:56:02AM -0400, Josh Kupershmidt wrote:
  Looking around, I see there were real problems[1] with sending SIGTERM
  to individual backends back in 2005 or so, and pg_terminate_backend()
  was only deemed safe enough to put in for 8.4 [2]. So expanding
  pg_terminate_backend() privileges does make me a tad nervous.
 
  The documentation for the CREATE USER flag would boil down to omit this 
  flag
  only if you're worried about undiscovered PostgreSQL bugs in this area. 
  ?I'd
  echo Tom's sentiment from the first thread, In any case I think we have to
  solve it, not create new mechanisms to try to ignore it.

 I do agree with Tom's sentiment from that thread. But, if we are
 confident that pg_terminate_backend() is safe enough to relax
 permissions on, then I take it you agree we should plan to extend this
 power to all users?

 Yes; that's what I was trying to say.

 Having thought about this some more, I do now see a risk.  Currently, a 
 SECURITY
 DEFINER function (actually any function, but that's where it matters) can trap
 query_canceled.  By doing so, the author can ensure that only superusers and
 crashes may halt the function during a section protected in this way.  One 
 might
 use it to guard a series of updates made over dblink.  pg_terminate_backend()
 breaks this protection.  I've never designed something this way; it only
 suffices when you merely sort-of-care about transactional integrity.  Perhaps
 it's an acceptable loss for this feature?

 And if so, is this patch a good first step on that path?


Understand that the pg_terminate_backend() is able to kill process
that need not to be killed.
I suppose that looking inside the internal postgreql table in order to
not allow a normal db owner to kill a superuser connection can avoid
this problem?

 Yes.

  Reading through those old threads made me realize this patch would
  give database owners the ability to kill off autovacuum workers. Seems
  like we'd want to restrict that power to superusers.
 
  Would we? ?Any old user can already stifle VACUUM by holding a transaction 
  open.

 This is true, though it's possible we might at some point want a
 backend process which really shouldn't be killable by non-superusers
 (if vacuum/autovacuum isn't one already.) Actually, I could easily
 imagine a superuser running an important query on a database getting
 peeved if a non-superuser were allowed to cancel/terminate his
 queries.

 That's really a different level of user isolation than we have.  If your
 important query runs on a database owned by someone else, calls functions 
 owned
 by someone else, or reads tables owned by someone else, you're substantially 
 at
 the mercy of those object owners.  That situation probably is unsatisfactory 
 to
 some folks.  Adding the possibility that a database owner could cancel your
 query seems like an extension of that codependency more than a new exposure.

If I am the database owner I need to be able to manage my DB. Ok for
superuser connection (and internal administrative process like
autovacuum)
I am the developer, not the DBA, so sometimes, when I wrong something,
I need to kill my session if I wrong something 

Can we suppose, in a more generic case,  that an user can kill
connection only from the same user even if this is not the database
owner?

Best Regards, Torello

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-06-01 Thread Josh Kupershmidt
On Wed, Jun 1, 2011 at 5:55 PM, Noah Misch n...@leadboat.com wrote:
 On Sun, May 29, 2011 at 10:56:02AM -0400, Josh Kupershmidt wrote:
 Looking around, I see there were real problems[1] with sending SIGTERM
 to individual backends back in 2005 or so, and pg_terminate_backend()
 was only deemed safe enough to put in for 8.4 [2]. So expanding
 pg_terminate_backend() privileges does make me a tad nervous.

 The documentation for the CREATE USER flag would boil down to omit this flag
 only if you're worried about undiscovered PostgreSQL bugs in this area.  I'd
 echo Tom's sentiment from the first thread, In any case I think we have to
 solve it, not create new mechanisms to try to ignore it.

I do agree with Tom's sentiment from that thread. But, if we are
confident that pg_terminate_backend() is safe enough to relax
permissions on, then I take it you agree we should plan to extend this
power to all users? And if so, is this patch a good first step on that
path?

 Reading through those old threads made me realize this patch would
 give database owners the ability to kill off autovacuum workers. Seems
 like we'd want to restrict that power to superusers.

 Would we?  Any old user can already stifle VACUUM by holding a transaction 
 open.

This is true, though it's possible we might at some point want a
backend process which really shouldn't be killable by non-superusers
(if vacuum/autovacuum isn't one already.) Actually, I could easily
imagine a superuser running an important query on a database getting
peeved if a non-superuser were allowed to cancel/terminate his
queries.

Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-05-29 Thread Josh Kupershmidt
On Sun, May 29, 2011 at 5:04 AM, Noah Misch n...@leadboat.com wrote:
 What risks arise from unconditionally allowing these calls for the same user's
 backends?  `pg_cancel_backend' ought to be safe enough; the user always has
 access to the standard cancellation protocol, making the SQL interface a mere
 convenience (albeit a compelling one).  `pg_terminate_backend' does open up
 access to a new behavior, but no concrete risks come to mind.

Looking around, I see there were real problems[1] with sending SIGTERM
to individual backends back in 2005 or so, and pg_terminate_backend()
was only deemed safe enough to put in for 8.4 [2]. So expanding
pg_terminate_backend() privileges does make me a tad nervous.

Reading through those old threads made me realize this patch would
give database owners the ability to kill off autovacuum workers. Seems
like we'd want to restrict that power to superusers.

 On the other hand, this *would* be substantial new authority for database
 owners.  Seems like a reasonable authority to grant, though.

And I also realized that this patch's approach might force us to
maintain a permissions wart if we ever want to implement fine-grained
control for this stuff, e.g. a per-role setting enabling self-kills.
It would be a bit lame to have to document Use this CREATE/ALTER ROLE
flag. Or be the database owner. Or be a superuser.

 Now, a few technical comments about the patch:
 1.) This bit looks dangerous:
 +                backend = pgstat_fetch_stat_beentry(i);
 +                if (backend-st_procpid == pid) {

 Since pgstat_fetch_stat_beentry() might return NULL.

 I think you want BackendPidGetProc().

Ah, thanks for the pointer.

Josh

--
[1] 
http://postgresql.1045698.n5.nabble.com/pg-terminate-backend-idea-td1930120.html
[2] 
http://postgresql.1045698.n5.nabble.com/Re-COMMITTERS-pgsql-Add-pg-terminate-backend-to-allow-terminating-only-a-single-td1983763i20.html

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-05-28 Thread Josh Kupershmidt
On Fri, Mar 11, 2011 at 8:54 AM, Bruce Momjian br...@momjian.us wrote:
 I have added it to the next commit fest.

Hi Torello,

I have volunteered (more accurately, Greg Smith volunteered me :-)
to be a reviewer for this patch. I know you're a bit new here, so I
thought I'd outline where this patch stands and what's expected if
you'd like to move it along.

We organize patch reviews via commitfests lasting a month or so.
Some more information about this process:
http://wiki.postgresql.org/wiki/CommitFest

Each commitfest is a period wherein you can expect to receive some
feedback on your patch and advice on things which might need to be
improved (in this case, it's my job to provide you this feedback).
Your patch is in the upcoming commitfest, scheduled to run from June
15 to July 14.

So if you're interested in being responsible for this patch, or some
variant of it, eventually making its way into PostgreSQL 9.2, you
should be willing to update your patch based on feedback, request
advice, etc. during this period. If you're not interested in getting
sucked into this process that's OK -- just please advise us if that's
the case, and maybe someone else will be willing to take charge of the
patch.

Anssi and I posted some initial feedback on the patch's goals earlier.
I would like to ultimately see users have the capability to
pg_cancel_backend() their own queries. But I could at least conceive
of others not wanting this behavior enabled by default. So perhaps
this patch's approach of granting extra privs to the database owner
could work as a first attempt. And maybe a later version could
introduce a GUC allowing the DBA to control whether users can
cancel/terminate their backends, or we could instead have an option
flag to CREATE/ALTER ROLE, allowing per-user configuration.

It would be helpful to hear from others whether this patch's goals
would work as a first pass at this problem, so that Torello doesn't
waste time on a doomed approach. Also, it might be helpful to add an
entry on the Todo list for 'allow non-superusers to use
pg_cancel_backend()', in case this patch gets sunk.

Now, a few technical comments about the patch:
1.) This bit looks dangerous:
+backend = pgstat_fetch_stat_beentry(i);
+if (backend-st_procpid == pid) {

Since pgstat_fetch_stat_beentry() might return NULL.

I'm a bit suspicious about whether looping through
pgstat_fetch_stat_beentry() is the best way to determine the database
owner for a given backend PID, but I haven't dug in enough yet to
suggest a better alternative.

2.) The way the code inside pg_signal_backend() is structured, doing:
  select pg_cancel_backend(12345);

as non-superuser, where '12345' is a fictitious PID, can now give you
the incorrect error message:

  ERROR:  must be superuser or target database owner to signal other
server processes

3.) No documentation adjustments, and the comments need some cleaup.
Torello: I'll be happy to handle comments/documentation for you as a
native English speaker, so you don't have to worry about this part.

That's it for now. Torello, I look forward to hearing back from you,
and hope that you have some time to work on this patch further.

Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-03-11 Thread Bruce Momjian
Kevin Grittner wrote:
 Torello Querci tque...@gmail.com wrote:
  
  I attach a path for this
  
 It's too late in the release cycle to consider this for version 9.1.
 Please add it to the open CommitFest for consideration for 9.2:
  
 https://commitfest.postgresql.org/action/commitfest_view/open

I have added it to the next commit fest.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-02-27 Thread Josh Kupershmidt
On Mon, Feb 14, 2011 at 8:58 AM, Anssi Kääriäinen
anssi.kaariai...@thl.fi wrote:
 On 02/14/2011 02:10 PM, Torello Querci wrote:

 I suppose that give the right to the owner db user to terminate or
 cancel other session connected to the database which it is owner is a
 good thing.
 I not see any security problem because this user can cancel or
 terminate only the session related with the own database,
 but if you think that this is a problem, a configuration parameter can be
 used.

 For what it's worth, a big +1 from me. We have pretty much the same use
 case.

 It would be good if you could also terminate your own connections.

The superuser-only restriction for pg_cancel_backend() has been a pet
peeve of mine as well. I actually posted a patch a while back to let
users pg_cancel_backend() their own queries, see:
http://archives.postgresql.org/pgsql-admin/2010-02/msg00052.php

IMO it'd be better to do away with this patch's check of:
/* If the user not is the superuser, need to be the db owner. */

and instead just check if the target session's user matches that of
the cancel requester.

Additionally, this patch keeps all the permission checking inside
pg_signal_backend(). That's fine if we're sure that we want
pg_cancel_backend() and pg_terminate_backend() to undergo the same
permissions check, but perhaps it's a bad idea to relax the
permissions check on pg_terminate_backend() ?

Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-02-14 Thread Torello Querci
Hi,

this is the first time that I post here, so if I wrong please don't kill me ...
I see that pg_terminate_backend and pg_cancel_backend  can be execute
only by admin users.
This approach seems to be too restrictive in a lots of real situation.

In dept, I have a situation where it is created one database machine
for all the postgresql database.
This database machine is managed by IT staff that have created two
user for each application.
One user is the owner db user that create, drop, grant on this db,
while the other user is the application db.

In this situation I (the developer) not able to disconnect any client
and stop any high weight queries.
Unfortunately the application run on application server that is
manager, again, by IT staff and I not have the right to stop it.

I suppose that give the right to the owner db user to terminate or
cancel other session connected to the database which it is owner is a
good thing.
I not see any security problem because this user can cancel or
terminate only the session related with the own database,
but if you think that this is a problem, a configuration parameter can be used.

Of course I can create a function with admin right that do the same
thing but the IT staff need to install, configure, and give the right
grant.
So, I suppose, that this can to be only a workaround, not the solution.

Sorry for my English.

I attach a path for this


Best Regards, Torello
diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
index 5bda4af..5327447 100644
--- a/src/backend/utils/adt/misc.c
+++ b/src/backend/utils/adt/misc.c
@@ -33,6 +33,7 @@
 #include storage/procarray.h
 #include utils/builtins.h
 #include tcop/tcopprot.h
+#include pgstat.h
 
 #define atooid(x)  ((Oid) strtoul((x), NULL, 10))
 
@@ -75,9 +76,33 @@ static bool
 pg_signal_backend(int pid, int sig)
 {
 	if (!superuser())
-		ereport(ERROR,
-(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
-			(errmsg(must be superuser to signal other server processes;
+	{
+bool haveRight = false;
+PgBackendStatus *backend;
+
+		/* If the user not is the superuser, need to be the db owner. */
+		if (pg_database_ownercheck(MyDatabaseId, GetUserId())) {
+
+/* Check for the specify backend in the stat info table */
+int nBackend = pgstat_fetch_stat_numbackends();
+int i;
+for (i = 1; i=nBackend; ++i) {
+backend = pgstat_fetch_stat_beentry(i);
+if (backend-st_procpid == pid) {
+if (backend-st_databaseid == MyDatabaseId)
+haveRight = true;
+break;
+}
+}
+}
+
+if (!haveRight)
+			ereport(ERROR,
+	(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+(errmsg(must be superuser or database destination owner to signal other server processes;
+	}
+
+
 
 	if (!IsBackendPid(pid))
 	{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-02-14 Thread Anssi Kääriäinen

On 02/14/2011 02:10 PM, Torello Querci wrote:

I suppose that give the right to the owner db user to terminate or
cancel other session connected to the database which it is owner is a
good thing.
I not see any security problem because this user can cancel or
terminate only the session related with the own database,
but if you think that this is a problem, a configuration parameter can be used.
For what it's worth, a big +1 from me. We have pretty much the same use 
case.


It would be good if you could also terminate your own connections.

 - Anssi

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend and pg_cancel_backend by not administrator user

2011-02-14 Thread Kevin Grittner
Torello Querci tque...@gmail.com wrote:
 
 I attach a path for this
 
It's too late in the release cycle to consider this for version 9.1.
Please add it to the open CommitFest for consideration for 9.2:
 
https://commitfest.postgresql.org/action/commitfest_view/open
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-17 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  The closest thing I can think of to an automated test is to run repeated
  sets of the parallel regression tests, and each time SIGTERM a randomly
  chosen backend at a randomly chosen time.  Then see if anything funny
 
  Yep, that was my plan, plus running the parallel regression tests you
  get the possibility of 2 backends.
 
 I was intentionally suggesting only one kill per test cycle.  Multiple
 kills will probably create an O(N^2) explosion in the set of possible
 downstream-failure deltas.  I doubt you'd really get any improvement
 in testing coverage to justify the much larger amount of hand validation
 needed.
 
 It also strikes me that you could make some simple alterations to the
 regression tests to reduce the set of observable downstream deltas.
 For example, anyplace where a test loads a table with successive INSERTs
 and that table is used by later tests, wrap the INSERT sequence with
 BEGIN/END.  Then there is only one possible downstream delta (empty
 table) and not N different possibilities for an N-row table.

I have added pg_terminate_backend() to use SIGTERM and will start
running tests as discussed with Tom.  I will post my scripts too.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Magnus Hagander
Bruce Momjian wrote:
 Tom Lane wrote:
  Bruce Momjian [EMAIL PROTECTED] writes:
   I think if we want pg_terminate_backend, we have to do it the
   way that was originally discussed: have it issue SIGTERM and fix
   whatever needs to be fixed in the SIGTERM code path.
  
   Well, with no movement on this TODO item since it was removed in
   8.0, I am willing to give users something that works most of the
   time.
  
  If the users want it so bad, why has no one stepped up to do the
  testing?
 
 Good question.  Tom and I talked about this on the phone today.
 
 I think the problem is testing to try to prove the lack of a bug.  How
 long does someone test to know they have reasonably proven a bug
 doesn't exist?  

Right. I think we have *a lot* of users that regularly use SIGTERM to
kill the backends, becuase they have no idea it's dangerous. They just
use ps to find the backend and kill to see it go away.

If we had a *big* problem with it, we'd hear about it. So I doubt we
have. But it's quite possible that we have a narrow problem that can
cause problems in some issues - but I expect most people who run this
stuff without knowing it's dangerous aren't running the worlds largest
and most loaded databases...

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Magnus Hagander
Tom Lane wrote:
 I'm willing to enable a SIGTERM-based pg_terminate_backend for 8.4
 if there is some reasonable amount of testing done during this
 development cycle to try to expose any problems.  If no one is willing
 to do any such testing, then as far as I'm concerned there is no
 market for the feature and we should drop it from TODO.

If someone can come up with an automated script to do this kind of
testing, I can commit a VM or three to running this 24/7 for a month,
easily... But I don't trust myself in coming up with a test-case that's
good enough :-P

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 ISTM that there will be more cases like this in future, so we need a
 general solution anyway.  I propose the following sort of code structure
 for these situations:

   on_shmem_exit(cleanup_routine, arg);
   PG_TRY();
   {
   ... do something ...
   cancel_shmem_exit(cleanup_routine, arg);
   }
   PG_CATCH();
   {
   cancel_shmem_exit(cleanup_routine, arg);
   cleanup_routine(arg);
   PG_RE_THROW();
   }
   PG_END_TRY();

[We would also have to block SIGTERM around the second cancel_shmem_exit and
cleanup_routine, no? Or if it's idempotent (actually, wouldn't it have to be?)
run them in the reverse order.]

This seems exceptionally fragile. Someone's bound to add another case without
realizing it. I thought about doing something like having a flag
in_unprotected_pg_try which PG_TRY would set to true, and on_shmem_exit would
clear. Then any LWLock or other shmem operation could assert it's cleared.

But that doesn't work for nested PG_TRY blocks. Possibly we could get it to
work by keeping a pair of counters but I'm not sure that's sufficient.

Are all the known cases LWLocks? Is there a reason we couldn't just have a
generic cleanup routine which just releases all LWLocks we hold before
exiting?

Or weakly -- an assert in CHECK_FOR_INTERRUPTS that there are no LWLocks held
or if there are that there's a cleanup routine in place. We wouldn't know for
sure that it's the *right* cleanup routine or enough cleanup routines but
hopefully we would catch the error often enough to notice.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Gregory Stark
Gregory Stark [EMAIL PROTECTED] writes:


 Or weakly -- an assert in CHECK_FOR_INTERRUPTS 

oops, that's irrelevant of course. 


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 ISTM that there will be more cases like this in future, so we need a
 general solution anyway.  I propose the following sort of code structure
 for these situations:

 [We would also have to block SIGTERM around the second cancel_shmem_exit and
 cleanup_routine, no? Or if it's idempotent (actually, wouldn't it have to be?)
 run them in the reverse order.]

No, we wouldn't, because a SIGTERM can only actually fire at a
CHECK_FOR_INTERRUPTS() call.  You'd just need to be sure there wasn't
one in the cleanup code.

 Are all the known cases LWLocks?

*None* of the known cases are LWLocks, nor any other resource that we
have generic cleanup code for.  The problem cases are one-off resources
that it seemed we could avoid having a real cleanup mechanism for.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 [We would also have to block SIGTERM around the second cancel_shmem_exit and
 cleanup_routine, no? Or if it's idempotent (actually, wouldn't it have to 
 be?)
 run them in the reverse order.]

 No, we wouldn't, because a SIGTERM can only actually fire at a
 CHECK_FOR_INTERRUPTS() call.  You'd just need to be sure there wasn't
 one in the cleanup code.

Wait, huh? In that case I don't see what advantage any of this has over
Bruce's patch. And his approach seemed a lot more robust.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 No, we wouldn't, because a SIGTERM can only actually fire at a
 CHECK_FOR_INTERRUPTS() call.  You'd just need to be sure there wasn't
 one in the cleanup code.

 Wait, huh? In that case I don't see what advantage any of this has over
 Bruce's patch. And his approach seemed a lot more robust.

 Maybe I missed something, but I thought he was just proposing some
 macro syntactic sugar over the same code that I described.

No, I meant the earlier patch which you rejected with the flag in MyProc. I
realize there were other issues but the initial concern was that it wouldn't
respond promptly because it would wait for CHECK_FOR_INTERRUPTS. But if
sigterm was never handled except at a CHECK_FOR_INTERRUPTS then that was never
a factor.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 No, we wouldn't, because a SIGTERM can only actually fire at a
 CHECK_FOR_INTERRUPTS() call.  You'd just need to be sure there wasn't
 one in the cleanup code.

 Wait, huh? In that case I don't see what advantage any of this has over
 Bruce's patch. And his approach seemed a lot more robust.

Maybe I missed something, but I thought he was just proposing some
macro syntactic sugar over the same code that I described.

I'm not against syntactic sugar, but it doesn't seem to be totally
clean; I think you would need two repetitions of the function and
arg anyway:

PG_ENSURE_ERROR_CLEANUP(cleanup_function, arg);
{
... risky code here ...
}
PG_END_ENSURE_ERROR_CLEANUP(cleanup_function, arg);

because both the before and after parts need to know the function
name and the arg to pass it.  The only way to avoid duplication would be
if the whole controlled code block was actually an argument to the
macro:

PG_ENSURE_ERROR_CLEANUP(cleanup_function, arg,
... risky code here ...
);

but there are very good reasons not to do that when the controlled
code can be large; for instance that #if can't be used portably
in a macro argument.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 No, I meant the earlier patch which you rejected with the flag in MyProc. I
 realize there were other issues but the initial concern was that it wouldn't
 respond promptly because it would wait for CHECK_FOR_INTERRUPTS.

No, that's not the concern in the least.  The concern is that something
could trap the attempted throwing of the error *after*
CHECK_FOR_INTERRUPTS has noticed the signal.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  I am starting to think we need to analyze the source code rather than
  testing, because of what we are testing for.
 
 I was thinking about that a bit more, in connection with trying to
 verbalize why I don't like your patch ;-)
 
 The fundamental assumption here is that we think that proc_exit() from
 the main loop of PostgresMain() should be safe, because that's exercised
 anytime a client quits or dies unexpectedly, and so it should be a
 well-tested path.  What is lacking such test coverage is the many code
 paths that start proc_exit() from any random CHECK_FOR_INTERRUPTS()
 point in the backend.  Some of those points might be (in fact are known
 to be, as of today) holding locks or otherwise in a state that prevents
 a clean proc_exit().
 
 Your patch proposes to replace that code path by one that fakes a
 QueryCancel error and then does proc_exit() once control reaches the
 outer level.  While that certainly *should* work (ignoring the question
 of whether control gets to the outer level in any reasonable amount of
 time), it's a bit of a stretch to claim that it's a well-tested case.
 At the moment it would only arise if a QueryCancel interrupt arrived at
 approximately the same time that the client died or sent a disconnect
 message, which is surely far less probable than client death by itself.
 So I don't buy the argument that this is a better-tested path, and
 I definitely don't want to introduce new complexity in order to make it
 happen like that.

I would like my use of SIGINT to be like someone pressing ^C and then \q
in psql, but I can see that the SIGINT might be delivered in places
where libpq would not deliver it, like during shutdown.  Hopefully that
could be addressed, but I agree using SIGTERM is better and more useful.

 Now, as to whether a SIGTERM-style exit is safe.  With the exception
 of the PG_CATCH issues that we already know about, any other case seems
 like a clear coding error: you'd have to be doing something that you
 know could throw an error, without having made provision to clean up
 whatever unclean state you are in.  It'd be nice to think that we
 haven't been that silly anywhere.  Experience however teaches that such
 an assumption is hubris.
 
 I think that the way forward is to fix the PG_CATCH issues (which I will
 try to get done later this week) and then get someone to mount a serious
 effort to break it, as outlined in previous discussions:
 http://archives.postgresql.org/pgsql-hackers/2006-08/msg00174.php

I was thinking of running two parallel regression tests and killing one
at random and see if the others complete.  One problem is that the
regression tests issue a lot of server error messages so somehow I would
need to filter out the normal errors, which I think I could do.

Magnus has offered to test too.

 I'm willing to enable a SIGTERM-based pg_terminate_backend for 8.4
 if there is some reasonable amount of testing done during this
 development cycle to try to expose any problems.  If no one is willing
 to do any such testing, then as far as I'm concerned there is no market
 for the feature and we should drop it from TODO.

Agreed, but the feature is going to be asked for again soon so we would
probably have to put it on the features we don't want, and that is going
to look pretty odd.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I'm willing to enable a SIGTERM-based pg_terminate_backend for 8.4
 if there is some reasonable amount of testing done during this
 development cycle to try to expose any problems.

 If someone can come up with an automated script to do this kind of
 testing, I can commit a VM or three to running this 24/7 for a month,
 easily... But I don't trust myself in coming up with a test-case that's
 good enough :-P

The closest thing I can think of to an automated test is to run repeated
sets of the parallel regression tests, and each time SIGTERM a randomly
chosen backend at a randomly chosen time.  Then see if anything funny
happens.  The hard part here is distinguishing expected from unexpected
regression outputs, especially in view of the fact that some of the
tests depend on database contents set up by earlier tests.

I'm thinking that you could automatically discard the regression diff
for the specific test that got SIGTERM'd, as long as it looked like
the normal output up to the point where the terminated by
administrator error appears.  Then what you'd have is the potential for
downstream failures due to things not being created, which *should* fall
into a fairly stylized set of possible diffs.  So get the script to
throw away any diffs that exactly match ones seen previously.  Run it
for awhile, and then hand-validate the set of diffs that it's saved
... or if any of 'em look funny, report.

One gotcha I can think of is that killing the prepared_xacts test
can leave you with open 2PC transactions, which will interfere with
starting the next cycle of the tests (you have to kill them before you
can dropdb).  But you could add a rollback prepared to the driver
script to clean out any uncommitted prepared xact.

Whether this is workable or not depends on the size of the set of
expected downstream-failure diffs.  My gut feeling from many years of
watching regression test crashes is that it'd be large but not
completely impractical to look through by hand.

I haven't time to write something like that myself, but offhand it seems
like it could be done without more than a day or so's work, especially
if you start from the buildfarm infrastructure.

BTW, don't forget to include autovac workers in the set of SIGTERM
target candidates.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Bruce Momjian
Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  I'm willing to enable a SIGTERM-based pg_terminate_backend for 8.4
  if there is some reasonable amount of testing done during this
  development cycle to try to expose any problems.
 
  If someone can come up with an automated script to do this kind of
  testing, I can commit a VM or three to running this 24/7 for a month,
  easily... But I don't trust myself in coming up with a test-case that's
  good enough :-P
 
 The closest thing I can think of to an automated test is to run repeated
 sets of the parallel regression tests, and each time SIGTERM a randomly
 chosen backend at a randomly chosen time.  Then see if anything funny

Yep, that was my plan, plus running the parallel regression tests you
get the possibility of 2 backends.

 happens.  The hard part here is distinguishing expected from unexpected
 regression outputs, especially in view of the fact that some of the
 tests depend on database contents set up by earlier tests.

Oh, that is a problem.  I was going to get the typical errors, sort them
and put them in a file.  Then when I run the test, I sort the log again
and use diff | grep '' to see an differences.  If there are cases that
generate new errors on a previous failure, I will have to add that
output too.

 I'm thinking that you could automatically discard the regression diff
 for the specific test that got SIGTERM'd, as long as it looked like
 the normal output up to the point where the terminated by
 administrator error appears.  Then what you'd have is the potential for
 downstream failures due to things not being created, which *should* fall
 into a fairly stylized set of possible diffs.  So get the script to
 throw away any diffs that exactly match ones seen previously.  Run it
 for awhile, and then hand-validate the set of diffs that it's saved
 ... or if any of 'em look funny, report.

Yep, see above.

 One gotcha I can think of is that killing the prepared_xacts test
 can leave you with open 2PC transactions, which will interfere with
 starting the next cycle of the tests (you have to kill them before you
 can dropdb).  But you could add a rollback prepared to the driver
 script to clean out any uncommitted prepared xact.

Good idea.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 The closest thing I can think of to an automated test is to run repeated
 sets of the parallel regression tests, and each time SIGTERM a randomly
 chosen backend at a randomly chosen time.  Then see if anything funny

 Yep, that was my plan, plus running the parallel regression tests you
 get the possibility of 2 backends.

I was intentionally suggesting only one kill per test cycle.  Multiple
kills will probably create an O(N^2) explosion in the set of possible
downstream-failure deltas.  I doubt you'd really get any improvement
in testing coverage to justify the much larger amount of hand validation
needed.

It also strikes me that you could make some simple alterations to the
regression tests to reduce the set of observable downstream deltas.
For example, anyplace where a test loads a table with successive INSERTs
and that table is used by later tests, wrap the INSERT sequence with
BEGIN/END.  Then there is only one possible downstream delta (empty
table) and not N different possibilities for an N-row table.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-16 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  The closest thing I can think of to an automated test is to run repeated
  sets of the parallel regression tests, and each time SIGTERM a randomly
  chosen backend at a randomly chosen time.  Then see if anything funny
 
  Yep, that was my plan, plus running the parallel regression tests you
  get the possibility of 2 backends.
 
 I was intentionally suggesting only one kill per test cycle.  Multiple
 kills will probably create an O(N^2) explosion in the set of possible
 downstream-failure deltas.  I doubt you'd really get any improvement
 in testing coverage to justify the much larger amount of hand validation
 needed.

No, the point is that you have 2 backends to choose from to kill.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_terminate_backend() idea

2008-04-15 Thread Bruce Momjian
Gregory Stark wrote:
   Add pg_terminate_backend() to allow terminating only a single session.
 
 I'm interested in this because I was already looking for a solution to the
 out of signals problem we have.
 
 I think we could expand this by having a bunch of boolean flags, one each for
 different conditions including the sinval processing conditions, interrupt,
 info, and terminate. (Any more?)
 
 The two things we would have to check to be sure of is:
 
 1) Do we care about how many times events are processed? Ie, if you press
 interrupt twice is it important that that be handled as an interrupt twice? It
 doesn't work that way currently for interrupt but are any of the other
 conditions sensitive to this? I don't think so.

Not that I can think of.

 2) Do we care what order things happen in? Ie, if you send an info request and
 then a cancel request is it ok if the cancel is handled first. I don't see why
 not myself. And if it's a terminate request we *clear* don't want to bother
 handling any other events first.

I don't think we care.

 It seems to me we could replace all of the above with either SIGINT or USR1
 and have a bunch of boolean flags in MyProc. I'm not sure of the implication
 for sinval processing of having to get a whole bunch of LWLocks though.

Keep in mind PGPROC-terminate was easier because once it was set it was
never cleared.  If you need to clear the flag and keep going the code is
going to need to be a little more sophisticated to avoid lost interrupts.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_terminate_backend() issues

2008-04-15 Thread Bruce Momjian
Tom Lane wrote:
 I wrote:
  All in all, this patch wasn't ready to apply without review.  I'm
  currently looking to see whether it's salvageable or not.
 
 After further study, I've concluded that it is in fact not salvageable,
 and I respectfully request that it be reverted.

OK, reverted.

 I was able to get things to more or less work most of the time with
 three or four additional ugly hacks in postgres.c, but I still don't
 have any great confidence that there aren't windows where a terminate
 request wouldn't be ignored (even without considering the uncooperative-
 function scenario).  Moreover it's entirely unclear that this approach
 actually dodges any of the hypothetical bugs in SIGTERM handling.

I don't understand.  If we call proc_exit(0) instead, it is the same as
someone exiting the backend via PQfinish().

 Given the complexity that it adds, I think it's fair to say that this
 approach is probably buggier than the other way.

 I think if we want pg_terminate_backend, we have to do it the way that
 was originally discussed: have it issue SIGTERM and fix whatever needs
 to be fixed in the SIGTERM code path.  As the TODO list used to say
 before it got editorialized upon, this is mostly a matter of testing
 (something that I can tell this patch was sadly lacking :-().
 We do need also to go around and fix any places that think a PG_TRY
 block is a sufficient way to clean up state in shared memory --- cf
 this AFAIK-still-unfixed bug:
 http://archives.postgresql.org/pgsql-hackers/2007-04/msg00218.php

Well, with no movement on this TODO item since it was removed in 8.0, I
am willing to give users something that works most of the time.  As you
said already, cancel isn't going to be 100% reliable anyway because of
places we can't check for cancel.

I will work on a new version of the patch that people can see.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-15 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I was able to get things to more or less work most of the time with
 three or four additional ugly hacks in postgres.c, but I still don't
 have any great confidence that there aren't windows where a terminate
 request wouldn't be ignored (even without considering the uncooperative-
 function scenario).  Moreover it's entirely unclear that this approach
 actually dodges any of the hypothetical bugs in SIGTERM handling.

 I don't understand.  If we call proc_exit(0) instead, it is the same as
 someone exiting the backend via PQfinish().

Well, using proc_exit() instead of die() might have alleviated that
particular objection, but there are still the others.

 I think if we want pg_terminate_backend, we have to do it the way that
 was originally discussed: have it issue SIGTERM and fix whatever needs
 to be fixed in the SIGTERM code path.

 Well, with no movement on this TODO item since it was removed in 8.0, I
 am willing to give users something that works most of the time.

If the users want it so bad, why has no one stepped up to do the
testing?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-15 Thread Tom Lane
I did a quick grep for PG_CATCH uses to see what we have along the lines
of this bug:
http://archives.postgresql.org/pgsql-hackers/2007-04/msg00218.php

In current sources there are three places at risk:

btbulkdelete, as noted in the above message
pg_start_backup needs to reset forcePageWrites = false
createdb wants to UnlockSharedObject and delete any already-copied files

In createdb, the Unlock actually doesn't need to get cleaned up, since
transaction abort would release the lock anyway.  But leaving a possibly
large mess of orphaned files doesn't seem nice.

ISTM that there will be more cases like this in future, so we need a
general solution anyway.  I propose the following sort of code structure
for these situations:

on_shmem_exit(cleanup_routine, arg);
PG_TRY();
{
... do something ...
cancel_shmem_exit(cleanup_routine, arg);
}
PG_CATCH();
{
cancel_shmem_exit(cleanup_routine, arg);
cleanup_routine(arg);
PG_RE_THROW();
}
PG_END_TRY();

where cancel_shmem_exit is defined to pop the latest shmem_exit_list
entry if it matches the passed arguments (I don't see any particular
need to search further than that in the list).  This structure
guarantees that cleanup_routine will be called on the way out of
either a plain ERROR or a FATAL exit.

Thoughts?  We clearly must do something about this before we can even
think of calling retail SIGTERM a supported feature ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() idea

2008-04-15 Thread Gregory Stark
Bruce Momjian [EMAIL PROTECTED] writes:

 Gregory Stark wrote:

 It seems to me we could replace all of the above with either SIGINT or USR1
 and have a bunch of boolean flags in MyProc. I'm not sure of the implication
 for sinval processing of having to get a whole bunch of LWLocks though.

 Keep in mind PGPROC-terminate was easier because once it was set it was
 never cleared.  If you need to clear the flag and keep going the code is
 going to need to be a little more sophisticated to avoid lost interrupts.

That's kind of the point. I don't think we care about lost interrupts for any
of these things. As long as we know a sinval flush happened I think we don't
care how many more happened. Or a cancel request or terminate request.

For sinval of course it would be important to clear the flags (with MyProc
locked) prior to processing the queue in case one arrives as soon as we're
finished. But I think that's the only detail.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-15 Thread Bruce Momjian
Tom Lane wrote:
 I did a quick grep for PG_CATCH uses to see what we have along the lines
 of this bug:
 http://archives.postgresql.org/pgsql-hackers/2007-04/msg00218.php
 
 In current sources there are three places at risk:
 
 btbulkdelete, as noted in the above message
 pg_start_backup   needs to reset forcePageWrites = false
 createdb wants to UnlockSharedObject and delete any already-copied files
 
 In createdb, the Unlock actually doesn't need to get cleaned up, since
 transaction abort would release the lock anyway.  But leaving a possibly
 large mess of orphaned files doesn't seem nice.
 
 ISTM that there will be more cases like this in future, so we need a
 general solution anyway.  I propose the following sort of code structure
 for these situations:
 
   on_shmem_exit(cleanup_routine, arg);
   PG_TRY();
   {
   ... do something ...
   cancel_shmem_exit(cleanup_routine, arg);
   }
   PG_CATCH();
   {
   cancel_shmem_exit(cleanup_routine, arg);
   cleanup_routine(arg);
   PG_RE_THROW();
   }
   PG_END_TRY();
 
 where cancel_shmem_exit is defined to pop the latest shmem_exit_list
 entry if it matches the passed arguments (I don't see any particular
 need to search further than that in the list).  This structure
 guarantees that cleanup_routine will be called on the way out of
 either a plain ERROR or a FATAL exit.
 
 Thoughts?  We clearly must do something about this before we can even
 think of calling retail SIGTERM a supported feature ...

Here is a little different idea.  Seems we want something more like:

PG_ON_EXIT();
{
cleanup_routine(arg);
}
   PG_TRY();
   {
   ... do something ...
   }
   PG_CATCH();
   {
   PG_RE_THROW();
   }
   PG_END_TRY();

PG_ON_EXIT() would place cleanup_routine() into the shmem_exit_list and
it would be called at the end of PG_CATCH and removed from
shmem_exit_list by PG_END_TRY().  Another idea would be to define the
PG_CATCH() first so it would be put in the shmem_exit_list before the
TRY was tried, but I assume you don't want all those functions to run on
exit.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-15 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Here is a little different idea.  Seems we want something more like:

Yeah, I thought about building a set of macros that would have this
ability folded in, but it didn't seem like much notational advantage
given that you have to write the cleanup routine out-of-line anyway.

Also, until we have a lot more than three use-sites, more macros
don't seem to be worth the trouble.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-15 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Here is a little different idea.  Seems we want something more like:
 
 Yeah, I thought about building a set of macros that would have this
 ability folded in, but it didn't seem like much notational advantage
 given that you have to write the cleanup routine out-of-line anyway.

Yea, agreed.  I am a little worried about the out-of-line issue because
some cases might become more complex because you have to pass lots of
parameters, perhaps.

 Also, until we have a lot more than three use-sites, more macros
 don't seem to be worth the trouble.

Oh, only three use sites.  I thought we had three wrong sites but that
more had to be changed.  Yea, go for it!

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-15 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  I think if we want pg_terminate_backend, we have to do it the way that
  was originally discussed: have it issue SIGTERM and fix whatever needs
  to be fixed in the SIGTERM code path.
 
  Well, with no movement on this TODO item since it was removed in 8.0, I
  am willing to give users something that works most of the time.
 
 If the users want it so bad, why has no one stepped up to do the
 testing?

Good question.  Tom and I talked about this on the phone today.

I think the problem is testing to try to prove the lack of a bug.  How
long does someone test to know they have reasonably proven a bug doesn't
exist?  

I think the other problem is what to test.  One SIGTERM problem that was
reported was related to schema changes.  Certainly that has to be
tested, but what else:  prepared statements, queries, HOT updates,
connect/disconnect, notify?  There are lots of place to test and it is
hard to know which ones, and for how long.

I am starting to think we need to analyze the source code rather than
testing, because of what we are testing for.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend() issues

2008-04-15 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  I was able to get things to more or less work most of the time with
  three or four additional ugly hacks in postgres.c, but I still don't
  have any great confidence that there aren't windows where a terminate
  request wouldn't be ignored (even without considering the uncooperative-
  function scenario).  Moreover it's entirely unclear that this approach
  actually dodges any of the hypothetical bugs in SIGTERM handling.
 
  I don't understand.  If we call proc_exit(0) instead, it is the same as
  someone exiting the backend via PQfinish().
 
 Well, using proc_exit() instead of die() might have alleviated that
 particular objection, but there are still the others.

Tom, thanks for you feedback.  It was very helpful.

I have adjusted the patch to return a locked PGPROC entry, initialize
the PGPROC-terminate boolean, called proc_exit(), and moved the
PGPROC-terminate test out of the cancel loop entirely so lost cancel
signals are not as big a problem anymore.

I agree it would be best for pg_terminate_backend() and for the
postmaster use of SIGTERM if SIGTERM was more reliable about cleanup, so
hopefully that will work for 8.4.  I have attached my patch in hopes we
can use it as a backup in case we can't get SIGTERM to work for
individual backends for 8.4.

[ I have posted about testing in a separate email.]

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/func.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/func.sgml,v
retrieving revision 1.432
diff -c -c -r1.432 func.sgml
*** doc/src/sgml/func.sgml  15 Apr 2008 20:28:46 -  1.432
--- doc/src/sgml/func.sgml  16 Apr 2008 03:59:20 -
***
*** 11849,11854 
--- 11849,11857 
  primarypg_cancel_backend/primary
 /indexterm
 indexterm
+ primarypg_terminate_backend/primary
+/indexterm
+indexterm
  primarypg_reload_conf/primary
 /indexterm
 indexterm
***
*** 11885,11890 
--- 11888,11900 
/row
row
 entry
+ 
literalfunctionpg_terminate_backend/function(parameterpid/parameter 
typeint/)/literal
+ /entry
+entrytypeboolean/type/entry
+entryTerminate a backend/entry
+   /row
+   row
+entry
  literalfunctionpg_reload_conf/function()/literal
  /entry
 entrytypeboolean/type/entry
***
*** 11907,11915 
 /para
  
 para
! functionpg_cancel_backend/ sends a query cancel
! (systemitemSIGINT/) signal to a backend process identified by
! process ID.  The process ID of an active backend can be found from
  the structfieldprocpid/structfield column in the
  structnamepg_stat_activity/structname view, or by listing the
  commandpostgres/command processes on the server with
--- 11917,11926 
 /para
  
 para
! functionpg_cancel_backend/ and functionpg_terminate_backend/
! send a query cancel (systemitemSIGINT/) signal to a backend process
! identified by process ID.  The
! process ID of an active backend can be found from
  the structfieldprocpid/structfield column in the
  structnamepg_stat_activity/structname view, or by listing the
  commandpostgres/command processes on the server with
Index: doc/src/sgml/runtime.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v
retrieving revision 1.413
diff -c -c -r1.413 runtime.sgml
*** doc/src/sgml/runtime.sgml   15 Apr 2008 20:28:46 -  1.413
--- doc/src/sgml/runtime.sgml   16 Apr 2008 03:59:20 -
***
*** 1372,1377 
--- 1372,1384 
  well.
 /para
/important
+ 
+   para
+To terminate a session while allowing other sessions to continue, use
+functionpg_terminate_backend()/ (xref
+linkend=functions-admin-signal-table) rather than sending a signal
+to the child process.
+   /para
   /sect1
  
   sect1 id=preventing-server-spoofing
Index: src/backend/access/transam/twophase.c
===
RCS file: /cvsroot/pgsql/src/backend/access/transam/twophase.c,v
retrieving revision 1.41
diff -c -c -r1.41 twophase.c
*** src/backend/access/transam/twophase.c   25 Mar 2008 22:42:42 -  
1.41
--- src/backend/access/transam/twophase.c   16 Apr 2008 03:59:20 -
***
*** 283,288 
--- 283,289 
gxact-proc.databaseId = databaseid;
gxact-proc.roleId = owner;
gxact-proc.inCommit = false;
+   gxact-proc.terminate = false;
gxact-proc.vacuumFlags = 0;
gxact-proc.lwWaiting = false;
gxact-proc.lwExclusive = false;
Index: 

Re: [HACKERS] pg_terminate_backend() issues

2008-04-15 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 I am starting to think we need to analyze the source code rather than
 testing, because of what we are testing for.

I was thinking about that a bit more, in connection with trying to
verbalize why I don't like your patch ;-)

The fundamental assumption here is that we think that proc_exit() from
the main loop of PostgresMain() should be safe, because that's exercised
anytime a client quits or dies unexpectedly, and so it should be a
well-tested path.  What is lacking such test coverage is the many code
paths that start proc_exit() from any random CHECK_FOR_INTERRUPTS()
point in the backend.  Some of those points might be (in fact are known
to be, as of today) holding locks or otherwise in a state that prevents
a clean proc_exit().

Your patch proposes to replace that code path by one that fakes a
QueryCancel error and then does proc_exit() once control reaches the
outer level.  While that certainly *should* work (ignoring the question
of whether control gets to the outer level in any reasonable amount of
time), it's a bit of a stretch to claim that it's a well-tested case.
At the moment it would only arise if a QueryCancel interrupt arrived at
approximately the same time that the client died or sent a disconnect
message, which is surely far less probable than client death by itself.
So I don't buy the argument that this is a better-tested path, and
I definitely don't want to introduce new complexity in order to make it
happen like that.

Now, as to whether a SIGTERM-style exit is safe.  With the exception
of the PG_CATCH issues that we already know about, any other case seems
like a clear coding error: you'd have to be doing something that you
know could throw an error, without having made provision to clean up
whatever unclean state you are in.  It'd be nice to think that we
haven't been that silly anywhere.  Experience however teaches that such
an assumption is hubris.

I think that the way forward is to fix the PG_CATCH issues (which I will
try to get done later this week) and then get someone to mount a serious
effort to break it, as outlined in previous discussions:
http://archives.postgresql.org/pgsql-hackers/2006-08/msg00174.php

I'm willing to enable a SIGTERM-based pg_terminate_backend for 8.4
if there is some reasonable amount of testing done during this
development cycle to try to expose any problems.  If no one is willing
to do any such testing, then as far as I'm concerned there is no market
for the feature and we should drop it from TODO.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_terminate_backend

2006-08-18 Thread Magnus Hagander
  Since I have a stuck backend without client again, I'll have to
 kill
  -SIGTERM a backend. Fortunately, I do have console access to
 that
  machine and it's not win32 but a decent OS.
 
 
 
  You do know that on Windows you can use pg_ctl to send a pseudo
  SIGTERM to a backend, don't you?
 The main issue still is that console access id required, on any OS.

Yeah.
Though for the Windows case only, we could easily enough make it
possible to run pg_ctl kill remotely, since we use a named pipe. Does
this seem like a good or bad idea?

//Magnus


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] pg_terminate_backend

2006-08-18 Thread Andreas Pflug
Magnus Hagander wrote:
 Since I have a stuck backend without client again, I'll have to
 
 kill
 
 -SIGTERM a backend. Fortunately, I do have console access to
 
 that
 
 machine and it's not win32 but a decent OS.


 
 You do know that on Windows you can use pg_ctl to send a pseudo
 SIGTERM to a backend, don't you?
   
 The main issue still is that console access id required, on any OS.
 

 Yeah.
 Though for the Windows case only, we could easily enough make it
 possible to run pg_ctl kill remotely, since we use a named pipe. Does
 this seem like a good or bad idea?
   

Not too helpful. How to kill a win32 backend from a linux workstation?
Additionally, NP requires an authenticated RPC connection. I you're not
allowed to access the console, you probably haven't got sufficient
access permissions to NP as well, or you'd need extra policy tweaking or
so. Nightmarish, just to avoid the easy and intuitive way.

Regards,
Andreas

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] pg_terminate_backend

2006-08-18 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Though for the Windows case only, we could easily enough make it
 possible to run pg_ctl kill remotely, since we use a named pipe. Does
 this seem like a good or bad idea?

Seems like we'd be opening a can of security worms :-(

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] pg_terminate_backend

2006-08-18 Thread Magnus Hagander
  Though for the Windows case only, we could easily enough make it
  possible to run pg_ctl kill remotely, since we use a named pipe.
 Does
  this seem like a good or bad idea?
 
 Seems like we'd be opening a can of security worms :-(

Not really, standard windows ACL already applies to everything, so you
need to be an admin on the machine to make it work.

Anyhoo, I don't really see the gain in it, which also seems to be what
others think, so let's just drop that idea.

//Magnus


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Tom Lane
Andreas Pflug [EMAIL PROTECTED] writes:
 utils/adt/misc.c says:
 //* Disabled in 8.0 due to reliability concerns; FIXME someday *//
 Datum
 *pg_terminate_backend*(PG_FUNCTION_ARGS)

 Well, AFAIR there were no more issues raised about code paths that don't 
 clean up correctly, so can we please
 remove that comment and make the function live finally? 

No, you have that backwards.  The burden of proof is on those who want
it to show that it's now safe.  The situation is not different than it
was before, except that we can now actually point to a specific bug that
did exist, whereas the original concern was just an unfocused one that
the code path hadn't been adequately exercised.  That concern is now
even more pressing than it was.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Andreas Pflug
Andrew Dunstan wrote:


 Andreas Pflug wrote:

 Since I have a stuck backend without client again, I'll have to kill
 -SIGTERM a backend. Fortunately, I do have console access to that
 machine and it's not win32 but a decent OS.
  


 You do know that on Windows you can use pg_ctl to send a pseudo
 SIGTERM to a backend, don't you?
The main issue still is that console access id required, on any OS.

Regards,
Andreas


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Andreas Pflug
Tom Lane wrote:
 Andreas Pflug [EMAIL PROTECTED] writes:
   
 utils/adt/misc.c says:
 //* Disabled in 8.0 due to reliability concerns; FIXME someday *//
 Datum
 *pg_terminate_backend*(PG_FUNCTION_ARGS)
 

   
 Well, AFAIR there were no more issues raised about code paths that don't 
 clean up correctly, so can we please
 remove that comment and make the function live finally? 
 

 No, you have that backwards.  The burden of proof is on those who want
 it to show that it's now safe.  The situation is not different than it
 was before, except that we can now actually point to a specific bug that
 did exist, whereas the original concern was just an unfocused one that
 the code path hadn't been adequately exercised.  That concern is now
 even more pressing than it was.
   

If the backend's stuck, I'll have to SIGTERM it, whether there's
pg_terminate_backend or not. Ultimately, if resources should remain
locked, there's no chance except restarting the whole server anyway.
SIGTERM gives me a fair chance (90%) that it will work without restart.

The persistent refusal of supporting the function makes it more painful
to execute, but not less necessary.

Regards,
Andreas


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Bruce Momjian
Tom Lane wrote:
 Andreas Pflug [EMAIL PROTECTED] writes:
  utils/adt/misc.c says:
  //* Disabled in 8.0 due to reliability concerns; FIXME someday *//
  Datum
  *pg_terminate_backend*(PG_FUNCTION_ARGS)
 
  Well, AFAIR there were no more issues raised about code paths that don't 
  clean up correctly, so can we please
  remove that comment and make the function live finally? 
 
 No, you have that backwards.  The burden of proof is on those who want
 it to show that it's now safe.  The situation is not different than it
 was before, except that we can now actually point to a specific bug that
 did exist, whereas the original concern was just an unfocused one that
 the code path hadn't been adequately exercised.  That concern is now
 even more pressing than it was.

I am not sure how you prove the non-existance of a bug.  Ideas?

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 No, you have that backwards.  The burden of proof is on those who want
 it to show that it's now safe.  The situation is not different than it
 was before, except that we can now actually point to a specific bug that
 did exist, whereas the original concern was just an unfocused one that
 the code path hadn't been adequately exercised.  That concern is now
 even more pressing than it was.

 I am not sure how you prove the non-existance of a bug.  Ideas?

What I'm looking for is some concentrated testing.  The fact that some
people once in a while SIGTERM a backend doesn't give me any confidence
in it.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Csaba Nagy
 What I'm looking for is some concentrated testing.  The fact that some
 people once in a while SIGTERM a backend doesn't give me any confidence
 in it.

Now wait a minute, is there some risk of lockup if I kill a backend ?
Cause I do that relatively often (say 20 times a day, when some web
users time out but their query keeps running). Should I rather not do it
?

Thanks,
Csaba.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Tom Lane
Csaba Nagy [EMAIL PROTECTED] writes:
 Now wait a minute, is there some risk of lockup if I kill a backend ?
 Cause I do that relatively often (say 20 times a day, when some web
 users time out but their query keeps running). Should I rather not do it
 ?

statement_timeout is your friend.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Csaba Nagy
You didn't answer the original question: is killing SIGTERM a backend
known/suspected to be dangerous ? And if yes, what's the risk (pointers
to discussions would be nice too).

 statement_timeout is your friend.

I know, but unfortunately I can't use it. I did try to use
statement_timeout and it worked out quite bad (due to our usage
scenario).

Some of the web requests which time out on the web should still go
through... and we have activities which should not observe statement
timeout at all, i.e. they must finish however long that takes.

I know it would be possible to use a different user with it's own
statement timeout for those requests, but that means we have to rewrite
a lot of code which is not possible immediately, and our admins would
resist to add even more configuration (additional users=additional
connection pool+caches and all to be configured). We also can fix the
queries so no timeout happens in the first place, but that will take us
even more time.

Cheers,
Csaba.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Tom Lane
Andreas Pflug [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 No, you have that backwards.  The burden of proof is on those who want
 it to show that it's now safe.

 If the backend's stuck, I'll have to SIGTERM it, whether there's
 pg_terminate_backend or not.

Stuck?  You have not shown us a case where SIGTERM rather than SIGINT
is necessary or appropriate.  It seems to me the above is assuming the
existence of unknown backend bugs, exactly the same thing you think
I shouldn't be assuming ...

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Csaba Nagy
On Thu, 2006-08-03 at 18:10, Csaba Nagy wrote:
 You didn't answer the original question: is killing SIGTERM a backend
  ^^^
Nevermind, I don't do that. I do 'kill backend_pid' without specifying
the signal, and I'm sufficiently unfamiliar with the unix signal names
to have confused them. Is a plain kill still dangerous ?

Thanks,
Csaba.




---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Csaba Nagy
 Stuck?  You have not shown us a case where SIGTERM rather than SIGINT
 is necessary or appropriate.  It seems to me the above is assuming the
 existence of unknown backend bugs, exactly the same thing you think
 I shouldn't be assuming ...

I do know a case where a plain kill will seem to be stucked: on vacuum
of a big table. I guess when it starts an index's cleanup scan it will
insist to finish it before stopping. I'm not sure if that's the cause,
but I have seen delays of 30 minutes for killing a vacuum... it's true
that finally it always did die... but it's also true that I have 'kill
-9'-ed it before because I thought it's stucked.

Cheers,
Csaba.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Tom Lane
Csaba Nagy [EMAIL PROTECTED] writes:
 I do know a case where a plain kill will seem to be stucked: on vacuum
 of a big table. I guess when it starts an index's cleanup scan it will
 insist to finish it before stopping.

We've fixed a few cases of missing CHECK_FOR_INTERRUPTS lately, and will
fix more if you can point them out.  Note though that SIGTERM is just as
vulnerable to that as SIGINT.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Andreas Pflug
Tom Lane wrote:
 Andreas Pflug [EMAIL PROTECTED] writes:
   
 Tom Lane wrote:
 
 No, you have that backwards.  The burden of proof is on those who want
 it to show that it's now safe.
   

   
 If the backend's stuck, I'll have to SIGTERM it, whether there's
 pg_terminate_backend or not.
 

 Stuck?  You have not shown us a case where SIGTERM rather than SIGINT
 is necessary or appropriate. 
Last night, I had a long-running query I launched from pgAdmin. It was
happily running and completing on the server (took about 2 hours), and
the backend went back to IDLE. pgAdmin didn't get back a response,
assuming the query was still running. Apparently, the VPN router had
interrupted the connection silently without notifying either side of the
tcp connection. Since the backend is IDLE, there's no query to cancel
and SIGINT won't help. So Stuck for me means a backend *not*
responding to SIGINT.
BTW, there's another scenario where SIGINT won't help. Imagine an app
running wild hammering the server with queries regardless of query
cancels (maybe some retry mechanism). You'd like to interrupt that
connection, i.e. get rid of the backend.

Regards,
Andreas


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Andreas Pflug
Csaba Nagy wrote:
 On Thu, 2006-08-03 at 18:10, Csaba Nagy wrote:
   
 You didn't answer the original question: is killing SIGTERM a backend
 
   ^^^
 Nevermind, I don't do that. I do 'kill backend_pid' without specifying
 the signal, and I'm sufficiently unfamiliar with the unix signal names
 to have confused them. Is a plain kill still dangerous ?
   
SIGTERM is the default kill parameter, so you do exactly what I'm
talking about.

Regards,
Andreas


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Tom Lane
Csaba Nagy [EMAIL PROTECTED] writes:
 On Thu, 2006-08-03 at 18:10, Csaba Nagy wrote:
 You didn't answer the original question: is killing SIGTERM a backend
   ^^^
 Nevermind, I don't do that. I do 'kill backend_pid' without specifying
 the signal,

man kill says the default is SIGTERM.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Andreas Pflug
Bruce Momjian wrote:


 I am not sure how you prove the non-existance of a bug.  Ideas?
   
Would be worth at least the Nobel prize :-)

Regards,
Andreas



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Csaba Nagy
 man kill says the default is SIGTERM.

OK, so that means I do use it... is it known to be dangerous ? I thought
till now that it is safe to use. What about select pg_cancel_backend()
?

Thanks,
Csaba.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Andreas Pflug
Csaba Nagy wrote:
 man kill says the default is SIGTERM.
 

 OK, so that means I do use it... is it known to be dangerous ? I thought
 till now that it is safe to use. 
Apparently you never suffered any problems from that; neither did I.

 What about select pg_cancel_backend()
   

That's the function wrapper around kill -SIGINT, which is probably the
way you could safely stop your queries most of the time.


Regards,
Andreas


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread [EMAIL PROTECTED]







I am not sure how you prove the non-existance of a bug.  Ideas?





I do that by deleting all of my code (usually by accident :-)

No code, no bugs!

 -- Korry











Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  No, you have that backwards.  The burden of proof is on those who want
  it to show that it's now safe.  The situation is not different than it
  was before, except that we can now actually point to a specific bug that
  did exist, whereas the original concern was just an unfocused one that
  the code path hadn't been adequately exercised.  That concern is now
  even more pressing than it was.
 
  I am not sure how you prove the non-existance of a bug.  Ideas?
 
 What I'm looking for is some concentrated testing.  The fact that some
 people once in a while SIGTERM a backend doesn't give me any confidence
 in it.

OK, here is an opportunity for someone to run tests to get this into
8.2.  The code already exists in CVS, but we need testing to enable it.
I would think running a huge workload and killing it over and over again
would be a good test.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 What I'm looking for is some concentrated testing.  The fact that some
 people once in a while SIGTERM a backend doesn't give me any confidence
 in it.

 OK, here is an opportunity for someone to run tests to get this into
 8.2.  The code already exists in CVS, but we need testing to enable it.
 I would think running a huge workload and killing it over and over again
 would be a good test.

Big multiprocess workload and you kill individual processes at random
while letting the rest run.  It probably needs to be something that
stresses more of the code than pgbench would, too.  (For instance,
it'd be a good idea if some of the workload involved having a few 2PC
transactions getting prepared and then either committed or rolled
back ... SIGTERM during a COMMIT PREPARED strikes me as the sort of
corner case that's probably never been exercised.)

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] pg_terminate_backend

2006-08-03 Thread Bruce Momjian

Thanks.  Good plan.

---

Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  What I'm looking for is some concentrated testing.  The fact that some
  people once in a while SIGTERM a backend doesn't give me any confidence
  in it.
 
  OK, here is an opportunity for someone to run tests to get this into
  8.2.  The code already exists in CVS, but we need testing to enable it.
  I would think running a huge workload and killing it over and over again
  would be a good test.
 
 Big multiprocess workload and you kill individual processes at random
 while letting the rest run.  It probably needs to be something that
 stresses more of the code than pgbench would, too.  (For instance,
 it'd be a good idea if some of the workload involved having a few 2PC
 transactions getting prepared and then either committed or rolled
 back ... SIGTERM during a COMMIT PREPARED strikes me as the sort of
 corner case that's probably never been exercised.)
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] pg_terminate_backend

2006-08-02 Thread Andreas Pflug
Since I have a stuck backend without client again, I'll have to kill -SIGTERM a 
backend. Fortunately, I do 
have console access to that machine and it's not win32 but a decent OS. For 
other cases I'd really really really 
appreciate if that function would make it into 8.2.

utils/adt/misc.c says:

#*ifdef* NOT_USED

//* Disabled in 8.0 due to reliability concerns; FIXME someday *//
Datum
*pg_terminate_backend*(PG_FUNCTION_ARGS)

Well, AFAIR there were no more issues raised about code paths that don't clean 
up correctly, so can we please
remove that comment and make the function live finally? 

Regards,
Andreas



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] pg_terminate_backend

2006-08-02 Thread Andrew Dunstan



Andreas Pflug wrote:

Since I have a stuck backend without client again, I'll have to kill -SIGTERM a backend. Fortunately, I do 
have console access to that machine and it's not win32 but a decent OS.
 



You do know that on Windows you can use pg_ctl to send a pseudo SIGTERM 
to a backend, don't you?


cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] pg_terminate_backend idea

2006-07-10 Thread Rod Taylor
On Tue, 2005-06-21 at 23:34 -0400, Tom Lane wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
  Tom Lane wrote:
  In any case the correct way to solve the problem is to find out what's
  being left corrupt by SIGTERM, rather than install more messiness in
  order to avoid facing the real issue ...
 
  I am confused.  Are you talking about the client SIGTERM or the server? 
 
 I am talking about Rod Taylor's reports that SIGTERM'ing individual
 backends tends to lead to lock table corrupted crashes awhile later.
 Now, I've been playing the part of Chicken Little on this for awhile,
 but seeing an actual report of problems from the field certainly
 strengthens my feelings about it.

Bringing this thread back to life.

I have not seen a lock table corruption issue with SIGTERM in 8.1 on
Solaris/Sun IV, Linux/AMD64, or Linux/Intel. I don't recall seeing one
on 8.0.3 either though I'm pretty sure there were several on 8.0.1.

There are times when locks for a process hang around for a few minutes
before getting cleared. I don't recall whether they were ungranted table
locks or entries waiting on a transaction ID lock, but the source was
Slony and a large pg_listener structure with more than 2 pages (yes,
pages not tuples).

I have also seen processes refusing to acknowledge the signal and exit
during btree index builds, but that's not a data corruption issue.
-- 


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] pg_terminate_backend idea

2005-06-24 Thread Bruce Momjian
Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
  Assuming we don't get such a case, and a chance to fix it, before 8.1
  (while still hoping we will get it fixed properly, we can't be sure, can
  we? If we were, it'd be fixed already). In this case, will you consider
  such a kludgy solution as a temporary fix to resolve a problem that a
  lot of users are having? And then plan to have it removed once sending
  SIGTERM directly to a backend can be considered safe?
 
 Kluges tend to become institutionalized, so my reaction is no.  It's
 also worth pointing out that with so little understanding of the problem
 Rod is reporting, it's tough to make a convincing case that this kluge
 will avoid it.  SIGTERM exit *shouldn't* be leaving any corrupted
 locktable entries behind; it's not that much different from the normal
 case.  Until we find out what's going on, introducing still another exit
 path isn't really going to make me feel more comfortable, no matter how
 close it's alleged to be to the normal path.

I have been running some tests to try to see the lock table corruption
but I have been unable to reproduce the problem.  I have attached my
crude test scripts.  I would run the scripts and then open another
session as a different user and do UPDATE and LOCK to cause the psql
session to block.

The only functional difference I can see between a SIGTERM exit and a
cancel followed by a normal exit is the call to
AbortCurrentTransaction().  Could that be significant?  Because I can't
reproduce the failure I can't know for sure.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073
#!/usr/contrib/bin/expect --
set timeout -1
eval spawn sql test
expect -nocase test=
send begin;\r
expect -nocase test=
send lock pg_class;\r
expect -nocase test=
send select * from pg_locks;\r
expect -nocase test=
send update test set x=3;\r
expect -nocase test=
expect eof
exit
while :
do
XPID=`/letc/ps_sysv -ef | grep 'postgres test'|grep -v grep|awk '{print 
$2}'`
if [ $XPID !=  ]
thenkill $XPID
echo $XPID
XPID=`/letc/ps_sysv -ef | grep 'psql test'|grep -v execargs|awk 
'{print $2}'`
kill $XPID
fi
sleep 1
done

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] pg_terminate_backend idea

2005-06-24 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes:
 I have been running some tests to try to see the lock table corruption
 but I have been unable to reproduce the problem.

It's possible that what Rod was complaining of is fixed in CVS tip ---
see discussion about RemoveFromWaitQueue() bug.  If so, it would have
been a bug that could be seen in other circumstances too, but maybe
SIGTERM made it more probable for some reason.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] pg_terminate_backend idea

2005-06-24 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
  I have been running some tests to try to see the lock table corruption
  but I have been unable to reproduce the problem.
 
 It's possible that what Rod was complaining of is fixed in CVS tip ---
 see discussion about RemoveFromWaitQueue() bug.  If so, it would have
 been a bug that could be seen in other circumstances too, but maybe
 SIGTERM made it more probable for some reason.

Was that backpatched to 8.0.X?  If not, I can test that branch of CVS
for the problem.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


  1   2   >