Re: Allow non-superuser to cancel superuser tasks.

2024-04-14 Thread Michael Paquier
On Fri, Apr 12, 2024 at 01:32:42PM +0500, Kirill Reshke wrote:
> +#
> +# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
> +# Portions Copyright (c) 1994, Regents of the University of California
> +#
> (Like in src/test/signals/Makefile)
> 
> at the beginning of each added file?

Assuming that these files are merged in 2024, you could just use:
Copyright (c) 2024, PostgreSQL Global Development Group

See for example slotsync.c introduced recently in commit ddd5f4f54a02.
--
Michael


signature.asc
Description: PGP signature


Re: Allow non-superuser to cancel superuser tasks.

2024-04-12 Thread Kirill Reshke
On Thu, 11 Apr 2024 at 16:55, Kirill Reshke  wrote:

> 7)
>
> > +# Copyright (c) 2022-2024, PostgreSQL Global Development Group
> > [...]
> > +# Copyright (c) 2024-2024, PostgreSQL Global Development Group
>
> > These need to be cleaned up.
>
> > +# Makefile for src/test/recovery
> > +#
> > +# src/test/recovery/Makefile
>
> > This is incorrect, twice.
>
> Cleaned up, thanks!

Oh, wait, I did this wrong.

Should i use

+# Copyright (c) 2024-2024, PostgreSQL Global Development Group

(Like in src/test/signals/meson.build &
src/test/signals/t/001_signal_autovacuum.pl)
or

+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
(Like in src/test/signals/Makefile)

at the beginning of each added file?




Re: Allow non-superuser to cancel superuser tasks.

2024-04-11 Thread Kyotaro Horiguchi
At Fri, 12 Apr 2024 09:10:35 +0900, Michael Paquier  wrote 
in 
> On Thu, Apr 11, 2024 at 04:55:59PM +0500, Kirill Reshke wrote:
> > The test doesn't fail because pg_terminate_backend actually meets his
> > point: autovac is killed. But while dying, autovac also receives
> > segfault. Thats because of injections points:
> > 
> > (gdb) bt
> > #0  0x56361c3379ea in tas (lock=0x7fbcb9632224  > access memory at address 0x7fbcb9632224>) at
> > ../../../../src/include/storage/s_lock.h:228
> > #1  ConditionVariableCancelSleep () at condition_variable.c:238
...
> > #3  0x56361c330a40 in CleanupProcSignalState (status=, arg=) at procsignal.c:240
> > #4  0x56361c328801 in shmem_exit (code=code@entry=1) at ipc.c:276
> > #9  0x56361c3378d7 in ConditionVariableTimedSleep
> > (cv=0x7fbcb9632224, timeout=timeout@entry=-1,
...
> I can see this stack trace as well.  Capturing a bit more than your
> own stack, this is crashing in the autovacuum worker while waiting on
> a condition variable when processing a ProcessInterrupts().
> 
> That may point to a legit bug with condition variables in this
> context, actually?  From what I can see, issuing a signal on a backend
> process waiting with a condition variable is able to process the
> interrupt correctly.

ProcSignalInit sets up CleanupProcSignalState to be called via
on_shmem_exit. If the CV is allocated in a dsm segment, shmem_exit
should have detached the region for the CV.  CV cleanup code should be
invoked via before_shmem_exit.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Allow non-superuser to cancel superuser tasks.

2024-04-11 Thread Michael Paquier
On Thu, Apr 11, 2024 at 04:55:59PM +0500, Kirill Reshke wrote:
> The test doesn't fail because pg_terminate_backend actually meets his
> point: autovac is killed. But while dying, autovac also receives
> segfault. Thats because of injections points:
> 
> (gdb) bt
> #0  0x56361c3379ea in tas (lock=0x7fbcb9632224  access memory at address 0x7fbcb9632224>) at
> ../../../../src/include/storage/s_lock.h:228
> #1  ConditionVariableCancelSleep () at condition_variable.c:238
> #2  0x56361c337e4b in ConditionVariableBroadcast
> (cv=0x7fbcb66f498c) at condition_variable.c:310
> #3  0x56361c330a40 in CleanupProcSignalState (status= out>, arg=) at procsignal.c:240
> #4  0x56361c328801 in shmem_exit (code=code@entry=1) at ipc.c:276
> #5  0x56361c3288fc in proc_exit_prepare (code=code@entry=1) at ipc.c:198
> #6  0x56361c3289bf in proc_exit (code=code@entry=1) at ipc.c:111
> #7  0x56361c49ffa8 in errfinish (filename=,
> lineno=, funcname=0x56361c654370 <__func__.16>
> "ProcessInterrupts") at elog.c:592
> #8  0x56361bf7191e in ProcessInterrupts () at postgres.c:3264
> #9  0x56361c3378d7 in ConditionVariableTimedSleep
> (cv=0x7fbcb9632224, timeout=timeout@entry=-1,
> wait_event_info=117440513) at condition_variable.c:196
> #10 0x56361c337d0b in ConditionVariableTimedSleep
> (wait_event_info=, timeout=-1, cv=) at
> condition_variable.c:135
> #11 ConditionVariableSleep (cv=,
> wait_event_info=) at condition_variable.c:98
> 
> discovered because of
> # Release injection point.
> $node->safe_psql('postgres',
> "SELECT injection_point_detach('autovacuum-worker-start');");
> added
> 
> v4 also suffers from that. i will try to fix that.

I can see this stack trace as well.  Capturing a bit more than your
own stack, this is crashing in the autovacuum worker while waiting on
a condition variable when processing a ProcessInterrupts().

That may point to a legit bug with condition variables in this
context, actually?  From what I can see, issuing a signal on a backend
process waiting with a condition variable is able to process the
interrupt correctly.
--
Michael


signature.asc
Description: PGP signature


Re: Allow non-superuser to cancel superuser tasks.

2024-04-11 Thread Kirill Reshke
On Thu, 11 Apr 2024 at 19:07, Nathan Bossart  wrote:
>
> On Thu, Apr 11, 2024 at 04:55:59PM +0500, Kirill Reshke wrote:
> > Posting updated version of this patch with comments above addressed.
>
> I look for a commitfest entry for this one, but couldn't find it.  Would
> you mind either creating one

Done: https://commitfest.postgresql.org/48/4922/




Re: Allow non-superuser to cancel superuser tasks.

2024-04-11 Thread Nathan Bossart
On Thu, Apr 11, 2024 at 04:55:59PM +0500, Kirill Reshke wrote:
>> It's feeling more natural here to check that we have a superuser-owned
>> backend first, and then do a lookup of the process type.
> 
>> I figured since there's no reason to rely on that behavior, we might as
>> well do a bit of future-proofing in case autovacuum workers are ever not
>> run as InvalidOid.
> 
> I have combined them into this:
> 
> if ((!OidIsValid(proc->roleId) || superuser_arg(proc->roleId))
> && pgstat_get_backend_type(GetNumberFromPGProc(proc)) == B_AUTOVAC_WORKER)
> 
> This is both future-proofing and natural, I suppose. Downside of this
> is double checking condition (!OidIsValid(proc->roleId) ||
> superuser_arg(proc->roleId)), but i think that is ok for the sake of
> simplicity.

If we want to retain the check, IMO we might as well combine the first two
blocks like Anthony proposed:

if (!OidIsValid(proc->roleId) || superuser_arg(proc->roleId))
{
ProcNumber procNumber = GetNumberFromPGProc(proc);
PGBackendStatus procStatus = 
pgstat_get_beentry_by_proc_number(procNumber);

if (procStatus && procStatus->st_backendType == 
B_AUTOVAC_WORKER &&
!has_privs_of_role(GetUserId(), 
ROLE_PG_SIGNAL_AUTOVAC_WORKER))
return SIGNAL_BACKEND_NOAUTOVAC;
else if (!superuser())
return SIGNAL_BACKEND_NOSUPERUSER;
}

+  
+   pg_signal_autovacuum_worker
+   Allow signaling autovacuum worker backend to cancel or 
terminate
+  

I think we need to be more specific about what pg_cancel_backend() and
pg_terminate_backend() do for autovacuum workers.  The code offers some
clues:

/*
 * SIGINT is used to signal canceling the current table's vacuum; 
SIGTERM
 * means abort and exit cleanly, and SIGQUIT means abandon ship.
 */
pqsignal(SIGINT, StatementCancelHandler);
pqsignal(SIGTERM, die);

+/* --
+ * pgstat_get_backend_type() -
+ *
+ * Return the backend type of the backend for the given proc number.
+ * --
+ */
+BackendType
+pgstat_get_backend_type(ProcNumber procNumber)
+{
+   PgBackendStatus *ret;
+
+   ret = pgstat_get_beentry_by_proc_number(procNumber);
+
+   if (!ret)
+   return B_INVALID;
+
+   return ret->st_backendType;
+}

I'm not sure we really need to introduce a new function for this.  I
avoided using it in my example snippet above.  But, maybe it'll come in
handy down the road...

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Allow non-superuser to cancel superuser tasks.

2024-04-11 Thread Nathan Bossart
On Thu, Apr 11, 2024 at 04:55:59PM +0500, Kirill Reshke wrote:
> Posting updated version of this patch with comments above addressed.

I look for a commitfest entry for this one, but couldn't find it.  Would
you mind either creating one or, if I've somehow missed it, pointing me to
the existing entry?

https://commitfest.postgresql.org/48/

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Allow non-superuser to cancel superuser tasks.

2024-04-11 Thread Kirill Reshke
Posting updated version of this patch with comments above addressed.

1) pg_signal_autovacuum -> pg_signal_autovacuum_worker, as there seems
to be no objections to that.

2)
There are comments on how to write if statement:

> In core, do_autovacuum() is only called in a process without
> a role specified

> It's feeling more natural here to check that we have a superuser-owned
> backend first, and then do a lookup of the process type.

> I figured since there's no reason to rely on that behavior, we might as
> well do a bit of future-proofing in case autovacuum workers are ever not
> run as InvalidOid.

I have combined them into this:

if ((!OidIsValid(proc->roleId) || superuser_arg(proc->roleId))
&& pgstat_get_backend_type(GetNumberFromPGProc(proc)) == B_AUTOVAC_WORKER)

This is both future-proofing and natural, I suppose. Downside of this
is double checking condition (!OidIsValid(proc->roleId) ||
superuser_arg(proc->roleId)), but i think that is ok for the sake of
simplicity.

3) pg_signal_autovacuum_worker Oid changed to random one: 8916

4)

> An invalid BackendType is not false, but B_INVALID.
fixed, thanks

5)

 There is pg_read_all_stats as well, so I don't see a big issue in
 requiring to be a member of this role as well for the sake of what's
 proposing here.
>>>
>>> Well, that tells you quite a bit more than just which PIDs correspond to
>>> autovacuum workers, but maybe that's good enough for now.
>>
>> That may be a good initial compromise, for now.

>Sounds good to me. I will update the documentation.

@Anthony if you feel that documentation update adds much value here,
please do. Given that we know autovacuum worker PIDs from
pg_stat_progress_vacuum, I don't know how to reflect something about
pg_stat_autovac_worker in doc, and if it is worth it.

6)
> + INJECTION_POINT("autovacuum-start");
> Perhaps autovacuum-worker-start is more suited here

fixed, thanks

7)

> +# Copyright (c) 2022-2024, PostgreSQL Global Development Group
> [...]
> +# Copyright (c) 2024-2024, PostgreSQL Global Development Group

> These need to be cleaned up.

> +# Makefile for src/test/recovery
> +#
> +# src/test/recovery/Makefile

> This is incorrect, twice.

Cleaned up, thanks!

8)

> Not sure that there is a huge point in checking after a role that
> holds pg_signal_backend.
Ok. Removed.

Then:

> +like($psql_err, qr/ERROR: permission denied to terminate ...
> Checking only the ERRROR, and not the DETAIL should be sufficient
> here.


After removing the pg_signal_backend test case we have only one place
where errors check is done. So, I think we should keep DETAIL here to
ensure detail is correct (it differs from regular backend case).

9)
> +# Test signaling for pg_signal_autovacuum role.
> This needs a better documentation:

Updated. Hope now the test documentation helps to understand it.

10)

> +ok($terminate_with_pg_signal_av == 0, "Terminating autovacuum worker should 
> succeed with pg_signal_autovacuum role");
> Is that enough for the validation?

Added:
ok($node->log_contains(qr/FATAL: terminating autovacuum process due to
administrator command/, $offset),
"Autovacuum terminates when role is granted with pg_signal_autovacuum_worker");

11) references to `passcheck` extension removed. errors messages rephrased.

12) injection_point_detach added.

13)
> + INSERT INTO tab_int SELECT * FROM generate_series(1, 100);
> A good chunk of the test would be spent on that, but you don't need
> that many tuples to trigger an autovacuum worker as the short naptime
> is able to do it. I would recommend to reduce that to a minimum.

+1
Single tuple works.

14)

v3 suffers from segfault:
2024-04-11 11:28:31.116 UTC [147437] 001_signal_autovacuum.pl LOG:
statement: SELECT pg_terminate_backend(147427);
2024-04-11 11:28:31.116 UTC [147427] FATAL:  terminating autovacuum
process due to administrator command
2024-04-11 11:28:31.116 UTC [147410] LOG:  server process (PID 147427)
was terminated by signal 11: Segmentation fault
2024-04-11 11:28:31.116 UTC [147410] LOG:  terminating any other
active server processes
2024-04-11 11:28:31.117 UTC [147410] LOG:  shutting down because
restart_after_crash is off
2024-04-11 11:28:31.121 UTC [147410] LOG:  database system is shut down

The test doesn't fail because pg_terminate_backend actually meets his
point: autovac is killed. But while dying, autovac also receives
segfault. Thats because of injections points:


(gdb) bt
#0  0x56361c3379ea in tas (lock=0x7fbcb9632224 ) at
../../../../src/include/storage/s_lock.h:228
#1  ConditionVariableCancelSleep () at condition_variable.c:238
#2  0x56361c337e4b in ConditionVariableBroadcast
(cv=0x7fbcb66f498c) at condition_variable.c:310
#3  0x56361c330a40 in CleanupProcSignalState (status=, arg=) at procsignal.c:240
#4  0x56361c328801 in shmem_exit (code=code@entry=1) at ipc.c:276
#5  0x56361c3288fc in proc_exit_prepare (code=code@entry=1) at ipc.c:198
#6  0x56361c3289bf in proc_exit (code=code@entry=1) at 

Re: Allow non-superuser to cancel superuser tasks.

2024-04-10 Thread Michael Paquier
On Wed, Apr 10, 2024 at 10:00:34AM -0500, Nathan Bossart wrote:
> Isn't it relatively easy to discover this same information today via
> pg_stat_progress_vacuum?  That has the following code:
> 
>   /* Value available to all callers */
>   values[0] = Int32GetDatum(beentry->st_procpid);
>   values[1] = ObjectIdGetDatum(beentry->st_databaseid);
> 
> I guess I'm not quite following why we are worried about leaking whether a
> backend is an autovacuum worker.

Good point.  I've missed that we make no effort currently to hide any
PID information from the progress tables.  And we can guess more
context data because of the per-table split of the progress tables.

This choice comes down to b6fb6471f6af that has introduced the
progress report facility, so this ship has long sailed it seems.  And
it makes my argument kind of moot.
--
Michael


signature.asc
Description: PGP signature


Re: Allow non-superuser to cancel superuser tasks.

2024-04-10 Thread Nathan Bossart
On Wed, Apr 10, 2024 at 07:58:39AM +0900, Michael Paquier wrote:
> On Wed, Apr 10, 2024 at 12:52:19AM +0300, Kirill Reshke wrote:
>> On Tue, 9 Apr 2024 at 08:53, Michael Paquier  wrote:
>>> The thing is that you cannot rely on a lookup of the backend type for
>>> the error information, or you open yourself to letting the caller of
>>> pg_cancel_backend or pg_terminate_backend know if a backend is
>>> controlled by a superuser or if a backend is an autovacuum worker.
>> 
>> Good catch. Thanks.  I think we need to update the error message to not
>> leak backend type info.
> 
> Yep, that's necessary I am afraid.

Isn't it relatively easy to discover this same information today via
pg_stat_progress_vacuum?  That has the following code:

/* Value available to all callers */
values[0] = Int32GetDatum(beentry->st_procpid);
values[1] = ObjectIdGetDatum(beentry->st_databaseid);

I guess I'm not quite following why we are worried about leaking whether a
backend is an autovacuum worker.

>>> The choice of pg_signal_autovacuum is a bit inconsistent, as well,
>>> because autovacuum workers operate like regular backends.  This name
>>> can also be confused with the autovacuum launcher.
>> 
>> Ok. What would be a good choice? Is `pg_signal_autovacuum_worker` good
>> enough?
> 
> Sounds fine to me.  Perhaps others have an opinion about that?

WFM

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Allow non-superuser to cancel superuser tasks.

2024-04-09 Thread Michael Paquier
On Wed, Apr 10, 2024 at 12:52:19AM +0300, Kirill Reshke wrote:
> On Tue, 9 Apr 2024 at 08:53, Michael Paquier  wrote:
>> The thing is that you cannot rely on a lookup of the backend type for
>> the error information, or you open yourself to letting the caller of
>> pg_cancel_backend or pg_terminate_backend know if a backend is
>> controlled by a superuser or if a backend is an autovacuum worker.
> 
> Good catch. Thanks.  I think we need to update the error message to not
> leak backend type info.

Yep, that's necessary I am afraid.

>> The choice of pg_signal_autovacuum is a bit inconsistent, as well,
>> because autovacuum workers operate like regular backends.  This name
>> can also be confused with the autovacuum launcher.
> 
> Ok. What would be a good choice? Is `pg_signal_autovacuum_worker` good
> enough?

Sounds fine to me.  Perhaps others have an opinion about that?
--
Michael


signature.asc
Description: PGP signature


Re: Allow non-superuser to cancel superuser tasks.

2024-04-09 Thread Kirill Reshke
Hi, thanks for looking into this.

On Tue, 9 Apr 2024 at 08:53, Michael Paquier  wrote:

> On Mon, Apr 08, 2024 at 05:42:05PM +, Leung, Anthony wrote:
> > Are you suggesting that we check if the backend is B_AUTOVAC in
> > pg_cancel/ terminate_backend? That seems a bit unclean to me since
> > pg_cancel_backend & pg_cancel_backend does not access to the
> > procNumber to check the type of the backend.
> >
> > IMHO, we can keep SIGNAL_BACKEND_NOAUTOVACUUM but just improve the
> > errmsg / errdetail to not expose that the backend is an AV
> > worker. It'll also be helpful if you can suggest what errdetail we
> > should use here.
>
> The thing is that you cannot rely on a lookup of the backend type for
> the error information, or you open yourself to letting the caller of
> pg_cancel_backend or pg_terminate_backend know if a backend is
> controlled by a superuser or if a backend is an autovacuum worker.
>

Good catch. Thanks.  I think we need to update the error message to not
leak backend type info.

> The choice of pg_signal_autovacuum is a bit inconsistent, as well,
> because autovacuum workers operate like regular backends.  This name
> can also be confused with the autovacuum launcher.

Ok. What would be a good choice? Is `pg_signal_autovacuum_worker` good
enough?


Re: Allow non-superuser to cancel superuser tasks.

2024-04-08 Thread Michael Paquier
On Mon, Apr 08, 2024 at 05:42:05PM +, Leung, Anthony wrote:
> Are you suggesting that we check if the backend is B_AUTOVAC in
> pg_cancel/ terminate_backend? That seems a bit unclean to me since
> pg_cancel_backend & pg_cancel_backend does not access to the
> procNumber to check the type of the backend.
> 
> IMHO, we can keep SIGNAL_BACKEND_NOAUTOVACUUM but just improve the
> errmsg / errdetail to not expose that the backend is an AV
> worker. It'll also be helpful if you can suggest what errdetail we
> should use here.

The thing is that you cannot rely on a lookup of the backend type for
the error information, or you open yourself to letting the caller of
pg_cancel_backend or pg_terminate_backend know if a backend is
controlled by a superuser or if a backend is an autovacuum worker.
And they may have no access to this information by default, except if
the role is a member of pg_read_all_stats able to scan
pg_stat_activity.  An option that I can think of, even if it is not
the most elegant ever, would be list all the possible system users
that can be used in the errdetail under a single SIGNAL_BACKEND_NO*
state.

In the case of your patch it would mean to mention both
pg_signal_backend and pg_signal_autovacuum.

The choice of pg_signal_autovacuum is a bit inconsistent, as well,
because autovacuum workers operate like regular backends.  This name
can also be confused with the autovacuum launcher.
--
Michael


signature.asc
Description: PGP signature


Re: Allow non-superuser to cancel superuser tasks.

2024-04-08 Thread Leung, Anthony
>>> There is pg_read_all_stats as well, so I don't see a big issue in
>>> requiring to be a member of this role as well for the sake of what's
>>> proposing here.
>>
>> Well, that tells you quite a bit more than just which PIDs correspond to
>> autovacuum workers, but maybe that's good enough for now.
>
> That may be a good initial compromise, for now.

Sounds good to me. I will update the documentation.


> And we should try to reshape things so as we get an ERROR like
> "permission denied to terminate process" or "permission denied to
> cancel query" for all the error paths, including autovacuum workers 
> and backends, so as we never leak any information about the backend
> types involved when a role has no permission to issue the signal.
> Perhaps that's the most intuitive thing as well, because autovacuum
> workers are backends. One thing that we could do is to mention both
> pg_signal_backend and pg_signal_autovacuum in the errdetail, and have
> both cases be handled by SIGNAL_BACKEND_NOPERMISSION on failure.

I understand your concern that we should avoid exposing the fact that the 
backend which the user is attempting to terminate is an AV worker unless the 
user has pg_signal_backend privileges and pg_signal_autovacuum privileges. 
But Im not following how we can re-use SIGNAL_BACKEND_NOPERMISSION for this. If 
we return SIGNAL_BACKEND_NOPERMISSION here as the following, it'll stay return 
the "permission denied to terminate / cancel query" errmsg and errdetail in 
pg_cancel/terminate_backend.

/*
 * If the backend is autovacuum worker, allow user with the privileges 
of
 * pg_signal_autovacuum role to signal the backend.
 */
if (pgstat_get_backend_type(GetNumberFromPGProc(proc)) == 
B_AUTOVAC_WORKER)
{
if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM))
return SIGNAL_BACKEND_NOPERMISSION;
}

Are you suggesting that we check if the backend is B_AUTOVAC in pg_cancel/ 
terminate_backend? That seems a bit unclean to me since pg_cancel_backend & 
pg_cancel_backend does not access to the procNumber to check the type of the 
backend.

IMHO, we can keep SIGNAL_BACKEND_NOAUTOVACUUM but just improve the errmsg / 
errdetail to not expose that the backend is an AV worker. It'll also be helpful 
if you can suggest what errdetail we should use here.

Thanks
--
Anthony Leung
Amazon Web Services: https://aws.amazon.com






Re: Allow non-superuser to cancel superuser tasks.

2024-04-07 Thread Michael Paquier
On Fri, Apr 05, 2024 at 08:07:51PM -0500, Nathan Bossart wrote:
> On Sat, Apr 06, 2024 at 08:56:04AM +0900, Michael Paquier wrote:
>> There is pg_read_all_stats as well, so I don't see a big issue in
>> requiring to be a member of this role as well for the sake of what's
>> proposing here.
> 
> Well, that tells you quite a bit more than just which PIDs correspond to
> autovacuum workers, but maybe that's good enough for now.

That may be a good initial compromise, for now.

>> I'd rather not leak any information at the end for
>> anybody calling pg_signal_backend without access to the stats, so
>> checking the backend type after the role sounds kind of a safer
>> long-term approach for me.
> 
> I'm not following what you mean by this.  Are you suggesting that we should
> keep the existing superuser message for the autovacuum workers?

Mostly.  Just to be clear the patch has the following problem:
=# CREATE ROLE popo LOGIN;
CREATE ROLE
=# CREATE EXTENSION injection_points;
CREATE EXTENSION
=# select injection_points_attach('autovacuum-start', 'wait');
 injection_points_attach
-

(1 row)
=# select pid, backend_type from pg_stat_activity
 where wait_event = 'autovacuum-start' LIMIT 1;
  pid  |   backend_type
---+---
 14163 | autovacuum worker
(1 row)
=> \c postgres popo
You are now connected to database "postgres" as user "popo". 
=> select pg_terminate_backend(14163);
ERROR:  42501: permission denied to terminate autovacuum worker backend
DETAIL:  Only roles with the SUPERUSER attribute or with privileges of
the "pg_signal_autovacuum" role may terminate autovacuum worker
backend
LOCATION:  pg_terminate_backend, signalfuncs.c:267 
=> select backend_type from pg_stat_activity where pid = 14163;
 backend_type
--
 null
(1 row)

And we should try to reshape things so as we get an ERROR like
"permission denied to terminate process" or "permission denied to
cancel query" for all the error paths, including autovacuum workers 
and backends, so as we never leak any information about the backend
types involved when a role has no permission to issue the signal.
Perhaps that's the most intuitive thing as well, because autovacuum
workers are backends.  One thing that we could do is to mention both
pg_signal_backend and pg_signal_autovacuum in the errdetail, and have
both cases be handled by SIGNAL_BACKEND_NOPERMISSION on failure.
--
Michael


signature.asc
Description: PGP signature


Re: Allow non-superuser to cancel superuser tasks.

2024-04-05 Thread Nathan Bossart
On Sat, Apr 06, 2024 at 08:56:04AM +0900, Michael Paquier wrote:
> There is pg_read_all_stats as well, so I don't see a big issue in
> requiring to be a member of this role as well for the sake of what's
> proposing here.

Well, that tells you quite a bit more than just which PIDs correspond to
autovacuum workers, but maybe that's good enough for now.

> I'd rather not leak any information at the end for
> anybody calling pg_signal_backend without access to the stats, so
> checking the backend type after the role sounds kind of a safer
> long-term approach for me.

I'm not following what you mean by this.  Are you suggesting that we should
keep the existing superuser message for the autovacuum workers?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Allow non-superuser to cancel superuser tasks.

2024-04-05 Thread Michael Paquier
On Fri, Apr 05, 2024 at 07:56:56AM -0500, Nathan Bossart wrote:
> On Fri, Apr 05, 2024 at 02:39:05PM +0900, Michael Paquier wrote:
>> One thing that we should definitely not do is letting any user calling
>> pg_signal_backend() know that a given PID maps to an autovacuum
>> worker.  This information is hidden in pg_stat_activity.  And
>> actually, doesn't the patch leak this information to all users when
>> calling pg_signal_backend with random PID numbers because of the fact
>> that SIGNAL_BACKEND_NOAUTOVACUUM exists?  Any role could guess which
>> PIDs are used by an autovacuum worker because of the granularity
>> required for the error related to pg_signal_autovacuum.
> 
> Hm.  I hadn't considered that angle.  IIUC right now they'll just get the
> generic superuser error for autovacuum workers.  I don't know how concerned
> to be about users distinguishing autovacuum workers from other superuser
> backends, _but_ if roles with pg_signal_autovacuum can't even figure out
> the PIDs for the autovacuum workers, then this feature seems kind-of
> useless.  Perhaps we should allow roles with privileges of
> pg_signal_autovacuum to see the autovacuum workers in pg_stat_activity.

There is pg_read_all_stats as well, so I don't see a big issue in
requiring to be a member of this role as well for the sake of what's
proposing here.  I'd rather not leak any information at the end for
anybody calling pg_signal_backend without access to the stats, so
checking the backend type after the role sounds kind of a safer
long-term approach for me.
--
Michael


signature.asc
Description: PGP signature


Re: Allow non-superuser to cancel superuser tasks.

2024-04-05 Thread Nathan Bossart
On Fri, Apr 05, 2024 at 02:39:05PM +0900, Michael Paquier wrote:
> + /*
> +  * If the backend is autovacuum worker, allow user with the privileges 
> of
> +  * pg_signal_autovacuum role to signal the backend.
> +  */
> + if (pgstat_get_backend_type(GetNumberFromPGProc(proc)) == 
> B_AUTOVAC_WORKER)
> + {
> + if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM))
> + return SIGNAL_BACKEND_NOAUTOVACUUM;
> + }
> 
> I was wondering why this is not done after we've checked that we have
> a superuser-owned backend, and this is giving me a pause.  @Nathan,
> why do you think we should not rely on the roleId for an autovacuum
> worker?  In core, do_autovacuum() is only called in a process without
> a role specified, and I've noticed your remark here:
> https://www.postgresql.org/message-id/20240401202146.GA2354284@nathanxps13
> It's feeling more natural here to check that we have a superuser-owned
> backend first, and then do a lookup of the process type.

I figured since there's no reason to rely on that behavior, we might as
well do a bit of future-proofing in case autovacuum workers are ever not
run as InvalidOid.  It'd be easy enough to fix this code if that ever
happened, so I'm not too worried about this.

> One thing that we should definitely not do is letting any user calling
> pg_signal_backend() know that a given PID maps to an autovacuum
> worker.  This information is hidden in pg_stat_activity.  And
> actually, doesn't the patch leak this information to all users when
> calling pg_signal_backend with random PID numbers because of the fact
> that SIGNAL_BACKEND_NOAUTOVACUUM exists?  Any role could guess which
> PIDs are used by an autovacuum worker because of the granularity
> required for the error related to pg_signal_autovacuum.

Hm.  I hadn't considered that angle.  IIUC right now they'll just get the
generic superuser error for autovacuum workers.  I don't know how concerned
to be about users distinguishing autovacuum workers from other superuser
backends, _but_ if roles with pg_signal_autovacuum can't even figure out
the PIDs for the autovacuum workers, then this feature seems kind-of
useless.  Perhaps we should allow roles with privileges of
pg_signal_autovacuum to see the autovacuum workers in pg_stat_activity.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Allow non-superuser to cancel superuser tasks.

2024-04-04 Thread Michael Paquier
On Fri, Apr 05, 2024 at 12:03:05AM +, Leung, Anthony wrote:
> Adding tap test for pg_signal_autovacuum using injection points as a
> separate patch. I also made a minor change on the original patch.

+   ret = pgstat_get_beentry_by_proc_number(procNumber);
+
+   if (!ret)
+   return false;

An invalid BackendType is not false, but B_INVALID.

+{ oid => '6312', oid_symbol => 'ROLE_PG_SIGNAL_AUTOVACUUM',

OIDs in patches under development should use a value in the range
8000-.  Newly-assigned OIDs are renumbered after the feature
freeze.

+   /*
+* If the backend is autovacuum worker, allow user with the privileges 
of
+* pg_signal_autovacuum role to signal the backend.
+*/
+   if (pgstat_get_backend_type(GetNumberFromPGProc(proc)) == 
B_AUTOVAC_WORKER)
+   {
+   if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM))
+   return SIGNAL_BACKEND_NOAUTOVACUUM;
+   }

I was wondering why this is not done after we've checked that we have
a superuser-owned backend, and this is giving me a pause.  @Nathan,
why do you think we should not rely on the roleId for an autovacuum
worker?  In core, do_autovacuum() is only called in a process without
a role specified, and I've noticed your remark here:
https://www.postgresql.org/message-id/20240401202146.GA2354284@nathanxps13
It's feeling more natural here to check that we have a superuser-owned
backend first, and then do a lookup of the process type.

One thing that we should definitely not do is letting any user calling
pg_signal_backend() know that a given PID maps to an autovacuum
worker.  This information is hidden in pg_stat_activity.  And
actually, doesn't the patch leak this information to all users when
calling pg_signal_backend with random PID numbers because of the fact
that SIGNAL_BACKEND_NOAUTOVACUUM exists?  Any role could guess which
PIDs are used by an autovacuum worker because of the granularity
required for the error related to pg_signal_autovacuum.

+   INJECTION_POINT("autovacuum-start");
Perhaps autovacuum-worker-start is more suited here.  I am not sure
that the beginning of do_autovacuum() is the optimal location, as what
matters is that we've done InitPostgres() to be able to grab the PID
from pg_stat_activity.  This location does the job.

+if ($ENV{enable_injection_points} ne 'yes')
+{
+   plan skip_all => 'Injection points not supported by this build';
+}
[...]
+$node->safe_psql('postgres',
+   "SELECT injection_points_attach('autovacuum-start', 'wait');");
[...]
+# Wait until the autovacuum worker starts
+$node->wait_for_event('autovacuum worker', 'autovacuum-start');

This integration with injection points looks correct to me.

+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
[...]
+# Copyright (c) 2024-2024, PostgreSQL Global Development Group

These need to be cleaned up.

+# Makefile for src/test/recovery
+#
+# src/test/recovery/Makefile

This is incorrect, twice.  No problems for me with using a new path in
src/test/ for that kind of tests.  There are no similar locations.

+INSERT INTO tab_int SELECT * FROM generate_series(1, 100);
A good chunk of the test would be spent on that, but you don't need
that many tuples to trigger an autovacuum worker as the short naptime
is able to do it.  I would recommend to reduce that to a minimum.

+# User with signal_backend_role cannot terminate autovacuum worker

Not sure that there is a huge point in checking after a role that
holds pg_signal_backend.  An autovacuum worker is not a backend.  Only
the case of a role not member of pg_signal_autovacuum should be
enough.

+# Test signaling for pg_signal_autovacuum role. 

This needs a better documentation: the purpose of the test is to
signal an autovacuum worker, aka it uses an injection point to ensure
that the worker for the whole duration of the test.

It seems to me that it would be a better practice to wakeup the
injection point and detach it before being done with the worker.
That's not mandatory but it would encourage the correct flow if this
code is copy-pasted around to other tests.

+like($psql_err, qr/ERROR:  permission denied to terminate ...

Checking only the ERRROR, and not the DETAIL should be sufficient
here.

+# User with pg_signal_backend can terminate autovacuum worker 
+my $terminate_with_pg_signal_av = $node->psql('postgres', qq( 
+SET ROLE signal_autovacuum_role;
+SELECT pg_terminate_backend($av_pid);
+), stdout => \$psql_out, stderr => \$psql_err);
+
+ok($terminate_with_pg_signal_av == 0, "Terminating autovacuum worker should 
succeed with pg_signal_autovacuum role");

Is that enough for the validation?  How about checking some pattern in
the server logs from an offset before running this last query?
--
Michael


signature.asc
Description: PGP signature


Re: Allow non-superuser to cancel superuser tasks.

2024-04-04 Thread Andrey M. Borodin



> On 5 Apr 2024, at 05:03, Leung, Anthony  wrote:
> 
> Adding tap test for pg_signal_autovacuum using injection points as a separate 
> patch. I also made a minor change on the original patch.

The test looks good, but:
1. remove references to passcheck :)
2. detach injection point when it's not needed anymore

Thanks!


Best regards, Andrey Borodin.



Re: Allow non-superuser to cancel superuser tasks.

2024-04-04 Thread Leung, Anthony
Adding tap test for pg_signal_autovacuum using injection points as a separate 
patch. I also made a minor change on the original patch.

Thanks.

--
Anthony 
Amazon Web Services: https://aws.amazon.com



v3-0001-Introduce-pg_signal_autovacuum-role-to-signal-autovacuum-worker.patch
Description: v3-0001-Introduce-pg_signal_autovacuum-role-to-signal-autovacuum-worker.patch


v3-0002-Add-tap-test-for-pg-signal-autovacuum-role.patch
Description: v3-0002-Add-tap-test-for-pg-signal-autovacuum-role.patch


Re: Allow non-superuser to cancel superuser tasks.

2024-04-04 Thread Leung, Anthony
I made some updates based on the feedbacks in v2. This patch only contains the 
code change for allowing the signaling to av worker with pg_signal_autovacuum. 
I will send a separate patch for the tap test shortly.

Thanks

--
Anthony Leung
Amazon Web Services: https://aws.amazon.com 



v2-0001-Introduce-pg_signal_autovacuum-role-to-cancel-or-termiante-autovacuum-worker.patch
Description: v2-0001-Introduce-pg_signal_autovacuum-role-to-cancel-or-termiante-autovacuum-worker.patch


Re: Allow non-superuser to cancel superuser tasks.

2024-04-04 Thread Nathan Bossart
On Thu, Apr 04, 2024 at 12:30:51AM +, Leung, Anthony wrote:
>>  if (pg_stat_is_backend_autovac_worker(proc->backendId) &&
>>  !has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM))
>>  return SIGNAL_BACKEND_NOAUTOVACUUM;
> 
> I tried to add them above the existing code. When I test it locally, a
> user without pg_signal_autovacuum will actually fail at this block
> because the user is not superuser and !OidIsValid(proc->roleId) is also
> true in the following:

Good catch.

> This is what Im planning to do - If the backend is autovacuum worker and
> the user is not superuser or has pg_signal_autovacuum role, we return the
> new value and provide the relevant error message
> 
>   /*
>* If the backend is autovacuum worker, allow user with privileges of 
> the 
>* pg_signal_autovacuum role to signal the backend.
>*/
>   if (pgstat_get_backend_type(proc->backendId) == B_AUTOVAC_WORKER)
>   {
>   if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM) 
> || !superuser())
>   return SIGNAL_BACKEND_NOAUTOVACUUM;
>   }
>   /*
>* Only allow superusers to signal superuser-owned backends.  Any 
> process
>* not advertising a role might have the importance of a superuser-owned
>* backend, so treat it that way.
>   */
>   else if ((!OidIsValid(proc->roleId) || superuser_arg(proc->roleId)) &&
>!superuser())
>   {
>   return SIGNAL_BACKEND_NOSUPERUSER;
>   }
>   /* Users can signal backends they have role membership in. */
>   else if (!has_privs_of_role(GetUserId(), proc->roleId) &&
>!has_privs_of_role(GetUserId(), 
> ROLE_PG_SIGNAL_BACKEND))
>   {
>   return SIGNAL_BACKEND_NOPERMISSION;
>   }

There's no need for the explicit superuser() check in the
pg_signal_autovacuum section.  That's built into has_privs_of_role()
already.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Allow non-superuser to cancel superuser tasks.

2024-04-03 Thread Michael Paquier
On Tue, Apr 02, 2024 at 04:35:28PM +0500, Andrey M. Borodin wrote:
> We can add tests just like [0] with injection points.
> I mean replace that "sleep 1" with something like
> "$node->wait_for_event('autovacuum worker', 'autocauum-runing');".
> Currently we have no infrastructure to wait for autovacuum of
> particular table, but I think it's doable.
> Also I do not like that test is changing system-wide autovac
> settings, AFAIR these settings can be set for particular table.

Yeah, hardcoded sleeps are not really acceptable.  On fast machines
they eat in global runtime making the whole slower, impacting the CI.
On slow machines, that's not going to be stable and we have a lot of
buildfarm animals starved on CPU, like the ones running valgrind or
just because their environment is slow (one of my animals runs on a
RPI, for example).  Note that slow machines have a lot of value
because they're usually better at catching race conditions.  Injection
points would indeed make the tests more deterministic by controlling
the waits and wakeups you'd like to have in the patch's tests.

eeefd4280f6e would be a good example of how to implement a test.
--
Michael


signature.asc
Description: PGP signature


Re: Allow non-superuser to cancel superuser tasks.

2024-04-03 Thread Leung, Anthony
Update - the condition should be && 

if (pgstat_get_backend_type(proc->backendId) == B_AUTOVAC_WORKER)
{
if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM) 
&& !superuser())
return SIGNAL_BACKEND_NOAUTOVACUUM;
}

Thanks
--
Anthony Leung
Amazon Web Services: https://aws.amazon.com



Re: Allow non-superuser to cancel superuser tasks.

2024-04-03 Thread Leung, Anthony
> I don't think we should rely on !OidIsValid(proc->roleId) for signaling
> autovacuum workers.  That might not always be true, and I don't see any
> need to rely on that otherwise.  IMHO we should just add a few lines before
> the existing code, which doesn't need to be changed at all:
> 
>   if (pg_stat_is_backend_autovac_worker(proc->backendId) &&
>   !has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM))
>   return SIGNAL_BACKEND_NOAUTOVACUUM;

I tried to add them above the existing code. When I test it locally, a user 
without pg_signal_autovacuum will actually fail at this block because the user 
is not superuser and !OidIsValid(proc->roleId) is also true in the following:

/*
 * Only allow superusers to signal superuser-owned backends.  Any 
process
 * not advertising a role might have the importance of a superuser-owned
 * backend, so treat it that way.
 */
if ((!OidIsValid(proc->roleId) || superuser_arg(proc->roleId)) &&
!superuser())
return SIGNAL_BACKEND_NOSUPERUSER;

This is what Im planning to do - If the backend is autovacuum worker and the 
user is not superuser or has pg_signal_autovacuum role, we return the new value 
and provide the relevant error message 

  /*
 * If the backend is autovacuum worker, allow user with privileges of 
the 
   * pg_signal_autovacuum role to signal the backend.
 */
if (pgstat_get_backend_type(proc->backendId) == B_AUTOVAC_WORKER)
{
if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM) 
|| !superuser())
return SIGNAL_BACKEND_NOAUTOVACUUM;
}
/*
 * Only allow superusers to signal superuser-owned backends.  Any 
process
 * not advertising a role might have the importance of a superuser-owned
 * backend, so treat it that way.
*/
else if ((!OidIsValid(proc->roleId) || superuser_arg(proc->roleId)) &&
 !superuser())
{
return SIGNAL_BACKEND_NOSUPERUSER;
}
/* Users can signal backends they have role membership in. */
else if (!has_privs_of_role(GetUserId(), proc->roleId) &&
 !has_privs_of_role(GetUserId(), 
ROLE_PG_SIGNAL_BACKEND))
{
return SIGNAL_BACKEND_NOPERMISSION;
}


> We can add tests just like [0] with injection points.
> I mean replace that "sleep 1" with something like 
> "$node->wait_for_event('autovacuum worker', 'autocauum-runing');".
> Currently we have no infrastructure to wait for autovacuum of particular 
> table, but I think it's doable.
> Also I do not like that test is changing system-wide autovac settings, AFAIR 
> these settings can be set for particular table.

Thanks for the suggestion. I will take a look at this. Let me also separate the 
test into a different patch file.

--
Anthony Leung
Amazon Web Services: https://aws.amazon.com






Re: Allow non-superuser to cancel superuser tasks.

2024-04-02 Thread Andrey M. Borodin



> On 2 Apr 2024, at 01:21, Nathan Bossart  wrote:
> 
> I haven't looked too closely, but I'm pretty skeptical that the test suite
> in your patch would be stable.  Unfortunately, I don't have any better
> ideas at the moment besides not adding a test for this new role.

We can add tests just like [0] with injection points.
I mean replace that "sleep 1" with something like 
"$node->wait_for_event('autovacuum worker', 'autocauum-runing');".
Currently we have no infrastructure to wait for autovacuum of particular table, 
but I think it's doable.
Also I do not like that test is changing system-wide autovac settings, AFAIR 
these settings can be set for particular table.

Thanks!


Best regards, Andrey Borodin.

[0] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=eeefd4280f6



Re: Allow non-superuser to cancel superuser tasks.

2024-04-01 Thread Nathan Bossart
On Mon, Apr 01, 2024 at 02:29:29PM +, Leung, Anthony wrote:
> I've attached the updated patch with some improvements.

Thanks!

+  
+   pg_signal_autovacuum
+   Allow terminating backend running autovacuum
+  

I think we should be more precise here by calling out the exact types of
workers:

"Allow signaling autovacuum worker processes to..."

-if ((!OidIsValid(proc->roleId) || superuser_arg(proc->roleId)) &&
-!superuser())
-return SIGNAL_BACKEND_NOSUPERUSER;
-
-/* Users can signal backends they have role membership in. */
-if (!has_privs_of_role(GetUserId(), proc->roleId) &&
-!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_BACKEND))
-return SIGNAL_BACKEND_NOPERMISSION;
+if (!superuser())
+{
+if (!OidIsValid(proc->roleId))
+{
+/*
+ * We only allow user with pg_signal_autovacuum role to terminate
+ * autovacuum worker as an exception. 
+ */
+if (!(pg_stat_is_backend_autovac_worker(proc->backendId) &&
+has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM)))
+return SIGNAL_BACKEND_NOSUPERUSER;
+}
+else
+{
+if (superuser_arg(proc->roleId))
+return SIGNAL_BACKEND_NOSUPERUSER;
+
+/* Users can signal backends they have role membership in. */
+if (!has_privs_of_role(GetUserId(), proc->roleId) &&
+!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_BACKEND))
+return SIGNAL_BACKEND_NOPERMISSION;
+}
+}

I don't think we should rely on !OidIsValid(proc->roleId) for signaling
autovacuum workers.  That might not always be true, and I don't see any
need to rely on that otherwise.  IMHO we should just add a few lines before
the existing code, which doesn't need to be changed at all:

if (pg_stat_is_backend_autovac_worker(proc->backendId) &&
!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM))
return SIGNAL_BACKEND_NOAUTOVACUUM;

I also think we need to return something besides SIGNAL_BACKEND_NOSUPERUSER
in this case.  Specifically, we probably need to introduce a new value and
provide the relevant error messages in pg_cancel_backend() and
pg_terminate_backend().

+/* --
+ * pg_stat_is_backend_autovac_worker() -
+ *
+ * Return whether the backend of the given backend id is of type autovacuum 
worker.
+ */
+bool
+pg_stat_is_backend_autovac_worker(BackendId beid)
+{
+PgBackendStatus *ret;
+
+Assert(beid != InvalidBackendId);
+
+ret = pgstat_get_beentry_by_backend_id(beid);
+
+if (!ret)
+return false;
+
+return ret->st_backendType == B_AUTOVAC_WORKER;
+}

Can we have this function return the backend type so that we don't have to
create a new function for every possible type?  That might be handy in the
future.

I haven't looked too closely, but I'm pretty skeptical that the test suite
in your patch would be stable.  Unfortunately, I don't have any better
ideas at the moment besides not adding a test for this new role.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Allow non-superuser to cancel superuser tasks.

2024-04-01 Thread Leung, Anthony
I took the liberty of continuing to work on this after chatting with Nathan.

I've attached the updated patch with some improvements.

Thanks.

--
Anthony Leung
Amazon Web Services: https://aws.amazon.com







v1-0001-Introduce-pg_signal_autovacuum-role-to-allow-non-sup.patch
Description: v1-0001-Introduce-pg_signal_autovacuum-role-to-allow-non-sup.patch


Re: Allow non-superuser to cancel superuser tasks.

2024-03-09 Thread Leung, Anthony
Another comment that I forgot to mention is that we should also make the 
documentation change in doc/src/sgml/user-manag.sgml for this new predefined 
role

Thanks.

-- 
Anthony Leung
Amazon Web Services: https://aws.amazon.com



Re: Allow non-superuser to cancel superuser tasks.

2024-03-09 Thread Leung, Anthony
Hi, 

I'm new to reviewing postgres patches, but I took an interest in reviewing this 
patch as recommended by Nathan.

I have the following comments:

>   if (!superuser()) {
>   if (!OidIsValid(proc->roleId)) {
>   LocalPgBackendStatus *local_beentry;
>   local_beentry = 
> pgstat_get_local_beentry_by_backend_id(proc->backendId);
>
>   if (!(local_beentry && 
> local_beentry->backendStatus.st_backendType == B_AUTOVAC_WORKER && 
>   has_privs_of_role(GetUserId(), 
> ROLE_PG_SIGNAL_AUTOVACUUM)))
>   return SIGNAL_BACKEND_NOSUPERUSER;
>   } else {
>   if (superuser_arg(proc->roleId))
>   return SIGNAL_BACKEND_NOSUPERUSER;
>
>   /* Users can signal backends they have role membership 
> in. */
>   if (!has_privs_of_role(GetUserId(), proc->roleId) &&
>   !has_privs_of_role(GetUserId(), 
> ROLE_PG_SIGNAL_BACKEND))
>   return SIGNAL_BACKEND_NOPERMISSION;
>   }
>   }
>
1. I would suggest not to do nested if blocks since it's becoming harder to 
read. Also, does it make sense to have a utilities function in backend_status.c 
to check if a backend of a given backend id is of a certain backend_type. And 
should we check if proc->backendId is invalid?

>ALTER SYSTEM SET autovacuum_vacuum_cost_limit TO 1;
>ALTER SYSTEM SET autovacuum_vacuum_cost_delay TO 100;
>ALTER SYSTEM SET autovacuum_naptime TO 1;
2. Could we set these parameters at the beginning of the test before 
$node->start with $node->append_conf ? That way we can avoid restarting the 
node and doing the sleep later on.

> my $res_pid = $node_primary->safe_psql(
>.   'regress',
>   "SELECT pid FROM pg_stat_activity WHERE backend_type = 'autovacuum 
> worker' and datname = 'regress';"
> );
>
> my ($res_reg_psa_1, $stdout_reg_psa_1, $stderr_reg_psa_1)  = 
> $node_primary->psql('regress', qq[
 SET ROLE psa_reg_role_1;
>   SELECT pg_terminate_backend($res_pid);
>  ]);
>
> ok($res_reg_psa_1 != 0, "should fail for non pg_signal_autovacuum");
> like($stderr_reg_psa_1, qr/Only roles with the SUPERUSER attribute may 
> terminate processes of roles with the SUPERUSER attribute./, "matches");
>
> my ($res_reg_psa_2, $stdout_reg_psa_2, $stderr_reg_psa_2) = 
> $node_primary->psql('regress', qq[
>SET ROLE psa_reg_role_2;
>SELECT pg_terminate_backend($res_pid);
> ]");
3. Some nits on styling 

4. According to Postgres styles, I believe open brackets should be in a new 
line 



Re: Allow non-superuser to cancel superuser tasks.

2024-02-27 Thread Nathan Bossart
On Tue, Feb 27, 2024 at 11:59:00PM +0500, Kirill Reshke wrote:
> Do we need to test the pg_cancel_backend vs autovacuum case at all?
> I think we do. Would it be better to split work into 2 patches: first one
> with tests against current logic, and second
> one with some changes/enhancements which allows to cancel running autovac
> to non-superuser (via `pg_signal_autovacuum` role or some other mechanism)?

If we need to add tests for pg_signal_backend, I think it's reasonable to
keep those in a separate patch from pg_signal_autovacuum.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Allow non-superuser to cancel superuser tasks.

2024-02-27 Thread Nathan Bossart
On Tue, Feb 27, 2024 at 01:22:31AM +0500, Kirill Reshke wrote:
> Also, tap tests for functionality added. I'm not sure where to place them,
> so I placed them in a separate directory in `src/test/`
> Seems that regression tests for this feature are not possible, am i right?

It might be difficult to create reliable tests for pg_signal_autovacuum.
If we can, it would probably be easiest to do with a TAP test.

> Also, I was thinking of pg_signal_autovacuum vs pg_signal_backend.
> Should pg_signal_autovacuum have power of pg_signal_backend (implicity)? Or
> should this role have such little scope...

-1.  I don't see why giving a role privileges of pg_signal_autovacuum
should also give them the ability to signal all other non-superuser
backends.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Allow non-superuser to cancel superuser tasks.

2024-02-27 Thread Kirill Reshke
On Tue, 27 Feb 2024 at 01:22, Kirill Reshke  wrote:

>
>
> On Mon, 26 Feb 2024 at 20:10, Nathan Bossart 
> wrote:
>
>> On Mon, Feb 26, 2024 at 12:38:40PM +0500, Kirill Reshke wrote:
>> > I see 2 possible ways to implement this. The first one is to have hool
>> in
>> > pg_signal_backend, and define a hook in extension which can do the
>> thing.
>> > The second one is to have a predefined role. Something like a
>> > `pg_signal_autovacuum` role which can signal running autovac to cancel.
>> But
>> > I don't see how we can handle specific `application_name` with this
>> > solution.
>>
>> pg_signal_autovacuum seems useful given commit 3a9b18b.
>>
>> --
>> Nathan Bossart
>> Amazon Web Services: https://aws.amazon.com
>
>
> Thank you for your response.
> Please find a patch attached.
>
> In patch, pg_signal_autovacuum role with oid 6312 added. I grabbed oid
> from unused_oids script output.
> Also, tap tests for functionality added. I'm not sure where to place them,
> so I placed them in a separate directory in `src/test/`
> Seems that regression tests for this feature are not possible, am i right?
> Also, I was thinking of pg_signal_autovacuum vs pg_signal_backend.
> Should pg_signal_autovacuum have power of pg_signal_backend (implicity)?
> Or should this role have such little scope...
>
> Have a little thought on this, will share.
Do we need to test the pg_cancel_backend vs autovacuum case at all?
I think we do. Would it be better to split work into 2 patches: first one
with tests against current logic, and second
one with some changes/enhancements which allows to cancel running autovac
to non-superuser (via `pg_signal_autovacuum` role or some other mechanism)?


Re: Allow non-superuser to cancel superuser tasks.

2024-02-26 Thread Kirill Reshke
On Mon, 26 Feb 2024 at 20:10, Nathan Bossart 
wrote:

> On Mon, Feb 26, 2024 at 12:38:40PM +0500, Kirill Reshke wrote:
> > I see 2 possible ways to implement this. The first one is to have hool in
> > pg_signal_backend, and define a hook in extension which can do the thing.
> > The second one is to have a predefined role. Something like a
> > `pg_signal_autovacuum` role which can signal running autovac to cancel.
> But
> > I don't see how we can handle specific `application_name` with this
> > solution.
>
> pg_signal_autovacuum seems useful given commit 3a9b18b.
>
> --
> Nathan Bossart
> Amazon Web Services: https://aws.amazon.com


Thank you for your response.
Please find a patch attached.

In patch, pg_signal_autovacuum role with oid 6312 added. I grabbed oid from
unused_oids script output.
Also, tap tests for functionality added. I'm not sure where to place them,
so I placed them in a separate directory in `src/test/`
Seems that regression tests for this feature are not possible, am i right?
Also, I was thinking of pg_signal_autovacuum vs pg_signal_backend.
Should pg_signal_autovacuum have power of pg_signal_backend (implicity)? Or
should this role have such little scope...


v1-0001-Allow-non-superuser-to-cancel-superuser-tasks.patch
Description: Binary data


Re: Allow non-superuser to cancel superuser tasks.

2024-02-26 Thread Nathan Bossart
On Mon, Feb 26, 2024 at 12:38:40PM +0500, Kirill Reshke wrote:
> I see 2 possible ways to implement this. The first one is to have hool in
> pg_signal_backend, and define a hook in extension which can do the thing.
> The second one is to have a predefined role. Something like a
> `pg_signal_autovacuum` role which can signal running autovac to cancel. But
> I don't see how we can handle specific `application_name` with this
> solution.

pg_signal_autovacuum seems useful given commit 3a9b18b.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Allow non-superuser to cancel superuser tasks.

2024-02-25 Thread Kirill Reshke
Hi hackers!

In our Cloud we have a patch, which allows non-superuser role ('mdb_admin')
to do some superuser things.
In particular, we have a patch that allows mdb admin to cancel the
autovacuum process and some other processes (processes with
application_name = 'MDB'), see the attachment.
This is needed to allow non-superuser roles to run pg_repack and to cancel
pg_repack.
We need to cancel running autovac to run pg_repack (because of locks), and
we need to cancel pg_repack sometimes also.

I want to reduce our internal patch size and transfer this logic to
extension or to core.
I have found similar threads [1] and [2], but, as far as I understand, they
do not solve this particular case.
I see 2 possible ways to implement this. The first one is to have hool in
pg_signal_backend, and define a hook in extension which can do the thing.
The second one is to have a predefined role. Something like a
`pg_signal_autovacuum` role which can signal running autovac to cancel. But
I don't see how we can handle specific `application_name` with this
solution.

[1]
https://www.postgresql.org/message-id/flat/F9408A5A-B20B-42D2-9E7F-49CD3D1547BC%40enterprisedb.com
[2]
https://www.postgresql.org/message-id/flat/20220722203735.GB3996698%40nathanxps13


v1-0001-cloud-mdb_admin-patch-part-to-illustrate.patch
Description: Binary data