Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2015-01-29 Thread Robert Haas
On Thu, Jan 29, 2015 at 4:09 PM, Jim Nasby jim.na...@bluetreble.com wrote:
 The difference between the autovacuum-run vacuum and the cron-run vacuum
 is that the one running out of cron will just keep holding the lock
 until it's actually able to truncate the end of the relation, no?  I
 recall discussion previously that we need a way to either support that
 in autovacuum for (a configurable set of) regular relations or come up
 with a solution that doesn't require that lock.

 AFAICT, in master, there is no difference in truncation between auto and
 manual vacuum. What we do is attempt to acquire the truncation lock for up
 to 5 seconds, giving up after that. Once we do have the lock, we check to
 see how many pages we can actually truncate. During that check, we test
 every ~20ms or so to see if someone else is waiting on our exclusive lock;
 if they are we stop counting and will only truncate the relation up to that
 point.

I don't think this is true, and I don't think it's been true for a
long time, if ever.  The difference between a manual vacuum and
autovacuum is that autovacuum commits suicide when it conflicts with
somebody else's lock request, and a manual vacuum doesn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2015-01-29 Thread Jim Nasby

On 1/28/15 7:45 PM, Stephen Frost wrote:

Jim,

* Jim Nasby (jim.na...@bluetreble.com) wrote:

On 12/23/14 12:52 PM, Stephen Frost wrote:

Autovacuum can certainly run vacuum/analyze on a few tables every 12
hours, so I'm not really following where you see autovacuum being unable
to cope.  I agree that there*are*  such cases, but getting more
information about those cases and exactly what solution*does*  work
would really help us improve autovacuum to address those use-cases.


(going through some old email...)

The two cases I've dealt with recently are:

- Tables with a fair update/delete rate that should always stay small

The problem with these tables is if anything happens to upset vacuuming you can 
end up with a significantly larger than expected table that's now essentially 
impossible to shrink. This could be caused by a single long-running transaction 
that happens to be in play when autovac kicks off, or for other reasons. Even 
once you manage to get all the tuples off the end of the heap it can still be 
extremely difficult to grab the lock you need to truncate it. Running a vacuum 
every minute from cron seems to help control this. Sadly, your indexes still 
get bloated, so occasionally you want to re-cluster too.


The difference between the autovacuum-run vacuum and the cron-run vacuum
is that the one running out of cron will just keep holding the lock
until it's actually able to truncate the end of the relation, no?  I
recall discussion previously that we need a way to either support that
in autovacuum for (a configurable set of) regular relations or come up
with a solution that doesn't require that lock.


AFAICT, in master, there is no difference in truncation between auto and manual 
vacuum. What we do is attempt to acquire the truncation lock for up to 5 
seconds, giving up after that. Once we do have the lock, we check to see how 
many pages we can actually truncate. During that check, we test every ~20ms or 
so to see if someone else is waiting on our exclusive lock; if they are we stop 
counting and will only truncate the relation up to that point.

So what this boils down to is that it's very hard to truncate a busy relation 
and your best bet of doing so is by repeatedly trying to.


- Preemptively vacuuming during off-hours

Many sites have either nightly or weekend periods of reduced load. Such sites 
can gain a great benefit from scheduling preemptive vacuums to reduce the odds 
of disruptive vacuuming activity during heavy activity periods. This is 
especially true when it comes to a scan_all vacuum of a large table; having 
autovac do one of those at a peak period can really hose things.


Having preferrable times for autovacuum to run vacuums would certainly
be nice to support this use-case.

All that said, I'm not against a role attribute which allows the user to
vacuum/analyze anything.  I do think that's a bit different from the
existing effort to reduce the activities which require superuser as with
the vacuum/analyze case you *could* have a single role that's a member
of every role that owns the relations which you want to vacuum/analyze.
I grant that it's a bit awkward though.


Yeah, I was mostly just providing some use cases. I'm not opposed to a separate 
vacuum/analyze permission, but don't see a huge need for it either. Typically I 
set this stuff up as a cron on the server itself, utilizing an account that 
does ident authentication. I figure if someone manages to compromise that then 
they probably have root on the box anyway, which is obviously game over.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2015-01-29 Thread Jim Nasby

On 1/29/15 4:02 PM, Robert Haas wrote:

On Thu, Jan 29, 2015 at 4:09 PM, Jim Nasby jim.na...@bluetreble.com wrote:

The difference between the autovacuum-run vacuum and the cron-run vacuum
is that the one running out of cron will just keep holding the lock
until it's actually able to truncate the end of the relation, no?  I
recall discussion previously that we need a way to either support that
in autovacuum for (a configurable set of) regular relations or come up
with a solution that doesn't require that lock.


AFAICT, in master, there is no difference in truncation between auto and
manual vacuum. What we do is attempt to acquire the truncation lock for up
to 5 seconds, giving up after that. Once we do have the lock, we check to
see how many pages we can actually truncate. During that check, we test
every ~20ms or so to see if someone else is waiting on our exclusive lock;
if they are we stop counting and will only truncate the relation up to that
point.


I don't think this is true, and I don't think it's been true for a
long time, if ever.  The difference between a manual vacuum and
autovacuum is that autovacuum commits suicide when it conflicts with
somebody else's lock request, and a manual vacuum doesn't.


Any idea where we set that up? The call stack is (note I'm ignoring vacuum full) 
autovacuum_do_vac_analyze() - vacuum() - vacuum_rel() - lazy_vacuum_rel() - 
lazy_truncate_heap() (which also calls count_nondeletable_pages()), and I don't see any 
IsAutoVacuumWorkerProcess() calls in lazy_truncate_heap or count_nondeletable_pages(). So 
AFAICT truncation operates the same regardless of how the vacuum was started. We also 
explicitly set things like statement_timeout to 0 in autovac, so I don't think that would 
be responsible for this...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2015-01-29 Thread Tom Lane
Jim Nasby jim.na...@bluetreble.com writes:
 On 1/29/15 4:02 PM, Robert Haas wrote:
 I don't think this is true, and I don't think it's been true for a
 long time, if ever.  The difference between a manual vacuum and
 autovacuum is that autovacuum commits suicide when it conflicts with
 somebody else's lock request, and a manual vacuum doesn't.

 Any idea where we set that up?

The relevant logic is in the deadlock detector, which will cancel an
autovacuum transaction if it is blocking somebody else's lock request.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2015-01-28 Thread Stephen Frost
Jim,

* Jim Nasby (jim.na...@bluetreble.com) wrote:
 On 12/23/14 12:52 PM, Stephen Frost wrote:
 Autovacuum can certainly run vacuum/analyze on a few tables every 12
 hours, so I'm not really following where you see autovacuum being unable
 to cope.  I agree that there*are*  such cases, but getting more
 information about those cases and exactly what solution*does*  work
 would really help us improve autovacuum to address those use-cases.
 
 (going through some old email...)
 
 The two cases I've dealt with recently are:
 
 - Tables with a fair update/delete rate that should always stay small
 
 The problem with these tables is if anything happens to upset vacuuming you 
 can end up with a significantly larger than expected table that's now 
 essentially impossible to shrink. This could be caused by a single 
 long-running transaction that happens to be in play when autovac kicks off, 
 or for other reasons. Even once you manage to get all the tuples off the end 
 of the heap it can still be extremely difficult to grab the lock you need to 
 truncate it. Running a vacuum every minute from cron seems to help control 
 this. Sadly, your indexes still get bloated, so occasionally you want to 
 re-cluster too.

The difference between the autovacuum-run vacuum and the cron-run vacuum
is that the one running out of cron will just keep holding the lock
until it's actually able to truncate the end of the relation, no?  I
recall discussion previously that we need a way to either support that
in autovacuum for (a configurable set of) regular relations or come up
with a solution that doesn't require that lock.

 - Preemptively vacuuming during off-hours
 
 Many sites have either nightly or weekend periods of reduced load. Such sites 
 can gain a great benefit from scheduling preemptive vacuums to reduce the 
 odds of disruptive vacuuming activity during heavy activity periods. This is 
 especially true when it comes to a scan_all vacuum of a large table; having 
 autovac do one of those at a peak period can really hose things.

Having preferrable times for autovacuum to run vacuums would certainly
be nice to support this use-case.

All that said, I'm not against a role attribute which allows the user to
vacuum/analyze anything.  I do think that's a bit different from the
existing effort to reduce the activities which require superuser as with
the vacuum/analyze case you *could* have a single role that's a member
of every role that owns the relations which you want to vacuum/analyze.
I grant that it's a bit awkward though.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2015-01-26 Thread Jim Nasby

On 12/23/14 12:52 PM, Stephen Frost wrote:

* José Luis Tallón (jltal...@adv-solutions.net) wrote:

On 12/23/2014 05:29 PM, Stephen Frost wrote:

  The capabilities would be:
 * MAINTENANCE --- Ability to run
  VACUUM [ANALYZE | FREEZE] (but not VACUUM FULL),
  ANALYZE (including SET LOCAL statistics_target TO 1),

 There's likely to be discussion about these from the perspective that
 you really shouldn't need to run them all that much.  Why isn't
 autovacuum able to handle this?


For some (arguably, ill-devised) use cases of INSERT - SELECT
aggregate - DELETE (third party, closed-source app, massive insert
rate) at the very least, autovacuum can't possibly cope with the
change rate in some tables, given that there are quite many other
interactive queries running.

Manually performing VACUUM / VACUUM ANALYZE on the (few) affected
tables every 12h or so fixes the performance problem for the
particular queries without impacting the other users too much ---
the tables and indexes in question have been moved to a separate
tablespace/disk volume of their own.

Autovacuum can certainly run vacuum/analyze on a few tables every 12
hours, so I'm not really following where you see autovacuum being unable
to cope.  I agree that there*are*  such cases, but getting more
information about those cases and exactly what solution*does*  work
would really help us improve autovacuum to address those use-cases.


(going through some old email...)

The two cases I've dealt with recently are:

- Tables with a fair update/delete rate that should always stay small

The problem with these tables is if anything happens to upset vacuuming you can 
end up with a significantly larger than expected table that's now essentially 
impossible to shrink. This could be caused by a single long-running transaction 
that happens to be in play when autovac kicks off, or for other reasons. Even 
once you manage to get all the tuples off the end of the heap it can still be 
extremely difficult to grab the lock you need to truncate it. Running a vacuum 
every minute from cron seems to help control this. Sadly, your indexes still 
get bloated, so occasionally you want to re-cluster too.

- Preemptively vacuuming during off-hours

Many sites have either nightly or weekend periods of reduced load. Such sites 
can gain a great benefit from scheduling preemptive vacuums to reduce the odds 
of disruptive vacuuming activity during heavy activity periods. This is 
especially true when it comes to a scan_all vacuum of a large table; having 
autovac do one of those at a peak period can really hose things.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2014-12-25 Thread Robert Haas
On Tue, Dec 23, 2014 at 11:20 AM, José Luis Tallón
jltal...@adv-solutions.net wrote:
 I've found myself needing two role capabilities? as of lately, when
 thinking about restricting some roles to the barely minimum allowed
 permissions needed to perform their duties ... as opposed to having a
 superuser role devoted to these task.

 The capabilities would be:
 * MAINTENANCE --- Ability to run
 VACUUM [ANALYZE | FREEZE] (but not VACUUM FULL),
 ANALYZE (including SET LOCAL statistics_target TO 1),
 REINDEX CONCURRENTLY  (but not the blocking, regular, one)
 REFRESH MATERIALIZED VIEW CONCURRENTLY (but not the blocking one)
 COPY ???

 Rationale: delegate the routine maintenance tasks to a low privilege
 role, which can't do harm (apart from some performance degradation) ---
 hence the no exclusive locking operations requirement.

I think the problem here is that, while almost everybody would
probably agree that something like this is useful, three hackers in a
room will have four or five different opinions on how to set the
boundaries around it.  I for example wouldn't feel too bad about
grouping VACUUM and ANALYZE under the same umbrella, but certainly
would be surprised to see all of the other stuff included.  But you've
got a different idea that is clearly valid, and somebody else might
want yet another thing.  We can avoid those problems by making the
capabilities finer-grained, but of course then you end up with lots
and lots of them, which is annoying too.

 * IMPERSONATE --- Ability to do SET AUTHORIZATION TO some_role; and RESET
 AUTHORIZATION
 This might be further refined to provide a way to say This role is
 authorized to impersonate role1 but no other

I can't see this providing any meaningful security improvement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2014-12-23 Thread Stephen Frost
* José Luis Tallón (jltal...@adv-solutions.net) wrote:
 I've found myself needing two role capabilities? as of lately,
 when thinking about restricting some roles to the barely minimum
 allowed permissions needed to perform their duties ... as opposed to
 having a superuser role devoted to these task.

Excellent.  We've been looking at the same considerations.

 The capabilities would be:
 * MAINTENANCE --- Ability to run
 VACUUM [ANALYZE | FREEZE] (but not VACUUM FULL),
 ANALYZE (including SET LOCAL statistics_target TO 1),

There's likely to be discussion about these from the perspective that
you really shouldn't need to run them all that much.  Why isn't
autovacuum able to handle this?

 REINDEX CONCURRENTLY  (but not the blocking, regular, one)
 REFRESH MATERIALIZED VIEW CONCURRENTLY (but not the blocking one)

These are interesting, but would these make sense at the role level?
Both of these commands explicitly take specific relations to operate
against, after all.

 COPY ???

The question around this one goes back to the CREATE DIRECTORY
discussion that happened this fall.  I'm still hopeful that we can do
*something* in this area, but I'm not sure what that's going to end up
looking like.  The problem with COPY is that it's either trivial to use
it to become a superuser, or insanely difficult to secure sufficiently.

 Rationale: delegate the routine maintenance tasks to a low
 privilege role, which can't do harm (apart from some performance
 degradation) --- hence the no exclusive locking operations
 requirement.

This makes sense for the reindex/refresh cases, though no harm might
be over-stating it.

 * IMPERSONATE --- Ability to do SET AUTHORIZATION TO some_role;
 and RESET AUTHORIZATION
 This might be further refined to provide a way to say This role
 is authorized to impersonate role1 but no other
 Rationale: for use by connection poolers (esp. pgBouncer), where
 the role used for connection would only have the LOGIN and
 IMPERSONATE privileges. The remaining operations would be authorized
 against the supplanted role (i.e. ability to create tables/indexes
 or views, perform DML and/or DDL, etc)
 AFAIK, a superuser role is needed for this purpose currently.

No..  You can have 'no-inherit' roles which you can use for exactly this
purpose.  The initial login role can have no rights on the database,
except to SET ROLE to other roles which have been granted to it.

You should never have your pgBouncer or other pooling connection logging
in as a superuser.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2014-12-23 Thread José Luis Tallón

On 12/23/2014 05:29 PM, Stephen Frost wrote:

* José Luis Tallón (jltal...@adv-solutions.net) wrote:

 I've found myself needing two role capabilities? as of lately,
when thinking about restricting some roles to the barely minimum
allowed permissions needed to perform their duties ... as opposed to
having a superuser role devoted to these tasks.

Excellent.  We've been looking at the same considerations.


 The capabilities would be:
* MAINTENANCE --- Ability to run
 VACUUM [ANALYZE | FREEZE] (but not VACUUM FULL),
 ANALYZE (including SET LOCAL statistics_target TO 1),

There's likely to be discussion about these from the perspective that
you really shouldn't need to run them all that much.  Why isn't
autovacuum able to handle this?


For some (arguably, ill-devised) use cases of INSERT - SELECT aggregate 
- DELETE (third party, closed-source app, massive insert rate) at the 
very least, autovacuum can't possibly cope with the change rate in some 
tables, given that there are quite many other interactive queries running.


Manually performing VACUUM / VACUUM ANALYZE on the (few) affected tables 
every 12h or so fixes the performance problem for the particular queries 
without impacting the other users too much --- the tables and indexes in 
question have been moved to a separate tablespace/disk volume of their own.



In short, this addresses situations where some tables have a much higher 
update rate than the rest of the database so that performance degrades 
with time --- the application became unusable after about 6 days' worth 
of updates until the manual vacuums were setup



 REINDEX CONCURRENTLY  (but not the blocking, regular, one)
 REFRESH MATERIALIZED VIEW CONCURRENTLY (but not the blocking one)

These are interesting, but would these make sense at the role level?
Both of these commands explicitly take specific relations to operate
against, after all.


Yup. Let's imagine a cron job invoking psql in order to perform 
maintenance routine.
The particular command(s) can be generated on-the-fly by querying the 
catalog and then send them in one go to be run sequentially by the one 
backend as a crude form of rate limiting/quality-of-service of sorts 
(renice -p or even ionice -p seems quite inadequate).


This automation becomes impossible to do if the object owners differ 
(only the owner or a superuser can perform these operations AFAICS -- 
there is no mention of it in the current documentation) unless the DBA 
makes the maintenance role a member of every other role ... which 
quickly becomes a problem.



 COPY ???

The question around this one goes back to the CREATE DIRECTORY
discussion that happened this fall.  I'm still hopeful that we can do
*something* in this area, but I'm not sure what that's going to end up
looking like.  The problem with COPY is that it's either trivial to use
it to become a superuser, or insanely difficult to secure sufficiently.


Yes. That's the reason for the question marks  :-\
Some dump to csv then load somewhere else kind of jobs might benefit 
from this feature, but I'm not sure the convenience is worth the risk.



 Rationale: delegate the routine maintenance tasks to a low
privilege role, which can't do harm (apart from some performance
degradation) --- hence the no exclusive locking operations
requirement.

This makes sense for the reindex/refresh cases, though no harm might
be over-stating it.


Well it's performance degradation vs DoS due to massive (exclusive) 
locking  :S
At least restricting it to one backend (connection_limit=1) allows quite 
some rate limit.



* IMPERSONATE --- Ability to do SET AUTHORIZATION TO some_role;
and RESET AUTHORIZATION
 This might be further refined to provide a way to say This role
is authorized to impersonate role1 but no other
 Rationale: for use by connection poolers (esp. pgBouncer), where
the role used for connection would only have the LOGIN and
IMPERSONATE privileges. The remaining operations would be authorized
against the supplanted role (i.e. ability to create tables/indexes
or views, perform DML and/or DDL, etc)
 AFAIK, a superuser role is needed for this purpose currently.

No..  You can have 'no-inherit' roles which you can use for exactly this
purpose.  The initial login role can have no rights on the database,
except to SET ROLE to other roles which have been granted to it.


Hmm the current documentation states that: The specified role_name 
must be a role that the current session user is a member of.
I can see use cases where making the login role a member of every other 
used role quickly becomes a burden, and that's the main driver for this 
feature (I'm thinking about multiple app servers running several 
applications each, minimum two roles per application)



You should never have your pgBouncer or other pooling connection logging
in as a superuser.


At least the default pgBouncer config explicitly says (albeit for 8.2)

Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2014-12-23 Thread David G Johnston
José Luis Tallón-2 wrote
 On 12/23/2014 05:29 PM, Stephen Frost wrote:
 * José Luis Tallón (

 jltallon@

 ) wrote:
 * IMPERSONATE --- Ability to do SET AUTHORIZATION TO some_role;
 and RESET AUTHORIZATION
  This might be further refined to provide a way to say This role
 is authorized to impersonate role1 but no other
  Rationale: for use by connection poolers (esp. pgBouncer), where
 the role used for connection would only have the LOGIN and
 IMPERSONATE privileges. The remaining operations would be authorized
 against the supplanted role (i.e. ability to create tables/indexes
 or views, perform DML and/or DDL, etc)
  AFAIK, a superuser role is needed for this purpose currently.
 No..  You can have 'no-inherit' roles which you can use for exactly this
 purpose.  The initial login role can have no rights on the database,
 except to SET ROLE to other roles which have been granted to it.
 
 Hmm the current documentation states that: The specified role_name 
 must be a role that the current session user is a member of.
 I can see use cases where making the login role a member of every other 
 used role quickly becomes a burden, and that's the main driver for this 
 feature (I'm thinking about multiple app servers running several 
 applications each, minimum two roles per application)

So you want to say:

GRANT IMPERSONATE TO bouncer; --covers the ALL requirement

instead of

GRANT victim1 TO bouncer;
GRANT victim2 TO bouncer;
etc...

-- these would still be used to cover the limited users requirement
?

Seems contrary to the principle of least privilege goal...

I'd rather there be better, more user friendly, SQL-based APIs to the
permissions system that would facilitate performing and reviewing grants.

If something like IMPERSONATE was added I would strongly suggest a
corresponding [NO]IMPERSONATE for CREATE USER so that the admin can make
specific roles unimpersonable - and also make SUPERUSER roles unimpersonable
by rule.

David J.




--
View this message in context: 
http://postgresql.nabble.com/Proposal-two-new-role-attributes-and-or-capabilities-tp5831859p5831868.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2014-12-23 Thread José Luis Tallón

On 12/23/2014 07:01 PM, David G Johnston wrote:
Hmm the current documentation states that: The specified 
role_name must be a role that the current session user is a member 
of. I can see use cases where making the login role a member of every 
other used role quickly becomes a burden, and that's the main driver 
for this feature (I'm thinking about multiple app servers running 
several applications each, minimum two roles per application)

So you want to say:

GRANT IMPERSONATE TO bouncer; --covers the ALL requirement


Yes, and exclusively for this purpose.


instead of

GRANT victim1 TO bouncer;
GRANT victim2 TO bouncer;
etc...

-- these would still be used to cover the limited users requirement
?


Yup.


Seems contrary to the principle of least privilege goal...


We still wouldn't grant any CREATE DATABASE, CREATE TABLESPACE, 
CREATE/LOAD EXTENSION, CREATE LANGUAGE, etc
 (and the ability to create/use/manipulate data within the database 
will still be constrained by the impersonated login)



I'd rather there be better, more user friendly, SQL-based APIs to the
permissions system that would facilitate performing and reviewing grants.

+1. All suggestions welcome.

If something like IMPERSONATE was added I would strongly suggest a
corresponding [NO]IMPERSONATE for CREATE USER so that the admin can make
specific roles unimpersonable

Indeed, I had thought about this too.


- and also make SUPERUSER roles unimpersonable by rule.


Yes, of course. Otherwise, the distinction would not have any sense.


Thanks,

J.L.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2014-12-23 Thread Stephen Frost
* José Luis Tallón (jltal...@adv-solutions.net) wrote:
 On 12/23/2014 05:29 PM, Stephen Frost wrote:
  The capabilities would be:
 * MAINTENANCE --- Ability to run
  VACUUM [ANALYZE | FREEZE] (but not VACUUM FULL),
  ANALYZE (including SET LOCAL statistics_target TO 1),
 There's likely to be discussion about these from the perspective that
 you really shouldn't need to run them all that much.  Why isn't
 autovacuum able to handle this?
 
 For some (arguably, ill-devised) use cases of INSERT - SELECT
 aggregate - DELETE (third party, closed-source app, massive insert
 rate) at the very least, autovacuum can't possibly cope with the
 change rate in some tables, given that there are quite many other
 interactive queries running.
 
 Manually performing VACUUM / VACUUM ANALYZE on the (few) affected
 tables every 12h or so fixes the performance problem for the
 particular queries without impacting the other users too much ---
 the tables and indexes in question have been moved to a separate
 tablespace/disk volume of their own.

Autovacuum can certainly run vacuum/analyze on a few tables every 12
hours, so I'm not really following where you see autovacuum being unable
to cope.  I agree that there *are* such cases, but getting more
information about those cases and exactly what solution *does* work
would really help us improve autovacuum to address those use-cases.

 In short, this addresses situations where some tables have a much
 higher update rate than the rest of the database so that performance
 degrades with time --- the application became unusable after about 6
 days' worth of updates until the manual vacuums were setup

This really looks like a configuration issue with autovacuum..  Perhaps
you need to make it more aggressive than the default and have it run
more threads?  Have you turned the autovacuum logging up all the way?
Is autovacuum giving up due to locking?

  REINDEX CONCURRENTLY  (but not the blocking, regular, one)
  REFRESH MATERIALIZED VIEW CONCURRENTLY (but not the blocking one)
 These are interesting, but would these make sense at the role level?
 Both of these commands explicitly take specific relations to operate
 against, after all.
 
 Yup. Let's imagine a cron job invoking psql in order to perform
 maintenance routine.

If they make sense at a relation level then they should be
relation-level GRANT'd permissions, not role-level attributes.

 The particular command(s) can be generated on-the-fly by querying
 the catalog and then send them in one go to be run sequentially by
 the one backend as a crude form of rate
 limiting/quality-of-service of sorts (renice -p or even ionice
 -p seems quite inadequate).

This sounds like it's something that we might want an autovacuum-like
background process to handle..  Some kind of auto-reindex-concurrently.
There are already plans to deal with updating of materialized views, as
I understand it.

 This automation becomes impossible to do if the object owners differ
 (only the owner or a superuser can perform these operations AFAICS
 -- there is no mention of it in the current documentation) unless
 the DBA makes the maintenance role a member of every other role ...
 which quickly becomes a problem.

I agree that having a maintenance role which is a member of every other
role isn't a very good solution.

  COPY ???
 The question around this one goes back to the CREATE DIRECTORY
 discussion that happened this fall.  I'm still hopeful that we can do
 *something* in this area, but I'm not sure what that's going to end up
 looking like.  The problem with COPY is that it's either trivial to use
 it to become a superuser, or insanely difficult to secure sufficiently.
 
 Yes. That's the reason for the question marks  :-\
 Some dump to csv then load somewhere else kind of jobs might
 benefit from this feature, but I'm not sure the convenience is worth
 the risk.

I've run into quite a few processes which would really benefit from
this, and would even be safe to use (the processes running the COPY
commands don't have any rights on the directories except through PG),
but it's not clear if that use-case is sufficiently broad for the
feature to be worthwhile..  At least, some feel it isn't.  Can you
describe your use-case more and perhaps the needle will move on that
point?

 * IMPERSONATE --- Ability to do SET AUTHORIZATION TO some_role;
 and RESET AUTHORIZATION
  This might be further refined to provide a way to say This role
 is authorized to impersonate role1 but no other
  Rationale: for use by connection poolers (esp. pgBouncer), where
 the role used for connection would only have the LOGIN and
 IMPERSONATE privileges. The remaining operations would be authorized
 against the supplanted role (i.e. ability to create tables/indexes
 or views, perform DML and/or DDL, etc)
  AFAIK, a superuser role is needed for this purpose currently.
 No..  You can have 'no-inherit' roles which you can use for exactly this
 

Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2014-12-23 Thread José Luis Tallón

On 12/23/2014 07:01 PM, David G Johnston wrote:

[snip]
So you want to say:

GRANT IMPERSONATE TO bouncer; --covers the ALL requirement

instead of

GRANT victim1 TO bouncer;
GRANT victim2 TO bouncer;
etc...

-- these would still be used to cover the limited users requirement
?


|GRANT IMPERSONATE ON actual_role TO login_role|

would actually get us closer to how some other databases do, now 
that I think of it. This could be just some syntactic sugar.

Might definitively ease migrations, if nothing else.


I appreciate the feedback. Thanks!


/ J.L.



Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2014-12-23 Thread Stephen Frost
* David G Johnston (david.g.johns...@gmail.com) wrote:
 I'd rather there be better, more user friendly, SQL-based APIs to the
 permissions system that would facilitate performing and reviewing grants.

This would be *really* nice, I agree.  I've heard tale of people writing
functions that go through the catalog based on a given user and spit
back everything that they have permissions to.  Would be really nice if
we had those kinds of functions built-in.

 If something like IMPERSONATE was added I would strongly suggest a
 corresponding [NO]IMPERSONATE for CREATE USER so that the admin can make
 specific roles unimpersonable - and also make SUPERUSER roles unimpersonable
 by rule.

I agree that this would be necessary..  but strikes me as less of a
complete solution than what the existing pg_auth_members approach grants
you.

Perhaps a better idea would be to simply make the bouncer unnecessary by
having a in-PG connection pooler type of system.  That's been discussed
previously and shot down but it's still one of those things that's on my
wish-list for PG.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2014-12-23 Thread Stephen Frost
* José Luis Tallón (jltal...@adv-solutions.net) wrote:
 On 12/23/2014 07:01 PM, David G Johnston wrote:
 [snip]
 So you want to say:
 
 GRANT IMPERSONATE TO bouncer; --covers the ALL requirement
 
 instead of
 
 GRANT victim1 TO bouncer;
 GRANT victim2 TO bouncer;
 etc...
 
 -- these would still be used to cover the limited users requirement
 ?
 
 |GRANT IMPERSONATE ON actual_role TO login_role|
 
 would actually get us closer to how some other databases do, now
 that I think of it. This could be just some syntactic sugar.
 Might definitively ease migrations, if nothing else.

Uh, how is this different from GRANT actual_role TO login_role, with use
of noinherit..?

THanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2014-12-23 Thread José Luis Tallón

On 12/23/2014 07:52 PM, Stephen Frost wrote:

[snip]
Manually performing VACUUM / VACUUM ANALYZE on the (few) affected
tables every 12h or so fixes the performance problem for the
particular queries without impacting the other users too much ---
the tables and indexes in question have been moved to a separate
tablespace/disk volume of their own.
Autovacuum can certainly run vacuum/analyze on a few tables every 12
hours, so I'm not really following where you see autovacuum being unable
to cope.  I agree that there *are* such cases, but getting more
information about those cases and exactly what solution *does* work
would really help us improve autovacuum to address those use-cases.


I'll try to. I don't have direct access, and the use case is quite edgy 
to be fair.

Plus, the configuration and hardware leaves quite a bit to be desired...

...but it's a real use case and the solution (even if only treating the 
symptoms) is quite straight-forward and easy.

In short, this addresses situations where some tables have a much
higher update rate than the rest of the database so that performance
degrades with time --- the application became unusable after about 6
days' worth of updates until the manual vacuums were setup

This really looks like a configuration issue with autovacuum..  Perhaps
you need to make it more aggressive than the default and have it run
more threads?


Yes to both. Up to something which actually affected performance a bit.
But basically only a few tables exhibited this behaviour among several 
hundreds in this particular situation.



Have you turned the autovacuum logging up all the way?
Is autovacuum giving up due to locking?
Not one of my systems, and I don't have access to it anymore, but I 
don't think this was the reason.


However, having some hundred million deleted rows piling every few hours 
quite increases the load. For the record, the (closed-source) 
application did issue the DELETEs on the table, so partitioning + 
TRUNCATE child_part was not applicable.



In any case, I was aiming at making this kind of operations possible and 
easier --- regardless of whether they are solving the right problem or 
not, or whether there exists an optimal solution --- since I have seen 
some real life solutions that could benefit from it.
I agree that routine index maintenance is a better match for this 
feature, though :)

 REINDEX CONCURRENTLY  (but not the blocking, regular, one)
 REFRESH MATERIALIZED VIEW CONCURRENTLY (but not the blocking one)

These are interesting, but would these make sense at the role level?
Both of these commands explicitly take specific relations to operate
against, after all.

Yup. Let's imagine a cron job invoking psql in order to perform
maintenance routine.

If they make sense at a relation level then they should be
relation-level GRANT'd permissions, not role-level attributes.


Same as before.
Let's imagine this coupled with REINDEX SCHEMA CONCURRENTLY ... or 
simply when constructing the list of tables dynamically and there is no 
other use for such a grant.
Arguably, this isn't that much of a problem if there exists a way to 
easily revoke all such permissions from all objects in one go (just like 
recently discussed in another thread)



The particular command(s) can be generated on-the-fly by querying
the catalog and then send them in one go to be run sequentially by
the one backend as a crude form of rate
limiting/quality-of-service of sorts (renice -p or even ionice
-p seems quite inadequate).

This sounds like it's something that we might want an autovacuum-like
background process to handle..  Some kind of auto-reindex-concurrently.
There are already plans to deal with updating of materialized views, as
I understand it.


While I can definitively see it for materialized views (they *are* 
views, after all), this pattern potentially gets us adding everything 
but the kitchen sink inside the database.
FWIW, it's only a matter of providing a mechanism for maintenance 
routines to use very unprivileged users to perform their duties on the 
whole cluster without having to explicitly grant permissions and/or 
include these into another, regular, role.
Please keep in mind that these  roles [having only LOGIN and 
MAINTENANCE] would NOT be able to perform any DML or DDL whatsoever, nor 
any queries (unless explicitly granted permission for SELECTs).



[snip]

Yes. That's the reason for the question marks  :-\
Some dump to csv then load somewhere else kind of jobs might
benefit from this feature, but I'm not sure the convenience is worth
the risk.

I've run into quite a few processes which would really benefit from
this, and would even be safe to use (the processes running the COPY
commands don't have any rights on the directories except through PG),
but it's not clear if that use-case is sufficiently broad for the
feature to be worthwhile..  At least, some feel it isn't.  Can you
describe your use-case more and perhaps the needle will move on