subject:"Re\: \[HACKERS\] \[RFC\] Interface of Row Level Security"

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-10 Thread Kohei KaiGai

Sorry for my late reply.

2012/6/6 Florian Pflug f...@phlo.org:
 On Jun5, 2012, at 22:33 , Kohei KaiGai wrote:
 2012/6/5 Florian Pflug f...@phlo.org:
 I can live with any behaviour, as long as it doesn't depends on details
 of the query plan. My vote would be for always using the role which was
 active at statement creation time (i.e. at PREPARE/DECLARE time) for
 RLS purposes though. That way, we could treat the role as being IMMUTABLE
 I think.

 I have same opinion at the point where the active role should be 
 consistent,
 but a bit different in details. I think, it should be a timing at
 beginning of execution,
 not plan creation time, like as existing permission checks are applied
 on InitPlan()
 or its callee.

 From a consistency POV, I'd agree. But binding the active role after
 planning means giving up a lot of query optimization opportunities. It
 will prevent the role from participating in constant folding, which
 translates to potentially enormous loss of performance. Take the following
 policy function as an example

  (role = 'some_role' AND expensive_function(row)) OR
  (role = 'other_role' AND row.field = 'some_value')

 and assume that there's an index on 'field'. Now, if role is known to be
 'other_row', the planner can eliminate the expensive_function() from the
 plan, and satisfy row.field = 'some_value' with an index scan. Without
 that knowledge, you'll probably get a sequential scan.

 I do value consistency, but in this case the benefit of not being
 consistent with other privilege checks outweighs the drawbacks I think.

I never deny some possibility to have special optimization using some
information to be determined at planner stage, in the near future.
However, I wonder the first version of this patch should include all
the desirable features at once.

 It seems to me, caution of the problem is current_user is performing out
 of the snapshot controls. According to the definition of stable function,
 its result should not be changed during a particular scan.

 Yeah, well, ISTM that our definition of STABLE doesn't really take functions
 whose return value depends on GUCs into account.

 Probably, has_XXX_privilege() should work with reliable data source being
 safe from user-id switching. But it is a tough work to enhance whole of the
 GUC stuff for snapshot aware…

 We don't need that for every GUC, though. I'd say its sufficient to somehow
 provide the policy function with a stable value of the active role (whatever
 active ends up meaning). If we do that, we can simply document that making
 the policy function depend on other GUCs is a seriously bad idea.

 We *do* need that stable active role value though, since we cannot very
 well advise against using current_role in the policy function without 
 providing
 an alternativ…

I have no preference about implementation detail of this. As long as we can
have the actually stable user-id until v9.3 release, it is sufficient for me.

 To reiterate, my point is that we won't get away with zero planner changes
 even without RLSBYPASS. In fact, we'll be paying *most* of the complexity
 costs of RLSBYPASS whether we actually add that feature or not. Which makes
 not adding it look like a pretty bad deal to me.

 It seems to me, 95% of our opinion is common, except for a few detailed 
 stuff.
 I never dislike the idea of RLSBYPASS permission, but I'd like to investigate
 this idea with more time, until v9.3 being closed.
 At least, nobody opposed to allow superusers to bypass RLS policy.
 So, it can be a minimum startup point, isn't it? RLS patch will takes 
 hundreds
 of lines due to syntax support or query rewriting stuff. It is always good 
 idea
 to focus on the core functionality at the starting point.

 Absolutely. But at the same time, it's important to check that the design 
 allows
 the additional features to be added later easily. In the case of RLS, I'm
 worried that decreeing that it's the role at execution time, not planning
 time, that counts, we're painting ourselves into a corner. I view RLSBYPASS
 as a good sanity check for that. Another one is that 
 GRANT-with-filter-function
 idea. Both seem to fall into place quite naturally if handled while planning -
 you simply have to add decide which additional WHERE clause to add, if any.
 Without any negative performance effects from policies which don't apply to
 the current role, if the optimizer does it's job. Getting the same level of
 performance when policies are added unconditionally seems quite impossible.
 You might be able to make up for some losses by caching STABLE values (which
 amounts to another constant folding pass), but you won't ever be able to make 
 up
 for a lost index scan opportunity or other optimizations which change the 
 structure
 of the whole query plan.

If we have a mechanism to check permissions at planner stage, it should perform
as if it works at executor stage, as existing permission mechanism doing.
Thus, here is no inconsistency even if

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-05 Thread Kohei KaiGai

2012/6/5 Tom Lane t...@sss.pgh.pa.us:
 Florian Pflug f...@phlo.org writes:
 On Jun4, 2012, at 18:38 , Kohei KaiGai wrote:
 2012/6/4 Florian Pflug f...@phlo.org:
 Without something like RLSBYPASS, the DBA needs to have intimate
 knowledge about the different RLS policies to e.g. guarantee that his
 backups aren't missing crucial information, or that the replication
 system indeed replicates all rows.

 It seems to me you can define a function which implements site-
 specific security requirement (E.g backup should not be prevented
 by RLS policy), then include it as a part of RLS policy
 (or implicitly added by extensions, like sepgsql tries to do).

 Sure. But that requires each and every application which uses RLS
 to provide support for special backup (or replication, or whatever)
 privileges. And it requires the DBA to know how to assign these
 privileges to certain roles for each any every application in question.
 For shops which uses a lot of different applications, all with their
 own RLS policy, that can quickly get out of hand.

 Plus, a bug in one of these RLS policies has the potential to render
 backups incomplete.

 I agree with Florian here: if there is no way to disable RLS for
 backups, the database will be un-administratable.  RLSBYPASS is not
 necessarily the only or best way to provide such an override, but we
 have to have something that is simple, foolproof, and *not* dependent
 on the details of any one RLS policy.

 I suspect that KaiGai-san's objection basically comes down to not
 wanting to have what amounts to a backdoor in RLS policies.  However,
 what Florian is saying is that you have to have a backdoor anyway,
 unless you'd like to be at risk of losing data because it wasn't
 backed up.  You can either have one well-understood, well-documented,
 well-tested backdoor, or you can have an ad-hoc backdoor in each RLS
 policy.  Nobody can think that the latter approach is preferable.

At least, database superusers shall bypass the RLS policy; it is a well
understandable behavior and an approach to minimize the back-door;
and allows to get complete database backup.

It is easy to add a special privilege mechanism to bypass RLS policy
later, however, not easy in opposite side. It seems to me a reasonable
start-up to allow only superusers to bypass RLS policy.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-05 Thread Florian Pflug

On Jun5, 2012, at 10:22 , Kohei KaiGai wrote:
 2012/6/5 Tom Lane t...@sss.pgh.pa.us:
 I suspect that KaiGai-san's objection basically comes down to not
 wanting to have what amounts to a backdoor in RLS policies.  However,
 what Florian is saying is that you have to have a backdoor anyway,
 unless you'd like to be at risk of losing data because it wasn't
 backed up.  You can either have one well-understood, well-documented,
 well-tested backdoor, or you can have an ad-hoc backdoor in each RLS
 policy.  Nobody can think that the latter approach is preferable.

 At least, database superusers shall bypass the RLS policy; it is a well
 understandable behavior and an approach to minimize the back-door;
 and allows to get complete database backup.

I don't think we want to force people to run stuff with superuser
privileges just for the sake of bypassing a RLS policy. On the whole,
that reduces the overall security, not adds to it.

 It is easy to add a special privilege mechanism to bypass RLS policy
 later, however, not easy in opposite side. It seems to me a reasonable
 start-up to allow only superusers to bypass RLS policy.

What's to be gained by that? Once there's *any* way to bypass a RLS
policy, you'll have to deal with the plan invalidation issues you
mentioned anyway. ISTM that complexity-wide, you don't save much by not
having RLSBYPASS (or something similar), but feature-wise you lose at
lot...

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-05 Thread Kohei KaiGai

2012/6/5 Florian Pflug f...@phlo.org:
 On Jun5, 2012, at 10:22 , Kohei KaiGai wrote:
 2012/6/5 Tom Lane t...@sss.pgh.pa.us:
 I suspect that KaiGai-san's objection basically comes down to not
 wanting to have what amounts to a backdoor in RLS policies.  However,
 what Florian is saying is that you have to have a backdoor anyway,
 unless you'd like to be at risk of losing data because it wasn't
 backed up.  You can either have one well-understood, well-documented,
 well-tested backdoor, or you can have an ad-hoc backdoor in each RLS
 policy.  Nobody can think that the latter approach is preferable.

 At least, database superusers shall bypass the RLS policy; it is a well
 understandable behavior and an approach to minimize the back-door;
 and allows to get complete database backup.

 I don't think we want to force people to run stuff with superuser
 privileges just for the sake of bypassing a RLS policy. On the whole,
 that reduces the overall security, not adds to it.

I can understand your opinion. But I'm still uncertain whether the new
permission is the best idea we can offer...

 It is easy to add a special privilege mechanism to bypass RLS policy
 later, however, not easy in opposite side. It seems to me a reasonable
 start-up to allow only superusers to bypass RLS policy.

 What's to be gained by that? Once there's *any* way to bypass a RLS
 policy, you'll have to deal with the plan invalidation issues you
 mentioned anyway. ISTM that complexity-wide, you don't save much by not
 having RLSBYPASS (or something similar), but feature-wise you lose at
 lot...

All we need to change is selection of the function to be appended
automatically. In case when superusers are allowed to bypass RLS
policy, OR has_superuser_privilege() shall be appended to the
user given clause, however, OR has_table_privilege() shall be
appended instead in case of RLSBYPASS permission.
(Note that has_table_privilege() always true for superusers.)

I think it does not require to add a mechanism to invalidate
prepared-statement, because all the checks are applied on
executor stage. And these functions can be stable functions,
so I believe some enhancements at planner will correctly wipe
out prior to query execution at the next step.
Conditional RLS policy in planner stage seems to me problematic
so I don't want to include this feature on the first version.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-05 Thread Florian Pflug

On Jun5, 2012, at 11:43 , Kohei KaiGai wrote:
 2012/6/5 Florian Pflug f...@phlo.org:
 What's to be gained by that? Once there's *any* way to bypass a RLS
 policy, you'll have to deal with the plan invalidation issues you
 mentioned anyway. ISTM that complexity-wide, you don't save much by not
 having RLSBYPASS (or something similar), but feature-wise you lose at
 lot…

 All we need to change is selection of the function to be appended
 automatically. In case when superusers are allowed to bypass RLS
 policy, OR has_superuser_privilege() shall be appended to the
 user given clause, however, OR has_table_privilege() shall be
 appended instead in case of RLSBYPASS permission.
 (Note that has_table_privilege() always true for superusers.)
 
 I think it does not require to add a mechanism to invalidate
 prepared-statement, because all the checks are applied on
 executor stage. And these functions can be stable functions,
 so I believe some enhancements at planner will correctly wipe
 out prior to query execution at the next step.

That won't work, since the current role can change any time mid-query,
at least if you're using cursors. Here's an example

First, setup a table
 CREATE TABLE data AS SELECT id FROM generate_series(1,100) AS id;
 GRANT SELECT ON data TO PUBLIC;

Then create a cursor which doesn't materialize its result
 BEGIN;
 DECLARE mycursor NO SCROLL CURSOR WITHOUT HOLD FOR
  SELECT id, current_user FROM data;
 SET ROLE=anybody;
 FETCH mycursor;
 id | current_user 
+--
  1 | anybody
(1 row)
 SET role=somebody;
 FETCH mycursor;
 id | current_user 
+--
  2 | somebody
(1 row)
 ROLLBACK;

And now, for comparison, a cursor which does materialize its result
 BEGIN;
 DECLARE mycursor NO SCROLL CURSOR WITHOUT HOLD FOR
  SELECT id, current_user FROM data ORDER BY id;
 SET ROLE=anybody;
 FETCH mycursor;
 id | current_user 
+--
  1 | anybody
(1 row)

 SET role=somebody;
 FETCH mycursor;
 id | current_user 
+--
  2 | anybody
(1 row)

Imagine that the current_user function call was actually part of the
RLS policy. Then the example above shows that *which* role we check
access against (the one active at the first fetch, or currently active
one) depends on which plan the optimizer happens to pick. This, IMHO,
is not acceptable for a security feature like RLS.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-05 Thread Kohei KaiGai

2012/6/5 Florian Pflug f...@phlo.org:
 On Jun5, 2012, at 11:43 , Kohei KaiGai wrote:
 2012/6/5 Florian Pflug f...@phlo.org:
 What's to be gained by that? Once there's *any* way to bypass a RLS
 policy, you'll have to deal with the plan invalidation issues you
 mentioned anyway. ISTM that complexity-wide, you don't save much by not
 having RLSBYPASS (or something similar), but feature-wise you lose at
 lot…

 All we need to change is selection of the function to be appended
 automatically. In case when superusers are allowed to bypass RLS
 policy, OR has_superuser_privilege() shall be appended to the
 user given clause, however, OR has_table_privilege() shall be
 appended instead in case of RLSBYPASS permission.
 (Note that has_table_privilege() always true for superusers.)

 I think it does not require to add a mechanism to invalidate
 prepared-statement, because all the checks are applied on
 executor stage. And these functions can be stable functions,
 so I believe some enhancements at planner will correctly wipe
 out prior to query execution at the next step.

 That won't work, since the current role can change any time mid-query,
 at least if you're using cursors. Here's an example

 First, setup a table
 CREATE TABLE data AS SELECT id FROM generate_series(1,100) AS id;
 GRANT SELECT ON data TO PUBLIC;

 Then create a cursor which doesn't materialize its result
 BEGIN;
 DECLARE mycursor NO SCROLL CURSOR WITHOUT HOLD FOR
  SELECT id, current_user FROM data;
 SET ROLE=anybody;
 FETCH mycursor;
  id | current_user
 +--
  1 | anybody
 (1 row)
 SET role=somebody;
 FETCH mycursor;
  id | current_user
 +--
  2 | somebody
 (1 row)
 ROLLBACK;

 And now, for comparison, a cursor which does materialize its result
 BEGIN;
 DECLARE mycursor NO SCROLL CURSOR WITHOUT HOLD FOR
  SELECT id, current_user FROM data ORDER BY id;
 SET ROLE=anybody;
 FETCH mycursor;
  id | current_user
 +--
  1 | anybody
 (1 row)

 SET role=somebody;
 FETCH mycursor;
  id | current_user
 +--
  2 | anybody
 (1 row)

 Imagine that the current_user function call was actually part of the
 RLS policy. Then the example above shows that *which* role we check
 access against (the one active at the first fetch, or currently active
 one) depends on which plan the optimizer happens to pick. This, IMHO,
 is not acceptable for a security feature like RLS.

Which is your expected behavior in this example?

In case when the query is internally materialized, the current_user function
is *already* executed prior to switch user-id, thus, the result set internally
stored in was already filtered out using older user-id.

It seems to me, caution of the problem is current_user is performing out
of the snapshot controls. According to the definition of stable function,
its result should not be changed during a particular scan.
At least, it is not a fundamental problem of RLS implementation, although
it needs to take an enhancement something like effective user-id per
snapshot basis.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-05 Thread Florian Pflug

On Jun5, 2012, at 15:07 , Kohei KaiGai wrote:
 2012/6/5 Florian Pflug f...@phlo.org:
 On Jun5, 2012, at 11:43 , Kohei KaiGai wrote:
 I think it does not require to add a mechanism to invalidate
 prepared-statement, because all the checks are applied on
 executor stage. And these functions can be stable functions,
 so I believe some enhancements at planner will correctly wipe
 out prior to query execution at the next step.

 That won't work, since the current role can change any time mid-query,
 at least if you're using cursors. Here's an example

snipped example

 Imagine that the current_user function call was actually part of the
 RLS policy. Then the example above shows that *which* role we check
 access against (the one active at the first fetch, or currently active
 one) depends on which plan the optimizer happens to pick. This, IMHO,
 is not acceptable for a security feature like RLS.

 Which is your expected behavior in this example?

I can live with any behaviour, as long as it doesn't depends on details
of the query plan. My vote would be for always using the role which was
active at statement creation time (i.e. at PREPARE/DECLARE time) for
RLS purposes though. That way, we could treat the role as being IMMUTABLE
I think.

 In case when the query is internally materialized, the current_user function
 is *already* executed prior to switch user-id, thus, the result set internally
 stored in was already filtered out using older user-id.

Exactly. But whether or not the query, some parts of it, or even only some
parts of the policy function are materialized is outside the control of the
user and may change at any time.

 It seems to me, caution of the problem is current_user is performing out
 of the snapshot controls. According to the definition of stable function,
 its result should not be changed during a particular scan.

Yeah, well, ISTM that our definition of STABLE doesn't really take functions
whose return value depends on GUCs into account. 

 At least, it is not a fundamental problem of RLS implementation, although
 it needs to take an enhancement something like effective user-id per
 snapshot basis.

Right. We need something like statement_role which returns the role the
statement is to be executed as (However we define that, though as I said
above my vote is for it being the role which created the statement). To make
that work, however, the plan cache needs to store that role, since the
statement might be re-planned after its initial creation, but before it is
executed. At which point the amount of complexity saved by not implementing
RLSBYPASS is very nearly zero.

To reiterate, my point is that we won't get away with zero planner changes
even without RLSBYPASS. In fact, we'll be paying *most* of the complexity
costs of RLSBYPASS whether we actually add that feature or not. Which makes
not adding it look like a pretty bad deal to me.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-05 Thread Kohei KaiGai

2012/6/5 Florian Pflug f...@phlo.org:
 On Jun5, 2012, at 15:07 , Kohei KaiGai wrote:
 2012/6/5 Florian Pflug f...@phlo.org:
 On Jun5, 2012, at 11:43 , Kohei KaiGai wrote:
 I think it does not require to add a mechanism to invalidate
 prepared-statement, because all the checks are applied on
 executor stage. And these functions can be stable functions,
 so I believe some enhancements at planner will correctly wipe
 out prior to query execution at the next step.

 That won't work, since the current role can change any time mid-query,
 at least if you're using cursors. Here's an example

 snipped example

 Imagine that the current_user function call was actually part of the
 RLS policy. Then the example above shows that *which* role we check
 access against (the one active at the first fetch, or currently active
 one) depends on which plan the optimizer happens to pick. This, IMHO,
 is not acceptable for a security feature like RLS.

 Which is your expected behavior in this example?

 I can live with any behaviour, as long as it doesn't depends on details
 of the query plan. My vote would be for always using the role which was
 active at statement creation time (i.e. at PREPARE/DECLARE time) for
 RLS purposes though. That way, we could treat the role as being IMMUTABLE
 I think.

I have same opinion at the point where the active role should be consistent,
but a bit different in details. I think, it should be a timing at
beginning of execution,
not plan creation time, like as existing permission checks are applied
on InitPlan()
or its callee.
So, I believe EXECUTE is more right place to check it rather than PREPARE.
Although FETCH command actually handles query execution, do we ought to
consider DECLARE do both of query plan and execution from the perspective
of model?

 It seems to me, caution of the problem is current_user is performing out
 of the snapshot controls. According to the definition of stable function,
 its result should not be changed during a particular scan.

 Yeah, well, ISTM that our definition of STABLE doesn't really take functions
 whose return value depends on GUCs into account.

Probably, has_XXX_privilege() should work with reliable data source being
safe from user-id switching. But it is a tough work to enhance whole of the
GUC stuff for snapshot aware...

 At least, it is not a fundamental problem of RLS implementation, although
 it needs to take an enhancement something like effective user-id per
 snapshot basis.

 Right. We need something like statement_role which returns the role the
 statement is to be executed as (However we define that, though as I said
 above my vote is for it being the role which created the statement). To make
 that work, however, the plan cache needs to store that role, since the
 statement might be re-planned after its initial creation, but before it is
 executed. At which point the amount of complexity saved by not implementing
 RLSBYPASS is very nearly zero.

I'm not sure why the plan cache needs to be stored with a particular user-id.
As long as we check permissions at executor stage, all we need to do is
implementation of a function to check permission based on a particular
user-id at the timing of executor initialization.

And, I think it can be a separated efforts from the first version of RLS.

 To reiterate, my point is that we won't get away with zero planner changes
 even without RLSBYPASS. In fact, we'll be paying *most* of the complexity
 costs of RLSBYPASS whether we actually add that feature or not. Which makes
 not adding it look like a pretty bad deal to me.

It seems to me, 95% of our opinion is common, except for a few detailed stuff.
I never dislike the idea of RLSBYPASS permission, but I'd like to investigate
this idea with more time, until v9.3 being closed.
At least, nobody opposed to allow superusers to bypass RLS policy.
So, it can be a minimum startup point, isn't it? RLS patch will takes hundreds
of lines due to syntax support or query rewriting stuff. It is always good idea
to focus on the core functionality at the starting point.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-05 Thread Florian Pflug

On Jun5, 2012, at 22:33 , Kohei KaiGai wrote:
 2012/6/5 Florian Pflug f...@phlo.org:
 I can live with any behaviour, as long as it doesn't depends on details
 of the query plan. My vote would be for always using the role which was
 active at statement creation time (i.e. at PREPARE/DECLARE time) for
 RLS purposes though. That way, we could treat the role as being IMMUTABLE
 I think.

 I have same opinion at the point where the active role should be consistent,
 but a bit different in details. I think, it should be a timing at
 beginning of execution,
 not plan creation time, like as existing permission checks are applied
 on InitPlan()
 or its callee.

From a consistency POV, I'd agree. But binding the active role after
planning means giving up a lot of query optimization opportunities. It
will prevent the role from participating in constant folding, which 
translates to potentially enormous loss of performance. Take the following
policy function as an example

  (role = 'some_role' AND expensive_function(row)) OR
  (role = 'other_role' AND row.field = 'some_value')

and assume that there's an index on 'field'. Now, if role is known to be
'other_row', the planner can eliminate the expensive_function() from the
plan, and satisfy row.field = 'some_value' with an index scan. Without
that knowledge, you'll probably get a sequential scan.

I do value consistency, but in this case the benefit of not being
consistent with other privilege checks outweighs the drawbacks I think.

 It seems to me, caution of the problem is current_user is performing out
 of the snapshot controls. According to the definition of stable function,
 its result should not be changed during a particular scan.
 
 Yeah, well, ISTM that our definition of STABLE doesn't really take functions
 whose return value depends on GUCs into account.

 Probably, has_XXX_privilege() should work with reliable data source being
 safe from user-id switching. But it is a tough work to enhance whole of the
 GUC stuff for snapshot aware…

We don't need that for every GUC, though. I'd say its sufficient to somehow
provide the policy function with a stable value of the active role (whatever
active ends up meaning). If we do that, we can simply document that making
the policy function depend on other GUCs is a seriously bad idea.

We *do* need that stable active role value though, since we cannot very
well advise against using current_role in the policy function without providing
an alternativ…

 At least, it is not a fundamental problem of RLS implementation, although
 it needs to take an enhancement something like effective user-id per
 snapshot basis.
 
 Right. We need something like statement_role which returns the role the
 statement is to be executed as (However we define that, though as I said
 above my vote is for it being the role which created the statement). To make
 that work, however, the plan cache needs to store that role, since the
 statement might be re-planned after its initial creation, but before it is
 executed. At which point the amount of complexity saved by not implementing
 RLSBYPASS is very nearly zero.

 I'm not sure why the plan cache needs to be stored with a particular user-id.
 As long as we check permissions at executor stage, all we need to do is
 implementation of a function to check permission based on a particular
 user-id at the timing of executor initialization.

Hm, yeah, if you defer everything to execution time, that's true. 

 To reiterate, my point is that we won't get away with zero planner changes
 even without RLSBYPASS. In fact, we'll be paying *most* of the complexity
 costs of RLSBYPASS whether we actually add that feature or not. Which makes
 not adding it look like a pretty bad deal to me.

 It seems to me, 95% of our opinion is common, except for a few detailed stuff.
 I never dislike the idea of RLSBYPASS permission, but I'd like to investigate
 this idea with more time, until v9.3 being closed.
 At least, nobody opposed to allow superusers to bypass RLS policy.
 So, it can be a minimum startup point, isn't it? RLS patch will takes hundreds
 of lines due to syntax support or query rewriting stuff. It is always good 
 idea
 to focus on the core functionality at the starting point.

Absolutely. But at the same time, it's important to check that the design allows
the additional features to be added later easily. In the case of RLS, I'm
worried that decreeing that it's the role at execution time, not planning
time, that counts, we're painting ourselves into a corner. I view RLSBYPASS
as a good sanity check for that. Another one is that GRANT-with-filter-function
idea. Both seem to fall into place quite naturally if handled while planning -
you simply have to add decide which additional WHERE clause to add, if any.
Without any negative performance effects from policies which don't apply to
the current role, if the optimizer does it's job. Getting the same level of
performance when policies are added unconditionally

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-04 Thread Robert Haas

On Sat, Jun 2, 2012 at 12:58 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 2012/6/1 Robert Haas robertmh...@gmail.com:
 On Thu, May 31, 2012 at 3:52 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 It may be an option to separate the case into two; a situation to execute
 the given query immediately just after optimization and never reused,
 and others.

 Yes.  Simon suggested exactly this a while back, and I agree with him
 that we ought to have it.

 It is good for me, also.

 Then, if so, we will be able to push the stuff corresponding to
 RLSBYPASS into the query optimization, and works transparently
 for users.

 You're still going to need a way to make sure that the cluster can be
 dumped properly.  RLSBYPASS accomplishes that; your scheme doesn't.

 If something like has_superuser_privilege() OR is automatically
 appended to user given clauses, it makes sure whole of the database
 cluster is dumped.
 That also means any permission checks are delayed to executor stage
 (except for the case on simple query protocol, I mentioned above),
 thus it simplifies the condition to invalidate prepared statement.

 One problem I noticed last night around RLSBYPASS approach is:
 it can take much number of invalidation whenever current user-id is
 switched. Not only SET AUTHORIZATION, but SECURITY DEFINER
 function's invocation also. I'm not sure whether this invalidation
 storm is reasonable level, or not.

 Is it unavailable to handle this type of implicit superuser checks
 with existing EXECUTE statement?
 I tried to run EXPLAIN with similar case.

 postgres=# PREPARE p1(int, bool) AS SELECT * FROM tbl WHERE y  $1 OR $2;
 PREPARE
 postgres=# EXPLAIN EXECUTE p1(10, current_user in ('kaigai','rhaas'));
                       QUERY PLAN
 
  Seq Scan on tbl  (cost=0.00..21.60 rows=1160 width=40)
 (1 row

 However,

 postgres=# PREPARE p2(int) AS SELECT * FROM tbl
                   WHERE y  $1 OR current_user in ('kaigai','rhaas');
 PREPARE
 postgres=# EXPLAIN EXECUTE p2(10);
                                 QUERY PLAN
 -
  Seq Scan on tbl  (cost=0.00..30.30 rows=394 width=40)
   Filter: ((y  10) OR (current_user() = ANY ('{kaigai,rhaas}'::name[])))
 (2 rows)

 Please assume the second condition something like superuser
 checks in addition to the main security policy.
 It implies an idea to replace a certain portion of clause that
 consists of only constant values and stable / immutable functions
 by shadow parameter to be calculated at executor stage, makes
 sense to wipe out RLS policy for superusers. In addition, it requires
 no new invalidation mechanism to prepared statements.

I'm not sure how best to handle the invalidation issue... but I still
think that relying on the query planner to use theorem-proving to
simplify RLS conditions is going to be problematic.  You can certainly
try it ad see how it comes out...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-04 Thread Kohei KaiGai

2012/6/4 Robert Haas robertmh...@gmail.com:
 On Sat, Jun 2, 2012 at 12:58 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 2012/6/1 Robert Haas robertmh...@gmail.com:
 On Thu, May 31, 2012 at 3:52 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 It may be an option to separate the case into two; a situation to execute
 the given query immediately just after optimization and never reused,
 and others.

 Yes.  Simon suggested exactly this a while back, and I agree with him
 that we ought to have it.

 It is good for me, also.

 Then, if so, we will be able to push the stuff corresponding to
 RLSBYPASS into the query optimization, and works transparently
 for users.

 You're still going to need a way to make sure that the cluster can be
 dumped properly.  RLSBYPASS accomplishes that; your scheme doesn't.

 If something like has_superuser_privilege() OR is automatically
 appended to user given clauses, it makes sure whole of the database
 cluster is dumped.
 That also means any permission checks are delayed to executor stage
 (except for the case on simple query protocol, I mentioned above),
 thus it simplifies the condition to invalidate prepared statement.

 One problem I noticed last night around RLSBYPASS approach is:
 it can take much number of invalidation whenever current user-id is
 switched. Not only SET AUTHORIZATION, but SECURITY DEFINER
 function's invocation also. I'm not sure whether this invalidation
 storm is reasonable level, or not.

 Is it unavailable to handle this type of implicit superuser checks
 with existing EXECUTE statement?
 I tried to run EXPLAIN with similar case.

 postgres=# PREPARE p1(int, bool) AS SELECT * FROM tbl WHERE y  $1 OR $2;
 PREPARE
 postgres=# EXPLAIN EXECUTE p1(10, current_user in ('kaigai','rhaas'));
                       QUERY PLAN
 
  Seq Scan on tbl  (cost=0.00..21.60 rows=1160 width=40)
 (1 row

 However,

 postgres=# PREPARE p2(int) AS SELECT * FROM tbl
                   WHERE y  $1 OR current_user in ('kaigai','rhaas');
 PREPARE
 postgres=# EXPLAIN EXECUTE p2(10);
                                 QUERY PLAN
 -
  Seq Scan on tbl  (cost=0.00..30.30 rows=394 width=40)
   Filter: ((y  10) OR (current_user() = ANY ('{kaigai,rhaas}'::name[])))
 (2 rows)

 Please assume the second condition something like superuser
 checks in addition to the main security policy.
 It implies an idea to replace a certain portion of clause that
 consists of only constant values and stable / immutable functions
 by shadow parameter to be calculated at executor stage, makes
 sense to wipe out RLS policy for superusers. In addition, it requires
 no new invalidation mechanism to prepared statements.

 I'm not sure how best to handle the invalidation issue... but I still
 think that relying on the query planner to use theorem-proving to
 simplify RLS conditions is going to be problematic.  You can certainly
 try it ad see how it comes out...

I think, the enhancement of planner can be handled independently
from RLS policy stuff, and we should avoid to contain many complex
stuff into one functionality.
At least, it will work correctly with idea to append implicit condition
(OR has_superuser_privilege()) without any invalidation mechanism,
although its performance penalty is not negligible; without planner-
enhancement.
How about your opinion?

I'm worry about future maintenance issues, once we have
RLSBYPASS permission or something user visible...

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-04 Thread Florian Pflug

On Jun4, 2012, at 17:38 , Kohei KaiGai wrote:
 I'm worry about future maintenance issues, once we have
 RLSBYPASS permission or something user visible…

I fear that without a generic way to disable RLS regardless which
RLS policy function is in effect, we're creating a huge maintenance
issue for DBAs. In a lot of shops, the DBA is responsible for a large
number of databases, each potentially using a completely different
approach to RLS and hence a completely different policy function.

Without something like RLSBYPASS, the DBA needs to have intimate
knowledge about the different RLS policies to e.g. guarantee that his
backups aren't missing crucial information, or that the replication
system indeed replicates all rows.

With RLSBYPASS, all he needs to do is grant one privilege to his
replication or backup user. The rest can be left to the development
or support team for a specific application.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-04 Thread Kohei KaiGai

2012/6/4 Florian Pflug f...@phlo.org:
 On Jun4, 2012, at 17:38 , Kohei KaiGai wrote:
 I'm worry about future maintenance issues, once we have
 RLSBYPASS permission or something user visible…

 I fear that without a generic way to disable RLS regardless which
 RLS policy function is in effect, we're creating a huge maintenance
 issue for DBAs. In a lot of shops, the DBA is responsible for a large
 number of databases, each potentially using a completely different
 approach to RLS and hence a completely different policy function.

Here is two problems around RLSBYPASS. The first is we have
no idea to handle invalidation of prepared-statement when current
user is switched, right now. The second is we can have another
way to describe same RLS policy without PG original enhancement
towards permission mechanism...

 Without something like RLSBYPASS, the DBA needs to have intimate
 knowledge about the different RLS policies to e.g. guarantee that his
 backups aren't missing crucial information, or that the replication
 system indeed replicates all rows.

 With RLSBYPASS, all he needs to do is grant one privilege to his
 replication or backup user. The rest can be left to the development
 or support team for a specific application.

It seems to me you can define a function which implements site-
specific security requirement (E.g backup should not be prevented
by RLS policy), then include it as a part of RLS policy
(or implicitly added by extensions, like sepgsql tries to do).

These are the reason why I hesitate to go ahead with RLSBYPASS
permission.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-04 Thread Tom Lane

Kohei KaiGai kai...@kaigai.gr.jp writes:
 Here is two problems around RLSBYPASS. The first is we have
 no idea to handle invalidation of prepared-statement when current
 user is switched, right now.

How is that specifically the fault of RLSBYPASS?  *Any* of the schemes
you're proposing for inlined RLS checks will have problems with userID
switching.

My guess is we'd have to treat the effective userID as part of the
plancache lookup key to make it safe to inline anything related to RLS.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-04 Thread Kohei KaiGai

2012/6/4 Tom Lane t...@sss.pgh.pa.us:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
 Here is two problems around RLSBYPASS. The first is we have
 no idea to handle invalidation of prepared-statement when current
 user is switched, right now.

 How is that specifically the fault of RLSBYPASS?  *Any* of the schemes
 you're proposing for inlined RLS checks will have problems with userID
 switching.

Really? I don't find out a scenario that cause a problem with user-id
switching in case when RLS policy is *unconditionally* appended then
evaluated on executor stage. I'd like to see the scenario.

 My guess is we'd have to treat the effective userID as part of the
 plancache lookup key to make it safe to inline anything related to RLS.

It might be a solution, if we append individual RLS policy at the
planner stage, depending on user-id.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-04 Thread Florian Pflug

On Jun4, 2012, at 18:38 , Kohei KaiGai wrote:
 2012/6/4 Florian Pflug f...@phlo.org:
 Without something like RLSBYPASS, the DBA needs to have intimate
 knowledge about the different RLS policies to e.g. guarantee that his
 backups aren't missing crucial information, or that the replication
 system indeed replicates all rows.
 
 With RLSBYPASS, all he needs to do is grant one privilege to his
 replication or backup user. The rest can be left to the development
 or support team for a specific application.

 It seems to me you can define a function which implements site-
 specific security requirement (E.g backup should not be prevented
 by RLS policy), then include it as a part of RLS policy
 (or implicitly added by extensions, like sepgsql tries to do).

Sure. But that requires each and every application which uses RLS
to provide support for special backup (or replication, or whatever)
privileges. And it requires the DBA to know how to assign these
privileges to certain roles for each any every application in question.
For shops which uses a lot of different applications, all with their
own RLS policy, that can quickly get out of hand.

Plus, a bug in one of these RLS policies has the potential to render
backups incomplete.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-04 Thread Tom Lane

Florian Pflug f...@phlo.org writes:
 On Jun4, 2012, at 18:38 , Kohei KaiGai wrote:
 2012/6/4 Florian Pflug f...@phlo.org:
 Without something like RLSBYPASS, the DBA needs to have intimate
 knowledge about the different RLS policies to e.g. guarantee that his
 backups aren't missing crucial information, or that the replication
 system indeed replicates all rows.

 It seems to me you can define a function which implements site-
 specific security requirement (E.g backup should not be prevented
 by RLS policy), then include it as a part of RLS policy
 (or implicitly added by extensions, like sepgsql tries to do).

 Sure. But that requires each and every application which uses RLS
 to provide support for special backup (or replication, or whatever)
 privileges. And it requires the DBA to know how to assign these
 privileges to certain roles for each any every application in question.
 For shops which uses a lot of different applications, all with their
 own RLS policy, that can quickly get out of hand.

 Plus, a bug in one of these RLS policies has the potential to render
 backups incomplete.

I agree with Florian here: if there is no way to disable RLS for
backups, the database will be un-administratable.  RLSBYPASS is not
necessarily the only or best way to provide such an override, but we
have to have something that is simple, foolproof, and *not* dependent
on the details of any one RLS policy.

I suspect that KaiGai-san's objection basically comes down to not
wanting to have what amounts to a backdoor in RLS policies.  However,
what Florian is saying is that you have to have a backdoor anyway,
unless you'd like to be at risk of losing data because it wasn't
backed up.  You can either have one well-understood, well-documented,
well-tested backdoor, or you can have an ad-hoc backdoor in each RLS
policy.  Nobody can think that the latter approach is preferable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-01 Thread Robert Haas

On Thu, May 31, 2012 at 3:52 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 It may be an option to separate the case into two; a situation to execute
 the given query immediately just after optimization and never reused,
 and others.

Yes.  Simon suggested exactly this a while back, and I agree with him
that we ought to have it.

 Even though the second situation, it may give us better query execution
 plan, if we try to reconstruct query plan just before executor with
 assumption that expects immutable / stable function can be replaced
 by constant value prior to execution.
 In other words, this idea tries to query optimization again on EXECUTE
 statement against to its nature, to replace immutable / stable functions
 by constant value, and to generate wiser execute plan.
 At least, it may make sense to have a flag on prepared statement to
 indicate whether it has possible better plan with this re-construction.

This sounds complex and inefficient to me.

 Then, if so, we will be able to push the stuff corresponding to
 RLSBYPASS into the query optimization, and works transparently
 for users.

You're still going to need a way to make sure that the cluster can be
dumped properly.  RLSBYPASS accomplishes that; your scheme doesn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-06-01 Thread Kohei KaiGai

2012/6/1 Robert Haas robertmh...@gmail.com:
 On Thu, May 31, 2012 at 3:52 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 It may be an option to separate the case into two; a situation to execute
 the given query immediately just after optimization and never reused,
 and others.

 Yes.  Simon suggested exactly this a while back, and I agree with him
 that we ought to have it.

It is good for me, also.

 Then, if so, we will be able to push the stuff corresponding to
 RLSBYPASS into the query optimization, and works transparently
 for users.

 You're still going to need a way to make sure that the cluster can be
 dumped properly.  RLSBYPASS accomplishes that; your scheme doesn't.

If something like has_superuser_privilege() OR is automatically
appended to user given clauses, it makes sure whole of the database
cluster is dumped.
That also means any permission checks are delayed to executor stage
(except for the case on simple query protocol, I mentioned above),
thus it simplifies the condition to invalidate prepared statement.

One problem I noticed last night around RLSBYPASS approach is:
it can take much number of invalidation whenever current user-id is
switched. Not only SET AUTHORIZATION, but SECURITY DEFINER
function's invocation also. I'm not sure whether this invalidation
storm is reasonable level, or not.

Is it unavailable to handle this type of implicit superuser checks
with existing EXECUTE statement?
I tried to run EXPLAIN with similar case.

postgres=# PREPARE p1(int, bool) AS SELECT * FROM tbl WHERE y  $1 OR $2;
PREPARE
postgres=# EXPLAIN EXECUTE p1(10, current_user in ('kaigai','rhaas'));
   QUERY PLAN

 Seq Scan on tbl  (cost=0.00..21.60 rows=1160 width=40)
(1 row

However,

postgres=# PREPARE p2(int) AS SELECT * FROM tbl
   WHERE y  $1 OR current_user in ('kaigai','rhaas');
PREPARE
postgres=# EXPLAIN EXECUTE p2(10);
 QUERY PLAN
-
 Seq Scan on tbl  (cost=0.00..30.30 rows=394 width=40)
   Filter: ((y  10) OR (current_user() = ANY ('{kaigai,rhaas}'::name[])))
(2 rows)

Please assume the second condition something like superuser
checks in addition to the main security policy.
It implies an idea to replace a certain portion of clause that
consists of only constant values and stable / immutable functions
by shadow parameter to be calculated at executor stage, makes
sense to wipe out RLS policy for superusers. In addition, it requires
no new invalidation mechanism to prepared statements.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-31 Thread Yeb Havinga


On 2012-05-30 21:26, Kohei KaiGai wrote:

If we would have an ideal optimizer, I'd still like the optimizer to
wipe out redundant clauses transparently, rather than RLSBYPASS
permissions, because it just controls all-or-nothing stuff.
For example, if tuples are categorized to unclassified, classified or
secret, and RLS policy is configured as:
   ((current_user IN ('alice', 'bob') AND X IN ('unclassified',
'classified')) OR (X IN 'unclassified)),
superuser can see all the tuples, and alice and bob can see
up to classified tuples.
Is it really hard to wipe out redundant condition at planner stage?
If current_user is obviously 'kaigai', it seems to me the left-side of
this clause can be wiped out at the planner stage.


The query's RLS policy would be simpler if the RLS policy function that 
returns the WHERE clause would take the user as argument, so its result 
does not contain user conditionals.


IF (current_user IN ('alice', 'bob')
THEN
  RETURN X IN ('unclassified', 'classified'))
ELSE
  RETURN X IN ('unclassified')
END IF;


regards,
Yeb



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-31 Thread Kohei KaiGai

2012/5/31 Yeb Havinga yebhavi...@gmail.com:
 On 2012-05-30 21:26, Kohei KaiGai wrote:

 If we would have an ideal optimizer, I'd still like the optimizer to
 wipe out redundant clauses transparently, rather than RLSBYPASS
 permissions, because it just controls all-or-nothing stuff.
 For example, if tuples are categorized to unclassified, classified or
 secret, and RLS policy is configured as:
   ((current_user IN ('alice', 'bob') AND X IN ('unclassified',
 'classified')) OR (X IN 'unclassified)),
 superuser can see all the tuples, and alice and bob can see
 up to classified tuples.
 Is it really hard to wipe out redundant condition at planner stage?
 If current_user is obviously 'kaigai', it seems to me the left-side of
 this clause can be wiped out at the planner stage.


 The query's RLS policy would be simpler if the RLS policy function that
 returns the WHERE clause would take the user as argument, so its result does
 not contain user conditionals.

 IF (current_user IN ('alice', 'bob')
 THEN
  RETURN X IN ('unclassified', 'classified'))
 ELSE
  RETURN X IN ('unclassified')
 END IF;

Yes, it is a happy case. But the point I'm concern about is, the conditions
to branch cases are not limited to current user-id.
The RLS policy shall be appended at planner stage, so prepared statement
needs to be invalidated whenever its prerequisites are changed.
For example, someone may assign a function that returns RLS policy
depending on whether the current hour is even-number of odd-number.
It implies that we cannot predicate all the cases to invalidate prepared
statement with RLS policy, because of too much flexibility.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-31 Thread Robert Haas

On Wed, May 30, 2012 at 3:26 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 My preference is RLSBYPASS permission rather than the approach
 with functions that return policy clause at run-time, because it needs
 to invalidate prepared statement at random timing.
 In case of this function approach, the RLS policy shall be generated
 on planner stage, and we cannot have any assumption to the criteria
 of RLS policy. A function might generate RLS policy regarding to the
 current user id. Yes, it is straightforward. The prepared statement
 should be invalidate whenever current user-id got switched.
 However, someone may define a function that generate RLS policy
 depending on the value of client_min_messages for example.
 Do we need to invalidate prepared statement whenever GUC get
 updated? I think it is overkill. We cannot predicate all the criteria
 user want to control the RLS policy using the functions.
 So, RLSBYPASS permission is more simple way to limit number of
 situations to invalidate prepared statements.

That's a good point.

 If we would have an ideal optimizer, I'd still like the optimizer to
 wipe out redundant clauses transparently, rather than RLSBYPASS
 permissions, because it just controls all-or-nothing stuff.
 For example, if tuples are categorized to unclassified, classified or
 secret, and RLS policy is configured as:
  ((current_user IN ('alice', 'bob') AND X IN ('unclassified',
 'classified')) OR (X IN 'unclassified)),
 superuser can see all the tuples, and alice and bob can see
 up to classified tuples.
 Is it really hard to wipe out redundant condition at planner stage?
 If current_user is obviously 'kaigai', it seems to me the left-side of
 this clause can be wiped out at the planner stage.
 Do I consider the issue too simple?

Yes.  :-)

There are two problems.  First, if using the extended query protocol
(e.g. pgbench -M prepared) you can prepare a statement just once and
then execute it multiple times.  In this case, stable-functions cannot
be constant-folded at plan time, because they are only guaranteed to
remain constant for a *single* execution of the query, not for all
executions of the query.  So any optimization in this area would have
to be limited to cases where the simple query protocol is used.  I
think that might still be worth doing, but it's a significant
limitation, to be sure.  Second, at present, there is no guarantee
that the snapshot used for planning the query is the same as the
snapshot used for executing the query, though commit
d573e239f03506920938bf0be56c868d9c3416da made that happen in some
common cases.  If we were to do constant-folding of stable functions
using the planner snapshot, it would represent a behavior change from
previous releases.  I am not clear whether that has any real-world
consequences that we should be worried about.  It seems to me that the
path of least resistance might be to refactor the portal stuff so that
we can provide a uniform guarantee that, when using the simple query
protocol, the planner and executor snapshots will be the same ... but
I might be wrong.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-31 Thread Kohei KaiGai

2012/5/31 Robert Haas robertmh...@gmail.com:
 If we would have an ideal optimizer, I'd still like the optimizer to
 wipe out redundant clauses transparently, rather than RLSBYPASS
 permissions, because it just controls all-or-nothing stuff.
 For example, if tuples are categorized to unclassified, classified or
 secret, and RLS policy is configured as:
  ((current_user IN ('alice', 'bob') AND X IN ('unclassified',
 'classified')) OR (X IN 'unclassified)),
 superuser can see all the tuples, and alice and bob can see
 up to classified tuples.
 Is it really hard to wipe out redundant condition at planner stage?
 If current_user is obviously 'kaigai', it seems to me the left-side of
 this clause can be wiped out at the planner stage.
 Do I consider the issue too simple?

 Yes.  :-)

 There are two problems.  First, if using the extended query protocol
 (e.g. pgbench -M prepared) you can prepare a statement just once and
 then execute it multiple times.  In this case, stable-functions cannot
 be constant-folded at plan time, because they are only guaranteed to
 remain constant for a *single* execution of the query, not for all
 executions of the query.  So any optimization in this area would have
 to be limited to cases where the simple query protocol is used.  I
 think that might still be worth doing, but it's a significant
 limitation, to be sure.  Second, at present, there is no guarantee
 that the snapshot used for planning the query is the same as the
 snapshot used for executing the query, though commit
 d573e239f03506920938bf0be56c868d9c3416da made that happen in some
 common cases.  If we were to do constant-folding of stable functions
 using the planner snapshot, it would represent a behavior change from
 previous releases.  I am not clear whether that has any real-world
 consequences that we should be worried about.  It seems to me that the
 path of least resistance might be to refactor the portal stuff so that
 we can provide a uniform guarantee that, when using the simple query
 protocol, the planner and executor snapshots will be the same ... but
 I might be wrong.

It may be an option to separate the case into two; a situation to execute
the given query immediately just after optimization and never reused,
and others.
Even though the second situation, it may give us better query execution
plan, if we try to reconstruct query plan just before executor with
assumption that expects immutable / stable function can be replaced
by constant value prior to execution.
In other words, this idea tries to query optimization again on EXECUTE
statement against to its nature, to replace immutable / stable functions
by constant value, and to generate wiser execute plan.
At least, it may make sense to have a flag on prepared statement to
indicate whether it has possible better plan with this re-construction.

Then, if so, we will be able to push the stuff corresponding to
RLSBYPASS into the query optimization, and works transparently
for users.

Isn't it feasible to implement?

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-31 Thread Kohei KaiGai

2012/5/31 Kohei KaiGai kai...@kaigai.gr.jp:
 2012/5/31 Robert Haas robertmh...@gmail.com:
 If we would have an ideal optimizer, I'd still like the optimizer to
 wipe out redundant clauses transparently, rather than RLSBYPASS
 permissions, because it just controls all-or-nothing stuff.
 For example, if tuples are categorized to unclassified, classified or
 secret, and RLS policy is configured as:
  ((current_user IN ('alice', 'bob') AND X IN ('unclassified',
 'classified')) OR (X IN 'unclassified)),
 superuser can see all the tuples, and alice and bob can see
 up to classified tuples.
 Is it really hard to wipe out redundant condition at planner stage?
 If current_user is obviously 'kaigai', it seems to me the left-side of
 this clause can be wiped out at the planner stage.
 Do I consider the issue too simple?

 Yes.  :-)

 There are two problems.  First, if using the extended query protocol
 (e.g. pgbench -M prepared) you can prepare a statement just once and
 then execute it multiple times.  In this case, stable-functions cannot
 be constant-folded at plan time, because they are only guaranteed to
 remain constant for a *single* execution of the query, not for all
 executions of the query.  So any optimization in this area would have
 to be limited to cases where the simple query protocol is used.  I
 think that might still be worth doing, but it's a significant
 limitation, to be sure.  Second, at present, there is no guarantee
 that the snapshot used for planning the query is the same as the
 snapshot used for executing the query, though commit
 d573e239f03506920938bf0be56c868d9c3416da made that happen in some
 common cases.  If we were to do constant-folding of stable functions
 using the planner snapshot, it would represent a behavior change from
 previous releases.  I am not clear whether that has any real-world
 consequences that we should be worried about.  It seems to me that the
 path of least resistance might be to refactor the portal stuff so that
 we can provide a uniform guarantee that, when using the simple query
 protocol, the planner and executor snapshots will be the same ... but
 I might be wrong.

 It may be an option to separate the case into two; a situation to execute
 the given query immediately just after optimization and never reused,
 and others.
 Even though the second situation, it may give us better query execution
 plan, if we try to reconstruct query plan just before executor with
 assumption that expects immutable / stable function can be replaced
 by constant value prior to execution.
 In other words, this idea tries to query optimization again on EXECUTE
 statement against to its nature, to replace immutable / stable functions
 by constant value, and to generate wiser execute plan.
 At least, it may make sense to have a flag on prepared statement to
 indicate whether it has possible better plan with this re-construction.

 Then, if so, we will be able to push the stuff corresponding to
 RLSBYPASS into the query optimization, and works transparently
 for users.

 Isn't it feasible to implement?

If we could replace a particular term that consists of constant values
and stable / immutable functions only by parameter references,
it may enable to handle the term as if a constant value, but actual
calculation is delayed to executor stage.

For example, according to this idea,
  PREPARE p1(int) AS SELECT * FROM tbl WHERE
  current_user in ('alice','bob') AND X  $1;
shall be internally rewritten to,
  PREPARE p1(int) AS SELECT * FROM tbl WHERE
  $2 AND X$1;

then, $2 is implicitly calculated just before execution of this prepared
statement. The snapshot to be used for this calculation is same with
executor's one. It seems to me it is a feasible idea with less invasive
implementation to existing planner.

Does it make sense to describe exceptional condition using regular
clause, instead of special permission?

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-30 Thread Kohei KaiGai

2012/5/29 Robert Haas robertmh...@gmail.com:
 On Fri, May 25, 2012 at 5:08 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 I think it is a good idea not to apply RLS when current user has
 superuser privilege from perspective of security model consistency,
 but it is inconsistent to check privileges underlying tables.

 Seems like a somewhat random wart, if it's just an exception for
 superusers.  I think we need to do better than that.  For example, at
 my last company, sales reps A and B were permitted to see all
 customers of the company, but sales reps C, D, E, F, G, H, I, and J
 were permitted to see only their own accounts.  Those sorts of
 policies need to be easy to implement.

 Probably, if sales_rep column records its responsible repo, its
 security policy is able to be described as:
  (my_sales_rep() in ('A', 'B') OR sales_rep = my_sales_rep())

 Yes, but that's a pain to optimize.  When A or B tries to select from
 the table, the query optimizer has to realize that my_sales_rep() is
 stable, inline it, do constant simplification and throw away the
 entire OR clause.  Note that this won't work today, because we only
 constant-fold immutable functions, not stable ones.  Then, since there
 are no remaining security quals, we have to realize that we actually
 don't need the security_barrier subquery RTE at all, and optimize that
 away as well.  Maybe we can make all of that work, and maybe we should
 make all of it work, but it's fairly complex.  The advantage of having
 the function return the qual rather than contain the qual is that all
 of that goes away.  The function can choose to return nothing (no RLS
 for this user) or it can choose to return something (which will likely
 be simpler than what it would have needed to return out of the chute).
  One disadvantage is that we have to parse the returned qual instead
 of just sucking in a node-tree.

 Anyway, I don't feel super-strongly about this particular idea, so if
 I'm the only one who likes it, fine, but that having been said, I
 think users are going to want a *declarative* way to control which
 policies are applied to which users.  Suppose Bob is a sales rep who
 is only allowed to see his own customers, but then one day, we decide
 we trust Bob after all, so we want to let him see everything.  We
 could go back and update the IN (...) list in the security policy
 function, but that's an ugly and unscalable nuisance, especially if
 we've got 10,000 users.  It's much nicer to be able to just grant bob
 a permission using some kind of, well, GRANT command.  That's what
 we're doing, after all.  Alastair's proposal of making the security
 policy a property of the GRANT is one way of tackling that, and the
 RLSBYPASS permission I proposed elsewhere is another.  Something along
 these lines seems likely to improve performance (by replacing a query
 optimization problem with a syscache lookup) as well as ease-of-use.

My preference is RLSBYPASS permission rather than the approach
with functions that return policy clause at run-time, because it needs
to invalidate prepared statement at random timing.
In case of this function approach, the RLS policy shall be generated
on planner stage, and we cannot have any assumption to the criteria
of RLS policy. A function might generate RLS policy regarding to the
current user id. Yes, it is straightforward. The prepared statement
should be invalidate whenever current user-id got switched.
However, someone may define a function that generate RLS policy
depending on the value of client_min_messages for example.
Do we need to invalidate prepared statement whenever GUC get
updated? I think it is overkill. We cannot predicate all the criteria
user want to control the RLS policy using the functions.
So, RLSBYPASS permission is more simple way to limit number of
situations to invalidate prepared statements.


If we would have an ideal optimizer, I'd still like the optimizer to
wipe out redundant clauses transparently, rather than RLSBYPASS
permissions, because it just controls all-or-nothing stuff.
For example, if tuples are categorized to unclassified, classified or
secret, and RLS policy is configured as:
  ((current_user IN ('alice', 'bob') AND X IN ('unclassified',
'classified')) OR (X IN 'unclassified)),
superuser can see all the tuples, and alice and bob can see
up to classified tuples.
Is it really hard to wipe out redundant condition at planner stage?
If current_user is obviously 'kaigai', it seems to me the left-side of
this clause can be wiped out at the planner stage.
Do I consider the issue too simple?

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-29 Thread Florian Pflug

On May25, 2012, at 23:32 , Kohei KaiGai wrote:
 However, it should not be applied on triggers being set on PK tables,
 because it allows to modify or delete primary-key being referenced by
 invisible foreign-key from the viewpoint of operators.
 I think, it makes sense to have exceptional cases; to make sure
 foreign-key constraint at the baseline.

Good point. If we simply ignore RLS policies when executing parent-side
FK constraint triggers, however, we'll happily UPDATE or DELETE invisible
rows should the constraint be defined as CASCADE or SET NULL. Which is,
um, not exactly desirable…

FK triggers already deal with a related issue, namely a child row being
invisible to a REPEATABLE READ transaction which attempts to update the
parent row. We handle this case by always executing queries in
parent-side FK constraint triggers with a newly created READ COMMITTED
snapshot, but verifying that all matching rows are also visible under
the transaction's actual REPEATABLE READ snapshot. If some are not, we
throw a serialization error.

FK constraints in the presence of RLS should do the same, I think. When
executing a parent-side constraint trigger we should *not* apply the
RLS policy as a filter, but instead check that all returned rows are
indeed visible according to the RLS policy. If some are not we throw,
say, a policy enforcement error.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-29 Thread Kohei KaiGai

2012/5/27 Alastair Turner b...@ctrlf5.co.za:
 Excerpts from Kohei KaiGai kai...@kaigai.gr.jp wrote on Fri, May 25,
 2012 at 11:08 PM:
 If we assume RLS is applied when user has
 no privileges on tables, the current ExecCheckRTEPerms()
 always raises an error towards unprivileged users, prior to
 execution of queries.
 Isn't it preferable behavior to allow unprivileged users to
 reference a table (or columns) when it has RLS policy?

 Rather than assuming the the RLS checks will be applied when there are
 no privileges it would make sense to regard RLS as a limitation on the
 scope of a particular privilege. This makes RLS a property of a
 particular grant of a privilege rather than of the table. Viewed this
 way it is possible to control which users are subject to restrictions
 imposed by the RLS function, which will avoid RLS overhead for
 operations which don't require it while allowing checks for those that
 do, provide a mechanism exempting object owners from RLS checks and
 make it possible to avoid pg_dump calling user defined code.

 This suggests an alternative declaration syntax, to use the salesman example:

 GRANT SELECT ON TABLE client_detail TO super_sales_group;
 GRANT SELECT TABLE ON client_detail TO limited_sales_group WITH
 (QUALIFICATION FUNCTION sales_view_check);

 and since more easily usable security features will be used more
 effectively, a possible future inlining of the condition:

 GRANT SELECT ON TABLE client_detail TO limited_sales_group WHERE
 sales_rep = my_sales_rep();

It seems to me an interesting proposition, because I didn't thought such kind of
statement to provide row-level policies.

We have a common issue for the idea that check privileges of underlying tables;
when we should check the privileges and make a decision whether RLS policy is
appended, or not.
Due to query optimization reason, the RLS policy should be appended prior to
the query optimization. At least, we want to utilize RLS policy for index scans,
rather than sequential scan.

On the other hand, all the permission checks are currently applied at executor
stage, not planner stage. Table / column-level privileges are also applied at
head of the executor. It is headache, if we go ahead with an idea to integrate
RLS and existing privilege checks.
One exception is simplify_function() that tries to expand enough simple SQL
functions if current user has ACL_EXECUTE privilege at planner stage.

One other issue is whether we should allow any users with enough privileges
to bypass RLS, or only superusers. I'm still not sure how the existing checks
perform with RLS.
If and when Alice has SELECT permission on column X and Y of TBL with RLS,
but her query tries to reference X,Y and Z. In this case, existing privilege
mechanism shall raise an error, but the criteria with underlying table allows to
run this query. It seems to me it is a straightforward criteria to
limit superusers
to bypass RLS...

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-29 Thread Kohei KaiGai

2012/5/28 Florian Pflug f...@phlo.org:
 On May28, 2012, at 02:46 , Noah Misch wrote:
 On Thu, May 24, 2012 at 07:31:37PM +0200, Florian Pflug wrote:
 Since the security barrier flag carries a potentially hefty performance
 penalty, I think it should be optional. Application which don't allow
 SQL-level access to the database might still benefit from row-level 
 security,
 because it saves them from having to manually add the WHERE clause to every
 statement, or having to wrap all their tables with views. Yet without direct
 SQL-level access, the security barrier thing isn't really necessary, so
 it'd be nice if they wouldn't have to pay for it. How about

  ALTER TABLE ? SET ROW POLICY ? WITH (security_barrier)

 Backward compatibility concerns limited our options when retrofitting the
 security_barrier treatment for views, but I'd rather not add a knob 
 completely
 disabling it in the context of a brand new feature.  A better avenue is to
 enhance our facilities for identifying safe query fragments.  For example,
 ALTER FUNCTION ... LEAKPROOF is superuser-only.  Adding a way for a table
 owner to similarly trust functions for the purpose of his own tables would
 help close the gap that motivates such an all-or-nothing knob.

 Hm, I'm not sure a per-function flag is really the solution here. Neither,
 however, is a per-RLS flag as your arguments made me realize. There really are
 three orthogonal concepts here, all of which allow security barriers to be
 ignored, namely

  A) Trusting the use not to exploit leaks, i.e. what you call a trusted query
     generator

  B) There being no leaks, i.e. all functions being LEAKPROOF

  C) Not caring about leaks, i.e. the security_barrier flag

 Concept B is handled adequately by the LEAKPROOF flag. Concept C is handled
 by the security_barrier flag. However, as you pointed out, it's a bit of a
 dubious concept and only really necessary for backwards compatibility because
 it reflects pre-9.2 behaviour.

 Concept A is what I was aiming for. Per the above, a per-RLS flag is clearly 
 the
 wrong tool for the job, so consider my suggestion withdrawn. What we actually
 want, I think, is a per-role flag which marks a role as leak trusted. 
 Queries
 issued by such a role would behave as if all functions are LEAKPROOF, since 
 even
 if there is a leak, the role is trusted not to exploit it.

It seems to me we are confusing about security-barrier and leakproof flag.
The purpose of leakproof flag is to ensure the function is safe to execute,
so it is allowed to pushed down the function into a sub-query with security-
barrier. It works to decide whether the user given qualifier is safe to
push-down. The RLS policy itself is placed within a sub-query from the
beginning, thus not needed them to be leakproof function.

Existing view with security-barrier flag cannot prevent information leaks,
even when the owner set leaky function on its qualifiers. However, it is
owner's responsibility. In a similar fashion, I don't think RLS works fine
in case when owner set leaky function as its policy by himself.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-29 Thread Florian Pflug

On May29, 2012, at 13:59 , Kohei KaiGai wrote:
 2012/5/28 Florian Pflug f...@phlo.org:
 Concept B is handled adequately by the LEAKPROOF flag. Concept C is handled
 by the security_barrier flag. However, as you pointed out, it's a bit of a
 dubious concept and only really necessary for backwards compatibility because
 it reflects pre-9.2 behaviour.
 
 Concept A is what I was aiming for. Per the above, a per-RLS flag is clearly 
 the
 wrong tool for the job, so consider my suggestion withdrawn. What we actually
 want, I think, is a per-role flag which marks a role as leak trusted. 
 Queries
 issued by such a role would behave as if all functions are LEAKPROOF, since 
 even
 if there is a leak, the role is trusted not to exploit it.
 
 It seems to me we are confusing about security-barrier and leakproof flag.
 The purpose of leakproof flag is to ensure the function is safe to execute,
 so it is allowed to pushed down the function into a sub-query with security-
 barrier. It works to decide whether the user given qualifier is safe to
 push-down. The RLS policy itself is placed within a sub-query from the
 beginning, thus not needed them to be leakproof function.

My suggestion (which I then withdrew) was for that per-policy flag to decide
whether the sub-query which applies the RLS policy acts as a security barrier
or not. Thus, with the flag set, only leakproof predicates would haven been
allowed to be pushed down into that subquery, whereas without the flag every
predicate would haven been a candidate for being pushed down. The flag was
never intended to mark the RLS policy itself as leakproof or leaky - that,
as you said, makes little sense.

My motivation for suggesting that flag was to prevent people who want RLS,
yet aren't concerned about leaks, from having to pay the performance penalty
associated with not pushing down predicates.

Noah's comments, however, made me realize that whether one cares about
potential leaks is usually not a per-table property, but rather a property
of the user executing the query. Some users (like the middle-ware that sits
on top of your database) you might trust to not exploit leaks, while wanting
the tightest security possible for others. Which made me suggest a per-role
flag which essentially overrides the security barrier stuff. Explaining that
behaviour as behave as if all functions are LEAKPROOF might haven been a
tad confusion, though. Maybe a better explanation is behave as if no
sub-query has the security barrier flag set, or even don't let security
concerns prevent predicate push-down.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-29 Thread Florian Pflug

On May29, 2012, at 13:37 , Kohei KaiGai wrote:
 2012/5/27 Alastair Turner b...@ctrlf5.co.za:
 Excerpts from Kohei KaiGai kai...@kaigai.gr.jp wrote on Fri, May 25,
 2012 at 11:08 PM:
 If we assume RLS is applied when user has
 no privileges on tables, the current ExecCheckRTEPerms()
 always raises an error towards unprivileged users, prior to
 execution of queries.
 Isn't it preferable behavior to allow unprivileged users to
 reference a table (or columns) when it has RLS policy?

I don't think so. Per-table and per-column permission checks should still
apply independent from any RLS policy. You can always grant SELECT access
to PUBLIC if want to rely solely on the RLS policy...

 Rather than assuming the the RLS checks will be applied when there are
 no privileges it would make sense to regard RLS as a limitation on the
 scope of a particular privilege. This makes RLS a property of a
 particular grant of a privilege rather than of the table. Viewed this
 way it is possible to control which users are subject to restrictions
 imposed by the RLS function, which will avoid RLS overhead for
 operations which don't require it while allowing checks for those that
 do, provide a mechanism exempting object owners from RLS checks and
 make it possible to avoid pg_dump calling user defined code.
 
 This suggests an alternative declaration syntax, to use the salesman example:
 
 GRANT SELECT ON TABLE client_detail TO super_sales_group;
 GRANT SELECT TABLE ON client_detail TO limited_sales_group WITH
 (QUALIFICATION FUNCTION sales_view_check);
 
 and since more easily usable security features will be used more
 effectively, a possible future inlining of the condition:
 
 GRANT SELECT ON TABLE client_detail TO limited_sales_group WHERE
 sales_rep = my_sales_rep();
 
 It seems to me an interesting proposition, because I didn't thought such kind 
 of
 statement to provide row-level policies.

Note, though, that due to role inheritance, multiple such QUALIFICATION 
FUNCTIONS
could be in scope for a single table. In that case, I guess you'd have to
OR them together, i.e. hide a row only if all of them return false.

 We have a common issue for the idea that check privileges of underlying 
 tables;
 when we should check the privileges and make a decision whether RLS policy is
 appended, or not.
 Due to query optimization reason, the RLS policy should be appended prior to
 the query optimization. At least, we want to utilize RLS policy for index 
 scans,
 rather than sequential scan.

Agreed. With the current design, the RLS policy has to be applied before 
planning,
not during execution.

 One other issue is whether we should allow any users with enough privileges
 to bypass RLS, or only superusers. I'm still not sure how the existing checks
 perform with RLS.

Every user who has the power to disable the RLS policy should also at least be
able to circumvent it temporarily, I think. This includes superusers, the table
owner (since he's presumably allowed to do ALTER TABLE .. RESET ROW POLICY),
and maybe the database owner.

 If and when Alice has SELECT permission on column X and Y of TBL with RLS,
 but her query tries to reference X,Y and Z. In this case, existing privilege
 mechanism shall raise an error, but the criteria with underlying table allows 
 to
 run this query. It seems to me it is a straightforward criteria to
 limit superusers
 to bypass RLS…

I'm not sure I understand. Do you mean the Alice references only columns X
and Y in her query, but the RLS policy adds the reference to column Z?

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-29 Thread Kohei KaiGai

2012/5/29 Florian Pflug f...@phlo.org:
 On May29, 2012, at 13:59 , Kohei KaiGai wrote:
 2012/5/28 Florian Pflug f...@phlo.org:
 Concept B is handled adequately by the LEAKPROOF flag. Concept C is handled
 by the security_barrier flag. However, as you pointed out, it's a bit of a
 dubious concept and only really necessary for backwards compatibility 
 because
 it reflects pre-9.2 behaviour.

 Concept A is what I was aiming for. Per the above, a per-RLS flag is 
 clearly the
 wrong tool for the job, so consider my suggestion withdrawn. What we 
 actually
 want, I think, is a per-role flag which marks a role as leak trusted. 
 Queries
 issued by such a role would behave as if all functions are LEAKPROOF, since 
 even
 if there is a leak, the role is trusted not to exploit it.

 It seems to me we are confusing about security-barrier and leakproof flag.
 The purpose of leakproof flag is to ensure the function is safe to execute,
 so it is allowed to pushed down the function into a sub-query with security-
 barrier. It works to decide whether the user given qualifier is safe to
 push-down. The RLS policy itself is placed within a sub-query from the
 beginning, thus not needed them to be leakproof function.

 My suggestion (which I then withdrew) was for that per-policy flag to decide
 whether the sub-query which applies the RLS policy acts as a security barrier
 or not. Thus, with the flag set, only leakproof predicates would haven been
 allowed to be pushed down into that subquery, whereas without the flag every
 predicate would haven been a candidate for being pushed down. The flag was
 never intended to mark the RLS policy itself as leakproof or leaky - that,
 as you said, makes little sense.

 My motivation for suggesting that flag was to prevent people who want RLS,
 yet aren't concerned about leaks, from having to pay the performance penalty
 associated with not pushing down predicates.

I think it is a reasonable selection. For example, it make sense in case when
users obviously don't have privilege to create a function and don't care about
estimation of invisible values using iteration of proving.
The owner is the only person who can determine whether it is harmless, or not.

 Noah's comments, however, made me realize that whether one cares about
 potential leaks is usually not a per-table property, but rather a property
 of the user executing the query. Some users (like the middle-ware that sits
 on top of your database) you might trust to not exploit leaks, while wanting
 the tightest security possible for others. Which made me suggest a per-role
 flag which essentially overrides the security barrier stuff. Explaining that
 behaviour as behave as if all functions are LEAKPROOF might haven been a
 tad confusion, though. Maybe a better explanation is behave as if no
 sub-query has the security barrier flag set, or even don't let security
 concerns prevent predicate push-down.

Hmm... It might make sense to allow table-owner to set up suitable grade
between security and performance. However, isn't it a feature to be
discussed in the 2nd commit-fest? I think we can construct this type of
adjustment on the basis of minimum functionality.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-29 Thread Florian Pflug

On May29, 2012, at 16:13 , Kohei KaiGai wrote:
 2012/5/29 Florian Pflug f...@phlo.org:
 My motivation for suggesting that flag was to prevent people who want RLS,
 yet aren't concerned about leaks, from having to pay the performance penalty
 associated with not pushing down predicates.
 
 I think it is a reasonable selection. For example, it make sense in case when
 users obviously don't have privilege to create a function and don't care about
 estimation of invisible values using iteration of proving.
 The owner is the only person who can determine whether it is harmless, or not.

Nice to know that we're on the same page here.

 Noah's comments, however, made me realize that whether one cares about
 potential leaks is usually not a per-table property, but rather a property
 of the user executing the query. Some users (like the middle-ware that sits
 on top of your database) you might trust to not exploit leaks, while wanting
 the tightest security possible for others. Which made me suggest a per-role
 flag which essentially overrides the security barrier stuff. Explaining that
 behaviour as behave as if all functions are LEAKPROOF might haven been a
 tad confusion, though. Maybe a better explanation is behave as if no
 sub-query has the security barrier flag set, or even don't let security
 concerns prevent predicate push-down.
 
 Hmm... It might make sense to allow table-owner to set up suitable grade
 between security and performance. However, isn't it a feature to be
 discussed in the 2nd commit-fest? I think we can construct this type of
 adjustment on the basis of minimum functionality.

Agreed. If the flag is per-role, not per-policy, the feature is orthogonal
to the whole RLS feature. So yeah, let's postpone it to a later date.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-29 Thread Kohei KaiGai

2012/5/29 Florian Pflug f...@phlo.org:
 On May29, 2012, at 13:37 , Kohei KaiGai wrote:
 2012/5/27 Alastair Turner b...@ctrlf5.co.za:
 Excerpts from Kohei KaiGai kai...@kaigai.gr.jp wrote on Fri, May 25,
 2012 at 11:08 PM:
 If we assume RLS is applied when user has
 no privileges on tables, the current ExecCheckRTEPerms()
 always raises an error towards unprivileged users, prior to
 execution of queries.
 Isn't it preferable behavior to allow unprivileged users to
 reference a table (or columns) when it has RLS policy?

 I don't think so. Per-table and per-column permission checks should still
 apply independent from any RLS policy. You can always grant SELECT access
 to PUBLIC if want to rely solely on the RLS policy...

Frankly, I have same opinion. Existing per-table and column permission
checks should perform independently from RLS. What I wanted to explain
is the combination of them makes confusion for us.

 Rather than assuming the the RLS checks will be applied when there are
 no privileges it would make sense to regard RLS as a limitation on the
 scope of a particular privilege. This makes RLS a property of a
 particular grant of a privilege rather than of the table. Viewed this
 way it is possible to control which users are subject to restrictions
 imposed by the RLS function, which will avoid RLS overhead for
 operations which don't require it while allowing checks for those that
 do, provide a mechanism exempting object owners from RLS checks and
 make it possible to avoid pg_dump calling user defined code.

 This suggests an alternative declaration syntax, to use the salesman 
 example:

 GRANT SELECT ON TABLE client_detail TO super_sales_group;
 GRANT SELECT TABLE ON client_detail TO limited_sales_group WITH
 (QUALIFICATION FUNCTION sales_view_check);

 and since more easily usable security features will be used more
 effectively, a possible future inlining of the condition:

 GRANT SELECT ON TABLE client_detail TO limited_sales_group WHERE
 sales_rep = my_sales_rep();

 It seems to me an interesting proposition, because I didn't thought such 
 kind of
 statement to provide row-level policies.

 Note, though, that due to role inheritance, multiple such QUALIFICATION 
 FUNCTIONS
 could be in scope for a single table. In that case, I guess you'd have to
 OR them together, i.e. hide a row only if all of them return false.

Nice catch! Indeed, it makes RLS policy to be applied complex, and hard to apply
index-scan.

 We have a common issue for the idea that check privileges of underlying 
 tables;
 when we should check the privileges and make a decision whether RLS policy is
 appended, or not.
 Due to query optimization reason, the RLS policy should be appended prior to
 the query optimization. At least, we want to utilize RLS policy for index 
 scans,
 rather than sequential scan.

 Agreed. With the current design, the RLS policy has to be applied before 
 planning,
 not during execution.

 One other issue is whether we should allow any users with enough privileges
 to bypass RLS, or only superusers. I'm still not sure how the existing checks
 perform with RLS.

 Every user who has the power to disable the RLS policy should also at least be
 able to circumvent it temporarily, I think. This includes superusers, the 
 table
 owner (since he's presumably allowed to do ALTER TABLE .. RESET ROW POLICY),
 and maybe the database owner.

I'm not inclined this criteria, because existing privilege system
allows to restrict
owner's behavior, even though it can be invalidated by owner itself.
In other words, the privilege mechanism performs according to the current
setting unless it is not disabled by owner itself.
On the other hand, superuser is allowed to ignore the security setting.
In RLS situation, it is corresponding to run the query without RLS policy.

 If and when Alice has SELECT permission on column X and Y of TBL with RLS,
 but her query tries to reference X,Y and Z. In this case, existing privilege
 mechanism shall raise an error, but the criteria with underlying table 
 allows to
 run this query. It seems to me it is a straightforward criteria to
 limit superusers
 to bypass RLS…

 I'm not sure I understand. Do you mean the Alice references only columns X
 and Y in her query, but the RLS policy adds the reference to column Z?

No, in this scenario, Alice tries to reference X,Y and Z, then it
should be violated.
I wanted to show is; If RLS performs only when current user has no table/column
level privilege, it makes strange behavior.
Here is no matter if RLS works independently from per table / column
permissions.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-29 Thread Robert Haas

On Tue, May 29, 2012 at 9:12 AM, Florian Pflug f...@phlo.org wrote:
 On May29, 2012, at 13:37 , Kohei KaiGai wrote:
 2012/5/27 Alastair Turner b...@ctrlf5.co.za:
 Excerpts from Kohei KaiGai kai...@kaigai.gr.jp wrote on Fri, May 25,
 2012 at 11:08 PM:
 If we assume RLS is applied when user has
 no privileges on tables, the current ExecCheckRTEPerms()
 always raises an error towards unprivileged users, prior to
 execution of queries.
 Isn't it preferable behavior to allow unprivileged users to
 reference a table (or columns) when it has RLS policy?

 I don't think so. Per-table and per-column permission checks should still
 apply independent from any RLS policy. You can always grant SELECT access
 to PUBLIC if want to rely solely on the RLS policy...

I strongly agree with this.  Permissions checks should be applied
first, so that we can boot anyone who has no business being there at
all as early in the process as possible.  As a second step, if you
have a right to the query the table, row-level security should
intervene to control which rows you can see.

One idea might be to have a grantable permission that permits the RLS
policy to be bypassed.  So, if a user has only SELECT permission, they
can select from the table, but the RLS policy will apply.  If they
have both SELECT and RLSBYPASS (probably not what we really want to
call it) permission, then they can select from the table and the RLS
policy will be skipped.  This means that superusers automatically skip
all RLS policies (which seems right) and table owners skip them by
default (but could revoke their own privileges) and other people can
skip them if the table owner (or the superuser) grants them the
appropriate privilege on the table involved.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-29 Thread Florian Pflug

On May29, 2012, at 16:34 , Robert Haas wrote:
 One idea might be to have a grantable permission that permits the RLS
 policy to be bypassed.  So, if a user has only SELECT permission, they
 can select from the table, but the RLS policy will apply.  If they
 have both SELECT and RLSBYPASS (probably not what we really want to
 call it) permission, then they can select from the table and the RLS
 policy will be skipped.  This means that superusers automatically skip
 all RLS policies (which seems right) and table owners skip them by
 default (but could revoke their own privileges) and other people can
 skip them if the table owner (or the superuser) grants them the
 appropriate privilege on the table involved.

I like it. Seems to support all use-cases I can come up with, and extends
existing privilege semantics in a natural way.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-29 Thread Kohei KaiGai

2012/5/29 Robert Haas robertmh...@gmail.com:
 One idea might be to have a grantable permission that permits the RLS
 policy to be bypassed.  So, if a user has only SELECT permission, they
 can select from the table, but the RLS policy will apply.  If they
 have both SELECT and RLSBYPASS (probably not what we really want to
 call it) permission, then they can select from the table and the RLS
 policy will be skipped.  This means that superusers automatically skip
 all RLS policies (which seems right) and table owners skip them by
 default (but could revoke their own privileges) and other people can
 skip them if the table owner (or the superuser) grants them the
 appropriate privilege on the table involved.

Isn't it unavailable to describe using RLS policy?
In case when 'alice' and 'bob' should bypass RLS policy on a certain table,
we will be able to describe it as follows:
(current_user in ('alice', 'bob') OR rls_policy_this_table(X, Y, Z))

I have one concern the current_user in (...) is not wiped out at the planner
stage, although its evaluation result is obvious prior to execution.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-29 Thread Robert Haas

On Fri, May 25, 2012 at 5:08 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 I think it is a good idea not to apply RLS when current user has
 superuser privilege from perspective of security model consistency,
 but it is inconsistent to check privileges underlying tables.

 Seems like a somewhat random wart, if it's just an exception for
 superusers.  I think we need to do better than that.  For example, at
 my last company, sales reps A and B were permitted to see all
 customers of the company, but sales reps C, D, E, F, G, H, I, and J
 were permitted to see only their own accounts.  Those sorts of
 policies need to be easy to implement.

 Probably, if sales_rep column records its responsible repo, its
 security policy is able to be described as:
  (my_sales_rep() in ('A', 'B') OR sales_rep = my_sales_rep())

Yes, but that's a pain to optimize.  When A or B tries to select from
the table, the query optimizer has to realize that my_sales_rep() is
stable, inline it, do constant simplification and throw away the
entire OR clause.  Note that this won't work today, because we only
constant-fold immutable functions, not stable ones.  Then, since there
are no remaining security quals, we have to realize that we actually
don't need the security_barrier subquery RTE at all, and optimize that
away as well.  Maybe we can make all of that work, and maybe we should
make all of it work, but it's fairly complex.  The advantage of having
the function return the qual rather than contain the qual is that all
of that goes away.  The function can choose to return nothing (no RLS
for this user) or it can choose to return something (which will likely
be simpler than what it would have needed to return out of the chute).
 One disadvantage is that we have to parse the returned qual instead
of just sucking in a node-tree.

Anyway, I don't feel super-strongly about this particular idea, so if
I'm the only one who likes it, fine, but that having been said, I
think users are going to want a *declarative* way to control which
policies are applied to which users.  Suppose Bob is a sales rep who
is only allowed to see his own customers, but then one day, we decide
we trust Bob after all, so we want to let him see everything.  We
could go back and update the IN (...) list in the security policy
function, but that's an ugly and unscalable nuisance, especially if
we've got 10,000 users.  It's much nicer to be able to just grant bob
a permission using some kind of, well, GRANT command.  That's what
we're doing, after all.  Alastair's proposal of making the security
policy a property of the GRANT is one way of tackling that, and the
RLSBYPASS permission I proposed elsewhere is another.  Something along
these lines seems likely to improve performance (by replacing a query
optimization problem with a syscache lookup) as well as ease-of-use.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-29 Thread Robert Haas

On Tue, May 29, 2012 at 10:57 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 2012/5/29 Robert Haas robertmh...@gmail.com:
 One idea might be to have a grantable permission that permits the RLS
 policy to be bypassed.  So, if a user has only SELECT permission, they
 can select from the table, but the RLS policy will apply.  If they
 have both SELECT and RLSBYPASS (probably not what we really want to
 call it) permission, then they can select from the table and the RLS
 policy will be skipped.  This means that superusers automatically skip
 all RLS policies (which seems right) and table owners skip them by
 default (but could revoke their own privileges) and other people can
 skip them if the table owner (or the superuser) grants them the
 appropriate privilege on the table involved.

 Isn't it unavailable to describe using RLS policy?
 In case when 'alice' and 'bob' should bypass RLS policy on a certain table,
 we will be able to describe it as follows:
    (current_user in ('alice', 'bob') OR rls_policy_this_table(X, Y, Z))

 I have one concern the current_user in (...) is not wiped out at the planner
 stage, although its evaluation result is obvious prior to execution.

Yes, that's one problem with doing it that way.  The fact that the
superuser is not guaranteed-excluded is another; that can of course be
fixed by adding a special-case hack for superusers, but IMHO this is
more elegant.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-29 Thread Florian Pflug

On May29, 2012, at 16:57 , Kohei KaiGai wrote:
 2012/5/29 Robert Haas robertmh...@gmail.com:
 One idea might be to have a grantable permission that permits the RLS
 policy to be bypassed.  So, if a user has only SELECT permission, they
 can select from the table, but the RLS policy will apply.  If they
 have both SELECT and RLSBYPASS (probably not what we really want to
 call it) permission, then they can select from the table and the RLS
 policy will be skipped.  This means that superusers automatically skip
 all RLS policies (which seems right) and table owners skip them by
 default (but could revoke their own privileges) and other people can
 skip them if the table owner (or the superuser) grants them the
 appropriate privilege on the table involved.

 Isn't it unavailable to describe using RLS policy?
 In case when 'alice' and 'bob' should bypass RLS policy on a certain table,
 we will be able to describe it as follows:
(current_user in ('alice', 'bob') OR rls_policy_this_table(X, Y, Z))
 
 I have one concern the current_user in (...) is not wiped out at the planner
 stage, although its evaluation result is obvious prior to execution.

I think you'd simply pretend the RLS policy isn't there (i.e., don't rewrite
the table reference into a sub-query) if the user has the RLSBYPASS privilege,
not add another clause which selectively neuters the RLS policy.

Wait a moment.. now I get it.

You're saying that Robert's RLSBYPASS privilege is redundant, since one can
always add a current_user IN ... clause to the RLS policy function. That might
be the case theoretically, but I still believe that Robert's suggestion has a
lot of benefits. First, due to role inheritance, a simply clause like the above
isn't sufficient. Second, it doesn't make superusers and table owner bypass
RLS by default. Which I think is a good idea (bypassing by default, that is),
because it prevents broken backups (for those using pg_dump) and data exports.
It doesn't cost anything in terms of security, because those users could disable
the RLS policy away if they are so inclined.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-28 Thread Florian Pflug

On May28, 2012, at 02:46 , Noah Misch wrote:
 On Thu, May 24, 2012 at 07:31:37PM +0200, Florian Pflug wrote:
 Since the security barrier flag carries a potentially hefty performance
 penalty, I think it should be optional. Application which don't allow
 SQL-level access to the database might still benefit from row-level security,
 because it saves them from having to manually add the WHERE clause to every
 statement, or having to wrap all their tables with views. Yet without direct
 SQL-level access, the security barrier thing isn't really necessary, so
 it'd be nice if they wouldn't have to pay for it. How about
 
  ALTER TABLE ? SET ROW POLICY ? WITH (security_barrier)
 
 Backward compatibility concerns limited our options when retrofitting the
 security_barrier treatment for views, but I'd rather not add a knob completely
 disabling it in the context of a brand new feature.  A better avenue is to
 enhance our facilities for identifying safe query fragments.  For example,
 ALTER FUNCTION ... LEAKPROOF is superuser-only.  Adding a way for a table
 owner to similarly trust functions for the purpose of his own tables would
 help close the gap that motivates such an all-or-nothing knob.

Hm, I'm not sure a per-function flag is really the solution here. Neither,
however, is a per-RLS flag as your arguments made me realize. There really are
three orthogonal concepts here, all of which allow security barriers to be
ignored, namely

  A) Trusting the use not to exploit leaks, i.e. what you call a trusted query
 generator

  B) There being no leaks, i.e. all functions being LEAKPROOF

  C) Not caring about leaks, i.e. the security_barrier flag

Concept B is handled adequately by the LEAKPROOF flag. Concept C is handled
by the security_barrier flag. However, as you pointed out, it's a bit of a
dubious concept and only really necessary for backwards compatibility because
it reflects pre-9.2 behaviour.

Concept A is what I was aiming for. Per the above, a per-RLS flag is clearly the
wrong tool for the job, so consider my suggestion withdrawn. What we actually
want, I think, is a per-role flag which marks a role as leak trusted. Queries
issued by such a role would behave as if all functions are LEAKPROOF, since even
if there is a leak, the role is trusted not to exploit it.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-27 Thread Alastair Turner

Excerpts from Kohei KaiGai kai...@kaigai.gr.jp wrote on Fri, May 25,
2012 at 11:08 PM:
 If we assume RLS is applied when user has
 no privileges on tables, the current ExecCheckRTEPerms()
 always raises an error towards unprivileged users, prior to
 execution of queries.
 Isn't it preferable behavior to allow unprivileged users to
 reference a table (or columns) when it has RLS policy?

Rather than assuming the the RLS checks will be applied when there are
no privileges it would make sense to regard RLS as a limitation on the
scope of a particular privilege. This makes RLS a property of a
particular grant of a privilege rather than of the table. Viewed this
way it is possible to control which users are subject to restrictions
imposed by the RLS function, which will avoid RLS overhead for
operations which don't require it while allowing checks for those that
do, provide a mechanism exempting object owners from RLS checks and
make it possible to avoid pg_dump calling user defined code.

This suggests an alternative declaration syntax, to use the salesman example:

GRANT SELECT ON TABLE client_detail TO super_sales_group;
GRANT SELECT TABLE ON client_detail TO limited_sales_group WITH
(QUALIFICATION FUNCTION sales_view_check);

and since more easily usable security features will be used more
effectively, a possible future inlining of the condition:

GRANT SELECT ON TABLE client_detail TO limited_sales_group WHERE
sales_rep = my_sales_rep();

Regards,

Alastair.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-27 Thread Noah Misch

On Thu, May 24, 2012 at 07:31:37PM +0200, Florian Pflug wrote:
 On May24, 2012, at 18:42 , Kohei KaiGai wrote:
  As we discussed, it causes a problem with approach to append
  additional qualifiers to where clause implicitly, because it does
  not solve the matter corresponding to the order to execute
  qualifiers. So, I'm inclined to the approach to replace reference
  to tables with security policy by sub-queries with security barrier
  flag.
 
 Since the security barrier flag carries a potentially hefty performance
 penalty, I think it should be optional. Application which don't allow
 SQL-level access to the database might still benefit from row-level security,
 because it saves them from having to manually add the WHERE clause to every
 statement, or having to wrap all their tables with views. Yet without direct
 SQL-level access, the security barrier thing isn't really necessary, so
 it'd be nice if they wouldn't have to pay for it. How about
 
   ALTER TABLE ? SET ROW POLICY ? WITH (security_barrier)

The conventional case for a RLS facility is to wholly implement a security
policy, so security_barrier should be the default.  Using the same facility to
implement a security policy in cooperation with a trusted query generator is
the variant case.

Backward compatibility concerns limited our options when retrofitting the
security_barrier treatment for views, but I'd rather not add a knob completely
disabling it in the context of a brand new feature.  A better avenue is to
enhance our facilities for identifying safe query fragments.  For example,
ALTER FUNCTION ... LEAKPROOF is superuser-only.  Adding a way for a table
owner to similarly trust functions for the purpose of his own tables would
help close the gap that motivates such an all-or-nothing knob.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-25 Thread Kohei KaiGai

2012/5/24 Robert Haas robertmh...@gmail.com:
 On Thu, May 24, 2012 at 6:11 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 Perhaps when we see that RLS
 applies, we should replace the reference to the original table with a
 subquery RTE that has the security_barrier flag set - essentially
 treating a table with RLS as if it were a security view.

 I become to think it is a better approach than tracking origin of each
 qualifiers. One problem is case handling on update or delete statement.

 It may be possible to rewrite the update / delete query as follows:

 From:
  UPDATE tbl SET X = X + 1 WHERE f_leak(Y)
 To:
  UPDATE tbl SET X = X + 1 WHERE ctid = (
      SELECT * FROM (
          SELECT ctid FROM tbl WHERE uname = getpgusername()  == (*)
 should have security-barrier
      ) AS tbl_subqry WHERE f_leak(Y)
  );

 Expanded sub-queries will have security-barrier flag, so it enforces
 the uname = getpgusername() being checked earlier than f_leak(Y).
 We may need to measure the performance impact due to the reform.

 The problem with this is that it introduces an extra instance of tbl
 into the query - there are now two rather than one.  UPDATE .. FROM is
 supposed to be a way to avoid this, but it's insufficiently general to
 handle all the cases (e.g. UPDATE a LEFT JOIN b can't be written using
 the existing syntax).  Anyway we want to avoid inserting self-joins
 for performance reasons if at all possible.  It should be easy to do
 that in the case of SELECT; UPDATE and DELETE may need a bit more
 work.

I'll try to investigate a way to solve the matter without twice scan.
Right now, I have no reasonable ideas. Please give us suggestion
if you have something...

 I think, this situation is similar to a case when we reference a view
 without privileges to underlying tables. If Bob set up a view with
 something tricky function, it allows Bob to reference credentials
 of users who reference the view.
 More or less, it might be a problem when a user try to invoke
 a user defined function declared by others.
 (Thus, sepgsql policy does not allow users to invoke a function
 declared by another one in different domain; without DBA's checks.)

 This is true, but there are still some new threat models.  For
 example, currently, pg_dump isn't going to run any user-defined code
 just because you do SELECT * FROM table, but that will change with
 this patch.  Note that pg_dump need not actually select from views,
 only tables.

 I think it is a good idea not to apply RLS when current user has
 superuser privilege from perspective of security model consistency,
 but it is inconsistent to check privileges underlying tables.

 Seems like a somewhat random wart, if it's just an exception for
 superusers.  I think we need to do better than that.  For example, at
 my last company, sales reps A and B were permitted to see all
 customers of the company, but sales reps C, D, E, F, G, H, I, and J
 were permitted to see only their own accounts.  Those sorts of
 policies need to be easy to implement.

Probably, if sales_rep column records its responsible repo, its
security policy is able to be described as:
  (my_sales_rep() in ('A', 'B') OR sales_rep = my_sales_rep())

Indeed, the design to check underlying table seems to me like
column-level privileges towards table-level privileges, since it
is checked only when user does not have requires privileges
on whole of the table.
However, I have no idea to modify ExecCheckRTEPerms()
regarding to RLS. If we assume RLS is applied when user has
no privileges on tables, the current ExecCheckRTEPerms()
always raises an error towards unprivileged users, prior to
execution of queries.
Isn't it preferable behavior to allow unprivileged users to
reference a table (or columns) when it has RLS policy?

I think, table and column level privilege should be checked
individually, in addition to row-level security policy.

 Another idea is to set things up so that the RLS policy function isn't
 applied to each row directly; instead, it's invoked once per query and
 *returns* a WHERE clause.  This would be a lot more powerful than the
 proposed design, because now the table owner can write a function that
 imposes quals on some people but not others, which seems very useful.

 Sorry, I don't favor this idea. Even if table owner set up a function to
 generate additional qualifiers, it also has no guarantee the qualifiers
 are invoked prior to user-given one.
 It seems to me this approach will have same problem...

 It's not intended to solve the qual-ordering problem, just to allow
 additional policy flexibility.

At the beginning, I thought it takes complex code to parse
where-clause being provided as security policy, so it is the
reason why I was inclined to give a function, instead of a clause.
But I noticed we already have similar code at CreateTrigger()
to handle it.
Does it give policy flexibility?

 It's not clear to me that there is any need for built-in server
 functionality here.  If the table

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-25 Thread Kohei KaiGai

2012/5/24 Robert Haas robertmh...@gmail.com:
 On Thu, May 24, 2012 at 12:00 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 Another issue, BTW, are FOREIGN KEY constraints. Should you be allowed to
 created references to rows which are invisible to you, or should FOREIGN KEY
 constraints be exempt from security policies? I'd say they shouldn't be, 
 i.e.
 the policy WHERE clause should be added to constraint checking queries like
 usual. But maybe I'm missing some reason why that'd be undesirable…

 I agree. The row level security policy should not be applied during FK checks
 (or other internal stuff; to be harmless). At the previous discussion, it was
 issued that iteration of FK/PK proving enables malicious one to estimate
 existence of invisible tuple and its key value, although they cannot see the
 actual values. It is well documented limitation, thus, user should not use 
 row-
 level security (or should not use natural key) if they cannot accept
 this limitation.

 You say I agree, but it seems to me that you and Florian are in fact
 taking opposite positions.

Sorry, I misread what he described.

 FWIW, I'm inclined to think that you should NOT be able to create a
 row that references an invisible row.  You might end up with that
 situation anyway, because we don't know what the semantics of the
 security policy are: rows might become visible or invisible after the
 fact, and we can't police that.  But I think that if you take the
 opposite position that the select queries inside fkey triggers ought
 to be exempt from security policy, then you need to build some new
 mechanism to make that happen, which seems like extra work for no
 benefit.

I think it is fair enough for RI_FKey_check_ins and RI_FKey_check_upd;
RLS policy inside these trigger function will exhibit to create a row that
references invisible row.

However, it should not be applied on triggers being set on PK tables,
because it allows to modify or delete primary-key being referenced by
invisible foreign-key from the viewpoint of operators.
I think, it makes sense to have exceptional cases; to make sure
foreign-key constraint at the baseline.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-25 Thread Kohei KaiGai

2012/5/25 Florian Pflug f...@phlo.org:
 On May24, 2012, at 19:25 , Robert Haas wrote:
 FWIW, I'm inclined to think that you should NOT be able to create a
 row that references an invisible row.  You might end up with that
 situation anyway, because we don't know what the semantics of the
 security policy are: rows might become visible or invisible after the
 fact, and we can't police that.

 Right. I just realized, however, that there's another case which wasn't
 considered yet, which is how to handle the initial check during
 ALTER TABEL ADD CONSTRAINT. I'm thinking that it's fine to only consider
 visible rows in the parent table there too, but we should be checking
 all rows in the child table. The easiest way would be to restrict
 ALTER TABLE ADD CONSTRAINT to the table owner for tables with RLS
 (it seems that currently the REFERENCES privilege is sufficient), and
 make the table owner exempt from RLS on that table. The latter means you'd
 need at least two roles to use RLS, but anyone security-conscious enough
 to use RLS will probably not user the same role for DDL and DML operations
 anyway...

I think, the RLS policy should perform as a view with qualifiers.
It means database constraints have to be kept from the viewpoint of
any clients, so, any orphan foreign-keys or any rows violating check
constraint should not exist.

In case when a user insert a row with foreign-key, it is an exceptional
case. Even if he cannot insert a foreign-key that references invisible
primary-key, it never breaks database constraints from the viewpoint
of other folks.

My standpoint deals with database constraints as first class customer
that should be always kept in the lowest level, even though a part of
results would be filtered out due to RLS.
Isn't it a simple enough criteria?

 But I think that if you take the
 opposite position that the select queries inside fkey triggers ought
 to be exempt from security policy, then you need to build some new
 mechanism to make that happen, which seems like extra work for no
 benefit.

 Hm, interesting angle. Continuing this thought, without any extra work,
 UNIQUE and EXCLUSION constraints *will* be enforced regardless of row
 visibility, because their implementation isn't SPI-based but instead
 detects conflicts while inserting tuples into the index.

 For being so obviously inconsistent in its treatment of UNIQUE and
 EXCLUSION constraints vs. FK constraints, this feels surprisingly
 right. So, to prevent design by accident, here's an attempt to explain
 that divergence.

 For UNIQUE and EXCLUSION constraints, the most conservative assumption
 possible is that all rows are visible, since that leads to the most
 rejections. With that assumption, no matter what the actual policy is,
 the data returned by a query will always satisfy the constraint. Plus,
 the constraint is still sensible because it neither rejects nor allows
 all rows. So that conservative assumption is the one we make, i.e. we
 ignore RLS visibility when checking those kinds of constraints.

 For FK constraints, OTOH, the most conservative assumption is that
 no rows are visible. But that is meaningless, since it will simply reject
 all possible rows. Having thus no chance of enforcing the constraint
 ourselves under all possible policies, the best we can do is to at least
 make it possible for the constraint to work correctly for as many policies
 as possible. Now, if we go with KaiGai's suggestion of skipping RLS
 while checking FK constraints, the only policy that the constraint will
 work correctly for is one which doesn't actually hide any parent rows.
 Whereas if we apply RLS checks while checking FK constraints, all policies
 which behave consistently for parent and child rows (i.e. don't hide the
 former but show the latter) will work correctly. We thus go with the
 second option, since the class of working policies is larger.

-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-24 Thread Kohei KaiGai

2012/5/23 Robert Haas robertmh...@gmail.com:
 On Wed, May 23, 2012 at 3:45 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 I wanted to have discussion to handle this problem.

 Unlike leaky-view problem, we don't need to worry about unexpected
 qualifier distribution on either side of join, because a scan on table
 never contains any join. Thus, all we need to care about is order of
 qualifiers chained on a particular scan node; being reordered by
 the cost to invoke functions.

 How about an idea to track FuncExpr come from the security policy
 and enforce 0 on its cost? Regular functions never reach zero
 cost, so the security policy must be re-ordered to the head.

 Hmm.  That would disregard the relative costs of multiple qualifiers
 all of which were injected by the security policy, which I suspect is
 not a good idea.
  Furthermore, I think that we should not assume that
 there is no join involved.  I would expect a fairly popular RLS qual
 to be something of the form WHERE NOT EXISTS (SELECT 1 FROM hide_me
 WHERE hide_me.pk = thistab.pk).

Please ignore the approach to track cost value of qualifiers.
I believe it does not work well without something fundamental updates.

 Perhaps when we see that RLS
 applies, we should replace the reference to the original table with a
 subquery RTE that has the security_barrier flag set - essentially
 treating a table with RLS as if it were a security view.

I become to think it is a better approach than tracking origin of each
qualifiers. One problem is case handling on update or delete statement.

It may be possible to rewrite the update / delete query as follows:

From:
  UPDATE tbl SET X = X + 1 WHERE f_leak(Y)
To:
  UPDATE tbl SET X = X + 1 WHERE ctid = (
  SELECT * FROM (
  SELECT ctid FROM tbl WHERE uname = getpgusername()  == (*)
should have security-barrier
  ) AS tbl_subqry WHERE f_leak(Y)
  );

Expanded sub-queries will have security-barrier flag, so it enforces
the uname = getpgusername() being checked earlier than f_leak(Y).
We may need to measure the performance impact due to the reform.

 Also, suppose that Bob applies an RLS policy to a table, and, later,
 Alice selects from the table.  How do we keep Bob from usurping
 Alice's privileges?  If we insist that Bob's RLS policy function runs
 as Bob, then it defeats inlining; but if it runs as Alice, then Bob
 can steal Alice's credentials.  One idea is to apply the security
 policy only if Alice's access to the table is granted by Bob.  That
 way, if Alice is (for example) the superuser, she's immune to RLS.
 But that doesn't seem to completely solve the problem, because Alice
 might merely be some other relatively unprivileged user and we still
 don't want Bob to be able to walk off with her access.

I think, this situation is similar to a case when we reference a view
without privileges to underlying tables. If Bob set up a view with
something tricky function, it allows Bob to reference credentials
of users who reference the view.
More or less, it might be a problem when a user try to invoke
a user defined function declared by others.
(Thus, sepgsql policy does not allow users to invoke a function
declared by another one in different domain; without DBA's checks.)

I think it is a good idea not to apply RLS when current user has
superuser privilege from perspective of security model consistency,
but it is inconsistent to check privileges underlying tables.

 Another idea is to set things up so that the RLS policy function isn't
 applied to each row directly; instead, it's invoked once per query and
 *returns* a WHERE clause.  This would be a lot more powerful than the
 proposed design, because now the table owner can write a function that
 imposes quals on some people but not others, which seems very useful.

Sorry, I don't favor this idea. Even if table owner set up a function to
generate additional qualifiers, it also has no guarantee the qualifiers
are invoked prior to user-given one.
It seems to me this approach will have same problem...

 Also, if the point here is to provide security for tables not views,
 it seems like you really need to have (at least a design for) RLS
 security on insert/update/delete operations.  Just adding the same
 filter condition might be adequate for deletes, but I don't think it
 works at all for inserts.  And for updates, what prevents the user from
 updating columns he shouldn't, or updating them to key values he
 shouldn't be able to use?

 If we also apply the security policy to newer version of tuples on
 update and insert, one idea is to inject a before-row-(update|insert)
 trigger to check whether it satisfies the security policy.
 For same reason, the trigger should be executed at the end of
 trigger chain.

 It's not clear to me that there is any need for built-in server
 functionality here.  If the table owner wants to enforce some sort of
 policy regarding INSERT or UPDATE or DELETE, they can already do that
 today just by attaching a

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-24 Thread Florian Pflug

On May23, 2012, at 22:12 , Robert Haas wrote:
 Also, suppose that Bob applies an RLS policy to a table, and, later,
 Alice selects from the table.  How do we keep Bob from usurping
 Alice's privileges?

That problem isn't restricted to RLW, though. Bob could just as well
have booby-trapped any other SQL object, e.g. could have added
bobs_malicious_function() to a view Alice selects from, or attached
it as a trigger to a table Alice inserts to.

It would be nice if there was (optional) protection against these
kinds of attacks, but it's not really something that RLS should be
burdened with.

BTW, I've wondered in the past how tight our protection against some
trying to take over the postgres role during pg_dump is. On the surface,
it seems that we're safe because pg_dump doesn't invoke user-defined
functions except output functions (which require superuser privileges
to modify). But that's not exactly a solid line of defense...

 If we also apply the security policy to newer version of tuples on
 update and insert, one idea is to inject a before-row-(update|insert)
 trigger to check whether it satisfies the security policy.
 For same reason, the trigger should be executed at the end of
 trigger chain.
 
 It's not clear to me that there is any need for built-in server
 functionality here.  If the table owner wants to enforce some sort of
 policy regarding INSERT or UPDATE or DELETE, they can already do that
 today just by attaching a trigger to the table.  And they can enforce
 whatever policy they like that way.  Before designing any new
 mechanism, what's wrong with the existing one?

Yeah, applying the security policy to the new row (for UPDATES
and INSERTS) seems weird - the policy determines what you can see,
not what you can store, which might be two different things.

But the security policy should still apply to the old rows, i.e.
you shouldn't be after to UPDATE or DELETE rows you cannot see, no?

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-24 Thread Kohei KaiGai

2012/5/24 Florian Pflug f...@phlo.org:
 If we also apply the security policy to newer version of tuples on
 update and insert, one idea is to inject a before-row-(update|insert)
 trigger to check whether it satisfies the security policy.
 For same reason, the trigger should be executed at the end of
 trigger chain.

 It's not clear to me that there is any need for built-in server
 functionality here.  If the table owner wants to enforce some sort of
 policy regarding INSERT or UPDATE or DELETE, they can already do that
 today just by attaching a trigger to the table.  And they can enforce
 whatever policy they like that way.  Before designing any new
 mechanism, what's wrong with the existing one?

 Yeah, applying the security policy to the new row (for UPDATES
 and INSERTS) seems weird - the policy determines what you can see,
 not what you can store, which might be two different things.

 But the security policy should still apply to the old rows, i.e.
 you shouldn't be after to UPDATE or DELETE rows you cannot see, no?

The case of INSERT / DELETE are simple; All we need to apply is
checks on either new or old tuples.

In case of UPDATE, we need to check on the old tuple whether use can
see, and on the new tuple whether use can store them.
Indeed, these are different checks, however, it seems like a black hole
if the new tuple is allowed to write but no reader privileges.
I expect most use cases choose same policy on reader timing and
writer times at UPDATE statement.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-24 Thread Florian Pflug

On May24, 2012, at 12:43 , Kohei KaiGai wrote:
 The case of INSERT / DELETE are simple; All we need to apply is
 checks on either new or old tuples.
 
 In case of UPDATE, we need to check on the old tuple whether use can
 see, and on the new tuple whether use can store them.
 Indeed, these are different checks, however, it seems like a black hole
 if the new tuple is allowed to write but no reader privileges.
 I expect most use cases choose same policy on reader timing and
 writer times at UPDATE statement.

I don't think preventing block holes is sensible here - it might,
in fact, be *just* what the user wants.

Imagine a messaging system. A reasonable RLS policy would be to allow
a user to see messages addressed to him. Yet you wouldn't want to prevent
her from creating messages to other people - cause what good is a messaging
system that only allows you to send messages to yourself. What you
probably *would* want to do, though, is to check that she did put herself in
as the sender when she creates a message. And you'd probably wanna forbit
updates entirely. So you'd have

  - A RLS policy that checks current_user = ANY(recipients)
  - An ON INSERT trigger which checks current_user = sender
  - An ON UPDATE trigger which errors out

If RLS policy applies to INSERTEed rows also, how would you do that?

Another example, although in the realm of filesystem permissions, is Mac OS X.
Per default, every user has a Drop Box folder, which anybody can write to, yet
only the owner can read. This allows you to easily transfer files from one
user to another without allowing a third party to read it.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-24 Thread Kohei KaiGai

2012/5/24 Florian Pflug f...@phlo.org:
 On May24, 2012, at 12:43 , Kohei KaiGai wrote:
 The case of INSERT / DELETE are simple; All we need to apply is
 checks on either new or old tuples.

 In case of UPDATE, we need to check on the old tuple whether use can
 see, and on the new tuple whether use can store them.
 Indeed, these are different checks, however, it seems like a black hole
 if the new tuple is allowed to write but no reader privileges.
 I expect most use cases choose same policy on reader timing and
 writer times at UPDATE statement.

 I don't think preventing block holes is sensible here - it might,
 in fact, be *just* what the user wants.

 Imagine a messaging system. A reasonable RLS policy would be to allow
 a user to see messages addressed to him. Yet you wouldn't want to prevent
 her from creating messages to other people - cause what good is a messaging
 system that only allows you to send messages to yourself. What you
 probably *would* want to do, though, is to check that she did put herself in
 as the sender when she creates a message. And you'd probably wanna forbit
 updates entirely. So you'd have

  - A RLS policy that checks current_user = ANY(recipients)
  - An ON INSERT trigger which checks current_user = sender
  - An ON UPDATE trigger which errors out

 If RLS policy applies to INSERTEed rows also, how would you do that?

 Another example, although in the realm of filesystem permissions, is Mac OS X.
 Per default, every user has a Drop Box folder, which anybody can write to, 
 yet
 only the owner can read. This allows you to easily transfer files from one
 user to another without allowing a third party to read it.

Indeed, you are right. We have no special reason why to enforce same rules
on both of reader and writer stage on UPDATE statement.

So, the proposed interface might be revised as follows:
  ALTER TABLE tblname ADD SECURITY POLICY
  func_name(args, ...) [FOR SELECT |
 INSERT |
 [BEFORE|AFTER] UPDATE |
 DELETE];

In case of INSERT or AFTER UPDATE, I assume the check shall be applied
on the tail of before-row triggers.

(*) I don't check whether it conflicts syntax or not yet.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-24 Thread Florian Pflug

On May24, 2012, at 16:19 , Kohei KaiGai wrote:
 So, the proposed interface might be revised as follows:
  ALTER TABLE tblname ADD SECURITY POLICY
  func_name(args, ...) [FOR SELECT |
 INSERT |
 [BEFORE|AFTER] UPDATE 
 |
 DELETE];
 
 In case of INSERT or AFTER UPDATE, I assume the check shall be applied
 on the tail of before-row triggers.

I'd go with just SELECT, UPDATE, DELETE, and leave the INSERT and BEFORE
UPDATE case to regular triggers, for two reasons

First, it's conceptually much simpler, since the policy always just adds
an implicit WHERE clause, period. This of course assumes that DELETE and
(BEFORE) UPDATE simply skips rows for which the policy function returns false,
instead of reporting 'permission denied' or something. But that's the most
reasonable behaviour anyway, I think, because otherwise you'd make batch
UPDATEs and DELETEs pretty much unusable, 'cause there'd always be the risk
of tripping over some invisible row and getting and error.

And second, it avoids mimicking functionality that is already provided
by an existing feature, namely triggers.

People will have to deal with the trigger ordering issue, but that's nothing
new, and I bet most people have a system in place for that. I usually prefix
my trigger names with 'a_' to 'z_', for example, to make the ordering explicit.

Another issue, BTW, are FOREIGN KEY constraints. Should you be allowed to
created references to rows which are invisible to you, or should FOREIGN KEY
constraints be exempt from security policies? I'd say they shouldn't be, i.e.
the policy WHERE clause should be added to constraint checking queries like
usual. But maybe I'm missing some reason why that'd be undesirable…

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-24 Thread Kohei KaiGai

2012/5/24 Florian Pflug f...@phlo.org:
 On May24, 2012, at 16:19 , Kohei KaiGai wrote:
 So, the proposed interface might be revised as follows:
  ALTER TABLE tblname ADD SECURITY POLICY
      func_name(args, ...) [FOR SELECT |
                                                         INSERT |
                                                         [BEFORE|AFTER] 
 UPDATE |
                                                         DELETE];

 In case of INSERT or AFTER UPDATE, I assume the check shall be applied
 on the tail of before-row triggers.

 I'd go with just SELECT, UPDATE, DELETE, and leave the INSERT and BEFORE
 UPDATE case to regular triggers, for two reasons

 First, it's conceptually much simpler, since the policy always just adds
 an implicit WHERE clause, period. This of course assumes that DELETE and
 (BEFORE) UPDATE simply skips rows for which the policy function returns false,
 instead of reporting 'permission denied' or something. But that's the most
 reasonable behaviour anyway, I think, because otherwise you'd make batch
 UPDATEs and DELETEs pretty much unusable, 'cause there'd always be the risk
 of tripping over some invisible row and getting and error.

I definitely agree with starting a new feature from simple implementation.

Although I'm inclined to the approach to replace references to tables with
security policy by sub-queries with security barrier flag, instead of adding
qualifiers of where clause to avoid the leaky-view scenario, it will make its
implementation mush simpler.

 Another issue, BTW, are FOREIGN KEY constraints. Should you be allowed to
 created references to rows which are invisible to you, or should FOREIGN KEY
 constraints be exempt from security policies? I'd say they shouldn't be, i.e.
 the policy WHERE clause should be added to constraint checking queries like
 usual. But maybe I'm missing some reason why that'd be undesirable…

I agree. The row level security policy should not be applied during FK checks
(or other internal stuff; to be harmless). At the previous discussion, it was
issued that iteration of FK/PK proving enables malicious one to estimate
existence of invisible tuple and its key value, although they cannot see the
actual values. It is well documented limitation, thus, user should not use row-
level security (or should not use natural key) if they cannot accept
this limitation.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-24 Thread Kohei KaiGai

I'd like to summarize the current design being discussed.

syntax:
  ALTER TABLE tblname WITH ROW LEVEL SECURITY
  ( condition clause ) [FOR (SELECT | UPDATE | DELETE)];
  ALTER TABLE tblname WITHOUT ROW LEVEL SECURITY;

I tried to patch the parser/gram.y, but here was syntax conflicts
on ADD / DROP sub-command.
And, I noticed ROW LEVEL SECURITY allows to implement
without adding new keyword, unlike SECURITY POLICY.

As we discussed, it causes a problem with approach to append
additional qualifiers to where clause implicitly, because it does
not solve the matter corresponding to the order to execute
qualifiers. So, I'm inclined to the approach to replace reference
to tables with security policy by sub-queries with security barrier
flag.
For example, if tbl has security policy, this query shall be rewritten
internally, as follows:

original)
  SELECT * FROM tbl WHERE X  20 AND f_leak(Y);

rewritten)
  SELECT * FROM (
SELECT * FROM tbl WHERE uname = getpgusername()
  ) AS tbl_subqry WHERE X  20 AND f_leak(Y);

The sub-query shall have security-barrier flag, so f_leak() is never
pushed down but X  20 will be pushed down because of leakproof
attribute of the function.

It is a bit complex at UPDATE or DELETE statement, but
what I try to do is same.

  original)
UPDATE tbl SET X = X + 1 WHERE X  20 AND f_leak(Y);

  rewritten)
UPDATE tbl SET X = X + 1 WHERE ctid = (
SELECT ctid FROM (
SELECT ctid, * FROM uname = getpgusername()
) AS tbl_subqry WHERE X  20 AND f_leak(Y)
);

That guarantees the security policy (uname = getpgusername())
is evaluated prior to user given conditions.

One thing still I'm thinking is whether the security policy should
be provided as a function or a clause. Enough simple sql function
is inlined at simplify_function(), so here is no meaningful difference.
I was afraid of code complexity, but all we should do is to append
configured clause on the where clause of sub-query inside.

|  ALTER TABLE tblname WITH ROW LEVEL SECURITY
|  ( condition clause ) [FOR (SELECT | UPDATE | DELETE)];

So, I tried to put condition clause instead of a function, right now.


Regarding to FK constraints, I don't think it is a situation to
apply row-level security policy towards internal queries.
So, I plan to disable during FK checks.


One other issue we didn't have discussed is table inheritance.
In case when a table TBLP has a child table TBLC and only
TBLC has its security policy, what security policy should be
applied when we run SELECT * FROM TBLP.
My preference is, the security policy is only applied to scan on
TBLC, not TBLP. It is not desirable behavior that visible tuples
are different from way to reference a certain table.
In addition, if and when TBLP and TBLC have their own policy
individually, what is a desirable behavior?
I think, the security policy of both TBLP and TBLC should be
applied on TBLC; in other words, it applies the security policy
of all the parent tables to scan on child table.

Any comments please. Thanks,

2012/5/24 Kohei KaiGai kai...@kaigai.gr.jp:
 2012/5/24 Florian Pflug f...@phlo.org:
 On May24, 2012, at 16:19 , Kohei KaiGai wrote:
 So, the proposed interface might be revised as follows:
  ALTER TABLE tblname ADD SECURITY POLICY
      func_name(args, ...) [FOR SELECT |
                                                         INSERT |
                                                         [BEFORE|AFTER] 
 UPDATE |
                                                         DELETE];

 In case of INSERT or AFTER UPDATE, I assume the check shall be applied
 on the tail of before-row triggers.

 I'd go with just SELECT, UPDATE, DELETE, and leave the INSERT and BEFORE
 UPDATE case to regular triggers, for two reasons

 First, it's conceptually much simpler, since the policy always just adds
 an implicit WHERE clause, period. This of course assumes that DELETE and
 (BEFORE) UPDATE simply skips rows for which the policy function returns 
 false,
 instead of reporting 'permission denied' or something. But that's the most
 reasonable behaviour anyway, I think, because otherwise you'd make batch
 UPDATEs and DELETEs pretty much unusable, 'cause there'd always be the risk
 of tripping over some invisible row and getting and error.

 I definitely agree with starting a new feature from simple implementation.

 Although I'm inclined to the approach to replace references to tables with
 security policy by sub-queries with security barrier flag, instead of adding
 qualifiers of where clause to avoid the leaky-view scenario, it will make its
 implementation mush simpler.

 Another issue, BTW, are FOREIGN KEY constraints. Should you be allowed to
 created references to rows which are invisible to you, or should FOREIGN KEY
 constraints be exempt from security policies? I'd say they shouldn't be, i.e.
 the policy WHERE clause should be added to constraint checking queries like
 usual. But maybe I'm missing some reason why that'd be

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-24 Thread Robert Haas

On Thu, May 24, 2012 at 6:11 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 Perhaps when we see that RLS
 applies, we should replace the reference to the original table with a
 subquery RTE that has the security_barrier flag set - essentially
 treating a table with RLS as if it were a security view.

 I become to think it is a better approach than tracking origin of each
 qualifiers. One problem is case handling on update or delete statement.

 It may be possible to rewrite the update / delete query as follows:

 From:
  UPDATE tbl SET X = X + 1 WHERE f_leak(Y)
 To:
  UPDATE tbl SET X = X + 1 WHERE ctid = (
      SELECT * FROM (
          SELECT ctid FROM tbl WHERE uname = getpgusername()  == (*)
 should have security-barrier
      ) AS tbl_subqry WHERE f_leak(Y)
  );

 Expanded sub-queries will have security-barrier flag, so it enforces
 the uname = getpgusername() being checked earlier than f_leak(Y).
 We may need to measure the performance impact due to the reform.

The problem with this is that it introduces an extra instance of tbl
into the query - there are now two rather than one.  UPDATE .. FROM is
supposed to be a way to avoid this, but it's insufficiently general to
handle all the cases (e.g. UPDATE a LEFT JOIN b can't be written using
the existing syntax).  Anyway we want to avoid inserting self-joins
for performance reasons if at all possible.  It should be easy to do
that in the case of SELECT; UPDATE and DELETE may need a bit more
work.

 I think, this situation is similar to a case when we reference a view
 without privileges to underlying tables. If Bob set up a view with
 something tricky function, it allows Bob to reference credentials
 of users who reference the view.
 More or less, it might be a problem when a user try to invoke
 a user defined function declared by others.
 (Thus, sepgsql policy does not allow users to invoke a function
 declared by another one in different domain; without DBA's checks.)

This is true, but there are still some new threat models.  For
example, currently, pg_dump isn't going to run any user-defined code
just because you do SELECT * FROM table, but that will change with
this patch.  Note that pg_dump need not actually select from views,
only tables.

 I think it is a good idea not to apply RLS when current user has
 superuser privilege from perspective of security model consistency,
 but it is inconsistent to check privileges underlying tables.

Seems like a somewhat random wart, if it's just an exception for
superusers.  I think we need to do better than that.  For example, at
my last company, sales reps A and B were permitted to see all
customers of the company, but sales reps C, D, E, F, G, H, I, and J
were permitted to see only their own accounts.  Those sorts of
policies need to be easy to implement.

 Another idea is to set things up so that the RLS policy function isn't
 applied to each row directly; instead, it's invoked once per query and
 *returns* a WHERE clause.  This would be a lot more powerful than the
 proposed design, because now the table owner can write a function that
 imposes quals on some people but not others, which seems very useful.

 Sorry, I don't favor this idea. Even if table owner set up a function to
 generate additional qualifiers, it also has no guarantee the qualifiers
 are invoked prior to user-given one.
 It seems to me this approach will have same problem...

It's not intended to solve the qual-ordering problem, just to allow
additional policy flexibility.

 It's not clear to me that there is any need for built-in server
 functionality here.  If the table owner wants to enforce some sort of
 policy regarding INSERT or UPDATE or DELETE, they can already do that
 today just by attaching a trigger to the table.  And they can enforce
 whatever policy they like that way.  Before designing any new
 mechanism, what's wrong with the existing one?

 Yes, we don't need any new invent to check the value of new tuples.
 But it should be done after all the user-defined triggers. Existing
 trigger does not have a mechanism to enforce order to be invoked.
 So, what I really implement is a mechanism to inject some pseudo
 triggers at tail of the Trigger array.

Start the trigger names with the letter z.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-24 Thread Robert Haas

On Thu, May 24, 2012 at 6:20 AM, Florian Pflug f...@phlo.org wrote:
 But the security policy should still apply to the old rows, i.e.
 you shouldn't be after to UPDATE or DELETE rows you cannot see, no?

Agreed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-24 Thread Robert Haas

On Thu, May 24, 2012 at 12:00 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 Another issue, BTW, are FOREIGN KEY constraints. Should you be allowed to
 created references to rows which are invisible to you, or should FOREIGN KEY
 constraints be exempt from security policies? I'd say they shouldn't be, i.e.
 the policy WHERE clause should be added to constraint checking queries like
 usual. But maybe I'm missing some reason why that'd be undesirable…

 I agree. The row level security policy should not be applied during FK checks
 (or other internal stuff; to be harmless). At the previous discussion, it was
 issued that iteration of FK/PK proving enables malicious one to estimate
 existence of invisible tuple and its key value, although they cannot see the
 actual values. It is well documented limitation, thus, user should not use 
 row-
 level security (or should not use natural key) if they cannot accept
 this limitation.

You say I agree, but it seems to me that you and Florian are in fact
taking opposite positions.

FWIW, I'm inclined to think that you should NOT be able to create a
row that references an invisible row.  You might end up with that
situation anyway, because we don't know what the semantics of the
security policy are: rows might become visible or invisible after the
fact, and we can't police that.  But I think that if you take the
opposite position that the select queries inside fkey triggers ought
to be exempt from security policy, then you need to build some new
mechanism to make that happen, which seems like extra work for no
benefit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-24 Thread Florian Pflug

On May24, 2012, at 18:42 , Kohei KaiGai wrote:
 I'd like to summarize the current design being discussed.
 
 syntax:
  ALTER TABLE tblname WITH ROW LEVEL SECURITY
  ( condition clause ) [FOR (SELECT | UPDATE | DELETE)];
  ALTER TABLE tblname WITHOUT ROW LEVEL SECURITY;
 
 I tried to patch the parser/gram.y, but here was syntax conflicts
 on ADD / DROP sub-command.
 And, I noticed ROW LEVEL SECURITY allows to implement
 without adding new keyword, unlike SECURITY POLICY.

Let the bike-shedding begin ;-)

ALTER TABLE … WITH sounds a bit weird. What about

  ALTER TABLE tblname SET ROW POLICY condition FOR { SELECT | UPDATE | 
DELETE }
  ALTER TABLE tblname RESET ROW POLICY

 As we discussed, it causes a problem with approach to append
 additional qualifiers to where clause implicitly, because it does
 not solve the matter corresponding to the order to execute
 qualifiers. So, I'm inclined to the approach to replace reference
 to tables with security policy by sub-queries with security barrier
 flag.

Since the security barrier flag carries a potentially hefty performance
penalty, I think it should be optional. Application which don't allow
SQL-level access to the database might still benefit from row-level security,
because it saves them from having to manually add the WHERE clause to every
statement, or having to wrap all their tables with views. Yet without direct
SQL-level access, the security barrier thing isn't really necessary, so
it'd be nice if they wouldn't have to pay for it. How about

  ALTER TABLE … SET ROW POLICY … WITH (security_barrier)

 One thing still I'm thinking is whether the security policy should
 be provided as a function or a clause. Enough simple sql function
 is inlined at simplify_function(), so here is no meaningful difference.
 I was afraid of code complexity, but all we should do is to append
 configured clause on the where clause of sub-query inside.
 
 |  ALTER TABLE tblname WITH ROW LEVEL SECURITY
 |  ( condition clause ) [FOR (SELECT | UPDATE | DELETE)];
 
 So, I tried to put condition clause instead of a function, right now.

A single function seems much easier implementation-wise, since you wouldn't
need to store an arbitrary expression in the catalog, just an oid. It also
delegates the dependency tracking problem to the function. It also simplies
the grammar, because the FOR …  clause cannot be mistaken to belong to
the condition clause.

 One other issue we didn't have discussed is table inheritance.
 In case when a table TBLP has a child table TBLC and only
 TBLC has its security policy, what security policy should be
 applied when we run SELECT * FROM TBLP.
 My preference is, the security policy is only applied to scan on
 TBLC, not TBLP.

Agreed.

 It is not desirable behavior that visible tuples
 are different from way to reference a certain table.
 In addition, if and when TBLP and TBLC have their own policy
 individually, what is a desirable behavior?
 I think, the security policy of both TBLP and TBLC should be
 applied on TBLC; in other words, it applies the security policy
 of all the parent tables to scan on child table.

I think security policies should only apply to the table they're
declared for, not their child tables. Mostly because that is how
triggers operate, and security policies and triggers will often
be used together, so having their semantics regarding inheritance
be the same seems to be the least surprising option.

Also, if policies are inherited, how would you define a policy
which applies only to the parent, not to the child?

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-24 Thread Florian Pflug

On May24, 2012, at 19:25 , Robert Haas wrote:
 FWIW, I'm inclined to think that you should NOT be able to create a
 row that references an invisible row.  You might end up with that
 situation anyway, because we don't know what the semantics of the
 security policy are: rows might become visible or invisible after the
 fact, and we can't police that.

Right. I just realized, however, that there's another case which wasn't
considered yet, which is how to handle the initial check during
ALTER TABEL ADD CONSTRAINT. I'm thinking that it's fine to only consider
visible rows in the parent table there too, but we should be checking
all rows in the child table. The easiest way would be to restrict
ALTER TABLE ADD CONSTRAINT to the table owner for tables with RLS
(it seems that currently the REFERENCES privilege is sufficient), and
make the table owner exempt from RLS on that table. The latter means you'd
need at least two roles to use RLS, but anyone security-conscious enough
to use RLS will probably not user the same role for DDL and DML operations
anyway...

 But I think that if you take the
 opposite position that the select queries inside fkey triggers ought
 to be exempt from security policy, then you need to build some new
 mechanism to make that happen, which seems like extra work for no
 benefit.

Hm, interesting angle. Continuing this thought, without any extra work,
UNIQUE and EXCLUSION constraints *will* be enforced regardless of row
visibility, because their implementation isn't SPI-based but instead
detects conflicts while inserting tuples into the index.

For being so obviously inconsistent in its treatment of UNIQUE and
EXCLUSION constraints vs. FK constraints, this feels surprisingly
right. So, to prevent design by accident, here's an attempt to explain
that divergence.

For UNIQUE and EXCLUSION constraints, the most conservative assumption
possible is that all rows are visible, since that leads to the most
rejections. With that assumption, no matter what the actual policy is,
the data returned by a query will always satisfy the constraint. Plus,
the constraint is still sensible because it neither rejects nor allows
all rows. So that conservative assumption is the one we make, i.e. we
ignore RLS visibility when checking those kinds of constraints.

For FK constraints, OTOH, the most conservative assumption is that
no rows are visible. But that is meaningless, since it will simply reject
all possible rows. Having thus no chance of enforcing the constraint
ourselves under all possible policies, the best we can do is to at least
make it possible for the constraint to work correctly for as many policies
as possible. Now, if we go with KaiGai's suggestion of skipping RLS
while checking FK constraints, the only policy that the constraint will
work correctly for is one which doesn't actually hide any parent rows.
Whereas if we apply RLS checks while checking FK constraints, all policies
which behave consistently for parent and child rows (i.e. don't hide the
former but show the latter) will work correctly. We thus go with the
second option, since the class of working policies is larger.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-23 Thread Tom Lane

Kohei KaiGai kai...@kaigai.gr.jp writes:
 Let me have a discussion to get preferable interface for row-level security.
 My planned feature will perform to append additional conditions to WHERE
 clause implicitly, to restrict tuples being visible for the current user.
 For example, when row-level policy uname = getpgusername() is configured
 on the table T1, the following query:
 select * from T1 where X  20;
 should be rewritten to:
 select * from T1 where (X  20) AND (uname = getpgusername());

Hm.  Simple and fairly noninvasive, but ... would this not be subject to
the same sorts of information-leak hazards that were addressed in the
security views feature?  That is, I see no guarantee that the RLS
condition will be evaluated before any conditions supplied by the user.
So it seems easy to get information out of rows the RLS policy is
supposed to prevent access to.  It would be far more secure to just
use a security view to apply the RLS condition.

Also, if the point here is to provide security for tables not views,
it seems like you really need to have (at least a design for) RLS
security on insert/update/delete operations.  Just adding the same
filter condition might be adequate for deletes, but I don't think it
works at all for inserts.  And for updates, what prevents the user from
updating columns he shouldn't, or updating them to key values he
shouldn't be able to use?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-23 Thread Alastair Turner

On Wed, May 23, 2012 at 5:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
 Let me have a discussion to get preferable interface for row-level security.
 My planned feature will perform to append additional conditions to WHERE
 clause implicitly, to restrict tuples being visible for the current user.
 For example, when row-level policy uname = getpgusername() is configured
 on the table T1, the following query:
     select * from T1 where X  20;
 should be rewritten to:
     select * from T1 where (X  20) AND (uname = getpgusername());

 Hm.  Simple and fairly noninvasive, but ... would this not be subject to
 the same sorts of information-leak hazards that were addressed in the
 security views feature?  That is, I see no guarantee that the RLS
 condition will be evaluated before any conditions supplied by the user.
 So it seems easy to get information out of rows the RLS policy is
 supposed to prevent access to.  It would be far more secure to just
 use a security view to apply the RLS condition.

Since adding a condition to the where clause is a relatively simple
operation (compared to the full potential scope of a view) could the
RLS rewrite of the query create a CTE with the additional condition[s]
rather than adding condition[s] to the user-supplied query? This would
provide the forced ordering of the evaluating the conditions, thereby
avoiding many of the potential points of leakage.

Bell.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-23 Thread Kohei KaiGai

2012/5/23 Tom Lane t...@sss.pgh.pa.us:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
 Let me have a discussion to get preferable interface for row-level security.
 My planned feature will perform to append additional conditions to WHERE
 clause implicitly, to restrict tuples being visible for the current user.
 For example, when row-level policy uname = getpgusername() is configured
 on the table T1, the following query:
     select * from T1 where X  20;
 should be rewritten to:
     select * from T1 where (X  20) AND (uname = getpgusername());

 Hm.  Simple and fairly noninvasive, but ... would this not be subject to
 the same sorts of information-leak hazards that were addressed in the
 security views feature?  That is, I see no guarantee that the RLS
 condition will be evaluated before any conditions supplied by the user.
 So it seems easy to get information out of rows the RLS policy is
 supposed to prevent access to.  It would be far more secure to just
 use a security view to apply the RLS condition.

I wanted to have discussion to handle this problem.

Unlike leaky-view problem, we don't need to worry about unexpected
qualifier distribution on either side of join, because a scan on table
never contains any join. Thus, all we need to care about is order of
qualifiers chained on a particular scan node; being reordered by
the cost to invoke functions.

How about an idea to track FuncExpr come from the security policy
and enforce 0 on its cost? Regular functions never reach zero
cost, so the security policy must be re-ordered to the head.

 Also, if the point here is to provide security for tables not views,
 it seems like you really need to have (at least a design for) RLS
 security on insert/update/delete operations.  Just adding the same
 filter condition might be adequate for deletes, but I don't think it
 works at all for inserts.  And for updates, what prevents the user from
 updating columns he shouldn't, or updating them to key values he
 shouldn't be able to use?

If we also apply the security policy to newer version of tuples on
update and insert, one idea is to inject a before-row-(update|insert)
trigger to check whether it satisfies the security policy.
For same reason, the trigger should be executed at the end of
trigger chain.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-23 Thread Kohei KaiGai

2012/5/23 Alastair Turner b...@ctrlf5.co.za:
 On Wed, May 23, 2012 at 5:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
 Let me have a discussion to get preferable interface for row-level security.
 My planned feature will perform to append additional conditions to WHERE
 clause implicitly, to restrict tuples being visible for the current user.
 For example, when row-level policy uname = getpgusername() is configured
 on the table T1, the following query:
     select * from T1 where X  20;
 should be rewritten to:
     select * from T1 where (X  20) AND (uname = getpgusername());

 Hm.  Simple and fairly noninvasive, but ... would this not be subject to
 the same sorts of information-leak hazards that were addressed in the
 security views feature?  That is, I see no guarantee that the RLS
 condition will be evaluated before any conditions supplied by the user.
 So it seems easy to get information out of rows the RLS policy is
 supposed to prevent access to.  It would be far more secure to just
 use a security view to apply the RLS condition.

 Since adding a condition to the where clause is a relatively simple
 operation (compared to the full potential scope of a view) could the
 RLS rewrite of the query create a CTE with the additional condition[s]
 rather than adding condition[s] to the user-supplied query? This would
 provide the forced ordering of the evaluating the conditions, thereby
 avoiding many of the potential points of leakage.

An interesting idea. However, I cannot imagine how does it works on
update or delete statement.

For select statement, it will get better performance to rewrite reference
to a particular table by a subquery with security_barrier flag than CTE,
because it allows to push down leakproof functions.

Could you tell me your idea for more details?
An example will help me understand well.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Interface of Row Level Security

2012-05-23 Thread Robert Haas

On Wed, May 23, 2012 at 3:45 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 I wanted to have discussion to handle this problem.

 Unlike leaky-view problem, we don't need to worry about unexpected
 qualifier distribution on either side of join, because a scan on table
 never contains any join. Thus, all we need to care about is order of
 qualifiers chained on a particular scan node; being reordered by
 the cost to invoke functions.

 How about an idea to track FuncExpr come from the security policy
 and enforce 0 on its cost? Regular functions never reach zero
 cost, so the security policy must be re-ordered to the head.

Hmm.  That would disregard the relative costs of multiple qualifiers
all of which were injected by the security policy, which I suspect is
not a good idea.  Furthermore, I think that we should not assume that
there is no join involved.  I would expect a fairly popular RLS qual
to be something of the form WHERE NOT EXISTS (SELECT 1 FROM hide_me
WHERE hide_me.pk = thistab.pk).  Perhaps when we see that RLS
applies, we should replace the reference to the original table with a
subquery RTE that has the security_barrier flag set - essentially
treating a table with RLS as if it were a security view.

Also, suppose that Bob applies an RLS policy to a table, and, later,
Alice selects from the table.  How do we keep Bob from usurping
Alice's privileges?  If we insist that Bob's RLS policy function runs
as Bob, then it defeats inlining; but if it runs as Alice, then Bob
can steal Alice's credentials.  One idea is to apply the security
policy only if Alice's access to the table is granted by Bob.  That
way, if Alice is (for example) the superuser, she's immune to RLS.
But that doesn't seem to completely solve the problem, because Alice
might merely be some other relatively unprivileged user and we still
don't want Bob to be able to walk off with her access.

Another idea is to set things up so that the RLS policy function isn't
applied to each row directly; instead, it's invoked once per query and
*returns* a WHERE clause.  This would be a lot more powerful than the
proposed design, because now the table owner can write a function that
imposes quals on some people but not others, which seems very useful.

 Also, if the point here is to provide security for tables not views,
 it seems like you really need to have (at least a design for) RLS
 security on insert/update/delete operations.  Just adding the same
 filter condition might be adequate for deletes, but I don't think it
 works at all for inserts.  And for updates, what prevents the user from
 updating columns he shouldn't, or updating them to key values he
 shouldn't be able to use?

 If we also apply the security policy to newer version of tuples on
 update and insert, one idea is to inject a before-row-(update|insert)
 trigger to check whether it satisfies the security policy.
 For same reason, the trigger should be executed at the end of
 trigger chain.

It's not clear to me that there is any need for built-in server
functionality here.  If the table owner wants to enforce some sort of
policy regarding INSERT or UPDATE or DELETE, they can already do that
today just by attaching a trigger to the table.  And they can enforce
whatever policy they like that way.  Before designing any new
mechanism, what's wrong with the existing one?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

63 matches

Mail list logo