Re: [HACKERS] [RFC] Interface of Row Level Security
Sorry for my late reply. 2012/6/6 Florian Pflug f...@phlo.org: On Jun5, 2012, at 22:33 , Kohei KaiGai wrote: 2012/6/5 Florian Pflug f...@phlo.org: I can live with any behaviour, as long as it doesn't depends on details of the query plan. My vote would be for always using the role which was active at statement creation time (i.e. at PREPARE/DECLARE time) for RLS purposes though. That way, we could treat the role as being IMMUTABLE I think. I have same opinion at the point where the active role should be consistent, but a bit different in details. I think, it should be a timing at beginning of execution, not plan creation time, like as existing permission checks are applied on InitPlan() or its callee. From a consistency POV, I'd agree. But binding the active role after planning means giving up a lot of query optimization opportunities. It will prevent the role from participating in constant folding, which translates to potentially enormous loss of performance. Take the following policy function as an example (role = 'some_role' AND expensive_function(row)) OR (role = 'other_role' AND row.field = 'some_value') and assume that there's an index on 'field'. Now, if role is known to be 'other_row', the planner can eliminate the expensive_function() from the plan, and satisfy row.field = 'some_value' with an index scan. Without that knowledge, you'll probably get a sequential scan. I do value consistency, but in this case the benefit of not being consistent with other privilege checks outweighs the drawbacks I think. I never deny some possibility to have special optimization using some information to be determined at planner stage, in the near future. However, I wonder the first version of this patch should include all the desirable features at once. It seems to me, caution of the problem is current_user is performing out of the snapshot controls. According to the definition of stable function, its result should not be changed during a particular scan. Yeah, well, ISTM that our definition of STABLE doesn't really take functions whose return value depends on GUCs into account. Probably, has_XXX_privilege() should work with reliable data source being safe from user-id switching. But it is a tough work to enhance whole of the GUC stuff for snapshot aware… We don't need that for every GUC, though. I'd say its sufficient to somehow provide the policy function with a stable value of the active role (whatever active ends up meaning). If we do that, we can simply document that making the policy function depend on other GUCs is a seriously bad idea. We *do* need that stable active role value though, since we cannot very well advise against using current_role in the policy function without providing an alternativ… I have no preference about implementation detail of this. As long as we can have the actually stable user-id until v9.3 release, it is sufficient for me. To reiterate, my point is that we won't get away with zero planner changes even without RLSBYPASS. In fact, we'll be paying *most* of the complexity costs of RLSBYPASS whether we actually add that feature or not. Which makes not adding it look like a pretty bad deal to me. It seems to me, 95% of our opinion is common, except for a few detailed stuff. I never dislike the idea of RLSBYPASS permission, but I'd like to investigate this idea with more time, until v9.3 being closed. At least, nobody opposed to allow superusers to bypass RLS policy. So, it can be a minimum startup point, isn't it? RLS patch will takes hundreds of lines due to syntax support or query rewriting stuff. It is always good idea to focus on the core functionality at the starting point. Absolutely. But at the same time, it's important to check that the design allows the additional features to be added later easily. In the case of RLS, I'm worried that decreeing that it's the role at execution time, not planning time, that counts, we're painting ourselves into a corner. I view RLSBYPASS as a good sanity check for that. Another one is that GRANT-with-filter-function idea. Both seem to fall into place quite naturally if handled while planning - you simply have to add decide which additional WHERE clause to add, if any. Without any negative performance effects from policies which don't apply to the current role, if the optimizer does it's job. Getting the same level of performance when policies are added unconditionally seems quite impossible. You might be able to make up for some losses by caching STABLE values (which amounts to another constant folding pass), but you won't ever be able to make up for a lost index scan opportunity or other optimizations which change the structure of the whole query plan. If we have a mechanism to check permissions at planner stage, it should perform as if it works at executor stage, as existing permission mechanism doing. Thus, here is no inconsistency even if
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/6/5 Tom Lane t...@sss.pgh.pa.us: Florian Pflug f...@phlo.org writes: On Jun4, 2012, at 18:38 , Kohei KaiGai wrote: 2012/6/4 Florian Pflug f...@phlo.org: Without something like RLSBYPASS, the DBA needs to have intimate knowledge about the different RLS policies to e.g. guarantee that his backups aren't missing crucial information, or that the replication system indeed replicates all rows. It seems to me you can define a function which implements site- specific security requirement (E.g backup should not be prevented by RLS policy), then include it as a part of RLS policy (or implicitly added by extensions, like sepgsql tries to do). Sure. But that requires each and every application which uses RLS to provide support for special backup (or replication, or whatever) privileges. And it requires the DBA to know how to assign these privileges to certain roles for each any every application in question. For shops which uses a lot of different applications, all with their own RLS policy, that can quickly get out of hand. Plus, a bug in one of these RLS policies has the potential to render backups incomplete. I agree with Florian here: if there is no way to disable RLS for backups, the database will be un-administratable. RLSBYPASS is not necessarily the only or best way to provide such an override, but we have to have something that is simple, foolproof, and *not* dependent on the details of any one RLS policy. I suspect that KaiGai-san's objection basically comes down to not wanting to have what amounts to a backdoor in RLS policies. However, what Florian is saying is that you have to have a backdoor anyway, unless you'd like to be at risk of losing data because it wasn't backed up. You can either have one well-understood, well-documented, well-tested backdoor, or you can have an ad-hoc backdoor in each RLS policy. Nobody can think that the latter approach is preferable. At least, database superusers shall bypass the RLS policy; it is a well understandable behavior and an approach to minimize the back-door; and allows to get complete database backup. It is easy to add a special privilege mechanism to bypass RLS policy later, however, not easy in opposite side. It seems to me a reasonable start-up to allow only superusers to bypass RLS policy. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Jun5, 2012, at 10:22 , Kohei KaiGai wrote: 2012/6/5 Tom Lane t...@sss.pgh.pa.us: I suspect that KaiGai-san's objection basically comes down to not wanting to have what amounts to a backdoor in RLS policies. However, what Florian is saying is that you have to have a backdoor anyway, unless you'd like to be at risk of losing data because it wasn't backed up. You can either have one well-understood, well-documented, well-tested backdoor, or you can have an ad-hoc backdoor in each RLS policy. Nobody can think that the latter approach is preferable. At least, database superusers shall bypass the RLS policy; it is a well understandable behavior and an approach to minimize the back-door; and allows to get complete database backup. I don't think we want to force people to run stuff with superuser privileges just for the sake of bypassing a RLS policy. On the whole, that reduces the overall security, not adds to it. It is easy to add a special privilege mechanism to bypass RLS policy later, however, not easy in opposite side. It seems to me a reasonable start-up to allow only superusers to bypass RLS policy. What's to be gained by that? Once there's *any* way to bypass a RLS policy, you'll have to deal with the plan invalidation issues you mentioned anyway. ISTM that complexity-wide, you don't save much by not having RLSBYPASS (or something similar), but feature-wise you lose at lot... best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/6/5 Florian Pflug f...@phlo.org: On Jun5, 2012, at 10:22 , Kohei KaiGai wrote: 2012/6/5 Tom Lane t...@sss.pgh.pa.us: I suspect that KaiGai-san's objection basically comes down to not wanting to have what amounts to a backdoor in RLS policies. However, what Florian is saying is that you have to have a backdoor anyway, unless you'd like to be at risk of losing data because it wasn't backed up. You can either have one well-understood, well-documented, well-tested backdoor, or you can have an ad-hoc backdoor in each RLS policy. Nobody can think that the latter approach is preferable. At least, database superusers shall bypass the RLS policy; it is a well understandable behavior and an approach to minimize the back-door; and allows to get complete database backup. I don't think we want to force people to run stuff with superuser privileges just for the sake of bypassing a RLS policy. On the whole, that reduces the overall security, not adds to it. I can understand your opinion. But I'm still uncertain whether the new permission is the best idea we can offer... It is easy to add a special privilege mechanism to bypass RLS policy later, however, not easy in opposite side. It seems to me a reasonable start-up to allow only superusers to bypass RLS policy. What's to be gained by that? Once there's *any* way to bypass a RLS policy, you'll have to deal with the plan invalidation issues you mentioned anyway. ISTM that complexity-wide, you don't save much by not having RLSBYPASS (or something similar), but feature-wise you lose at lot... All we need to change is selection of the function to be appended automatically. In case when superusers are allowed to bypass RLS policy, OR has_superuser_privilege() shall be appended to the user given clause, however, OR has_table_privilege() shall be appended instead in case of RLSBYPASS permission. (Note that has_table_privilege() always true for superusers.) I think it does not require to add a mechanism to invalidate prepared-statement, because all the checks are applied on executor stage. And these functions can be stable functions, so I believe some enhancements at planner will correctly wipe out prior to query execution at the next step. Conditional RLS policy in planner stage seems to me problematic so I don't want to include this feature on the first version. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Jun5, 2012, at 11:43 , Kohei KaiGai wrote: 2012/6/5 Florian Pflug f...@phlo.org: What's to be gained by that? Once there's *any* way to bypass a RLS policy, you'll have to deal with the plan invalidation issues you mentioned anyway. ISTM that complexity-wide, you don't save much by not having RLSBYPASS (or something similar), but feature-wise you lose at lot… All we need to change is selection of the function to be appended automatically. In case when superusers are allowed to bypass RLS policy, OR has_superuser_privilege() shall be appended to the user given clause, however, OR has_table_privilege() shall be appended instead in case of RLSBYPASS permission. (Note that has_table_privilege() always true for superusers.) I think it does not require to add a mechanism to invalidate prepared-statement, because all the checks are applied on executor stage. And these functions can be stable functions, so I believe some enhancements at planner will correctly wipe out prior to query execution at the next step. That won't work, since the current role can change any time mid-query, at least if you're using cursors. Here's an example First, setup a table CREATE TABLE data AS SELECT id FROM generate_series(1,100) AS id; GRANT SELECT ON data TO PUBLIC; Then create a cursor which doesn't materialize its result BEGIN; DECLARE mycursor NO SCROLL CURSOR WITHOUT HOLD FOR SELECT id, current_user FROM data; SET ROLE=anybody; FETCH mycursor; id | current_user +-- 1 | anybody (1 row) SET role=somebody; FETCH mycursor; id | current_user +-- 2 | somebody (1 row) ROLLBACK; And now, for comparison, a cursor which does materialize its result BEGIN; DECLARE mycursor NO SCROLL CURSOR WITHOUT HOLD FOR SELECT id, current_user FROM data ORDER BY id; SET ROLE=anybody; FETCH mycursor; id | current_user +-- 1 | anybody (1 row) SET role=somebody; FETCH mycursor; id | current_user +-- 2 | anybody (1 row) Imagine that the current_user function call was actually part of the RLS policy. Then the example above shows that *which* role we check access against (the one active at the first fetch, or currently active one) depends on which plan the optimizer happens to pick. This, IMHO, is not acceptable for a security feature like RLS. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/6/5 Florian Pflug f...@phlo.org: On Jun5, 2012, at 11:43 , Kohei KaiGai wrote: 2012/6/5 Florian Pflug f...@phlo.org: What's to be gained by that? Once there's *any* way to bypass a RLS policy, you'll have to deal with the plan invalidation issues you mentioned anyway. ISTM that complexity-wide, you don't save much by not having RLSBYPASS (or something similar), but feature-wise you lose at lot… All we need to change is selection of the function to be appended automatically. In case when superusers are allowed to bypass RLS policy, OR has_superuser_privilege() shall be appended to the user given clause, however, OR has_table_privilege() shall be appended instead in case of RLSBYPASS permission. (Note that has_table_privilege() always true for superusers.) I think it does not require to add a mechanism to invalidate prepared-statement, because all the checks are applied on executor stage. And these functions can be stable functions, so I believe some enhancements at planner will correctly wipe out prior to query execution at the next step. That won't work, since the current role can change any time mid-query, at least if you're using cursors. Here's an example First, setup a table CREATE TABLE data AS SELECT id FROM generate_series(1,100) AS id; GRANT SELECT ON data TO PUBLIC; Then create a cursor which doesn't materialize its result BEGIN; DECLARE mycursor NO SCROLL CURSOR WITHOUT HOLD FOR SELECT id, current_user FROM data; SET ROLE=anybody; FETCH mycursor; id | current_user +-- 1 | anybody (1 row) SET role=somebody; FETCH mycursor; id | current_user +-- 2 | somebody (1 row) ROLLBACK; And now, for comparison, a cursor which does materialize its result BEGIN; DECLARE mycursor NO SCROLL CURSOR WITHOUT HOLD FOR SELECT id, current_user FROM data ORDER BY id; SET ROLE=anybody; FETCH mycursor; id | current_user +-- 1 | anybody (1 row) SET role=somebody; FETCH mycursor; id | current_user +-- 2 | anybody (1 row) Imagine that the current_user function call was actually part of the RLS policy. Then the example above shows that *which* role we check access against (the one active at the first fetch, or currently active one) depends on which plan the optimizer happens to pick. This, IMHO, is not acceptable for a security feature like RLS. Which is your expected behavior in this example? In case when the query is internally materialized, the current_user function is *already* executed prior to switch user-id, thus, the result set internally stored in was already filtered out using older user-id. It seems to me, caution of the problem is current_user is performing out of the snapshot controls. According to the definition of stable function, its result should not be changed during a particular scan. At least, it is not a fundamental problem of RLS implementation, although it needs to take an enhancement something like effective user-id per snapshot basis. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Jun5, 2012, at 15:07 , Kohei KaiGai wrote: 2012/6/5 Florian Pflug f...@phlo.org: On Jun5, 2012, at 11:43 , Kohei KaiGai wrote: I think it does not require to add a mechanism to invalidate prepared-statement, because all the checks are applied on executor stage. And these functions can be stable functions, so I believe some enhancements at planner will correctly wipe out prior to query execution at the next step. That won't work, since the current role can change any time mid-query, at least if you're using cursors. Here's an example snipped example Imagine that the current_user function call was actually part of the RLS policy. Then the example above shows that *which* role we check access against (the one active at the first fetch, or currently active one) depends on which plan the optimizer happens to pick. This, IMHO, is not acceptable for a security feature like RLS. Which is your expected behavior in this example? I can live with any behaviour, as long as it doesn't depends on details of the query plan. My vote would be for always using the role which was active at statement creation time (i.e. at PREPARE/DECLARE time) for RLS purposes though. That way, we could treat the role as being IMMUTABLE I think. In case when the query is internally materialized, the current_user function is *already* executed prior to switch user-id, thus, the result set internally stored in was already filtered out using older user-id. Exactly. But whether or not the query, some parts of it, or even only some parts of the policy function are materialized is outside the control of the user and may change at any time. It seems to me, caution of the problem is current_user is performing out of the snapshot controls. According to the definition of stable function, its result should not be changed during a particular scan. Yeah, well, ISTM that our definition of STABLE doesn't really take functions whose return value depends on GUCs into account. At least, it is not a fundamental problem of RLS implementation, although it needs to take an enhancement something like effective user-id per snapshot basis. Right. We need something like statement_role which returns the role the statement is to be executed as (However we define that, though as I said above my vote is for it being the role which created the statement). To make that work, however, the plan cache needs to store that role, since the statement might be re-planned after its initial creation, but before it is executed. At which point the amount of complexity saved by not implementing RLSBYPASS is very nearly zero. To reiterate, my point is that we won't get away with zero planner changes even without RLSBYPASS. In fact, we'll be paying *most* of the complexity costs of RLSBYPASS whether we actually add that feature or not. Which makes not adding it look like a pretty bad deal to me. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/6/5 Florian Pflug f...@phlo.org: On Jun5, 2012, at 15:07 , Kohei KaiGai wrote: 2012/6/5 Florian Pflug f...@phlo.org: On Jun5, 2012, at 11:43 , Kohei KaiGai wrote: I think it does not require to add a mechanism to invalidate prepared-statement, because all the checks are applied on executor stage. And these functions can be stable functions, so I believe some enhancements at planner will correctly wipe out prior to query execution at the next step. That won't work, since the current role can change any time mid-query, at least if you're using cursors. Here's an example snipped example Imagine that the current_user function call was actually part of the RLS policy. Then the example above shows that *which* role we check access against (the one active at the first fetch, or currently active one) depends on which plan the optimizer happens to pick. This, IMHO, is not acceptable for a security feature like RLS. Which is your expected behavior in this example? I can live with any behaviour, as long as it doesn't depends on details of the query plan. My vote would be for always using the role which was active at statement creation time (i.e. at PREPARE/DECLARE time) for RLS purposes though. That way, we could treat the role as being IMMUTABLE I think. I have same opinion at the point where the active role should be consistent, but a bit different in details. I think, it should be a timing at beginning of execution, not plan creation time, like as existing permission checks are applied on InitPlan() or its callee. So, I believe EXECUTE is more right place to check it rather than PREPARE. Although FETCH command actually handles query execution, do we ought to consider DECLARE do both of query plan and execution from the perspective of model? It seems to me, caution of the problem is current_user is performing out of the snapshot controls. According to the definition of stable function, its result should not be changed during a particular scan. Yeah, well, ISTM that our definition of STABLE doesn't really take functions whose return value depends on GUCs into account. Probably, has_XXX_privilege() should work with reliable data source being safe from user-id switching. But it is a tough work to enhance whole of the GUC stuff for snapshot aware... At least, it is not a fundamental problem of RLS implementation, although it needs to take an enhancement something like effective user-id per snapshot basis. Right. We need something like statement_role which returns the role the statement is to be executed as (However we define that, though as I said above my vote is for it being the role which created the statement). To make that work, however, the plan cache needs to store that role, since the statement might be re-planned after its initial creation, but before it is executed. At which point the amount of complexity saved by not implementing RLSBYPASS is very nearly zero. I'm not sure why the plan cache needs to be stored with a particular user-id. As long as we check permissions at executor stage, all we need to do is implementation of a function to check permission based on a particular user-id at the timing of executor initialization. And, I think it can be a separated efforts from the first version of RLS. To reiterate, my point is that we won't get away with zero planner changes even without RLSBYPASS. In fact, we'll be paying *most* of the complexity costs of RLSBYPASS whether we actually add that feature or not. Which makes not adding it look like a pretty bad deal to me. It seems to me, 95% of our opinion is common, except for a few detailed stuff. I never dislike the idea of RLSBYPASS permission, but I'd like to investigate this idea with more time, until v9.3 being closed. At least, nobody opposed to allow superusers to bypass RLS policy. So, it can be a minimum startup point, isn't it? RLS patch will takes hundreds of lines due to syntax support or query rewriting stuff. It is always good idea to focus on the core functionality at the starting point. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Jun5, 2012, at 22:33 , Kohei KaiGai wrote: 2012/6/5 Florian Pflug f...@phlo.org: I can live with any behaviour, as long as it doesn't depends on details of the query plan. My vote would be for always using the role which was active at statement creation time (i.e. at PREPARE/DECLARE time) for RLS purposes though. That way, we could treat the role as being IMMUTABLE I think. I have same opinion at the point where the active role should be consistent, but a bit different in details. I think, it should be a timing at beginning of execution, not plan creation time, like as existing permission checks are applied on InitPlan() or its callee. From a consistency POV, I'd agree. But binding the active role after planning means giving up a lot of query optimization opportunities. It will prevent the role from participating in constant folding, which translates to potentially enormous loss of performance. Take the following policy function as an example (role = 'some_role' AND expensive_function(row)) OR (role = 'other_role' AND row.field = 'some_value') and assume that there's an index on 'field'. Now, if role is known to be 'other_row', the planner can eliminate the expensive_function() from the plan, and satisfy row.field = 'some_value' with an index scan. Without that knowledge, you'll probably get a sequential scan. I do value consistency, but in this case the benefit of not being consistent with other privilege checks outweighs the drawbacks I think. It seems to me, caution of the problem is current_user is performing out of the snapshot controls. According to the definition of stable function, its result should not be changed during a particular scan. Yeah, well, ISTM that our definition of STABLE doesn't really take functions whose return value depends on GUCs into account. Probably, has_XXX_privilege() should work with reliable data source being safe from user-id switching. But it is a tough work to enhance whole of the GUC stuff for snapshot aware… We don't need that for every GUC, though. I'd say its sufficient to somehow provide the policy function with a stable value of the active role (whatever active ends up meaning). If we do that, we can simply document that making the policy function depend on other GUCs is a seriously bad idea. We *do* need that stable active role value though, since we cannot very well advise against using current_role in the policy function without providing an alternativ… At least, it is not a fundamental problem of RLS implementation, although it needs to take an enhancement something like effective user-id per snapshot basis. Right. We need something like statement_role which returns the role the statement is to be executed as (However we define that, though as I said above my vote is for it being the role which created the statement). To make that work, however, the plan cache needs to store that role, since the statement might be re-planned after its initial creation, but before it is executed. At which point the amount of complexity saved by not implementing RLSBYPASS is very nearly zero. I'm not sure why the plan cache needs to be stored with a particular user-id. As long as we check permissions at executor stage, all we need to do is implementation of a function to check permission based on a particular user-id at the timing of executor initialization. Hm, yeah, if you defer everything to execution time, that's true. To reiterate, my point is that we won't get away with zero planner changes even without RLSBYPASS. In fact, we'll be paying *most* of the complexity costs of RLSBYPASS whether we actually add that feature or not. Which makes not adding it look like a pretty bad deal to me. It seems to me, 95% of our opinion is common, except for a few detailed stuff. I never dislike the idea of RLSBYPASS permission, but I'd like to investigate this idea with more time, until v9.3 being closed. At least, nobody opposed to allow superusers to bypass RLS policy. So, it can be a minimum startup point, isn't it? RLS patch will takes hundreds of lines due to syntax support or query rewriting stuff. It is always good idea to focus on the core functionality at the starting point. Absolutely. But at the same time, it's important to check that the design allows the additional features to be added later easily. In the case of RLS, I'm worried that decreeing that it's the role at execution time, not planning time, that counts, we're painting ourselves into a corner. I view RLSBYPASS as a good sanity check for that. Another one is that GRANT-with-filter-function idea. Both seem to fall into place quite naturally if handled while planning - you simply have to add decide which additional WHERE clause to add, if any. Without any negative performance effects from policies which don't apply to the current role, if the optimizer does it's job. Getting the same level of performance when policies are added unconditionally
Re: [HACKERS] [RFC] Interface of Row Level Security
On Sat, Jun 2, 2012 at 12:58 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: 2012/6/1 Robert Haas robertmh...@gmail.com: On Thu, May 31, 2012 at 3:52 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: It may be an option to separate the case into two; a situation to execute the given query immediately just after optimization and never reused, and others. Yes. Simon suggested exactly this a while back, and I agree with him that we ought to have it. It is good for me, also. Then, if so, we will be able to push the stuff corresponding to RLSBYPASS into the query optimization, and works transparently for users. You're still going to need a way to make sure that the cluster can be dumped properly. RLSBYPASS accomplishes that; your scheme doesn't. If something like has_superuser_privilege() OR is automatically appended to user given clauses, it makes sure whole of the database cluster is dumped. That also means any permission checks are delayed to executor stage (except for the case on simple query protocol, I mentioned above), thus it simplifies the condition to invalidate prepared statement. One problem I noticed last night around RLSBYPASS approach is: it can take much number of invalidation whenever current user-id is switched. Not only SET AUTHORIZATION, but SECURITY DEFINER function's invocation also. I'm not sure whether this invalidation storm is reasonable level, or not. Is it unavailable to handle this type of implicit superuser checks with existing EXECUTE statement? I tried to run EXPLAIN with similar case. postgres=# PREPARE p1(int, bool) AS SELECT * FROM tbl WHERE y $1 OR $2; PREPARE postgres=# EXPLAIN EXECUTE p1(10, current_user in ('kaigai','rhaas')); QUERY PLAN Seq Scan on tbl (cost=0.00..21.60 rows=1160 width=40) (1 row However, postgres=# PREPARE p2(int) AS SELECT * FROM tbl WHERE y $1 OR current_user in ('kaigai','rhaas'); PREPARE postgres=# EXPLAIN EXECUTE p2(10); QUERY PLAN - Seq Scan on tbl (cost=0.00..30.30 rows=394 width=40) Filter: ((y 10) OR (current_user() = ANY ('{kaigai,rhaas}'::name[]))) (2 rows) Please assume the second condition something like superuser checks in addition to the main security policy. It implies an idea to replace a certain portion of clause that consists of only constant values and stable / immutable functions by shadow parameter to be calculated at executor stage, makes sense to wipe out RLS policy for superusers. In addition, it requires no new invalidation mechanism to prepared statements. I'm not sure how best to handle the invalidation issue... but I still think that relying on the query planner to use theorem-proving to simplify RLS conditions is going to be problematic. You can certainly try it ad see how it comes out... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/6/4 Robert Haas robertmh...@gmail.com: On Sat, Jun 2, 2012 at 12:58 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: 2012/6/1 Robert Haas robertmh...@gmail.com: On Thu, May 31, 2012 at 3:52 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: It may be an option to separate the case into two; a situation to execute the given query immediately just after optimization and never reused, and others. Yes. Simon suggested exactly this a while back, and I agree with him that we ought to have it. It is good for me, also. Then, if so, we will be able to push the stuff corresponding to RLSBYPASS into the query optimization, and works transparently for users. You're still going to need a way to make sure that the cluster can be dumped properly. RLSBYPASS accomplishes that; your scheme doesn't. If something like has_superuser_privilege() OR is automatically appended to user given clauses, it makes sure whole of the database cluster is dumped. That also means any permission checks are delayed to executor stage (except for the case on simple query protocol, I mentioned above), thus it simplifies the condition to invalidate prepared statement. One problem I noticed last night around RLSBYPASS approach is: it can take much number of invalidation whenever current user-id is switched. Not only SET AUTHORIZATION, but SECURITY DEFINER function's invocation also. I'm not sure whether this invalidation storm is reasonable level, or not. Is it unavailable to handle this type of implicit superuser checks with existing EXECUTE statement? I tried to run EXPLAIN with similar case. postgres=# PREPARE p1(int, bool) AS SELECT * FROM tbl WHERE y $1 OR $2; PREPARE postgres=# EXPLAIN EXECUTE p1(10, current_user in ('kaigai','rhaas')); QUERY PLAN Seq Scan on tbl (cost=0.00..21.60 rows=1160 width=40) (1 row However, postgres=# PREPARE p2(int) AS SELECT * FROM tbl WHERE y $1 OR current_user in ('kaigai','rhaas'); PREPARE postgres=# EXPLAIN EXECUTE p2(10); QUERY PLAN - Seq Scan on tbl (cost=0.00..30.30 rows=394 width=40) Filter: ((y 10) OR (current_user() = ANY ('{kaigai,rhaas}'::name[]))) (2 rows) Please assume the second condition something like superuser checks in addition to the main security policy. It implies an idea to replace a certain portion of clause that consists of only constant values and stable / immutable functions by shadow parameter to be calculated at executor stage, makes sense to wipe out RLS policy for superusers. In addition, it requires no new invalidation mechanism to prepared statements. I'm not sure how best to handle the invalidation issue... but I still think that relying on the query planner to use theorem-proving to simplify RLS conditions is going to be problematic. You can certainly try it ad see how it comes out... I think, the enhancement of planner can be handled independently from RLS policy stuff, and we should avoid to contain many complex stuff into one functionality. At least, it will work correctly with idea to append implicit condition (OR has_superuser_privilege()) without any invalidation mechanism, although its performance penalty is not negligible; without planner- enhancement. How about your opinion? I'm worry about future maintenance issues, once we have RLSBYPASS permission or something user visible... Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Jun4, 2012, at 17:38 , Kohei KaiGai wrote: I'm worry about future maintenance issues, once we have RLSBYPASS permission or something user visible… I fear that without a generic way to disable RLS regardless which RLS policy function is in effect, we're creating a huge maintenance issue for DBAs. In a lot of shops, the DBA is responsible for a large number of databases, each potentially using a completely different approach to RLS and hence a completely different policy function. Without something like RLSBYPASS, the DBA needs to have intimate knowledge about the different RLS policies to e.g. guarantee that his backups aren't missing crucial information, or that the replication system indeed replicates all rows. With RLSBYPASS, all he needs to do is grant one privilege to his replication or backup user. The rest can be left to the development or support team for a specific application. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/6/4 Florian Pflug f...@phlo.org: On Jun4, 2012, at 17:38 , Kohei KaiGai wrote: I'm worry about future maintenance issues, once we have RLSBYPASS permission or something user visible… I fear that without a generic way to disable RLS regardless which RLS policy function is in effect, we're creating a huge maintenance issue for DBAs. In a lot of shops, the DBA is responsible for a large number of databases, each potentially using a completely different approach to RLS and hence a completely different policy function. Here is two problems around RLSBYPASS. The first is we have no idea to handle invalidation of prepared-statement when current user is switched, right now. The second is we can have another way to describe same RLS policy without PG original enhancement towards permission mechanism... Without something like RLSBYPASS, the DBA needs to have intimate knowledge about the different RLS policies to e.g. guarantee that his backups aren't missing crucial information, or that the replication system indeed replicates all rows. With RLSBYPASS, all he needs to do is grant one privilege to his replication or backup user. The rest can be left to the development or support team for a specific application. It seems to me you can define a function which implements site- specific security requirement (E.g backup should not be prevented by RLS policy), then include it as a part of RLS policy (or implicitly added by extensions, like sepgsql tries to do). These are the reason why I hesitate to go ahead with RLSBYPASS permission. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
Kohei KaiGai kai...@kaigai.gr.jp writes: Here is two problems around RLSBYPASS. The first is we have no idea to handle invalidation of prepared-statement when current user is switched, right now. How is that specifically the fault of RLSBYPASS? *Any* of the schemes you're proposing for inlined RLS checks will have problems with userID switching. My guess is we'd have to treat the effective userID as part of the plancache lookup key to make it safe to inline anything related to RLS. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/6/4 Tom Lane t...@sss.pgh.pa.us: Kohei KaiGai kai...@kaigai.gr.jp writes: Here is two problems around RLSBYPASS. The first is we have no idea to handle invalidation of prepared-statement when current user is switched, right now. How is that specifically the fault of RLSBYPASS? *Any* of the schemes you're proposing for inlined RLS checks will have problems with userID switching. Really? I don't find out a scenario that cause a problem with user-id switching in case when RLS policy is *unconditionally* appended then evaluated on executor stage. I'd like to see the scenario. My guess is we'd have to treat the effective userID as part of the plancache lookup key to make it safe to inline anything related to RLS. It might be a solution, if we append individual RLS policy at the planner stage, depending on user-id. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Jun4, 2012, at 18:38 , Kohei KaiGai wrote: 2012/6/4 Florian Pflug f...@phlo.org: Without something like RLSBYPASS, the DBA needs to have intimate knowledge about the different RLS policies to e.g. guarantee that his backups aren't missing crucial information, or that the replication system indeed replicates all rows. With RLSBYPASS, all he needs to do is grant one privilege to his replication or backup user. The rest can be left to the development or support team for a specific application. It seems to me you can define a function which implements site- specific security requirement (E.g backup should not be prevented by RLS policy), then include it as a part of RLS policy (or implicitly added by extensions, like sepgsql tries to do). Sure. But that requires each and every application which uses RLS to provide support for special backup (or replication, or whatever) privileges. And it requires the DBA to know how to assign these privileges to certain roles for each any every application in question. For shops which uses a lot of different applications, all with their own RLS policy, that can quickly get out of hand. Plus, a bug in one of these RLS policies has the potential to render backups incomplete. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
Florian Pflug f...@phlo.org writes: On Jun4, 2012, at 18:38 , Kohei KaiGai wrote: 2012/6/4 Florian Pflug f...@phlo.org: Without something like RLSBYPASS, the DBA needs to have intimate knowledge about the different RLS policies to e.g. guarantee that his backups aren't missing crucial information, or that the replication system indeed replicates all rows. It seems to me you can define a function which implements site- specific security requirement (E.g backup should not be prevented by RLS policy), then include it as a part of RLS policy (or implicitly added by extensions, like sepgsql tries to do). Sure. But that requires each and every application which uses RLS to provide support for special backup (or replication, or whatever) privileges. And it requires the DBA to know how to assign these privileges to certain roles for each any every application in question. For shops which uses a lot of different applications, all with their own RLS policy, that can quickly get out of hand. Plus, a bug in one of these RLS policies has the potential to render backups incomplete. I agree with Florian here: if there is no way to disable RLS for backups, the database will be un-administratable. RLSBYPASS is not necessarily the only or best way to provide such an override, but we have to have something that is simple, foolproof, and *not* dependent on the details of any one RLS policy. I suspect that KaiGai-san's objection basically comes down to not wanting to have what amounts to a backdoor in RLS policies. However, what Florian is saying is that you have to have a backdoor anyway, unless you'd like to be at risk of losing data because it wasn't backed up. You can either have one well-understood, well-documented, well-tested backdoor, or you can have an ad-hoc backdoor in each RLS policy. Nobody can think that the latter approach is preferable. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Thu, May 31, 2012 at 3:52 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: It may be an option to separate the case into two; a situation to execute the given query immediately just after optimization and never reused, and others. Yes. Simon suggested exactly this a while back, and I agree with him that we ought to have it. Even though the second situation, it may give us better query execution plan, if we try to reconstruct query plan just before executor with assumption that expects immutable / stable function can be replaced by constant value prior to execution. In other words, this idea tries to query optimization again on EXECUTE statement against to its nature, to replace immutable / stable functions by constant value, and to generate wiser execute plan. At least, it may make sense to have a flag on prepared statement to indicate whether it has possible better plan with this re-construction. This sounds complex and inefficient to me. Then, if so, we will be able to push the stuff corresponding to RLSBYPASS into the query optimization, and works transparently for users. You're still going to need a way to make sure that the cluster can be dumped properly. RLSBYPASS accomplishes that; your scheme doesn't. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/6/1 Robert Haas robertmh...@gmail.com: On Thu, May 31, 2012 at 3:52 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: It may be an option to separate the case into two; a situation to execute the given query immediately just after optimization and never reused, and others. Yes. Simon suggested exactly this a while back, and I agree with him that we ought to have it. It is good for me, also. Then, if so, we will be able to push the stuff corresponding to RLSBYPASS into the query optimization, and works transparently for users. You're still going to need a way to make sure that the cluster can be dumped properly. RLSBYPASS accomplishes that; your scheme doesn't. If something like has_superuser_privilege() OR is automatically appended to user given clauses, it makes sure whole of the database cluster is dumped. That also means any permission checks are delayed to executor stage (except for the case on simple query protocol, I mentioned above), thus it simplifies the condition to invalidate prepared statement. One problem I noticed last night around RLSBYPASS approach is: it can take much number of invalidation whenever current user-id is switched. Not only SET AUTHORIZATION, but SECURITY DEFINER function's invocation also. I'm not sure whether this invalidation storm is reasonable level, or not. Is it unavailable to handle this type of implicit superuser checks with existing EXECUTE statement? I tried to run EXPLAIN with similar case. postgres=# PREPARE p1(int, bool) AS SELECT * FROM tbl WHERE y $1 OR $2; PREPARE postgres=# EXPLAIN EXECUTE p1(10, current_user in ('kaigai','rhaas')); QUERY PLAN Seq Scan on tbl (cost=0.00..21.60 rows=1160 width=40) (1 row However, postgres=# PREPARE p2(int) AS SELECT * FROM tbl WHERE y $1 OR current_user in ('kaigai','rhaas'); PREPARE postgres=# EXPLAIN EXECUTE p2(10); QUERY PLAN - Seq Scan on tbl (cost=0.00..30.30 rows=394 width=40) Filter: ((y 10) OR (current_user() = ANY ('{kaigai,rhaas}'::name[]))) (2 rows) Please assume the second condition something like superuser checks in addition to the main security policy. It implies an idea to replace a certain portion of clause that consists of only constant values and stable / immutable functions by shadow parameter to be calculated at executor stage, makes sense to wipe out RLS policy for superusers. In addition, it requires no new invalidation mechanism to prepared statements. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On 2012-05-30 21:26, Kohei KaiGai wrote: If we would have an ideal optimizer, I'd still like the optimizer to wipe out redundant clauses transparently, rather than RLSBYPASS permissions, because it just controls all-or-nothing stuff. For example, if tuples are categorized to unclassified, classified or secret, and RLS policy is configured as: ((current_user IN ('alice', 'bob') AND X IN ('unclassified', 'classified')) OR (X IN 'unclassified)), superuser can see all the tuples, and alice and bob can see up to classified tuples. Is it really hard to wipe out redundant condition at planner stage? If current_user is obviously 'kaigai', it seems to me the left-side of this clause can be wiped out at the planner stage. The query's RLS policy would be simpler if the RLS policy function that returns the WHERE clause would take the user as argument, so its result does not contain user conditionals. IF (current_user IN ('alice', 'bob') THEN RETURN X IN ('unclassified', 'classified')) ELSE RETURN X IN ('unclassified') END IF; regards, Yeb -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/31 Yeb Havinga yebhavi...@gmail.com: On 2012-05-30 21:26, Kohei KaiGai wrote: If we would have an ideal optimizer, I'd still like the optimizer to wipe out redundant clauses transparently, rather than RLSBYPASS permissions, because it just controls all-or-nothing stuff. For example, if tuples are categorized to unclassified, classified or secret, and RLS policy is configured as: ((current_user IN ('alice', 'bob') AND X IN ('unclassified', 'classified')) OR (X IN 'unclassified)), superuser can see all the tuples, and alice and bob can see up to classified tuples. Is it really hard to wipe out redundant condition at planner stage? If current_user is obviously 'kaigai', it seems to me the left-side of this clause can be wiped out at the planner stage. The query's RLS policy would be simpler if the RLS policy function that returns the WHERE clause would take the user as argument, so its result does not contain user conditionals. IF (current_user IN ('alice', 'bob') THEN RETURN X IN ('unclassified', 'classified')) ELSE RETURN X IN ('unclassified') END IF; Yes, it is a happy case. But the point I'm concern about is, the conditions to branch cases are not limited to current user-id. The RLS policy shall be appended at planner stage, so prepared statement needs to be invalidated whenever its prerequisites are changed. For example, someone may assign a function that returns RLS policy depending on whether the current hour is even-number of odd-number. It implies that we cannot predicate all the cases to invalidate prepared statement with RLS policy, because of too much flexibility. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Wed, May 30, 2012 at 3:26 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: My preference is RLSBYPASS permission rather than the approach with functions that return policy clause at run-time, because it needs to invalidate prepared statement at random timing. In case of this function approach, the RLS policy shall be generated on planner stage, and we cannot have any assumption to the criteria of RLS policy. A function might generate RLS policy regarding to the current user id. Yes, it is straightforward. The prepared statement should be invalidate whenever current user-id got switched. However, someone may define a function that generate RLS policy depending on the value of client_min_messages for example. Do we need to invalidate prepared statement whenever GUC get updated? I think it is overkill. We cannot predicate all the criteria user want to control the RLS policy using the functions. So, RLSBYPASS permission is more simple way to limit number of situations to invalidate prepared statements. That's a good point. If we would have an ideal optimizer, I'd still like the optimizer to wipe out redundant clauses transparently, rather than RLSBYPASS permissions, because it just controls all-or-nothing stuff. For example, if tuples are categorized to unclassified, classified or secret, and RLS policy is configured as: ((current_user IN ('alice', 'bob') AND X IN ('unclassified', 'classified')) OR (X IN 'unclassified)), superuser can see all the tuples, and alice and bob can see up to classified tuples. Is it really hard to wipe out redundant condition at planner stage? If current_user is obviously 'kaigai', it seems to me the left-side of this clause can be wiped out at the planner stage. Do I consider the issue too simple? Yes. :-) There are two problems. First, if using the extended query protocol (e.g. pgbench -M prepared) you can prepare a statement just once and then execute it multiple times. In this case, stable-functions cannot be constant-folded at plan time, because they are only guaranteed to remain constant for a *single* execution of the query, not for all executions of the query. So any optimization in this area would have to be limited to cases where the simple query protocol is used. I think that might still be worth doing, but it's a significant limitation, to be sure. Second, at present, there is no guarantee that the snapshot used for planning the query is the same as the snapshot used for executing the query, though commit d573e239f03506920938bf0be56c868d9c3416da made that happen in some common cases. If we were to do constant-folding of stable functions using the planner snapshot, it would represent a behavior change from previous releases. I am not clear whether that has any real-world consequences that we should be worried about. It seems to me that the path of least resistance might be to refactor the portal stuff so that we can provide a uniform guarantee that, when using the simple query protocol, the planner and executor snapshots will be the same ... but I might be wrong. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/31 Robert Haas robertmh...@gmail.com: If we would have an ideal optimizer, I'd still like the optimizer to wipe out redundant clauses transparently, rather than RLSBYPASS permissions, because it just controls all-or-nothing stuff. For example, if tuples are categorized to unclassified, classified or secret, and RLS policy is configured as: ((current_user IN ('alice', 'bob') AND X IN ('unclassified', 'classified')) OR (X IN 'unclassified)), superuser can see all the tuples, and alice and bob can see up to classified tuples. Is it really hard to wipe out redundant condition at planner stage? If current_user is obviously 'kaigai', it seems to me the left-side of this clause can be wiped out at the planner stage. Do I consider the issue too simple? Yes. :-) There are two problems. First, if using the extended query protocol (e.g. pgbench -M prepared) you can prepare a statement just once and then execute it multiple times. In this case, stable-functions cannot be constant-folded at plan time, because they are only guaranteed to remain constant for a *single* execution of the query, not for all executions of the query. So any optimization in this area would have to be limited to cases where the simple query protocol is used. I think that might still be worth doing, but it's a significant limitation, to be sure. Second, at present, there is no guarantee that the snapshot used for planning the query is the same as the snapshot used for executing the query, though commit d573e239f03506920938bf0be56c868d9c3416da made that happen in some common cases. If we were to do constant-folding of stable functions using the planner snapshot, it would represent a behavior change from previous releases. I am not clear whether that has any real-world consequences that we should be worried about. It seems to me that the path of least resistance might be to refactor the portal stuff so that we can provide a uniform guarantee that, when using the simple query protocol, the planner and executor snapshots will be the same ... but I might be wrong. It may be an option to separate the case into two; a situation to execute the given query immediately just after optimization and never reused, and others. Even though the second situation, it may give us better query execution plan, if we try to reconstruct query plan just before executor with assumption that expects immutable / stable function can be replaced by constant value prior to execution. In other words, this idea tries to query optimization again on EXECUTE statement against to its nature, to replace immutable / stable functions by constant value, and to generate wiser execute plan. At least, it may make sense to have a flag on prepared statement to indicate whether it has possible better plan with this re-construction. Then, if so, we will be able to push the stuff corresponding to RLSBYPASS into the query optimization, and works transparently for users. Isn't it feasible to implement? Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/31 Kohei KaiGai kai...@kaigai.gr.jp: 2012/5/31 Robert Haas robertmh...@gmail.com: If we would have an ideal optimizer, I'd still like the optimizer to wipe out redundant clauses transparently, rather than RLSBYPASS permissions, because it just controls all-or-nothing stuff. For example, if tuples are categorized to unclassified, classified or secret, and RLS policy is configured as: ((current_user IN ('alice', 'bob') AND X IN ('unclassified', 'classified')) OR (X IN 'unclassified)), superuser can see all the tuples, and alice and bob can see up to classified tuples. Is it really hard to wipe out redundant condition at planner stage? If current_user is obviously 'kaigai', it seems to me the left-side of this clause can be wiped out at the planner stage. Do I consider the issue too simple? Yes. :-) There are two problems. First, if using the extended query protocol (e.g. pgbench -M prepared) you can prepare a statement just once and then execute it multiple times. In this case, stable-functions cannot be constant-folded at plan time, because they are only guaranteed to remain constant for a *single* execution of the query, not for all executions of the query. So any optimization in this area would have to be limited to cases where the simple query protocol is used. I think that might still be worth doing, but it's a significant limitation, to be sure. Second, at present, there is no guarantee that the snapshot used for planning the query is the same as the snapshot used for executing the query, though commit d573e239f03506920938bf0be56c868d9c3416da made that happen in some common cases. If we were to do constant-folding of stable functions using the planner snapshot, it would represent a behavior change from previous releases. I am not clear whether that has any real-world consequences that we should be worried about. It seems to me that the path of least resistance might be to refactor the portal stuff so that we can provide a uniform guarantee that, when using the simple query protocol, the planner and executor snapshots will be the same ... but I might be wrong. It may be an option to separate the case into two; a situation to execute the given query immediately just after optimization and never reused, and others. Even though the second situation, it may give us better query execution plan, if we try to reconstruct query plan just before executor with assumption that expects immutable / stable function can be replaced by constant value prior to execution. In other words, this idea tries to query optimization again on EXECUTE statement against to its nature, to replace immutable / stable functions by constant value, and to generate wiser execute plan. At least, it may make sense to have a flag on prepared statement to indicate whether it has possible better plan with this re-construction. Then, if so, we will be able to push the stuff corresponding to RLSBYPASS into the query optimization, and works transparently for users. Isn't it feasible to implement? If we could replace a particular term that consists of constant values and stable / immutable functions only by parameter references, it may enable to handle the term as if a constant value, but actual calculation is delayed to executor stage. For example, according to this idea, PREPARE p1(int) AS SELECT * FROM tbl WHERE current_user in ('alice','bob') AND X $1; shall be internally rewritten to, PREPARE p1(int) AS SELECT * FROM tbl WHERE $2 AND X$1; then, $2 is implicitly calculated just before execution of this prepared statement. The snapshot to be used for this calculation is same with executor's one. It seems to me it is a feasible idea with less invasive implementation to existing planner. Does it make sense to describe exceptional condition using regular clause, instead of special permission? Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/29 Robert Haas robertmh...@gmail.com: On Fri, May 25, 2012 at 5:08 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: I think it is a good idea not to apply RLS when current user has superuser privilege from perspective of security model consistency, but it is inconsistent to check privileges underlying tables. Seems like a somewhat random wart, if it's just an exception for superusers. I think we need to do better than that. For example, at my last company, sales reps A and B were permitted to see all customers of the company, but sales reps C, D, E, F, G, H, I, and J were permitted to see only their own accounts. Those sorts of policies need to be easy to implement. Probably, if sales_rep column records its responsible repo, its security policy is able to be described as: (my_sales_rep() in ('A', 'B') OR sales_rep = my_sales_rep()) Yes, but that's a pain to optimize. When A or B tries to select from the table, the query optimizer has to realize that my_sales_rep() is stable, inline it, do constant simplification and throw away the entire OR clause. Note that this won't work today, because we only constant-fold immutable functions, not stable ones. Then, since there are no remaining security quals, we have to realize that we actually don't need the security_barrier subquery RTE at all, and optimize that away as well. Maybe we can make all of that work, and maybe we should make all of it work, but it's fairly complex. The advantage of having the function return the qual rather than contain the qual is that all of that goes away. The function can choose to return nothing (no RLS for this user) or it can choose to return something (which will likely be simpler than what it would have needed to return out of the chute). One disadvantage is that we have to parse the returned qual instead of just sucking in a node-tree. Anyway, I don't feel super-strongly about this particular idea, so if I'm the only one who likes it, fine, but that having been said, I think users are going to want a *declarative* way to control which policies are applied to which users. Suppose Bob is a sales rep who is only allowed to see his own customers, but then one day, we decide we trust Bob after all, so we want to let him see everything. We could go back and update the IN (...) list in the security policy function, but that's an ugly and unscalable nuisance, especially if we've got 10,000 users. It's much nicer to be able to just grant bob a permission using some kind of, well, GRANT command. That's what we're doing, after all. Alastair's proposal of making the security policy a property of the GRANT is one way of tackling that, and the RLSBYPASS permission I proposed elsewhere is another. Something along these lines seems likely to improve performance (by replacing a query optimization problem with a syscache lookup) as well as ease-of-use. My preference is RLSBYPASS permission rather than the approach with functions that return policy clause at run-time, because it needs to invalidate prepared statement at random timing. In case of this function approach, the RLS policy shall be generated on planner stage, and we cannot have any assumption to the criteria of RLS policy. A function might generate RLS policy regarding to the current user id. Yes, it is straightforward. The prepared statement should be invalidate whenever current user-id got switched. However, someone may define a function that generate RLS policy depending on the value of client_min_messages for example. Do we need to invalidate prepared statement whenever GUC get updated? I think it is overkill. We cannot predicate all the criteria user want to control the RLS policy using the functions. So, RLSBYPASS permission is more simple way to limit number of situations to invalidate prepared statements. If we would have an ideal optimizer, I'd still like the optimizer to wipe out redundant clauses transparently, rather than RLSBYPASS permissions, because it just controls all-or-nothing stuff. For example, if tuples are categorized to unclassified, classified or secret, and RLS policy is configured as: ((current_user IN ('alice', 'bob') AND X IN ('unclassified', 'classified')) OR (X IN 'unclassified)), superuser can see all the tuples, and alice and bob can see up to classified tuples. Is it really hard to wipe out redundant condition at planner stage? If current_user is obviously 'kaigai', it seems to me the left-side of this clause can be wiped out at the planner stage. Do I consider the issue too simple? Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On May25, 2012, at 23:32 , Kohei KaiGai wrote: However, it should not be applied on triggers being set on PK tables, because it allows to modify or delete primary-key being referenced by invisible foreign-key from the viewpoint of operators. I think, it makes sense to have exceptional cases; to make sure foreign-key constraint at the baseline. Good point. If we simply ignore RLS policies when executing parent-side FK constraint triggers, however, we'll happily UPDATE or DELETE invisible rows should the constraint be defined as CASCADE or SET NULL. Which is, um, not exactly desirable… FK triggers already deal with a related issue, namely a child row being invisible to a REPEATABLE READ transaction which attempts to update the parent row. We handle this case by always executing queries in parent-side FK constraint triggers with a newly created READ COMMITTED snapshot, but verifying that all matching rows are also visible under the transaction's actual REPEATABLE READ snapshot. If some are not, we throw a serialization error. FK constraints in the presence of RLS should do the same, I think. When executing a parent-side constraint trigger we should *not* apply the RLS policy as a filter, but instead check that all returned rows are indeed visible according to the RLS policy. If some are not we throw, say, a policy enforcement error. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/27 Alastair Turner b...@ctrlf5.co.za: Excerpts from Kohei KaiGai kai...@kaigai.gr.jp wrote on Fri, May 25, 2012 at 11:08 PM: If we assume RLS is applied when user has no privileges on tables, the current ExecCheckRTEPerms() always raises an error towards unprivileged users, prior to execution of queries. Isn't it preferable behavior to allow unprivileged users to reference a table (or columns) when it has RLS policy? Rather than assuming the the RLS checks will be applied when there are no privileges it would make sense to regard RLS as a limitation on the scope of a particular privilege. This makes RLS a property of a particular grant of a privilege rather than of the table. Viewed this way it is possible to control which users are subject to restrictions imposed by the RLS function, which will avoid RLS overhead for operations which don't require it while allowing checks for those that do, provide a mechanism exempting object owners from RLS checks and make it possible to avoid pg_dump calling user defined code. This suggests an alternative declaration syntax, to use the salesman example: GRANT SELECT ON TABLE client_detail TO super_sales_group; GRANT SELECT TABLE ON client_detail TO limited_sales_group WITH (QUALIFICATION FUNCTION sales_view_check); and since more easily usable security features will be used more effectively, a possible future inlining of the condition: GRANT SELECT ON TABLE client_detail TO limited_sales_group WHERE sales_rep = my_sales_rep(); It seems to me an interesting proposition, because I didn't thought such kind of statement to provide row-level policies. We have a common issue for the idea that check privileges of underlying tables; when we should check the privileges and make a decision whether RLS policy is appended, or not. Due to query optimization reason, the RLS policy should be appended prior to the query optimization. At least, we want to utilize RLS policy for index scans, rather than sequential scan. On the other hand, all the permission checks are currently applied at executor stage, not planner stage. Table / column-level privileges are also applied at head of the executor. It is headache, if we go ahead with an idea to integrate RLS and existing privilege checks. One exception is simplify_function() that tries to expand enough simple SQL functions if current user has ACL_EXECUTE privilege at planner stage. One other issue is whether we should allow any users with enough privileges to bypass RLS, or only superusers. I'm still not sure how the existing checks perform with RLS. If and when Alice has SELECT permission on column X and Y of TBL with RLS, but her query tries to reference X,Y and Z. In this case, existing privilege mechanism shall raise an error, but the criteria with underlying table allows to run this query. It seems to me it is a straightforward criteria to limit superusers to bypass RLS... Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/28 Florian Pflug f...@phlo.org: On May28, 2012, at 02:46 , Noah Misch wrote: On Thu, May 24, 2012 at 07:31:37PM +0200, Florian Pflug wrote: Since the security barrier flag carries a potentially hefty performance penalty, I think it should be optional. Application which don't allow SQL-level access to the database might still benefit from row-level security, because it saves them from having to manually add the WHERE clause to every statement, or having to wrap all their tables with views. Yet without direct SQL-level access, the security barrier thing isn't really necessary, so it'd be nice if they wouldn't have to pay for it. How about ALTER TABLE ? SET ROW POLICY ? WITH (security_barrier) Backward compatibility concerns limited our options when retrofitting the security_barrier treatment for views, but I'd rather not add a knob completely disabling it in the context of a brand new feature. A better avenue is to enhance our facilities for identifying safe query fragments. For example, ALTER FUNCTION ... LEAKPROOF is superuser-only. Adding a way for a table owner to similarly trust functions for the purpose of his own tables would help close the gap that motivates such an all-or-nothing knob. Hm, I'm not sure a per-function flag is really the solution here. Neither, however, is a per-RLS flag as your arguments made me realize. There really are three orthogonal concepts here, all of which allow security barriers to be ignored, namely A) Trusting the use not to exploit leaks, i.e. what you call a trusted query generator B) There being no leaks, i.e. all functions being LEAKPROOF C) Not caring about leaks, i.e. the security_barrier flag Concept B is handled adequately by the LEAKPROOF flag. Concept C is handled by the security_barrier flag. However, as you pointed out, it's a bit of a dubious concept and only really necessary for backwards compatibility because it reflects pre-9.2 behaviour. Concept A is what I was aiming for. Per the above, a per-RLS flag is clearly the wrong tool for the job, so consider my suggestion withdrawn. What we actually want, I think, is a per-role flag which marks a role as leak trusted. Queries issued by such a role would behave as if all functions are LEAKPROOF, since even if there is a leak, the role is trusted not to exploit it. It seems to me we are confusing about security-barrier and leakproof flag. The purpose of leakproof flag is to ensure the function is safe to execute, so it is allowed to pushed down the function into a sub-query with security- barrier. It works to decide whether the user given qualifier is safe to push-down. The RLS policy itself is placed within a sub-query from the beginning, thus not needed them to be leakproof function. Existing view with security-barrier flag cannot prevent information leaks, even when the owner set leaky function on its qualifiers. However, it is owner's responsibility. In a similar fashion, I don't think RLS works fine in case when owner set leaky function as its policy by himself. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On May29, 2012, at 13:59 , Kohei KaiGai wrote: 2012/5/28 Florian Pflug f...@phlo.org: Concept B is handled adequately by the LEAKPROOF flag. Concept C is handled by the security_barrier flag. However, as you pointed out, it's a bit of a dubious concept and only really necessary for backwards compatibility because it reflects pre-9.2 behaviour. Concept A is what I was aiming for. Per the above, a per-RLS flag is clearly the wrong tool for the job, so consider my suggestion withdrawn. What we actually want, I think, is a per-role flag which marks a role as leak trusted. Queries issued by such a role would behave as if all functions are LEAKPROOF, since even if there is a leak, the role is trusted not to exploit it. It seems to me we are confusing about security-barrier and leakproof flag. The purpose of leakproof flag is to ensure the function is safe to execute, so it is allowed to pushed down the function into a sub-query with security- barrier. It works to decide whether the user given qualifier is safe to push-down. The RLS policy itself is placed within a sub-query from the beginning, thus not needed them to be leakproof function. My suggestion (which I then withdrew) was for that per-policy flag to decide whether the sub-query which applies the RLS policy acts as a security barrier or not. Thus, with the flag set, only leakproof predicates would haven been allowed to be pushed down into that subquery, whereas without the flag every predicate would haven been a candidate for being pushed down. The flag was never intended to mark the RLS policy itself as leakproof or leaky - that, as you said, makes little sense. My motivation for suggesting that flag was to prevent people who want RLS, yet aren't concerned about leaks, from having to pay the performance penalty associated with not pushing down predicates. Noah's comments, however, made me realize that whether one cares about potential leaks is usually not a per-table property, but rather a property of the user executing the query. Some users (like the middle-ware that sits on top of your database) you might trust to not exploit leaks, while wanting the tightest security possible for others. Which made me suggest a per-role flag which essentially overrides the security barrier stuff. Explaining that behaviour as behave as if all functions are LEAKPROOF might haven been a tad confusion, though. Maybe a better explanation is behave as if no sub-query has the security barrier flag set, or even don't let security concerns prevent predicate push-down. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On May29, 2012, at 13:37 , Kohei KaiGai wrote: 2012/5/27 Alastair Turner b...@ctrlf5.co.za: Excerpts from Kohei KaiGai kai...@kaigai.gr.jp wrote on Fri, May 25, 2012 at 11:08 PM: If we assume RLS is applied when user has no privileges on tables, the current ExecCheckRTEPerms() always raises an error towards unprivileged users, prior to execution of queries. Isn't it preferable behavior to allow unprivileged users to reference a table (or columns) when it has RLS policy? I don't think so. Per-table and per-column permission checks should still apply independent from any RLS policy. You can always grant SELECT access to PUBLIC if want to rely solely on the RLS policy... Rather than assuming the the RLS checks will be applied when there are no privileges it would make sense to regard RLS as a limitation on the scope of a particular privilege. This makes RLS a property of a particular grant of a privilege rather than of the table. Viewed this way it is possible to control which users are subject to restrictions imposed by the RLS function, which will avoid RLS overhead for operations which don't require it while allowing checks for those that do, provide a mechanism exempting object owners from RLS checks and make it possible to avoid pg_dump calling user defined code. This suggests an alternative declaration syntax, to use the salesman example: GRANT SELECT ON TABLE client_detail TO super_sales_group; GRANT SELECT TABLE ON client_detail TO limited_sales_group WITH (QUALIFICATION FUNCTION sales_view_check); and since more easily usable security features will be used more effectively, a possible future inlining of the condition: GRANT SELECT ON TABLE client_detail TO limited_sales_group WHERE sales_rep = my_sales_rep(); It seems to me an interesting proposition, because I didn't thought such kind of statement to provide row-level policies. Note, though, that due to role inheritance, multiple such QUALIFICATION FUNCTIONS could be in scope for a single table. In that case, I guess you'd have to OR them together, i.e. hide a row only if all of them return false. We have a common issue for the idea that check privileges of underlying tables; when we should check the privileges and make a decision whether RLS policy is appended, or not. Due to query optimization reason, the RLS policy should be appended prior to the query optimization. At least, we want to utilize RLS policy for index scans, rather than sequential scan. Agreed. With the current design, the RLS policy has to be applied before planning, not during execution. One other issue is whether we should allow any users with enough privileges to bypass RLS, or only superusers. I'm still not sure how the existing checks perform with RLS. Every user who has the power to disable the RLS policy should also at least be able to circumvent it temporarily, I think. This includes superusers, the table owner (since he's presumably allowed to do ALTER TABLE .. RESET ROW POLICY), and maybe the database owner. If and when Alice has SELECT permission on column X and Y of TBL with RLS, but her query tries to reference X,Y and Z. In this case, existing privilege mechanism shall raise an error, but the criteria with underlying table allows to run this query. It seems to me it is a straightforward criteria to limit superusers to bypass RLS… I'm not sure I understand. Do you mean the Alice references only columns X and Y in her query, but the RLS policy adds the reference to column Z? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/29 Florian Pflug f...@phlo.org: On May29, 2012, at 13:59 , Kohei KaiGai wrote: 2012/5/28 Florian Pflug f...@phlo.org: Concept B is handled adequately by the LEAKPROOF flag. Concept C is handled by the security_barrier flag. However, as you pointed out, it's a bit of a dubious concept and only really necessary for backwards compatibility because it reflects pre-9.2 behaviour. Concept A is what I was aiming for. Per the above, a per-RLS flag is clearly the wrong tool for the job, so consider my suggestion withdrawn. What we actually want, I think, is a per-role flag which marks a role as leak trusted. Queries issued by such a role would behave as if all functions are LEAKPROOF, since even if there is a leak, the role is trusted not to exploit it. It seems to me we are confusing about security-barrier and leakproof flag. The purpose of leakproof flag is to ensure the function is safe to execute, so it is allowed to pushed down the function into a sub-query with security- barrier. It works to decide whether the user given qualifier is safe to push-down. The RLS policy itself is placed within a sub-query from the beginning, thus not needed them to be leakproof function. My suggestion (which I then withdrew) was for that per-policy flag to decide whether the sub-query which applies the RLS policy acts as a security barrier or not. Thus, with the flag set, only leakproof predicates would haven been allowed to be pushed down into that subquery, whereas without the flag every predicate would haven been a candidate for being pushed down. The flag was never intended to mark the RLS policy itself as leakproof or leaky - that, as you said, makes little sense. My motivation for suggesting that flag was to prevent people who want RLS, yet aren't concerned about leaks, from having to pay the performance penalty associated with not pushing down predicates. I think it is a reasonable selection. For example, it make sense in case when users obviously don't have privilege to create a function and don't care about estimation of invisible values using iteration of proving. The owner is the only person who can determine whether it is harmless, or not. Noah's comments, however, made me realize that whether one cares about potential leaks is usually not a per-table property, but rather a property of the user executing the query. Some users (like the middle-ware that sits on top of your database) you might trust to not exploit leaks, while wanting the tightest security possible for others. Which made me suggest a per-role flag which essentially overrides the security barrier stuff. Explaining that behaviour as behave as if all functions are LEAKPROOF might haven been a tad confusion, though. Maybe a better explanation is behave as if no sub-query has the security barrier flag set, or even don't let security concerns prevent predicate push-down. Hmm... It might make sense to allow table-owner to set up suitable grade between security and performance. However, isn't it a feature to be discussed in the 2nd commit-fest? I think we can construct this type of adjustment on the basis of minimum functionality. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On May29, 2012, at 16:13 , Kohei KaiGai wrote: 2012/5/29 Florian Pflug f...@phlo.org: My motivation for suggesting that flag was to prevent people who want RLS, yet aren't concerned about leaks, from having to pay the performance penalty associated with not pushing down predicates. I think it is a reasonable selection. For example, it make sense in case when users obviously don't have privilege to create a function and don't care about estimation of invisible values using iteration of proving. The owner is the only person who can determine whether it is harmless, or not. Nice to know that we're on the same page here. Noah's comments, however, made me realize that whether one cares about potential leaks is usually not a per-table property, but rather a property of the user executing the query. Some users (like the middle-ware that sits on top of your database) you might trust to not exploit leaks, while wanting the tightest security possible for others. Which made me suggest a per-role flag which essentially overrides the security barrier stuff. Explaining that behaviour as behave as if all functions are LEAKPROOF might haven been a tad confusion, though. Maybe a better explanation is behave as if no sub-query has the security barrier flag set, or even don't let security concerns prevent predicate push-down. Hmm... It might make sense to allow table-owner to set up suitable grade between security and performance. However, isn't it a feature to be discussed in the 2nd commit-fest? I think we can construct this type of adjustment on the basis of minimum functionality. Agreed. If the flag is per-role, not per-policy, the feature is orthogonal to the whole RLS feature. So yeah, let's postpone it to a later date. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/29 Florian Pflug f...@phlo.org: On May29, 2012, at 13:37 , Kohei KaiGai wrote: 2012/5/27 Alastair Turner b...@ctrlf5.co.za: Excerpts from Kohei KaiGai kai...@kaigai.gr.jp wrote on Fri, May 25, 2012 at 11:08 PM: If we assume RLS is applied when user has no privileges on tables, the current ExecCheckRTEPerms() always raises an error towards unprivileged users, prior to execution of queries. Isn't it preferable behavior to allow unprivileged users to reference a table (or columns) when it has RLS policy? I don't think so. Per-table and per-column permission checks should still apply independent from any RLS policy. You can always grant SELECT access to PUBLIC if want to rely solely on the RLS policy... Frankly, I have same opinion. Existing per-table and column permission checks should perform independently from RLS. What I wanted to explain is the combination of them makes confusion for us. Rather than assuming the the RLS checks will be applied when there are no privileges it would make sense to regard RLS as a limitation on the scope of a particular privilege. This makes RLS a property of a particular grant of a privilege rather than of the table. Viewed this way it is possible to control which users are subject to restrictions imposed by the RLS function, which will avoid RLS overhead for operations which don't require it while allowing checks for those that do, provide a mechanism exempting object owners from RLS checks and make it possible to avoid pg_dump calling user defined code. This suggests an alternative declaration syntax, to use the salesman example: GRANT SELECT ON TABLE client_detail TO super_sales_group; GRANT SELECT TABLE ON client_detail TO limited_sales_group WITH (QUALIFICATION FUNCTION sales_view_check); and since more easily usable security features will be used more effectively, a possible future inlining of the condition: GRANT SELECT ON TABLE client_detail TO limited_sales_group WHERE sales_rep = my_sales_rep(); It seems to me an interesting proposition, because I didn't thought such kind of statement to provide row-level policies. Note, though, that due to role inheritance, multiple such QUALIFICATION FUNCTIONS could be in scope for a single table. In that case, I guess you'd have to OR them together, i.e. hide a row only if all of them return false. Nice catch! Indeed, it makes RLS policy to be applied complex, and hard to apply index-scan. We have a common issue for the idea that check privileges of underlying tables; when we should check the privileges and make a decision whether RLS policy is appended, or not. Due to query optimization reason, the RLS policy should be appended prior to the query optimization. At least, we want to utilize RLS policy for index scans, rather than sequential scan. Agreed. With the current design, the RLS policy has to be applied before planning, not during execution. One other issue is whether we should allow any users with enough privileges to bypass RLS, or only superusers. I'm still not sure how the existing checks perform with RLS. Every user who has the power to disable the RLS policy should also at least be able to circumvent it temporarily, I think. This includes superusers, the table owner (since he's presumably allowed to do ALTER TABLE .. RESET ROW POLICY), and maybe the database owner. I'm not inclined this criteria, because existing privilege system allows to restrict owner's behavior, even though it can be invalidated by owner itself. In other words, the privilege mechanism performs according to the current setting unless it is not disabled by owner itself. On the other hand, superuser is allowed to ignore the security setting. In RLS situation, it is corresponding to run the query without RLS policy. If and when Alice has SELECT permission on column X and Y of TBL with RLS, but her query tries to reference X,Y and Z. In this case, existing privilege mechanism shall raise an error, but the criteria with underlying table allows to run this query. It seems to me it is a straightforward criteria to limit superusers to bypass RLS… I'm not sure I understand. Do you mean the Alice references only columns X and Y in her query, but the RLS policy adds the reference to column Z? No, in this scenario, Alice tries to reference X,Y and Z, then it should be violated. I wanted to show is; If RLS performs only when current user has no table/column level privilege, it makes strange behavior. Here is no matter if RLS works independently from per table / column permissions. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Tue, May 29, 2012 at 9:12 AM, Florian Pflug f...@phlo.org wrote: On May29, 2012, at 13:37 , Kohei KaiGai wrote: 2012/5/27 Alastair Turner b...@ctrlf5.co.za: Excerpts from Kohei KaiGai kai...@kaigai.gr.jp wrote on Fri, May 25, 2012 at 11:08 PM: If we assume RLS is applied when user has no privileges on tables, the current ExecCheckRTEPerms() always raises an error towards unprivileged users, prior to execution of queries. Isn't it preferable behavior to allow unprivileged users to reference a table (or columns) when it has RLS policy? I don't think so. Per-table and per-column permission checks should still apply independent from any RLS policy. You can always grant SELECT access to PUBLIC if want to rely solely on the RLS policy... I strongly agree with this. Permissions checks should be applied first, so that we can boot anyone who has no business being there at all as early in the process as possible. As a second step, if you have a right to the query the table, row-level security should intervene to control which rows you can see. One idea might be to have a grantable permission that permits the RLS policy to be bypassed. So, if a user has only SELECT permission, they can select from the table, but the RLS policy will apply. If they have both SELECT and RLSBYPASS (probably not what we really want to call it) permission, then they can select from the table and the RLS policy will be skipped. This means that superusers automatically skip all RLS policies (which seems right) and table owners skip them by default (but could revoke their own privileges) and other people can skip them if the table owner (or the superuser) grants them the appropriate privilege on the table involved. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On May29, 2012, at 16:34 , Robert Haas wrote: One idea might be to have a grantable permission that permits the RLS policy to be bypassed. So, if a user has only SELECT permission, they can select from the table, but the RLS policy will apply. If they have both SELECT and RLSBYPASS (probably not what we really want to call it) permission, then they can select from the table and the RLS policy will be skipped. This means that superusers automatically skip all RLS policies (which seems right) and table owners skip them by default (but could revoke their own privileges) and other people can skip them if the table owner (or the superuser) grants them the appropriate privilege on the table involved. I like it. Seems to support all use-cases I can come up with, and extends existing privilege semantics in a natural way. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/29 Robert Haas robertmh...@gmail.com: One idea might be to have a grantable permission that permits the RLS policy to be bypassed. So, if a user has only SELECT permission, they can select from the table, but the RLS policy will apply. If they have both SELECT and RLSBYPASS (probably not what we really want to call it) permission, then they can select from the table and the RLS policy will be skipped. This means that superusers automatically skip all RLS policies (which seems right) and table owners skip them by default (but could revoke their own privileges) and other people can skip them if the table owner (or the superuser) grants them the appropriate privilege on the table involved. Isn't it unavailable to describe using RLS policy? In case when 'alice' and 'bob' should bypass RLS policy on a certain table, we will be able to describe it as follows: (current_user in ('alice', 'bob') OR rls_policy_this_table(X, Y, Z)) I have one concern the current_user in (...) is not wiped out at the planner stage, although its evaluation result is obvious prior to execution. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Fri, May 25, 2012 at 5:08 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: I think it is a good idea not to apply RLS when current user has superuser privilege from perspective of security model consistency, but it is inconsistent to check privileges underlying tables. Seems like a somewhat random wart, if it's just an exception for superusers. I think we need to do better than that. For example, at my last company, sales reps A and B were permitted to see all customers of the company, but sales reps C, D, E, F, G, H, I, and J were permitted to see only their own accounts. Those sorts of policies need to be easy to implement. Probably, if sales_rep column records its responsible repo, its security policy is able to be described as: (my_sales_rep() in ('A', 'B') OR sales_rep = my_sales_rep()) Yes, but that's a pain to optimize. When A or B tries to select from the table, the query optimizer has to realize that my_sales_rep() is stable, inline it, do constant simplification and throw away the entire OR clause. Note that this won't work today, because we only constant-fold immutable functions, not stable ones. Then, since there are no remaining security quals, we have to realize that we actually don't need the security_barrier subquery RTE at all, and optimize that away as well. Maybe we can make all of that work, and maybe we should make all of it work, but it's fairly complex. The advantage of having the function return the qual rather than contain the qual is that all of that goes away. The function can choose to return nothing (no RLS for this user) or it can choose to return something (which will likely be simpler than what it would have needed to return out of the chute). One disadvantage is that we have to parse the returned qual instead of just sucking in a node-tree. Anyway, I don't feel super-strongly about this particular idea, so if I'm the only one who likes it, fine, but that having been said, I think users are going to want a *declarative* way to control which policies are applied to which users. Suppose Bob is a sales rep who is only allowed to see his own customers, but then one day, we decide we trust Bob after all, so we want to let him see everything. We could go back and update the IN (...) list in the security policy function, but that's an ugly and unscalable nuisance, especially if we've got 10,000 users. It's much nicer to be able to just grant bob a permission using some kind of, well, GRANT command. That's what we're doing, after all. Alastair's proposal of making the security policy a property of the GRANT is one way of tackling that, and the RLSBYPASS permission I proposed elsewhere is another. Something along these lines seems likely to improve performance (by replacing a query optimization problem with a syscache lookup) as well as ease-of-use. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Tue, May 29, 2012 at 10:57 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: 2012/5/29 Robert Haas robertmh...@gmail.com: One idea might be to have a grantable permission that permits the RLS policy to be bypassed. So, if a user has only SELECT permission, they can select from the table, but the RLS policy will apply. If they have both SELECT and RLSBYPASS (probably not what we really want to call it) permission, then they can select from the table and the RLS policy will be skipped. This means that superusers automatically skip all RLS policies (which seems right) and table owners skip them by default (but could revoke their own privileges) and other people can skip them if the table owner (or the superuser) grants them the appropriate privilege on the table involved. Isn't it unavailable to describe using RLS policy? In case when 'alice' and 'bob' should bypass RLS policy on a certain table, we will be able to describe it as follows: (current_user in ('alice', 'bob') OR rls_policy_this_table(X, Y, Z)) I have one concern the current_user in (...) is not wiped out at the planner stage, although its evaluation result is obvious prior to execution. Yes, that's one problem with doing it that way. The fact that the superuser is not guaranteed-excluded is another; that can of course be fixed by adding a special-case hack for superusers, but IMHO this is more elegant. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On May29, 2012, at 16:57 , Kohei KaiGai wrote: 2012/5/29 Robert Haas robertmh...@gmail.com: One idea might be to have a grantable permission that permits the RLS policy to be bypassed. So, if a user has only SELECT permission, they can select from the table, but the RLS policy will apply. If they have both SELECT and RLSBYPASS (probably not what we really want to call it) permission, then they can select from the table and the RLS policy will be skipped. This means that superusers automatically skip all RLS policies (which seems right) and table owners skip them by default (but could revoke their own privileges) and other people can skip them if the table owner (or the superuser) grants them the appropriate privilege on the table involved. Isn't it unavailable to describe using RLS policy? In case when 'alice' and 'bob' should bypass RLS policy on a certain table, we will be able to describe it as follows: (current_user in ('alice', 'bob') OR rls_policy_this_table(X, Y, Z)) I have one concern the current_user in (...) is not wiped out at the planner stage, although its evaluation result is obvious prior to execution. I think you'd simply pretend the RLS policy isn't there (i.e., don't rewrite the table reference into a sub-query) if the user has the RLSBYPASS privilege, not add another clause which selectively neuters the RLS policy. Wait a moment.. now I get it. You're saying that Robert's RLSBYPASS privilege is redundant, since one can always add a current_user IN ... clause to the RLS policy function. That might be the case theoretically, but I still believe that Robert's suggestion has a lot of benefits. First, due to role inheritance, a simply clause like the above isn't sufficient. Second, it doesn't make superusers and table owner bypass RLS by default. Which I think is a good idea (bypassing by default, that is), because it prevents broken backups (for those using pg_dump) and data exports. It doesn't cost anything in terms of security, because those users could disable the RLS policy away if they are so inclined. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On May28, 2012, at 02:46 , Noah Misch wrote: On Thu, May 24, 2012 at 07:31:37PM +0200, Florian Pflug wrote: Since the security barrier flag carries a potentially hefty performance penalty, I think it should be optional. Application which don't allow SQL-level access to the database might still benefit from row-level security, because it saves them from having to manually add the WHERE clause to every statement, or having to wrap all their tables with views. Yet without direct SQL-level access, the security barrier thing isn't really necessary, so it'd be nice if they wouldn't have to pay for it. How about ALTER TABLE ? SET ROW POLICY ? WITH (security_barrier) Backward compatibility concerns limited our options when retrofitting the security_barrier treatment for views, but I'd rather not add a knob completely disabling it in the context of a brand new feature. A better avenue is to enhance our facilities for identifying safe query fragments. For example, ALTER FUNCTION ... LEAKPROOF is superuser-only. Adding a way for a table owner to similarly trust functions for the purpose of his own tables would help close the gap that motivates such an all-or-nothing knob. Hm, I'm not sure a per-function flag is really the solution here. Neither, however, is a per-RLS flag as your arguments made me realize. There really are three orthogonal concepts here, all of which allow security barriers to be ignored, namely A) Trusting the use not to exploit leaks, i.e. what you call a trusted query generator B) There being no leaks, i.e. all functions being LEAKPROOF C) Not caring about leaks, i.e. the security_barrier flag Concept B is handled adequately by the LEAKPROOF flag. Concept C is handled by the security_barrier flag. However, as you pointed out, it's a bit of a dubious concept and only really necessary for backwards compatibility because it reflects pre-9.2 behaviour. Concept A is what I was aiming for. Per the above, a per-RLS flag is clearly the wrong tool for the job, so consider my suggestion withdrawn. What we actually want, I think, is a per-role flag which marks a role as leak trusted. Queries issued by such a role would behave as if all functions are LEAKPROOF, since even if there is a leak, the role is trusted not to exploit it. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
Excerpts from Kohei KaiGai kai...@kaigai.gr.jp wrote on Fri, May 25, 2012 at 11:08 PM: If we assume RLS is applied when user has no privileges on tables, the current ExecCheckRTEPerms() always raises an error towards unprivileged users, prior to execution of queries. Isn't it preferable behavior to allow unprivileged users to reference a table (or columns) when it has RLS policy? Rather than assuming the the RLS checks will be applied when there are no privileges it would make sense to regard RLS as a limitation on the scope of a particular privilege. This makes RLS a property of a particular grant of a privilege rather than of the table. Viewed this way it is possible to control which users are subject to restrictions imposed by the RLS function, which will avoid RLS overhead for operations which don't require it while allowing checks for those that do, provide a mechanism exempting object owners from RLS checks and make it possible to avoid pg_dump calling user defined code. This suggests an alternative declaration syntax, to use the salesman example: GRANT SELECT ON TABLE client_detail TO super_sales_group; GRANT SELECT TABLE ON client_detail TO limited_sales_group WITH (QUALIFICATION FUNCTION sales_view_check); and since more easily usable security features will be used more effectively, a possible future inlining of the condition: GRANT SELECT ON TABLE client_detail TO limited_sales_group WHERE sales_rep = my_sales_rep(); Regards, Alastair. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Thu, May 24, 2012 at 07:31:37PM +0200, Florian Pflug wrote: On May24, 2012, at 18:42 , Kohei KaiGai wrote: As we discussed, it causes a problem with approach to append additional qualifiers to where clause implicitly, because it does not solve the matter corresponding to the order to execute qualifiers. So, I'm inclined to the approach to replace reference to tables with security policy by sub-queries with security barrier flag. Since the security barrier flag carries a potentially hefty performance penalty, I think it should be optional. Application which don't allow SQL-level access to the database might still benefit from row-level security, because it saves them from having to manually add the WHERE clause to every statement, or having to wrap all their tables with views. Yet without direct SQL-level access, the security barrier thing isn't really necessary, so it'd be nice if they wouldn't have to pay for it. How about ALTER TABLE ? SET ROW POLICY ? WITH (security_barrier) The conventional case for a RLS facility is to wholly implement a security policy, so security_barrier should be the default. Using the same facility to implement a security policy in cooperation with a trusted query generator is the variant case. Backward compatibility concerns limited our options when retrofitting the security_barrier treatment for views, but I'd rather not add a knob completely disabling it in the context of a brand new feature. A better avenue is to enhance our facilities for identifying safe query fragments. For example, ALTER FUNCTION ... LEAKPROOF is superuser-only. Adding a way for a table owner to similarly trust functions for the purpose of his own tables would help close the gap that motivates such an all-or-nothing knob. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/24 Robert Haas robertmh...@gmail.com: On Thu, May 24, 2012 at 6:11 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: Perhaps when we see that RLS applies, we should replace the reference to the original table with a subquery RTE that has the security_barrier flag set - essentially treating a table with RLS as if it were a security view. I become to think it is a better approach than tracking origin of each qualifiers. One problem is case handling on update or delete statement. It may be possible to rewrite the update / delete query as follows: From: UPDATE tbl SET X = X + 1 WHERE f_leak(Y) To: UPDATE tbl SET X = X + 1 WHERE ctid = ( SELECT * FROM ( SELECT ctid FROM tbl WHERE uname = getpgusername() == (*) should have security-barrier ) AS tbl_subqry WHERE f_leak(Y) ); Expanded sub-queries will have security-barrier flag, so it enforces the uname = getpgusername() being checked earlier than f_leak(Y). We may need to measure the performance impact due to the reform. The problem with this is that it introduces an extra instance of tbl into the query - there are now two rather than one. UPDATE .. FROM is supposed to be a way to avoid this, but it's insufficiently general to handle all the cases (e.g. UPDATE a LEFT JOIN b can't be written using the existing syntax). Anyway we want to avoid inserting self-joins for performance reasons if at all possible. It should be easy to do that in the case of SELECT; UPDATE and DELETE may need a bit more work. I'll try to investigate a way to solve the matter without twice scan. Right now, I have no reasonable ideas. Please give us suggestion if you have something... I think, this situation is similar to a case when we reference a view without privileges to underlying tables. If Bob set up a view with something tricky function, it allows Bob to reference credentials of users who reference the view. More or less, it might be a problem when a user try to invoke a user defined function declared by others. (Thus, sepgsql policy does not allow users to invoke a function declared by another one in different domain; without DBA's checks.) This is true, but there are still some new threat models. For example, currently, pg_dump isn't going to run any user-defined code just because you do SELECT * FROM table, but that will change with this patch. Note that pg_dump need not actually select from views, only tables. I think it is a good idea not to apply RLS when current user has superuser privilege from perspective of security model consistency, but it is inconsistent to check privileges underlying tables. Seems like a somewhat random wart, if it's just an exception for superusers. I think we need to do better than that. For example, at my last company, sales reps A and B were permitted to see all customers of the company, but sales reps C, D, E, F, G, H, I, and J were permitted to see only their own accounts. Those sorts of policies need to be easy to implement. Probably, if sales_rep column records its responsible repo, its security policy is able to be described as: (my_sales_rep() in ('A', 'B') OR sales_rep = my_sales_rep()) Indeed, the design to check underlying table seems to me like column-level privileges towards table-level privileges, since it is checked only when user does not have requires privileges on whole of the table. However, I have no idea to modify ExecCheckRTEPerms() regarding to RLS. If we assume RLS is applied when user has no privileges on tables, the current ExecCheckRTEPerms() always raises an error towards unprivileged users, prior to execution of queries. Isn't it preferable behavior to allow unprivileged users to reference a table (or columns) when it has RLS policy? I think, table and column level privilege should be checked individually, in addition to row-level security policy. Another idea is to set things up so that the RLS policy function isn't applied to each row directly; instead, it's invoked once per query and *returns* a WHERE clause. This would be a lot more powerful than the proposed design, because now the table owner can write a function that imposes quals on some people but not others, which seems very useful. Sorry, I don't favor this idea. Even if table owner set up a function to generate additional qualifiers, it also has no guarantee the qualifiers are invoked prior to user-given one. It seems to me this approach will have same problem... It's not intended to solve the qual-ordering problem, just to allow additional policy flexibility. At the beginning, I thought it takes complex code to parse where-clause being provided as security policy, so it is the reason why I was inclined to give a function, instead of a clause. But I noticed we already have similar code at CreateTrigger() to handle it. Does it give policy flexibility? It's not clear to me that there is any need for built-in server functionality here. If the table
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/24 Robert Haas robertmh...@gmail.com: On Thu, May 24, 2012 at 12:00 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: Another issue, BTW, are FOREIGN KEY constraints. Should you be allowed to created references to rows which are invisible to you, or should FOREIGN KEY constraints be exempt from security policies? I'd say they shouldn't be, i.e. the policy WHERE clause should be added to constraint checking queries like usual. But maybe I'm missing some reason why that'd be undesirable… I agree. The row level security policy should not be applied during FK checks (or other internal stuff; to be harmless). At the previous discussion, it was issued that iteration of FK/PK proving enables malicious one to estimate existence of invisible tuple and its key value, although they cannot see the actual values. It is well documented limitation, thus, user should not use row- level security (or should not use natural key) if they cannot accept this limitation. You say I agree, but it seems to me that you and Florian are in fact taking opposite positions. Sorry, I misread what he described. FWIW, I'm inclined to think that you should NOT be able to create a row that references an invisible row. You might end up with that situation anyway, because we don't know what the semantics of the security policy are: rows might become visible or invisible after the fact, and we can't police that. But I think that if you take the opposite position that the select queries inside fkey triggers ought to be exempt from security policy, then you need to build some new mechanism to make that happen, which seems like extra work for no benefit. I think it is fair enough for RI_FKey_check_ins and RI_FKey_check_upd; RLS policy inside these trigger function will exhibit to create a row that references invisible row. However, it should not be applied on triggers being set on PK tables, because it allows to modify or delete primary-key being referenced by invisible foreign-key from the viewpoint of operators. I think, it makes sense to have exceptional cases; to make sure foreign-key constraint at the baseline. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/25 Florian Pflug f...@phlo.org: On May24, 2012, at 19:25 , Robert Haas wrote: FWIW, I'm inclined to think that you should NOT be able to create a row that references an invisible row. You might end up with that situation anyway, because we don't know what the semantics of the security policy are: rows might become visible or invisible after the fact, and we can't police that. Right. I just realized, however, that there's another case which wasn't considered yet, which is how to handle the initial check during ALTER TABEL ADD CONSTRAINT. I'm thinking that it's fine to only consider visible rows in the parent table there too, but we should be checking all rows in the child table. The easiest way would be to restrict ALTER TABLE ADD CONSTRAINT to the table owner for tables with RLS (it seems that currently the REFERENCES privilege is sufficient), and make the table owner exempt from RLS on that table. The latter means you'd need at least two roles to use RLS, but anyone security-conscious enough to use RLS will probably not user the same role for DDL and DML operations anyway... I think, the RLS policy should perform as a view with qualifiers. It means database constraints have to be kept from the viewpoint of any clients, so, any orphan foreign-keys or any rows violating check constraint should not exist. In case when a user insert a row with foreign-key, it is an exceptional case. Even if he cannot insert a foreign-key that references invisible primary-key, it never breaks database constraints from the viewpoint of other folks. My standpoint deals with database constraints as first class customer that should be always kept in the lowest level, even though a part of results would be filtered out due to RLS. Isn't it a simple enough criteria? But I think that if you take the opposite position that the select queries inside fkey triggers ought to be exempt from security policy, then you need to build some new mechanism to make that happen, which seems like extra work for no benefit. Hm, interesting angle. Continuing this thought, without any extra work, UNIQUE and EXCLUSION constraints *will* be enforced regardless of row visibility, because their implementation isn't SPI-based but instead detects conflicts while inserting tuples into the index. For being so obviously inconsistent in its treatment of UNIQUE and EXCLUSION constraints vs. FK constraints, this feels surprisingly right. So, to prevent design by accident, here's an attempt to explain that divergence. For UNIQUE and EXCLUSION constraints, the most conservative assumption possible is that all rows are visible, since that leads to the most rejections. With that assumption, no matter what the actual policy is, the data returned by a query will always satisfy the constraint. Plus, the constraint is still sensible because it neither rejects nor allows all rows. So that conservative assumption is the one we make, i.e. we ignore RLS visibility when checking those kinds of constraints. For FK constraints, OTOH, the most conservative assumption is that no rows are visible. But that is meaningless, since it will simply reject all possible rows. Having thus no chance of enforcing the constraint ourselves under all possible policies, the best we can do is to at least make it possible for the constraint to work correctly for as many policies as possible. Now, if we go with KaiGai's suggestion of skipping RLS while checking FK constraints, the only policy that the constraint will work correctly for is one which doesn't actually hide any parent rows. Whereas if we apply RLS checks while checking FK constraints, all policies which behave consistently for parent and child rows (i.e. don't hide the former but show the latter) will work correctly. We thus go with the second option, since the class of working policies is larger. -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/23 Robert Haas robertmh...@gmail.com: On Wed, May 23, 2012 at 3:45 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: I wanted to have discussion to handle this problem. Unlike leaky-view problem, we don't need to worry about unexpected qualifier distribution on either side of join, because a scan on table never contains any join. Thus, all we need to care about is order of qualifiers chained on a particular scan node; being reordered by the cost to invoke functions. How about an idea to track FuncExpr come from the security policy and enforce 0 on its cost? Regular functions never reach zero cost, so the security policy must be re-ordered to the head. Hmm. That would disregard the relative costs of multiple qualifiers all of which were injected by the security policy, which I suspect is not a good idea. Furthermore, I think that we should not assume that there is no join involved. I would expect a fairly popular RLS qual to be something of the form WHERE NOT EXISTS (SELECT 1 FROM hide_me WHERE hide_me.pk = thistab.pk). Please ignore the approach to track cost value of qualifiers. I believe it does not work well without something fundamental updates. Perhaps when we see that RLS applies, we should replace the reference to the original table with a subquery RTE that has the security_barrier flag set - essentially treating a table with RLS as if it were a security view. I become to think it is a better approach than tracking origin of each qualifiers. One problem is case handling on update or delete statement. It may be possible to rewrite the update / delete query as follows: From: UPDATE tbl SET X = X + 1 WHERE f_leak(Y) To: UPDATE tbl SET X = X + 1 WHERE ctid = ( SELECT * FROM ( SELECT ctid FROM tbl WHERE uname = getpgusername() == (*) should have security-barrier ) AS tbl_subqry WHERE f_leak(Y) ); Expanded sub-queries will have security-barrier flag, so it enforces the uname = getpgusername() being checked earlier than f_leak(Y). We may need to measure the performance impact due to the reform. Also, suppose that Bob applies an RLS policy to a table, and, later, Alice selects from the table. How do we keep Bob from usurping Alice's privileges? If we insist that Bob's RLS policy function runs as Bob, then it defeats inlining; but if it runs as Alice, then Bob can steal Alice's credentials. One idea is to apply the security policy only if Alice's access to the table is granted by Bob. That way, if Alice is (for example) the superuser, she's immune to RLS. But that doesn't seem to completely solve the problem, because Alice might merely be some other relatively unprivileged user and we still don't want Bob to be able to walk off with her access. I think, this situation is similar to a case when we reference a view without privileges to underlying tables. If Bob set up a view with something tricky function, it allows Bob to reference credentials of users who reference the view. More or less, it might be a problem when a user try to invoke a user defined function declared by others. (Thus, sepgsql policy does not allow users to invoke a function declared by another one in different domain; without DBA's checks.) I think it is a good idea not to apply RLS when current user has superuser privilege from perspective of security model consistency, but it is inconsistent to check privileges underlying tables. Another idea is to set things up so that the RLS policy function isn't applied to each row directly; instead, it's invoked once per query and *returns* a WHERE clause. This would be a lot more powerful than the proposed design, because now the table owner can write a function that imposes quals on some people but not others, which seems very useful. Sorry, I don't favor this idea. Even if table owner set up a function to generate additional qualifiers, it also has no guarantee the qualifiers are invoked prior to user-given one. It seems to me this approach will have same problem... Also, if the point here is to provide security for tables not views, it seems like you really need to have (at least a design for) RLS security on insert/update/delete operations. Just adding the same filter condition might be adequate for deletes, but I don't think it works at all for inserts. And for updates, what prevents the user from updating columns he shouldn't, or updating them to key values he shouldn't be able to use? If we also apply the security policy to newer version of tuples on update and insert, one idea is to inject a before-row-(update|insert) trigger to check whether it satisfies the security policy. For same reason, the trigger should be executed at the end of trigger chain. It's not clear to me that there is any need for built-in server functionality here. If the table owner wants to enforce some sort of policy regarding INSERT or UPDATE or DELETE, they can already do that today just by attaching a
Re: [HACKERS] [RFC] Interface of Row Level Security
On May23, 2012, at 22:12 , Robert Haas wrote: Also, suppose that Bob applies an RLS policy to a table, and, later, Alice selects from the table. How do we keep Bob from usurping Alice's privileges? That problem isn't restricted to RLW, though. Bob could just as well have booby-trapped any other SQL object, e.g. could have added bobs_malicious_function() to a view Alice selects from, or attached it as a trigger to a table Alice inserts to. It would be nice if there was (optional) protection against these kinds of attacks, but it's not really something that RLS should be burdened with. BTW, I've wondered in the past how tight our protection against some trying to take over the postgres role during pg_dump is. On the surface, it seems that we're safe because pg_dump doesn't invoke user-defined functions except output functions (which require superuser privileges to modify). But that's not exactly a solid line of defense... If we also apply the security policy to newer version of tuples on update and insert, one idea is to inject a before-row-(update|insert) trigger to check whether it satisfies the security policy. For same reason, the trigger should be executed at the end of trigger chain. It's not clear to me that there is any need for built-in server functionality here. If the table owner wants to enforce some sort of policy regarding INSERT or UPDATE or DELETE, they can already do that today just by attaching a trigger to the table. And they can enforce whatever policy they like that way. Before designing any new mechanism, what's wrong with the existing one? Yeah, applying the security policy to the new row (for UPDATES and INSERTS) seems weird - the policy determines what you can see, not what you can store, which might be two different things. But the security policy should still apply to the old rows, i.e. you shouldn't be after to UPDATE or DELETE rows you cannot see, no? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/24 Florian Pflug f...@phlo.org: If we also apply the security policy to newer version of tuples on update and insert, one idea is to inject a before-row-(update|insert) trigger to check whether it satisfies the security policy. For same reason, the trigger should be executed at the end of trigger chain. It's not clear to me that there is any need for built-in server functionality here. If the table owner wants to enforce some sort of policy regarding INSERT or UPDATE or DELETE, they can already do that today just by attaching a trigger to the table. And they can enforce whatever policy they like that way. Before designing any new mechanism, what's wrong with the existing one? Yeah, applying the security policy to the new row (for UPDATES and INSERTS) seems weird - the policy determines what you can see, not what you can store, which might be two different things. But the security policy should still apply to the old rows, i.e. you shouldn't be after to UPDATE or DELETE rows you cannot see, no? The case of INSERT / DELETE are simple; All we need to apply is checks on either new or old tuples. In case of UPDATE, we need to check on the old tuple whether use can see, and on the new tuple whether use can store them. Indeed, these are different checks, however, it seems like a black hole if the new tuple is allowed to write but no reader privileges. I expect most use cases choose same policy on reader timing and writer times at UPDATE statement. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On May24, 2012, at 12:43 , Kohei KaiGai wrote: The case of INSERT / DELETE are simple; All we need to apply is checks on either new or old tuples. In case of UPDATE, we need to check on the old tuple whether use can see, and on the new tuple whether use can store them. Indeed, these are different checks, however, it seems like a black hole if the new tuple is allowed to write but no reader privileges. I expect most use cases choose same policy on reader timing and writer times at UPDATE statement. I don't think preventing block holes is sensible here - it might, in fact, be *just* what the user wants. Imagine a messaging system. A reasonable RLS policy would be to allow a user to see messages addressed to him. Yet you wouldn't want to prevent her from creating messages to other people - cause what good is a messaging system that only allows you to send messages to yourself. What you probably *would* want to do, though, is to check that she did put herself in as the sender when she creates a message. And you'd probably wanna forbit updates entirely. So you'd have - A RLS policy that checks current_user = ANY(recipients) - An ON INSERT trigger which checks current_user = sender - An ON UPDATE trigger which errors out If RLS policy applies to INSERTEed rows also, how would you do that? Another example, although in the realm of filesystem permissions, is Mac OS X. Per default, every user has a Drop Box folder, which anybody can write to, yet only the owner can read. This allows you to easily transfer files from one user to another without allowing a third party to read it. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/24 Florian Pflug f...@phlo.org: On May24, 2012, at 12:43 , Kohei KaiGai wrote: The case of INSERT / DELETE are simple; All we need to apply is checks on either new or old tuples. In case of UPDATE, we need to check on the old tuple whether use can see, and on the new tuple whether use can store them. Indeed, these are different checks, however, it seems like a black hole if the new tuple is allowed to write but no reader privileges. I expect most use cases choose same policy on reader timing and writer times at UPDATE statement. I don't think preventing block holes is sensible here - it might, in fact, be *just* what the user wants. Imagine a messaging system. A reasonable RLS policy would be to allow a user to see messages addressed to him. Yet you wouldn't want to prevent her from creating messages to other people - cause what good is a messaging system that only allows you to send messages to yourself. What you probably *would* want to do, though, is to check that she did put herself in as the sender when she creates a message. And you'd probably wanna forbit updates entirely. So you'd have - A RLS policy that checks current_user = ANY(recipients) - An ON INSERT trigger which checks current_user = sender - An ON UPDATE trigger which errors out If RLS policy applies to INSERTEed rows also, how would you do that? Another example, although in the realm of filesystem permissions, is Mac OS X. Per default, every user has a Drop Box folder, which anybody can write to, yet only the owner can read. This allows you to easily transfer files from one user to another without allowing a third party to read it. Indeed, you are right. We have no special reason why to enforce same rules on both of reader and writer stage on UPDATE statement. So, the proposed interface might be revised as follows: ALTER TABLE tblname ADD SECURITY POLICY func_name(args, ...) [FOR SELECT | INSERT | [BEFORE|AFTER] UPDATE | DELETE]; In case of INSERT or AFTER UPDATE, I assume the check shall be applied on the tail of before-row triggers. (*) I don't check whether it conflicts syntax or not yet. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On May24, 2012, at 16:19 , Kohei KaiGai wrote: So, the proposed interface might be revised as follows: ALTER TABLE tblname ADD SECURITY POLICY func_name(args, ...) [FOR SELECT | INSERT | [BEFORE|AFTER] UPDATE | DELETE]; In case of INSERT or AFTER UPDATE, I assume the check shall be applied on the tail of before-row triggers. I'd go with just SELECT, UPDATE, DELETE, and leave the INSERT and BEFORE UPDATE case to regular triggers, for two reasons First, it's conceptually much simpler, since the policy always just adds an implicit WHERE clause, period. This of course assumes that DELETE and (BEFORE) UPDATE simply skips rows for which the policy function returns false, instead of reporting 'permission denied' or something. But that's the most reasonable behaviour anyway, I think, because otherwise you'd make batch UPDATEs and DELETEs pretty much unusable, 'cause there'd always be the risk of tripping over some invisible row and getting and error. And second, it avoids mimicking functionality that is already provided by an existing feature, namely triggers. People will have to deal with the trigger ordering issue, but that's nothing new, and I bet most people have a system in place for that. I usually prefix my trigger names with 'a_' to 'z_', for example, to make the ordering explicit. Another issue, BTW, are FOREIGN KEY constraints. Should you be allowed to created references to rows which are invisible to you, or should FOREIGN KEY constraints be exempt from security policies? I'd say they shouldn't be, i.e. the policy WHERE clause should be added to constraint checking queries like usual. But maybe I'm missing some reason why that'd be undesirable… best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/24 Florian Pflug f...@phlo.org: On May24, 2012, at 16:19 , Kohei KaiGai wrote: So, the proposed interface might be revised as follows: ALTER TABLE tblname ADD SECURITY POLICY func_name(args, ...) [FOR SELECT | INSERT | [BEFORE|AFTER] UPDATE | DELETE]; In case of INSERT or AFTER UPDATE, I assume the check shall be applied on the tail of before-row triggers. I'd go with just SELECT, UPDATE, DELETE, and leave the INSERT and BEFORE UPDATE case to regular triggers, for two reasons First, it's conceptually much simpler, since the policy always just adds an implicit WHERE clause, period. This of course assumes that DELETE and (BEFORE) UPDATE simply skips rows for which the policy function returns false, instead of reporting 'permission denied' or something. But that's the most reasonable behaviour anyway, I think, because otherwise you'd make batch UPDATEs and DELETEs pretty much unusable, 'cause there'd always be the risk of tripping over some invisible row and getting and error. I definitely agree with starting a new feature from simple implementation. Although I'm inclined to the approach to replace references to tables with security policy by sub-queries with security barrier flag, instead of adding qualifiers of where clause to avoid the leaky-view scenario, it will make its implementation mush simpler. Another issue, BTW, are FOREIGN KEY constraints. Should you be allowed to created references to rows which are invisible to you, or should FOREIGN KEY constraints be exempt from security policies? I'd say they shouldn't be, i.e. the policy WHERE clause should be added to constraint checking queries like usual. But maybe I'm missing some reason why that'd be undesirable… I agree. The row level security policy should not be applied during FK checks (or other internal stuff; to be harmless). At the previous discussion, it was issued that iteration of FK/PK proving enables malicious one to estimate existence of invisible tuple and its key value, although they cannot see the actual values. It is well documented limitation, thus, user should not use row- level security (or should not use natural key) if they cannot accept this limitation. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
I'd like to summarize the current design being discussed. syntax: ALTER TABLE tblname WITH ROW LEVEL SECURITY ( condition clause ) [FOR (SELECT | UPDATE | DELETE)]; ALTER TABLE tblname WITHOUT ROW LEVEL SECURITY; I tried to patch the parser/gram.y, but here was syntax conflicts on ADD / DROP sub-command. And, I noticed ROW LEVEL SECURITY allows to implement without adding new keyword, unlike SECURITY POLICY. As we discussed, it causes a problem with approach to append additional qualifiers to where clause implicitly, because it does not solve the matter corresponding to the order to execute qualifiers. So, I'm inclined to the approach to replace reference to tables with security policy by sub-queries with security barrier flag. For example, if tbl has security policy, this query shall be rewritten internally, as follows: original) SELECT * FROM tbl WHERE X 20 AND f_leak(Y); rewritten) SELECT * FROM ( SELECT * FROM tbl WHERE uname = getpgusername() ) AS tbl_subqry WHERE X 20 AND f_leak(Y); The sub-query shall have security-barrier flag, so f_leak() is never pushed down but X 20 will be pushed down because of leakproof attribute of the function. It is a bit complex at UPDATE or DELETE statement, but what I try to do is same. original) UPDATE tbl SET X = X + 1 WHERE X 20 AND f_leak(Y); rewritten) UPDATE tbl SET X = X + 1 WHERE ctid = ( SELECT ctid FROM ( SELECT ctid, * FROM uname = getpgusername() ) AS tbl_subqry WHERE X 20 AND f_leak(Y) ); That guarantees the security policy (uname = getpgusername()) is evaluated prior to user given conditions. One thing still I'm thinking is whether the security policy should be provided as a function or a clause. Enough simple sql function is inlined at simplify_function(), so here is no meaningful difference. I was afraid of code complexity, but all we should do is to append configured clause on the where clause of sub-query inside. | ALTER TABLE tblname WITH ROW LEVEL SECURITY | ( condition clause ) [FOR (SELECT | UPDATE | DELETE)]; So, I tried to put condition clause instead of a function, right now. Regarding to FK constraints, I don't think it is a situation to apply row-level security policy towards internal queries. So, I plan to disable during FK checks. One other issue we didn't have discussed is table inheritance. In case when a table TBLP has a child table TBLC and only TBLC has its security policy, what security policy should be applied when we run SELECT * FROM TBLP. My preference is, the security policy is only applied to scan on TBLC, not TBLP. It is not desirable behavior that visible tuples are different from way to reference a certain table. In addition, if and when TBLP and TBLC have their own policy individually, what is a desirable behavior? I think, the security policy of both TBLP and TBLC should be applied on TBLC; in other words, it applies the security policy of all the parent tables to scan on child table. Any comments please. Thanks, 2012/5/24 Kohei KaiGai kai...@kaigai.gr.jp: 2012/5/24 Florian Pflug f...@phlo.org: On May24, 2012, at 16:19 , Kohei KaiGai wrote: So, the proposed interface might be revised as follows: ALTER TABLE tblname ADD SECURITY POLICY func_name(args, ...) [FOR SELECT | INSERT | [BEFORE|AFTER] UPDATE | DELETE]; In case of INSERT or AFTER UPDATE, I assume the check shall be applied on the tail of before-row triggers. I'd go with just SELECT, UPDATE, DELETE, and leave the INSERT and BEFORE UPDATE case to regular triggers, for two reasons First, it's conceptually much simpler, since the policy always just adds an implicit WHERE clause, period. This of course assumes that DELETE and (BEFORE) UPDATE simply skips rows for which the policy function returns false, instead of reporting 'permission denied' or something. But that's the most reasonable behaviour anyway, I think, because otherwise you'd make batch UPDATEs and DELETEs pretty much unusable, 'cause there'd always be the risk of tripping over some invisible row and getting and error. I definitely agree with starting a new feature from simple implementation. Although I'm inclined to the approach to replace references to tables with security policy by sub-queries with security barrier flag, instead of adding qualifiers of where clause to avoid the leaky-view scenario, it will make its implementation mush simpler. Another issue, BTW, are FOREIGN KEY constraints. Should you be allowed to created references to rows which are invisible to you, or should FOREIGN KEY constraints be exempt from security policies? I'd say they shouldn't be, i.e. the policy WHERE clause should be added to constraint checking queries like usual. But maybe I'm missing some reason why that'd be
Re: [HACKERS] [RFC] Interface of Row Level Security
On Thu, May 24, 2012 at 6:11 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: Perhaps when we see that RLS applies, we should replace the reference to the original table with a subquery RTE that has the security_barrier flag set - essentially treating a table with RLS as if it were a security view. I become to think it is a better approach than tracking origin of each qualifiers. One problem is case handling on update or delete statement. It may be possible to rewrite the update / delete query as follows: From: UPDATE tbl SET X = X + 1 WHERE f_leak(Y) To: UPDATE tbl SET X = X + 1 WHERE ctid = ( SELECT * FROM ( SELECT ctid FROM tbl WHERE uname = getpgusername() == (*) should have security-barrier ) AS tbl_subqry WHERE f_leak(Y) ); Expanded sub-queries will have security-barrier flag, so it enforces the uname = getpgusername() being checked earlier than f_leak(Y). We may need to measure the performance impact due to the reform. The problem with this is that it introduces an extra instance of tbl into the query - there are now two rather than one. UPDATE .. FROM is supposed to be a way to avoid this, but it's insufficiently general to handle all the cases (e.g. UPDATE a LEFT JOIN b can't be written using the existing syntax). Anyway we want to avoid inserting self-joins for performance reasons if at all possible. It should be easy to do that in the case of SELECT; UPDATE and DELETE may need a bit more work. I think, this situation is similar to a case when we reference a view without privileges to underlying tables. If Bob set up a view with something tricky function, it allows Bob to reference credentials of users who reference the view. More or less, it might be a problem when a user try to invoke a user defined function declared by others. (Thus, sepgsql policy does not allow users to invoke a function declared by another one in different domain; without DBA's checks.) This is true, but there are still some new threat models. For example, currently, pg_dump isn't going to run any user-defined code just because you do SELECT * FROM table, but that will change with this patch. Note that pg_dump need not actually select from views, only tables. I think it is a good idea not to apply RLS when current user has superuser privilege from perspective of security model consistency, but it is inconsistent to check privileges underlying tables. Seems like a somewhat random wart, if it's just an exception for superusers. I think we need to do better than that. For example, at my last company, sales reps A and B were permitted to see all customers of the company, but sales reps C, D, E, F, G, H, I, and J were permitted to see only their own accounts. Those sorts of policies need to be easy to implement. Another idea is to set things up so that the RLS policy function isn't applied to each row directly; instead, it's invoked once per query and *returns* a WHERE clause. This would be a lot more powerful than the proposed design, because now the table owner can write a function that imposes quals on some people but not others, which seems very useful. Sorry, I don't favor this idea. Even if table owner set up a function to generate additional qualifiers, it also has no guarantee the qualifiers are invoked prior to user-given one. It seems to me this approach will have same problem... It's not intended to solve the qual-ordering problem, just to allow additional policy flexibility. It's not clear to me that there is any need for built-in server functionality here. If the table owner wants to enforce some sort of policy regarding INSERT or UPDATE or DELETE, they can already do that today just by attaching a trigger to the table. And they can enforce whatever policy they like that way. Before designing any new mechanism, what's wrong with the existing one? Yes, we don't need any new invent to check the value of new tuples. But it should be done after all the user-defined triggers. Existing trigger does not have a mechanism to enforce order to be invoked. So, what I really implement is a mechanism to inject some pseudo triggers at tail of the Trigger array. Start the trigger names with the letter z. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Thu, May 24, 2012 at 6:20 AM, Florian Pflug f...@phlo.org wrote: But the security policy should still apply to the old rows, i.e. you shouldn't be after to UPDATE or DELETE rows you cannot see, no? Agreed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Thu, May 24, 2012 at 12:00 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: Another issue, BTW, are FOREIGN KEY constraints. Should you be allowed to created references to rows which are invisible to you, or should FOREIGN KEY constraints be exempt from security policies? I'd say they shouldn't be, i.e. the policy WHERE clause should be added to constraint checking queries like usual. But maybe I'm missing some reason why that'd be undesirable… I agree. The row level security policy should not be applied during FK checks (or other internal stuff; to be harmless). At the previous discussion, it was issued that iteration of FK/PK proving enables malicious one to estimate existence of invisible tuple and its key value, although they cannot see the actual values. It is well documented limitation, thus, user should not use row- level security (or should not use natural key) if they cannot accept this limitation. You say I agree, but it seems to me that you and Florian are in fact taking opposite positions. FWIW, I'm inclined to think that you should NOT be able to create a row that references an invisible row. You might end up with that situation anyway, because we don't know what the semantics of the security policy are: rows might become visible or invisible after the fact, and we can't police that. But I think that if you take the opposite position that the select queries inside fkey triggers ought to be exempt from security policy, then you need to build some new mechanism to make that happen, which seems like extra work for no benefit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On May24, 2012, at 18:42 , Kohei KaiGai wrote: I'd like to summarize the current design being discussed. syntax: ALTER TABLE tblname WITH ROW LEVEL SECURITY ( condition clause ) [FOR (SELECT | UPDATE | DELETE)]; ALTER TABLE tblname WITHOUT ROW LEVEL SECURITY; I tried to patch the parser/gram.y, but here was syntax conflicts on ADD / DROP sub-command. And, I noticed ROW LEVEL SECURITY allows to implement without adding new keyword, unlike SECURITY POLICY. Let the bike-shedding begin ;-) ALTER TABLE … WITH sounds a bit weird. What about ALTER TABLE tblname SET ROW POLICY condition FOR { SELECT | UPDATE | DELETE } ALTER TABLE tblname RESET ROW POLICY As we discussed, it causes a problem with approach to append additional qualifiers to where clause implicitly, because it does not solve the matter corresponding to the order to execute qualifiers. So, I'm inclined to the approach to replace reference to tables with security policy by sub-queries with security barrier flag. Since the security barrier flag carries a potentially hefty performance penalty, I think it should be optional. Application which don't allow SQL-level access to the database might still benefit from row-level security, because it saves them from having to manually add the WHERE clause to every statement, or having to wrap all their tables with views. Yet without direct SQL-level access, the security barrier thing isn't really necessary, so it'd be nice if they wouldn't have to pay for it. How about ALTER TABLE … SET ROW POLICY … WITH (security_barrier) One thing still I'm thinking is whether the security policy should be provided as a function or a clause. Enough simple sql function is inlined at simplify_function(), so here is no meaningful difference. I was afraid of code complexity, but all we should do is to append configured clause on the where clause of sub-query inside. | ALTER TABLE tblname WITH ROW LEVEL SECURITY | ( condition clause ) [FOR (SELECT | UPDATE | DELETE)]; So, I tried to put condition clause instead of a function, right now. A single function seems much easier implementation-wise, since you wouldn't need to store an arbitrary expression in the catalog, just an oid. It also delegates the dependency tracking problem to the function. It also simplies the grammar, because the FOR … clause cannot be mistaken to belong to the condition clause. One other issue we didn't have discussed is table inheritance. In case when a table TBLP has a child table TBLC and only TBLC has its security policy, what security policy should be applied when we run SELECT * FROM TBLP. My preference is, the security policy is only applied to scan on TBLC, not TBLP. Agreed. It is not desirable behavior that visible tuples are different from way to reference a certain table. In addition, if and when TBLP and TBLC have their own policy individually, what is a desirable behavior? I think, the security policy of both TBLP and TBLC should be applied on TBLC; in other words, it applies the security policy of all the parent tables to scan on child table. I think security policies should only apply to the table they're declared for, not their child tables. Mostly because that is how triggers operate, and security policies and triggers will often be used together, so having their semantics regarding inheritance be the same seems to be the least surprising option. Also, if policies are inherited, how would you define a policy which applies only to the parent, not to the child? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On May24, 2012, at 19:25 , Robert Haas wrote: FWIW, I'm inclined to think that you should NOT be able to create a row that references an invisible row. You might end up with that situation anyway, because we don't know what the semantics of the security policy are: rows might become visible or invisible after the fact, and we can't police that. Right. I just realized, however, that there's another case which wasn't considered yet, which is how to handle the initial check during ALTER TABEL ADD CONSTRAINT. I'm thinking that it's fine to only consider visible rows in the parent table there too, but we should be checking all rows in the child table. The easiest way would be to restrict ALTER TABLE ADD CONSTRAINT to the table owner for tables with RLS (it seems that currently the REFERENCES privilege is sufficient), and make the table owner exempt from RLS on that table. The latter means you'd need at least two roles to use RLS, but anyone security-conscious enough to use RLS will probably not user the same role for DDL and DML operations anyway... But I think that if you take the opposite position that the select queries inside fkey triggers ought to be exempt from security policy, then you need to build some new mechanism to make that happen, which seems like extra work for no benefit. Hm, interesting angle. Continuing this thought, without any extra work, UNIQUE and EXCLUSION constraints *will* be enforced regardless of row visibility, because their implementation isn't SPI-based but instead detects conflicts while inserting tuples into the index. For being so obviously inconsistent in its treatment of UNIQUE and EXCLUSION constraints vs. FK constraints, this feels surprisingly right. So, to prevent design by accident, here's an attempt to explain that divergence. For UNIQUE and EXCLUSION constraints, the most conservative assumption possible is that all rows are visible, since that leads to the most rejections. With that assumption, no matter what the actual policy is, the data returned by a query will always satisfy the constraint. Plus, the constraint is still sensible because it neither rejects nor allows all rows. So that conservative assumption is the one we make, i.e. we ignore RLS visibility when checking those kinds of constraints. For FK constraints, OTOH, the most conservative assumption is that no rows are visible. But that is meaningless, since it will simply reject all possible rows. Having thus no chance of enforcing the constraint ourselves under all possible policies, the best we can do is to at least make it possible for the constraint to work correctly for as many policies as possible. Now, if we go with KaiGai's suggestion of skipping RLS while checking FK constraints, the only policy that the constraint will work correctly for is one which doesn't actually hide any parent rows. Whereas if we apply RLS checks while checking FK constraints, all policies which behave consistently for parent and child rows (i.e. don't hide the former but show the latter) will work correctly. We thus go with the second option, since the class of working policies is larger. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
Kohei KaiGai kai...@kaigai.gr.jp writes: Let me have a discussion to get preferable interface for row-level security. My planned feature will perform to append additional conditions to WHERE clause implicitly, to restrict tuples being visible for the current user. For example, when row-level policy uname = getpgusername() is configured on the table T1, the following query: select * from T1 where X 20; should be rewritten to: select * from T1 where (X 20) AND (uname = getpgusername()); Hm. Simple and fairly noninvasive, but ... would this not be subject to the same sorts of information-leak hazards that were addressed in the security views feature? That is, I see no guarantee that the RLS condition will be evaluated before any conditions supplied by the user. So it seems easy to get information out of rows the RLS policy is supposed to prevent access to. It would be far more secure to just use a security view to apply the RLS condition. Also, if the point here is to provide security for tables not views, it seems like you really need to have (at least a design for) RLS security on insert/update/delete operations. Just adding the same filter condition might be adequate for deletes, but I don't think it works at all for inserts. And for updates, what prevents the user from updating columns he shouldn't, or updating them to key values he shouldn't be able to use? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Wed, May 23, 2012 at 5:09 PM, Tom Lane t...@sss.pgh.pa.us wrote: Kohei KaiGai kai...@kaigai.gr.jp writes: Let me have a discussion to get preferable interface for row-level security. My planned feature will perform to append additional conditions to WHERE clause implicitly, to restrict tuples being visible for the current user. For example, when row-level policy uname = getpgusername() is configured on the table T1, the following query: select * from T1 where X 20; should be rewritten to: select * from T1 where (X 20) AND (uname = getpgusername()); Hm. Simple and fairly noninvasive, but ... would this not be subject to the same sorts of information-leak hazards that were addressed in the security views feature? That is, I see no guarantee that the RLS condition will be evaluated before any conditions supplied by the user. So it seems easy to get information out of rows the RLS policy is supposed to prevent access to. It would be far more secure to just use a security view to apply the RLS condition. Since adding a condition to the where clause is a relatively simple operation (compared to the full potential scope of a view) could the RLS rewrite of the query create a CTE with the additional condition[s] rather than adding condition[s] to the user-supplied query? This would provide the forced ordering of the evaluating the conditions, thereby avoiding many of the potential points of leakage. Bell. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/23 Tom Lane t...@sss.pgh.pa.us: Kohei KaiGai kai...@kaigai.gr.jp writes: Let me have a discussion to get preferable interface for row-level security. My planned feature will perform to append additional conditions to WHERE clause implicitly, to restrict tuples being visible for the current user. For example, when row-level policy uname = getpgusername() is configured on the table T1, the following query: select * from T1 where X 20; should be rewritten to: select * from T1 where (X 20) AND (uname = getpgusername()); Hm. Simple and fairly noninvasive, but ... would this not be subject to the same sorts of information-leak hazards that were addressed in the security views feature? That is, I see no guarantee that the RLS condition will be evaluated before any conditions supplied by the user. So it seems easy to get information out of rows the RLS policy is supposed to prevent access to. It would be far more secure to just use a security view to apply the RLS condition. I wanted to have discussion to handle this problem. Unlike leaky-view problem, we don't need to worry about unexpected qualifier distribution on either side of join, because a scan on table never contains any join. Thus, all we need to care about is order of qualifiers chained on a particular scan node; being reordered by the cost to invoke functions. How about an idea to track FuncExpr come from the security policy and enforce 0 on its cost? Regular functions never reach zero cost, so the security policy must be re-ordered to the head. Also, if the point here is to provide security for tables not views, it seems like you really need to have (at least a design for) RLS security on insert/update/delete operations. Just adding the same filter condition might be adequate for deletes, but I don't think it works at all for inserts. And for updates, what prevents the user from updating columns he shouldn't, or updating them to key values he shouldn't be able to use? If we also apply the security policy to newer version of tuples on update and insert, one idea is to inject a before-row-(update|insert) trigger to check whether it satisfies the security policy. For same reason, the trigger should be executed at the end of trigger chain. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/5/23 Alastair Turner b...@ctrlf5.co.za: On Wed, May 23, 2012 at 5:09 PM, Tom Lane t...@sss.pgh.pa.us wrote: Kohei KaiGai kai...@kaigai.gr.jp writes: Let me have a discussion to get preferable interface for row-level security. My planned feature will perform to append additional conditions to WHERE clause implicitly, to restrict tuples being visible for the current user. For example, when row-level policy uname = getpgusername() is configured on the table T1, the following query: select * from T1 where X 20; should be rewritten to: select * from T1 where (X 20) AND (uname = getpgusername()); Hm. Simple and fairly noninvasive, but ... would this not be subject to the same sorts of information-leak hazards that were addressed in the security views feature? That is, I see no guarantee that the RLS condition will be evaluated before any conditions supplied by the user. So it seems easy to get information out of rows the RLS policy is supposed to prevent access to. It would be far more secure to just use a security view to apply the RLS condition. Since adding a condition to the where clause is a relatively simple operation (compared to the full potential scope of a view) could the RLS rewrite of the query create a CTE with the additional condition[s] rather than adding condition[s] to the user-supplied query? This would provide the forced ordering of the evaluating the conditions, thereby avoiding many of the potential points of leakage. An interesting idea. However, I cannot imagine how does it works on update or delete statement. For select statement, it will get better performance to rewrite reference to a particular table by a subquery with security_barrier flag than CTE, because it allows to push down leakproof functions. Could you tell me your idea for more details? An example will help me understand well. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Wed, May 23, 2012 at 3:45 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: I wanted to have discussion to handle this problem. Unlike leaky-view problem, we don't need to worry about unexpected qualifier distribution on either side of join, because a scan on table never contains any join. Thus, all we need to care about is order of qualifiers chained on a particular scan node; being reordered by the cost to invoke functions. How about an idea to track FuncExpr come from the security policy and enforce 0 on its cost? Regular functions never reach zero cost, so the security policy must be re-ordered to the head. Hmm. That would disregard the relative costs of multiple qualifiers all of which were injected by the security policy, which I suspect is not a good idea. Furthermore, I think that we should not assume that there is no join involved. I would expect a fairly popular RLS qual to be something of the form WHERE NOT EXISTS (SELECT 1 FROM hide_me WHERE hide_me.pk = thistab.pk). Perhaps when we see that RLS applies, we should replace the reference to the original table with a subquery RTE that has the security_barrier flag set - essentially treating a table with RLS as if it were a security view. Also, suppose that Bob applies an RLS policy to a table, and, later, Alice selects from the table. How do we keep Bob from usurping Alice's privileges? If we insist that Bob's RLS policy function runs as Bob, then it defeats inlining; but if it runs as Alice, then Bob can steal Alice's credentials. One idea is to apply the security policy only if Alice's access to the table is granted by Bob. That way, if Alice is (for example) the superuser, she's immune to RLS. But that doesn't seem to completely solve the problem, because Alice might merely be some other relatively unprivileged user and we still don't want Bob to be able to walk off with her access. Another idea is to set things up so that the RLS policy function isn't applied to each row directly; instead, it's invoked once per query and *returns* a WHERE clause. This would be a lot more powerful than the proposed design, because now the table owner can write a function that imposes quals on some people but not others, which seems very useful. Also, if the point here is to provide security for tables not views, it seems like you really need to have (at least a design for) RLS security on insert/update/delete operations. Just adding the same filter condition might be adequate for deletes, but I don't think it works at all for inserts. And for updates, what prevents the user from updating columns he shouldn't, or updating them to key values he shouldn't be able to use? If we also apply the security policy to newer version of tuples on update and insert, one idea is to inject a before-row-(update|insert) trigger to check whether it satisfies the security policy. For same reason, the trigger should be executed at the end of trigger chain. It's not clear to me that there is any need for built-in server functionality here. If the table owner wants to enforce some sort of policy regarding INSERT or UPDATE or DELETE, they can already do that today just by attaching a trigger to the table. And they can enforce whatever policy they like that way. Before designing any new mechanism, what's wrong with the existing one? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers