Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On Tue, Jun 24, 2014 at 12:27 PM, Stephen Frost wrote: > I feel like we are getting to the point of simply talking past each > other and so I'll try anew, and I'll include my understanding of how the > different approaches would address the specific use-case you outlined > up-thread. Thanks, you're right, and this is a good write-up. > Single policy > - > The current implementation approach only allows a single policy to be > included. > [...snip...] > For the case where a sales rep isn't also a partner, you could simplify > this to: > > WHERE > sales_rep_id = 16384 > > but I'm not sure that really buys you much? With the bitmap heap > scan, if one side of the OR ends up not returning anything then it > doesn't contribute to the blocks which have to be scanned. The index > might still need to be scanned, although I think you could avoid even > that with an EXISTS check to see if the user is a partner at all. > That's not to say that a bitmap scan is equivilant to an index scan, but > it's certainly likely to be far better than a sequential scan. True, but the wins could be much larger if one policy is WHERE sales_rep_id = (SELECT oid FROM pg_roles WHERE rolname = current_user) and the other policy is WHERE complexfn(). I'll also throw out a +1 for Dean's comments on this topic. > Multiple, Non-overlapping policies > -- > Preventing the overlap of policies ends up being very complicated if > many dimensions are allowed. For the simple case, perhaps only the > 'current role' dimension is useful. I expect that going down that > route would very quickly lead to requests for other dimensions (client > IP, etc) which is why I'm not a big fan of it, but if that's the > concensus then let's work out the syntax and update the patch and move > on. I think role is good enough. That's the primary identifier for all access-control related decisions, so it should be good enough here, too. I don't really understand your concerns about overlapping policies being complex. If you've got a couple of WHERE clauses, combining them with OR is not hard. Now, the query optimizer may have trouble with it, but on the whole I expect to win more than we lose, by entirely excluding some branches of an OR for users for whom entirely policies can be excluded. > Overall, while I'm interested in defining where this is going in a way > which allows us implement an initial RLS capability while avoiding > future upgrade issues, I am perfectly happy to say that the 9.5 RLS > implementation may not be exactly syntax-compatible with 9.6 or 10.0. Again, I think that's completely non-viable. Are you going to tell people they can't pg_upgrade, and they can't dump-and-reload either, without manual fiddling? There's no way that's going to be accepted. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Craig, * Craig Ringer (cr...@2ndquadrant.com) wrote: > On 06/24/2014 10:30 PM, Alvaro Herrera wrote: > > I haven't been following this thread, but this bit caught my attention. > > I'm not sure I agree that OR is always the right policy either. > > There is a case for a policy that says "forbid these rows to these guys, > > even if they have read permissions from elsewhere". > > That's generally considered a "DENY" policy, a concept borrowed from ACLs. Right. > > If OR is the only > > way to mix multiple policies there might not be a way to implement this. > > I think that's a "later" myself, but we shouldn't design ourselves into > a corner where we can't support deny rules either. Agreed, but I don't want to get so wrapped up in all of this that we end up with a set of requirements so long that we'll never be able to accomplish them all in a single release... Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On 06/24/2014 10:30 PM, Alvaro Herrera wrote: > I haven't been following this thread, but this bit caught my attention. > I'm not sure I agree that OR is always the right policy either. > There is a case for a policy that says "forbid these rows to these guys, > even if they have read permissions from elsewhere". That's generally considered a "DENY" policy, a concept borrowed from ACLs. You have access to a resource if: - You have at least one policy that gives you access AND - You have no policies that deny you access > If OR is the only > way to mix multiple policies there might not be a way to implement this. I think that's a "later" myself, but we shouldn't design ourselves into a corner where we can't support deny rules either. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Dean, * Dean Rasheed (dean.a.rash...@gmail.com) wrote: > Thinking about the examples upthread, a separate issue occurs to me > --- when defining a RLS qual, I think that there has to be a syntax to > specify an alias for the main table, so that correlated subqueries can > refer to it. I'm not sure if that's been mentioned in any of the > discussions so far, but it might be quite hard to define certain quals > without it. Yeah, that thought had occured to me also. Have any suggestions about how to approach that issue? The way triggers have OLD/NEW comes to mind but I'm not sure how easily that'd work. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Thinking about the examples upthread, a separate issue occurs to me --- when defining a RLS qual, I think that there has to be a syntax to specify an alias for the main table, so that correlated subqueries can refer to it. I'm not sure if that's been mentioned in any of the discussions so far, but it might be quite hard to define certain quals without it. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On 24 June 2014 17:27, Stephen Frost wrote: > Single policy vs Multiple, Overlapping policies vs Multiple, Non-overlapping > policies > What I was describing upthread was multiple non-overlapping policies. I disagree that this will be more complicated to use. It's a strict superset of the single policy functionality, so if you want to do it all using a single policy then you can. But I think that once the ACLs reach a certain level of complexity, you probably will want to break it up into multiple policies, and I think doing so will make things simpler, not more complicated. Taking a specific, simplistic example, suppose you had 2 groups of users - some are normal users who should only be able to access their own records. For these users, you might have a policy like WHERE person_id = current_user which would be highly selective, and probably use an index scan. Then there might be another group of users who are managers with access to the records of, say, everyone in their department. This might then be a more complex qual along the lines of WHERE person_id IN (SELECT ... FROM person_department WHERE mgr_id = current_user AND ...) which might end up being a hash or merge join, depending on any user-supplied quals. You _could_ combine those into a single policy, but I think it would be much better to have 2 distinct policies, since they're 2 very different queries, for different use cases. Normal users would only be granted permission to use the normal_user_policy. Managers might be granted permission to use either the normal_user_policy or the manager_policy (but not both at the same time). That's a very simplified example. In more realistic situations there are likely to be many more classes of users, and trying to enforce all the logic in a single WHERE clause is likely to get unmanageable, or inefficient if it involves lots of logic hidden away in functions. Allowing multiple, non-overlapping policies allows the problem to be broken up into more manageable pieces, which also makes the planner's job easier, since only a single, simpler policy is in effect in any given query. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Robert, I feel like we are getting to the point of simply talking past each other and so I'll try anew, and I'll include my understanding of how the different approaches would address the specific use-case you outlined up-thread. Single policy - The current implementation approach only allows a single policy to be included. The concern raised with this approach is that it won't be very performant due to the qual complexity, which you outlined (reformatted a bit) as: WHERE sales_rep_id = (SELECT oid FROM pg_roles WHERE rolname = current_user AND oid IN (SELECT id FROM person WHERE is_sales_rep)) OR partner_id = (SELECT p.org_id FROM pg_roles a, person p WHERE a.rolname = current_user AND a.oid = p.id) Which I take to mean there is a 'person' table which looks like: id, is_sales_rep, org_id and a table which has the RLS qual which looks like: pk_id, sales_rep_id, partner_id Then, if the individual is_sales_rep and it's their account by sales_rep_id, or if the individual's org_id matches the partner_id, they can see the record. Using this example with security barrier views and indexes on person.id, data.pk_id, data.sales_rep_id, and data.partner_id, we'll get a bitmap heap scan across the 'data' table by having the two OR's run as InitPlan 1 and InitPlan 2. Does that address the concern you had around multi-branch OR policies? This works with more than two OR branches also, though of course we need appropriate indexes to make use of a Bitmap Heap Scan. Even with per-user policies, we would define a policy along these lines, for the "sfrost" role: WHERE sales_rep_id = 16384 OR partner_id = 1 Which also ends up doing a Bitmap Heap Scan across the data table. For the case where a sales rep isn't also a partner, you could simplify this to: WHERE sales_rep_id = 16384 but I'm not sure that really buys you much? With the bitmap heap scan, if one side of the OR ends up not returning anything then it doesn't contribute to the blocks which have to be scanned. The index might still need to be scanned, although I think you could avoid even that with an EXISTS check to see if the user is a partner at all. That's not to say that a bitmap scan is equivilant to an index scan, but it's certainly likely to be far better than a sequential scan. Now, if the query is "select * from data_view with pk_id = 1002;", then we get an indexed lookup on the data table based on the PK. That's what I was trying to point out previously regarding leakproof functions (which comprise about half of the boolean functions we provide, if I recall my previous analysis correctly). We also get indexed lookups with "pk_id < 10" or similar as those are also leakproof. Multiple, Overlapping policies -- Per discussion, these would generally be OR'd together. Building up the overall qual which has to include an OR branch for each individual policy qual(s) looks like a complicated bit of work and one which might be better left to the user (and, as just pointed out, the user may actually want AND instead of OR in some cases..). Managing the plan cache in a sensible way is certainly made more complicated by this and might mean that it can't be used at all, which has already been raised as a show-stopper issue. In the example which you provided, while we could represent that the two policies exist (sales representatives vs partners) and that they are to be OR'd together in the catalog, but I don't immediately see how that would change the qual which ends up being added to the query in this case or really improving the overall query plan; at least, not without eliminating one of the OR branches somehow- which I discuss below. Multiple, Non-overlapping policies -- Preventing the overlap of policies ends up being very complicated if many dimensions are allowed. For the simple case, perhaps only the 'current role' dimension is useful. I expect that going down that route would very quickly lead to requests for other dimensions (client IP, etc) which is why I'm not a big fan of it, but if that's the concensus then let's work out the syntax and update the patch and move on. Another option might be to have a qual for each policy which the user can define that indicates if that policy is to be applied or not and then simply pick the first policy for which that qual which returns 'true'. We would require an ordering to be defined in this case, which I believe was an issue up-thread. If we allow all policies matching the quals then we run into the complications mentioned under "Overlapping policies" above. If we decide that per-role policies need to be supported, I very quickly see the need to have "groups" of roles to which a policy is to be applied. This would differ from roles today as they would not be allowed to overlap (otherwise we are into over
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On Tue, Jun 24, 2014 at 10:30 AM, Alvaro Herrera wrote: > Robert Haas wrote: >> > Right, if we were to support multiple policies on a given table then we >> > would have to support adding and removing them individually, as well as >> > specify when they are to be applied- and what if that "when" overlaps? >> > Do we apply both and only a row which passed them all gets sent to the >> > user? Essentially we'd be defining the RLS policies to be AND'd >> > together, right? Would we want to support both AND-based and OR-based, >> > and allow users to pick what set of conditionals they want applied to >> > their various overlapping RLS policies? >> >> AND is not a sensible policy; it would need to be OR. If you grant >> someone access to two different subsets of the rows in a table, it >> stands to reason that they will expect to have access to all of the >> rows that are in at least one of those subsets. > > I haven't been following this thread, but this bit caught my attention. > I'm not sure I agree that OR is always the right policy either. > There is a case for a policy that says "forbid these rows to these guys, > even if they have read permissions from elsewhere". If OR is the only > way to mix multiple policies there might not be a way to implement this. > So ISTM each policy must be able to indicate what to do -- sort of how > PAM config files allow you to specify "required", "optional" and so > forth for each module. Hmm. Well, that could be useful, but I'm not sure I'd view it as something we absolutely have to have... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Robert Haas wrote: > > Right, if we were to support multiple policies on a given table then we > > would have to support adding and removing them individually, as well as > > specify when they are to be applied- and what if that "when" overlaps? > > Do we apply both and only a row which passed them all gets sent to the > > user? Essentially we'd be defining the RLS policies to be AND'd > > together, right? Would we want to support both AND-based and OR-based, > > and allow users to pick what set of conditionals they want applied to > > their various overlapping RLS policies? > > AND is not a sensible policy; it would need to be OR. If you grant > someone access to two different subsets of the rows in a table, it > stands to reason that they will expect to have access to all of the > rows that are in at least one of those subsets. I haven't been following this thread, but this bit caught my attention. I'm not sure I agree that OR is always the right policy either. There is a case for a policy that says "forbid these rows to these guys, even if they have read permissions from elsewhere". If OR is the only way to mix multiple policies there might not be a way to implement this. So ISTM each policy must be able to indicate what to do -- sort of how PAM config files allow you to specify "required", "optional" and so forth for each module. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On Mon, Jun 23, 2014 at 2:29 PM, Stephen Frost wrote: > What are these policies going to depend on? Will they be allowed to > overlap? I don't see multi-policy support as being very easily added. We discussed the point about overlap upthread, and I gave specific examples. If there's something else you want me to provide here, please be more clear about it. > If there are specific ways to design the syntax which would make it > easier to support multiple policies in the future, I'm all for it. Have > any specific thoughts regarding that? I did propose something already upthread, and then Dean said this: # Note that the syntax proposed elsewhere --- GRANT SELECT (polname) ON # TABLE tab TO role --- doesn't work because it conflicts with the # syntax for granting column privileges, so there needs to be a distinct # syntax for this, and I think it ought to ultimately allow things like # # GRANT SELECT (col1, col2), UPDATE (col1) ON t1 TO bob USING policy1; He's got a good point there. I don't know whether the policy should be given inline (e.g. GRANT ... WHERE stuff()) or out-of-line (GRANT ... USING policy1) but it seems like specifying it as some sort of GRANT modifier might make sense. I'm sure there are other ways also, of course. >> >> - Require the user to specify in some way which of the available >> >> policies they want applied, and then apply only that one. >> > >> > I'd want to at least see a way to apply an ordering to the policies >> > being applied, or have PG work out which one is "cheapest" and try that >> > one first. >> >> Cost-based comparison of policies that return different results >> doesn't seem sensible to me. > > I keep coming back to the thought that, really, having multiple > overlapping policies just adds unnecessary complication to the system > for not much gain in real functionality. Being able to specify a policy > per-role might be useful, but that's only one dimension and I can > imagine a lot of other dimensions that one might want to use to control > which policy is used. Well, I don't agree, and I've given examples upthread showing the kinds of scenarios that I'm concerned about, which are drawn from real experiences I've had. It may be that I'm the only one who has had such experiences, of course; or that there aren't enough people who have to justify catering to such use cases. But I'm not sure there's much point in trying to have a conversation about how such a thing could be made to work if you're just going to revert back to "well, we don't really need this anyway" each time I make or refute a technical point. >> I think it would be a VERY bad idea to design the system around the >> assumption that the RLS quals will be much more or less selective than >> the user-supplied quals. That's going to be different in different >> environments. > > Fine- but do you really see the query planner having a problem pushing > down whichever is the more selective qual, if the user-provided qual is > marked as leakproof? I'm not quite sure I understand the scenario you're describing here. Can you provide a tangible example? I expect that most of the things the RLS-limited user might write in the WHERE clause will NOT get pushed down because most functions are not leakproof. However, the issue I'm actually concerned about is whether the *security* qual is simple enough to permit an index-scan. Anything with an OR clause in it probably won't be, and any function call definitely won't be. > I realize that you want multiple policies because you'd like a way for > the RLS qual to be made simpler for certain cases while also having more > complex quals for other cases. What I keep waiting to hear is exactly > how you want to specify which policy is used because that's where it > gets ugly and complicated. I still really don't like the idea of trying > to apply multiple policies inside of a single query execution. See above comments. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
* Robert Haas (robertmh...@gmail.com) wrote: > On Wed, Jun 18, 2014 at 2:18 PM, Stephen Frost wrote: > > I'm also of the opinion that this isn't strictly necessary for the > > initial RLS offering in PG- there's a clear way we could migrate > > existing users to a multi-policy system from a single-policy system. > > Sure, to get the performance and optimization benefits that we'd > > presumably have in the multi-policy case they'd need to re-work their > > RLS configuration, but for users who care, they'll likely be very happy > > to do so to gain those benefits. > > I think a lot depends on the syntax we choose. If we choose a syntax > that only makes sense in a single-policy framework, then I think > allowing upgrades to a multi-policy syntax is going to be really > difficult. On the other hand, if we choose a syntax that allows > multiple policies, I suspect we can support multiple policies from the > beginning without much extra effort. What are these policies going to depend on? Will they be allowed to overlap? I don't see multi-policy support as being very easily added. If there are specific ways to design the syntax which would make it easier to support multiple policies in the future, I'm all for it. Have any specific thoughts regarding that? > >> - Require the user to specify in some way which of the available > >> policies they want applied, and then apply only that one. > > > > I'd want to at least see a way to apply an ordering to the policies > > being applied, or have PG work out which one is "cheapest" and try that > > one first. > > Cost-based comparison of policies that return different results > doesn't seem sensible to me. I keep coming back to the thought that, really, having multiple overlapping policies just adds unnecessary complication to the system for not much gain in real functionality. Being able to specify a policy per-role might be useful, but that's only one dimension and I can imagine a lot of other dimensions that one might want to use to control which policy is used. > >> I think exactly the opposite, for the query planning reasons > >> previously stated. I think the policies will quickly get so > >> complicated that they're no longer optimizable. Here's a simple > >> example: > >> > >> - Policy 1 allows the user to access rows for which complexfunc() returns > >> true. > >> - Policy 2 allows the user to access rows for which a = 1. > >> > >> Most users have access only through policy 2, but some have access > >> through policy 1. Users who have access through policy 1 will always > >> get a sequential scan, > > > > This is the thing which I most object to- if the quals being provided at > > any level are leakproof and would be able to reduce the returned set > > sufficiently that an index scan is the best bet, we should be doing > > that. I don't anticipate the RLS quals to be as selective as the > > quals which the user is adding. > > I think it would be a VERY bad idea to design the system around the > assumption that the RLS quals will be much more or less selective than > the user-supplied quals. That's going to be different in different > environments. Fine- but do you really see the query planner having a problem pushing down whichever is the more selective qual, if the user-provided qual is marked as leakproof? I realize that you want multiple policies because you'd like a way for the RLS qual to be made simpler for certain cases while also having more complex quals for other cases. What I keep waiting to hear is exactly how you want to specify which policy is used because that's where it gets ugly and complicated. I still really don't like the idea of trying to apply multiple policies inside of a single query execution. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On Wed, Jun 18, 2014 at 2:18 PM, Stephen Frost wrote: > I'm also of the opinion that this isn't strictly necessary for the > initial RLS offering in PG- there's a clear way we could migrate > existing users to a multi-policy system from a single-policy system. > Sure, to get the performance and optimization benefits that we'd > presumably have in the multi-policy case they'd need to re-work their > RLS configuration, but for users who care, they'll likely be very happy > to do so to gain those benefits. I think a lot depends on the syntax we choose. If we choose a syntax that only makes sense in a single-policy framework, then I think allowing upgrades to a multi-policy syntax is going to be really difficult. On the other hand, if we choose a syntax that allows multiple policies, I suspect we can support multiple policies from the beginning without much extra effort. >> - Require the user to specify in some way which of the available >> policies they want applied, and then apply only that one. > > I'd want to at least see a way to apply an ordering to the policies > being applied, or have PG work out which one is "cheapest" and try that > one first. Cost-based comparison of policies that return different results doesn't seem sensible to me. >> I think exactly the opposite, for the query planning reasons >> previously stated. I think the policies will quickly get so >> complicated that they're no longer optimizable. Here's a simple >> example: >> >> - Policy 1 allows the user to access rows for which complexfunc() returns >> true. >> - Policy 2 allows the user to access rows for which a = 1. >> >> Most users have access only through policy 2, but some have access >> through policy 1. Users who have access through policy 1 will always >> get a sequential scan, > > This is the thing which I most object to- if the quals being provided at > any level are leakproof and would be able to reduce the returned set > sufficiently that an index scan is the best bet, we should be doing > that. I don't anticipate the RLS quals to be as selective as the > quals which the user is adding. I think it would be a VERY bad idea to design the system around the assumption that the RLS quals will be much more or less selective than the user-supplied quals. That's going to be different in different environments. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Dean, * Dean Rasheed (dean.a.rash...@gmail.com) wrote: > On 17 June 2014 20:19, Robert Haas wrote: > > On Fri, Jun 13, 2014 at 3:11 AM, Dean Rasheed > > wrote: > >> Yeah, I was thinking something like this could work, but I would go > >> further. Suppose you had separate GRANTable privileges for direct > >> access to individual tables, bypassing RLS, e.g. > >> > >> GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name > > > > So, is this one new privilege (DIRECT) or four separate new privileges > > that are variants of the existing privileges (DIRECT SELECT, DIRECT > > INSERT, DIRECT UPDATE, DIRECT DELETE)? > > I was thinking it would be 4 new privileges, so that a user could for > example be granted DIRECT SELECT permission on a table, but not DIRECT > UPDATE. Ok. > On reflection though, I think I prefer the approach of allowing > multiple named security policies per table, because it gives the > planner more opportunity to optimize queries against specific RLS > quals, which won't work if the ACL logic is embedded in functions. Having more than one policy for the purpose of performance really doesn't make a huge amount of sense to me. Perhaps someone could explain the use-case with specific example applications where they would benefit from this? Based on the discussion, they would have to be OR'd together in the query as built with any result being marked as success. One could build an SQL function which could be in-lined potentially which does the same if their case is that simple. Being able to define the policy based on some criteria may allow it to be simpler (eg: policy 'a' applies for certain roles, while policy 'b' applies for other roles), but I'm not enthusiastic about that approach because there could be a huge number of permutations to allow. How about another approach- what about having a function which is called (as the table owner, I'm thinking..) that then returns the qual to be included, instead of having to define a specific qual which is included in the catalog? That function could take into consideration the user, table, etc, and return a qual which includes constants to compare rows against for planning purposes. This would have to be done early enough, of course, which might be difficult. For my part, having that capability would be neat, but nothing we're trying to do here would preclude us from adding it later either. > That seems like something that would have to be designed in now, > because it's difficult to see how you could add it later. I don't follow this at all. Going from supporting one qual to supporting multiple seems like it'd be quite straight-forward to add in later? Going the other way would be difficult. > Managing policy names becomes an issue though, because if you have 2 > tables each with 1 policy, but you give them different names, how can > the user querying the data specify that they want policy1 for table1 > and policy2 for table2, possibly in the same query? From my experience, users don't pick the policy any more than they get to pick which set of permissions get applied to them when querying tables (modulo roles, of course, but that's a mechanism for changing users, not for saying which set of permissions you get). All that you describe could be done for regular permissions also, but we don't, and I don't think we get complaints about that because we have roles- which would work just the same for RLS (assuming the RLS policy defined has a role component). Having a function be able to be called to return a qual to be used would be a way to have per-role RLS also, along with providing the flexibility to have per-source-IP, per-connection-type, etc, RLS policies also. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On 17 June 2014 20:19, Robert Haas wrote: > On Fri, Jun 13, 2014 at 3:11 AM, Dean Rasheed > wrote: >> Yeah, I was thinking something like this could work, but I would go >> further. Suppose you had separate GRANTable privileges for direct >> access to individual tables, bypassing RLS, e.g. >> >> GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name > > So, is this one new privilege (DIRECT) or four separate new privileges > that are variants of the existing privileges (DIRECT SELECT, DIRECT > INSERT, DIRECT UPDATE, DIRECT DELETE)? > I was thinking it would be 4 new privileges, so that a user could for example be granted DIRECT SELECT permission on a table, but not DIRECT UPDATE. On reflection though, I think I prefer the approach of allowing multiple named security policies per table, because it gives the planner more opportunity to optimize queries against specific RLS quals, which won't work if the ACL logic is embedded in functions. That seems like something that would have to be designed in now, because it's difficult to see how you could add it later. Managing policy names becomes an issue though, because if you have 2 tables each with 1 policy, but you give them different names, how can the user querying the data specify that they want policy1 for table1 and policy2 for table2, possibly in the same query? I think that can be made more manageable by making policies top-level objects that exist independently of any particular tables. So you might do something like: \c - alice CREATE POLICY policy1; CREATE POLICY policy2; ALTER TABLE t1 SET POLICY policy1 TO t1_quals; ALTER TABLE t2 SET POLICY policy1 TO t2_quals; ... GRANT SELECT ON TABLE t1, t2 TO bob USING policy1; GRANT SELECT ON TABLE t1, t2 TO manager; -- Can use any policy, or bypass all policies Then a particular user would typically only have to set their policy once per session, for accessing multiple tables: \c - bob SET rls_policy = policy1; SELECT * FROM t1 JOIN t2; -- OK SET rls_policy = policy2; SELECT * FROM t1; -- ERROR: no permission to access t1 using policy2 or you'd be able to set a default policy for users, so that they wouldn't need to explicitly choose one: ALTER ROLE bob SET rls_policy = policy1; Note that the syntax proposed elsewhere --- GRANT SELECT (polname) ON TABLE tab TO role --- doesn't work because it conflicts with the syntax for granting column privileges, so there needs to be a distinct syntax for this, and I think it ought to ultimately allow things like GRANT SELECT (col1, col2), UPDATE (col1) ON t1 TO bob USING policy1; Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Tom, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Stephen Frost writes: > > * Robert Haas (robertmh...@gmail.com) wrote: > >> [ counting bits in ACLs ] > > Wouldn't it be fairly painless to widen AclMode from 32 bits to 64, > and thereby double the number of available bits? Thanks for commenting on this. I hadn't considered that but I don't see any particular problem with it either.. > In any case, I concur with the position that this feature patch should > be separate from a patch to make additional bitspace available. Certainly. Thanks for your thoughts. Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Stephen Frost writes: > * Robert Haas (robertmh...@gmail.com) wrote: >> [ counting bits in ACLs ] Wouldn't it be fairly painless to widen AclMode from 32 bits to 64, and thereby double the number of available bits? That code was all written before we required platforms to have an int64 primitive type, but of course now we expect that. In any case, I concur with the position that this feature patch should be separate from a patch to make additional bitspace available. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Robert, * Robert Haas (robertmh...@gmail.com) wrote: > On Wed, Jun 18, 2014 at 10:40 AM, Stephen Frost wrote: > > * Robert Haas (robertmh...@gmail.com) wrote: > >> Technically, there are 4 bits left, and that's what we need for > >> separate privileges. > > > > I'd really hate to chew them all up.. > > Usually it's the patch author who WANTS to chew up all the available > bit space and OTHER people who say no. :-) Ah, well, technically I'm not the patch author here, though I would like to see it happen. :) Still, have to balance these features and capabilities against the future unknown options we might want to add and it certainly doesn't seem terribly nice to chew up all that remain rather than addressing the need to support more. Still, perhaps we can put together a patch for this and then review the implementation and, if we like it and that functionality, we can make the decision about if it should be on this patch to make more bits available. > > Perhaps, or we might come up with some new whiz-bang permission to add > > next year. :/ > > Well, people proposed separate permissions for things like VACUUM and > ANALYZE around the time TRUNCATE was added, and those were rejected on > the grounds that they didn't add enough value to justify wasting bits > on them. I think we see whether there's a workable system that such > that marginal permissions (like TRUNCATE) that won't be checked often > don't have to consume bits. That's an interesting approach but I'm not sure that we need to go a system where we segregate "often-used" bits from "less-used" ones. > > My thoughts on how to address this were to segregate the ACL bits by > > object type. That is to say, the AclMode stored for databases might > > only use bits 0-2 (create/connect/temporary), while tables would use > > bits 0-7 (insert/select/update/delete/references/trigger). This would > > allow us to more easily add more rights at the database and/or > > tablespace level too. > > Yeah, that's another idea. But it really deserves its own thread. > I'm still not convinced we have to do this at all to meet this need, > but that should be argued back and forth on that other thread. Fair enough. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
* Robert Haas (robertmh...@gmail.com) wrote: > On Tue, Jun 17, 2014 at 10:25 PM, Stephen Frost wrote: > > I agree that we want to support that, if we can do so reasonably. What > > I was trying to get at is simply this- don't we provide that already > > with the leakproof attribute and functions? If we don't have enough > > there to allow index scans then we should be looking to add more, I'm > > thinking. > > So the reason why we got onto this particular topic was because of the > issue of multiple security policies for a single table. Of course, > multiple security policies can always be merged into a single > more-complex policy, but the resulting policy may be so complex that > the query-planner is no longer capable of doing a good job optimizing > it. Yeah, I could see that happening with some use-cases. > >> ALTER TABLE tab ADD POLICY polname WHERE quals; > >> GRANT SELECT (polname) ON TABLE tab TO role; > > > > Right, if we were to support multiple policies on a given table then we > > would have to support adding and removing them individually, as well as > > specify when they are to be applied- and what if that "when" overlaps? > > Do we apply both and only a row which passed them all gets sent to the > > user? Essentially we'd be defining the RLS policies to be AND'd > > together, right? Would we want to support both AND-based and OR-based, > > and allow users to pick what set of conditionals they want applied to > > their various overlapping RLS policies? > > AND is not a sensible policy; it would need to be OR. If you grant > someone access to two different subsets of the rows in a table, it > stands to reason that they will expect to have access to all of the > rows that are in at least one of those subsets. I think I can buy off on this. What that also means is that any 'short-circuiting' that we try to do here would be based on "stop once we get back a 'true'". This could seriously change how we actually implement RLS though as doing it all through query rewrites and making this work with multiple security policies which are OR'd together and yet keeping the optimization and qual push-down and index-based plans is looking pretty daunting. I'm also of the opinion that this isn't strictly necessary for the initial RLS offering in PG- there's a clear way we could migrate existing users to a multi-policy system from a single-policy system. Sure, to get the performance and optimization benefits that we'd presumably have in the multi-policy case they'd need to re-work their RLS configuration, but for users who care, they'll likely be very happy to do so to gain those benefits. Perhaps the question here is- if we implement RLS one way for the single case and then change the implementation all around for the multi case, will we end up breaking the single case? Or destroying the performance for it? I can't see either of those cases being allowed- if and when we support multi, we must still support single and the whole point of multi would be to allow more performant implementations and that solution will require the single case to be at least as performant as what we're proposing to do today, I believe. Or are you thinking that we would never support calling user-defined functions in any RLS scheme because we want to be able to do that optimization? I don't see that being acceptable from a feature standpoint. > Alternatively, we could: > > - Require the user to specify in some way which of the available > policies they want applied, and then apply only that one. I'd want to at least see a way to apply an ordering to the policies being applied, or have PG work out which one is "cheapest" and try that one first. > - Decide that such scenarios constitute misconfiguration. Throw an > error and make the table owner or other relevant local authority fix > it. Having them all be OR'd together feels simpler and easier to work with than trying to provide the user with all the knobs necessary to select which subset of users they want the policy applied to when (user X from IP range a.b.c.d/24 at time 1500). We could probably make it work with exclusion constraints, range types, etc, and perhaps it'd be a reason to bring btree_gist into core (which I'm all for) and make it work with catalog tables, but... just 'yuck' all around, for my part. > > Sounds all rather painful and much better done programatically by the > > user in a language which is suited to that task- eg: pl/pgsql, perl, C, > > or something besides our ALTER syntax + catalog representation. > > I think exactly the opposite, for the query planning reasons > previously stated. I think the policies will quickly get so > complicated that they're no longer optimizable. Here's a simple > example: > > - Policy 1 allows the user to access rows for which complexfunc() returns > true. > - Policy 2 allows the user to access rows for which a = 1. > > Most users have access only through policy 2, but some have access > t
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On Wed, Jun 18, 2014 at 10:40 AM, Stephen Frost wrote: > * Robert Haas (robertmh...@gmail.com) wrote: >> On Tue, Jun 17, 2014 at 10:06 PM, Stephen Frost wrote: >> > I had taken it to be a single privilege, but you're right, it could be >> > done for each of those.. I really don't think we have the bits for more >> > than one case here though (if that) without a fair bit of additional >> > rework. I'm not against that rework (and called for it wayyy back when >> > I proposed the TRUNCATE privilege, as I recall) but that's a whole >> > different challenge and no small bit of work.. >> >> Technically, there are 4 bits left, and that's what we need for >> separate privileges. > > I'd really hate to chew them all up.. Usually it's the patch author who WANTS to chew up all the available bit space and OTHER people who say no. :-) >> We last consumed bits in 2008 (for TRUNCATE) and >> 2006 (for GRANT ON DATABASE), so even if we used all of the remaining >> bits it might be another 5 years before anyone has to do that >> refactoring. > > Perhaps, or we might come up with some new whiz-bang permission to add > next year. :/ Well, people proposed separate permissions for things like VACUUM and ANALYZE around the time TRUNCATE was added, and those were rejected on the grounds that they didn't add enough value to justify wasting bits on them. I think we see whether there's a workable system that such that marginal permissions (like TRUNCATE) that won't be checked often don't have to consume bits. >> But even if the refactoring needs to be done now for >> some reason, it's only June, and the last CommitFest doesn't start >> until February 15th. I think we're being way too quick to jump to >> talking about what can and can't be done in time for 9.5. Let's start >> by figuring out how we'd really like it to work and then, if it's too >> ambitious, we can scale it back. > > Alright- perhaps we can discuss what kind of refactoring would be needed > for such a change then, to get a better idea as to the scope of the > change and the level of effort required. > > My thoughts on how to address this were to segregate the ACL bits by > object type. That is to say, the AclMode stored for databases might > only use bits 0-2 (create/connect/temporary), while tables would use > bits 0-7 (insert/select/update/delete/references/trigger). This would > allow us to more easily add more rights at the database and/or > tablespace level too. Yeah, that's another idea. But it really deserves its own thread. I'm still not convinced we have to do this at all to meet this need, but that should be argued back and forth on that other thread. >> My main concern about using only one bit is that someone might want to >> allow a user to bypass RLS on SELECT while still enforcing it for >> data-modifying operations. That seems like a plausible use case to >> me. > > I absolutely agree that it's a real use-case and one which we should > support, just trying to avoid biting off more than can be done between > now and February. Still, if we get things hammered out and more-or-less > agreement on the way forward, getting the code written may move quickly. Nifty. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On Tue, Jun 17, 2014 at 10:25 PM, Stephen Frost wrote: >> Yeah, if we have to ask an external security module a question for >> each row, there's little hope of any real optimization. However, I >> think there will be a significant number of cases where people will >> want filtering clauses that can be realized by doing an index scan >> instead of a sequential scan, and if we end up forcing a sequential >> scan anyway, the feature will be useless to those people. > > I agree that we want to support that, if we can do so reasonably. What > I was trying to get at is simply this- don't we provide that already > with the leakproof attribute and functions? If we don't have enough > there to allow index scans then we should be looking to add more, I'm > thinking. So the reason why we got onto this particular topic was because of the issue of multiple security policies for a single table. Of course, multiple security policies can always be merged into a single more-complex policy, but the resulting policy may be so complex that the query-planner is no longer capable of doing a good job optimizing it. I won't mention here exactly what a certain large commercial database vendor has implemented here; suffice it to say, however, that their design avoids this pitfall, and ours currently does not. > I agree on this point, but I'm still hopeful that we'll be able to get a > good feature into 9.5. There are quite a few resources available for > the 'just programming' part, so the long pole in the tent here is > absolutely hashing out what we want and how it should function. Agreed. > I'd be happy to host or participate in a conference call or similar if > that would be useful to move this along- or we can continue to > communicate via email. There's a bit of a lull in conferences to which > I'm going to right now, so in person is unlikely, unless folks want to > get together somewhere on the east coast (I'd be happy to travel to > Philly, Pittsburgh, NYC, etc, if it'd help..). For me, email is easiest; but there are other options, too. >> > What solution did you come up with for this case, which performed well >> > and was also secure..? >> >> I put the logic in the client. :-( > > Well, that's not helpful here. ;) Sure. The reason I brought it up is to say - hey, look, I had this come up in the real world. What would it take to be able to do actually do it in the database server? And the answer is - something that will handle multiple security policies cleanly. >> But I'm not sure; that >> feels like it's giving something up that might be important. And I >> think that the kinds of syntax we're discussing won't support leaving >> that out of the initial version and adding it later, so if we commit >> to this syntax, we're stuck with that behavior. To avoid that, we'd >> need something like this: >> >> ALTER TABLE tab ADD POLICY polname WHERE quals; >> GRANT SELECT (polname) ON TABLE tab TO role; > > Right, if we were to support multiple policies on a given table then we > would have to support adding and removing them individually, as well as > specify when they are to be applied- and what if that "when" overlaps? > Do we apply both and only a row which passed them all gets sent to the > user? Essentially we'd be defining the RLS policies to be AND'd > together, right? Would we want to support both AND-based and OR-based, > and allow users to pick what set of conditionals they want applied to > their various overlapping RLS policies? AND is not a sensible policy; it would need to be OR. If you grant someone access to two different subsets of the rows in a table, it stands to reason that they will expect to have access to all of the rows that are in at least one of those subsets. If you give someone your car key and your house key, that means they can operate your car or enter your house; it does not mean that they can operate your car but only when it's inside your garage. Alternatively, we could: - Require the user to specify in some way which of the available policies they want applied, and then apply only that one. or - Decide that such scenarios constitute misconfiguration. Throw an error and make the table owner or other relevant local authority fix it. > Sounds all rather painful and much better done programatically by the > user in a language which is suited to that task- eg: pl/pgsql, perl, C, > or something besides our ALTER syntax + catalog representation. I think exactly the opposite, for the query planning reasons previously stated. I think the policies will quickly get so complicated that they're no longer optimizable. Here's a simple example: - Policy 1 allows the user to access rows for which complexfunc() returns true. - Policy 2 allows the user to access rows for which a = 1. Most users have access only through policy 2, but some have access through policy 1. Users who have access through policy 1 will always get a sequential scan, but users who have access through po
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
* Robert Haas (robertmh...@gmail.com) wrote: > On Tue, Jun 17, 2014 at 10:06 PM, Stephen Frost wrote: > > I had taken it to be a single privilege, but you're right, it could be > > done for each of those.. I really don't think we have the bits for more > > than one case here though (if that) without a fair bit of additional > > rework. I'm not against that rework (and called for it wayyy back when > > I proposed the TRUNCATE privilege, as I recall) but that's a whole > > different challenge and no small bit of work.. > > Technically, there are 4 bits left, and that's what we need for > separate privileges. I'd really hate to chew them all up.. > We last consumed bits in 2008 (for TRUNCATE) and > 2006 (for GRANT ON DATABASE), so even if we used all of the remaining > bits it might be another 5 years before anyone has to do that > refactoring. Perhaps, or we might come up with some new whiz-bang permission to add next year. :/ > But even if the refactoring needs to be done now for > some reason, it's only June, and the last CommitFest doesn't start > until February 15th. I think we're being way too quick to jump to > talking about what can and can't be done in time for 9.5. Let's start > by figuring out how we'd really like it to work and then, if it's too > ambitious, we can scale it back. Alright- perhaps we can discuss what kind of refactoring would be needed for such a change then, to get a better idea as to the scope of the change and the level of effort required. My thoughts on how to address this were to segregate the ACL bits by object type. That is to say, the AclMode stored for databases might only use bits 0-2 (create/connect/temporary), while tables would use bits 0-7 (insert/select/update/delete/references/trigger). This would allow us to more easily add more rights at the database and/or tablespace level too. > My main concern about using only one bit is that someone might want to > allow a user to bypass RLS on SELECT while still enforcing it for > data-modifying operations. That seems like a plausible use case to > me. I absolutely agree that it's a real use-case and one which we should support, just trying to avoid biting off more than can be done between now and February. Still, if we get things hammered out and more-or-less agreement on the way forward, getting the code written may move quickly. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On Tue, Jun 17, 2014 at 10:06 PM, Stephen Frost wrote: > * Robert Haas (robertmh...@gmail.com) wrote: >> On Fri, Jun 13, 2014 at 3:11 AM, Dean Rasheed >> wrote: >> > Yeah, I was thinking something like this could work, but I would go >> > further. Suppose you had separate GRANTable privileges for direct >> > access to individual tables, bypassing RLS, e.g. >> > >> > GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name >> >> So, is this one new privilege (DIRECT) or four separate new privileges >> that are variants of the existing privileges (DIRECT SELECT, DIRECT >> INSERT, DIRECT UPDATE, DIRECT DELETE)? > > I had taken it to be a single privilege, but you're right, it could be > done for each of those.. I really don't think we have the bits for more > than one case here though (if that) without a fair bit of additional > rework. I'm not against that rework (and called for it wayyy back when > I proposed the TRUNCATE privilege, as I recall) but that's a whole > different challenge and no small bit of work.. Technically, there are 4 bits left, and that's what we need for separate privileges. We last consumed bits in 2008 (for TRUNCATE) and 2006 (for GRANT ON DATABASE), so even if we used all of the remaining bits it might be another 5 years before anyone has to do that refactoring. But even if the refactoring needs to be done now for some reason, it's only June, and the last CommitFest doesn't start until February 15th. I think we're being way too quick to jump to talking about what can and can't be done in time for 9.5. Let's start by figuring out how we'd really like it to work and then, if it's too ambitious, we can scale it back. My main concern about using only one bit is that someone might want to allow a user to bypass RLS on SELECT while still enforcing it for data-modifying operations. That seems like a plausible use case to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On Tue, Jun 17, 2014 at 9:45 PM, Brightwell, Adam wrote: > I absolutely appreciate all of the feedback that has been provided. It has > been educational. To your point above, I started putting together a wiki > page, as Stephen has spoken to, that is meant to capture these concerns and > considerations as well as to capture ideas around solutions. > > https://wiki.postgresql.org/wiki/Row_Security_Considerations > > This page is obviously not complete, but I think it is a good start. > Hopefully this document will help to continue the conversation and assist in > addressing all the concerns that have been brought to the table. As well, I > hope that this document serves to demonstrate our intent and that we *are* > taking these concerns seriously. I assure you that as one of the > individuals who is working towards the acceptance of this feature/patch, I > am very much concerned about meeting the expected standards of quality and > security. Cool, thanks for weighing in. I think that page is a good start. An item that I think should be added there is the potential overlap between security_barrier views and row-level security. How can we reuse code (and SQL syntax?) for existing features like WITH CHECK OPTION instead of writing new code (and inventing new syntax) for very similar concepts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
* Robert Haas (robertmh...@gmail.com) wrote: > On Mon, Jun 16, 2014 at 1:15 AM, Stephen Frost wrote: > > I understand that there are performance implications. As mentioned to > > Tom, realistically, there's zero way to optimized at least some of these > > use-cases because they require a completely external module (eg: > > SELlinux) to be involved in the decision about who can view what > > records. If we can optimize that, it'd be by a completely different > > approach whereby we pull up the qual higher because we know the whole > > query only involves leakproof functions or similar, allowing us to only > > apply the filter to the final set of records prior to them being > > returned to the user. The point being that such optimizations would > > happen independently and regardless of the quals or user-defined > > functions involved. At the end of the day, I can't think of a better > > optimization for such a case (where we have to ask an external security > > module if a row is acceptable to return to the user) than that. Is > > there something specific you're thinking about that we'd be missing out > > on? > > Yeah, if we have to ask an external security module a question for > each row, there's little hope of any real optimization. However, I > think there will be a significant number of cases where people will > want filtering clauses that can be realized by doing an index scan > instead of a sequential scan, and if we end up forcing a sequential > scan anyway, the feature will be useless to those people. I agree that we want to support that, if we can do so reasonably. What I was trying to get at is simply this- don't we provide that already with the leakproof attribute and functions? If we don't have enough there to allow index scans then we should be looking to add more, I'm thinking. > > Perhaps it's just my experience, but I've been focused on the main core > > feature for quite some time and it feels like we're really close to > > having it there. I agree that a few additional bits would be nice to > > have but these strike me as relatively straight-forward to implement > > overtop of this general construct. I do see value in documenting these > > concerns and will see about making that happen, along with what the > > general viewpoints and thoughts are about how to address the concern. > > I feel like there's quite a bit of work left to do around these > issues. The technical bits may not be too hard, but deciding what we > want will take some thought and discussion. I agree on this point, but I'm still hopeful that we'll be able to get a good feature into 9.5. There are quite a few resources available for the 'just programming' part, so the long pole in the tent here is absolutely hashing out what we want and how it should function. I'd be happy to host or participate in a conference call or similar if that would be useful to move this along- or we can continue to communicate via email. There's a bit of a lull in conferences to which I'm going to right now, so in person is unlikely, unless folks want to get together somewhere on the east coast (I'd be happy to travel to Philly, Pittsburgh, NYC, etc, if it'd help..). > > That's not actualy true today, is it? Given our leak-proof attribute, > > if the qual is "WHERE somecomplexfunction() AND leakprooffunctionx()" > > then we would be able to push down the leak-proof function and not > > necessairly run a straight sequential scan, no? Even so, though, we've > > had users who have tested exactly what this patch implements and they've > > been happy with their real-world use-cases. I'm certainly all for > > optimization and would love to see us make this better for everyone, but > > I don't view that as a reason to delay this particular feature which is > > really just bringing us up to parity with other RDMBS's. > > I'm a bit confused here, because your example seems to be totally > different from my example. In my example, somecomplexfunction() will > get pushed down because it's the security qual; that needs to be > inside the security_barrier view, or a malicious user can subvert the > system by getting some other qual evaluated first. In your example, > you seem to be imagining WHERE somecomplexfunction() AND > leakprooffunctionx() as queries sent by the untrusted user, in which > case, yet, the leak-proof one will get pushed down and the other one > will not. Right- my point there was that the leakproof one might allow an index scan to be run. This is all pretty hand-wavey, I admit, so I'll see if I can get more details about how the currently-proposed patch is performing for the users who are testing it and what kind of plans they're seeing. If that falls through, I'll try and build up my own set of realistic-looking (to myself and the users who are testing) example. > > What solution did you come up with for this case, which performed well > > and was also secure..? > > I put the logic in the client. :-( Wel
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
* Robert Haas (robertmh...@gmail.com) wrote: > On Fri, Jun 13, 2014 at 3:11 AM, Dean Rasheed > wrote: > > Yeah, I was thinking something like this could work, but I would go > > further. Suppose you had separate GRANTable privileges for direct > > access to individual tables, bypassing RLS, e.g. > > > > GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name > > So, is this one new privilege (DIRECT) or four separate new privileges > that are variants of the existing privileges (DIRECT SELECT, DIRECT > INSERT, DIRECT UPDATE, DIRECT DELETE)? I had taken it to be a single privilege, but you're right, it could be done for each of those.. I really don't think we have the bits for more than one case here though (if that) without a fair bit of additional rework. I'm not against that rework (and called for it wayyy back when I proposed the TRUNCATE privilege, as I recall) but that's a whole different challenge and no small bit of work.. > > Actually, given the fact that the majority of users won't be using > > RLS, I would be tempted to invert the above logic and have the new > > privilege be for LIMITED access (via RLS quals). So a user granted the > > normal SELECT privilege would be able to bypass RLS, but a user only > > granted LIMITED SELECT wouldn't. > > Well, for the people who are not using RLS, there's no difference > anyway. I think it matters more what users of RLS will expect from a > command like GRANT SELECT ... and I'm guessing they'll prefer that RLS > always apply unless they very specifically grant the right for RLS to > not apply. I might be wrong, though. The preference from the folks using RLS that I've talked to is absolutely that it be applied by default for all 'normal' (eg: non-pg_dump) sessions. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
* Robert Haas (robertmh...@gmail.com) wrote: > On Thu, Jun 12, 2014 at 8:13 PM, Stephen Frost wrote: > > This approach was suggested by an existing user testing out this RLS > > approach, to be fair, but it looks pretty sane to me as a way to address > > some of these concerns. Certainly open to other ideas and thoughts though. > > In general, I agree that this is a good approach. I think it will be > difficult to have a GUC with three values, one of which is > superuser-only and the other two of which are not. I don't think > there's any precedent for something like that in the existing > framework, and I think it's likely we'll run into unpleasant corner > cases if we try to graft it in. Also, I think we need to separate > things: whether the system is willing to allow the user to access the > table without RLS, and whether the user is willing to accept RLS if > the system deems it necessary. Good point- I agree that it's best to avoid having to support individual superuser-only only options on a GUC. Also, addressing the issues independently also makes sense to me. > For the first one, two solutions have been proposed. The initial > proposal was to insist on RLS except for the superuser (and maybe the > table owner?). Having a separate grantable privilege, as Dean > suggests, may be better. I'll reply separately to that email also, as > I have a question about what he's proposing. I like the idea of a grantable privilege as it allows the granularity that some users may require (or be frustrated that we don't have it). > For the second one, I think the two most useful behaviors are "normal > mode" - i.e. allow access to the table, applying RLS predicates if > required and not applying them if I am exempt - and "error-instead" > mode - i.e. if my access to this table would be mediated by an RLS > predicate, then throw an error instead. Right, makes sense. > There's a third mode which > might be useful as well, which is "even though I have the *right* to > bypass the RLS predicate on this table, please apply the predicate > anyway". This could be used by the table owner in testing, for > example. Agreed, this sounds very useful too. > Here again, the level of granularity we want to provide is > an interesting question. Having a GUC (e.g. enable_row_level_security > = on, off, force) would be adequate for pg_dump, but allowing the > table name to be qualified in the query, as proposed downthread, would > be more granular, admittedly at some parser cost. I'm personally of > the view that we *at least* need the GUC, because that seems like the > best way to secure pg_dump, and perhaps other applications. We can > and should give pg_dump an--allow-row-level-security flag, I think, > but pg_dump's default behavior should be to configure the system in > such a way that the dump will fail rather than complete with a subset > of the data. This sounds good to me. > I'm less sure whether we should have something that can > be used to qualify table names in particular queries. I think it > would be really useful, but I'm concerned that it will require > creating additional fully-reserved keywords, which are somewhat > painful for users. I've been trying to think of the use-case for this. It certainly *sounds* nice, but on reflection, the use-case for this seems to me to be that you're trying to develop some application which will be constrained by RLS totally and therefore want to flip back-and-forth between "RLS on" and "RLS off" (for the tables involved). When would you really need, in the same query, to have RLS enabled for table X but disabled for table Y? I do like the idea of an *independent* option to (just) COPY which says "give me all the data or error, independent of the GUC for the same purpose". Would be curious to hear what others think of that proposal. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Robert, However, I believe that > there has been a lack of focus in the development of the patch thus > far in a couple of key areas - first in terms of articulating how it > is different from and better than a writeable security barrier view, > and second on how to manage the security and operational aspects of > having a feature like this. I think that the discussion subsequent to > my June 10th email has let to some good discussion on both points, > which was my intent, but I still think much more time and thought > needs to be spent on those issues if we are to have a feature which is > up to our usual standards. I do apologize to anyone who interpreted > that initial as a pure rant, because it really wasn't intended that > way. Contrariwise, I hope that the people defending this patch will > admit that the issues I am raising are real and focus on whether and > how those concerns can be addressed. I absolutely appreciate all of the feedback that has been provided. It has been educational. To your point above, I started putting together a wiki page, as Stephen has spoken to, that is meant to capture these concerns and considerations as well as to capture ideas around solutions. https://wiki.postgresql.org/wiki/Row_Security_Considerations This page is obviously not complete, but I think it is a good start. Hopefully this document will help to continue the conversation and assist in addressing all the concerns that have been brought to the table. As well, I hope that this document serves to demonstrate our intent and that we *are* taking these concerns seriously. I assure you that as one of the individuals who is working towards the acceptance of this feature/patch, I am very much concerned about meeting the expected standards of quality and security. Thanks, Adam
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On Mon, Jun 16, 2014 at 1:15 AM, Stephen Frost wrote: >> I'm not referring to the proposed implementation particularly; or at >> least not that aspect of it. I don't think trying to run the view >> quals as the defining user is likely to be very appealing, because I >> think it's going to hurt performance, for example by preventing >> function inlining and requiring lots of user-ID switches. > > I understand that there are performance implications. As mentioned to > Tom, realistically, there's zero way to optimized at least some of these > use-cases because they require a completely external module (eg: > SELlinux) to be involved in the decision about who can view what > records. If we can optimize that, it'd be by a completely different > approach whereby we pull up the qual higher because we know the whole > query only involves leakproof functions or similar, allowing us to only > apply the filter to the final set of records prior to them being > returned to the user. The point being that such optimizations would > happen independently and regardless of the quals or user-defined > functions involved. At the end of the day, I can't think of a better > optimization for such a case (where we have to ask an external security > module if a row is acceptable to return to the user) than that. Is > there something specific you're thinking about that we'd be missing out > on? Yeah, if we have to ask an external security module a question for each row, there's little hope of any real optimization. However, I think there will be a significant number of cases where people will want filtering clauses that can be realized by doing an index scan instead of a sequential scan, and if we end up forcing a sequential scan anyway, the feature will be useless to those people. >> But I'm not >> gonna complain if someone wants to mull it over and make a proposal >> for how to make it work. Rather, my concern is that all we've got is >> what might be called the core of the feature; the actual guts of it. >> There are a lot of ancillary details that seem to me to be not worked >> out at all yet, or only half-baked. > > Perhaps it's just my experience, but I've been focused on the main core > feature for quite some time and it feels like we're really close to > having it there. I agree that a few additional bits would be nice to > have but these strike me as relatively straight-forward to implement > overtop of this general construct. I do see value in documenting these > concerns and will see about making that happen, along with what the > general viewpoints and thoughts are about how to address the concern. I feel like there's quite a bit of work left to do around these issues. The technical bits may not be too hard, but deciding what we want will take some thought and discussion. >> > The current approach allows a nearly unlimited level of flexibility, >> > should the user wish it, by being able to run user-defined code. >> > Perhaps that would be considered 'one policy', but it could certainly >> > take under consideration the calling user, the object being queried >> > (if a function is defined per table, or if we provide a way to get >> > that information in the function), etc. >> >> In theory, that's true. But in practice, performance will suck unless >> the security qual is easily optimizable. If your security qual is >> WHERE somecomplexfunction() you're going to have to implement that by >> sequential-scanning the table and evaluating the function for each >> row. > > That's not actualy true today, is it? Given our leak-proof attribute, > if the qual is "WHERE somecomplexfunction() AND leakprooffunctionx()" > then we would be able to push down the leak-proof function and not > necessairly run a straight sequential scan, no? Even so, though, we've > had users who have tested exactly what this patch implements and they've > been happy with their real-world use-cases. I'm certainly all for > optimization and would love to see us make this better for everyone, but > I don't view that as a reason to delay this particular feature which is > really just bringing us up to parity with other RDMBS's. I'm a bit confused here, because your example seems to be totally different from my example. In my example, somecomplexfunction() will get pushed down because it's the security qual; that needs to be inside the security_barrier view, or a malicious user can subvert the system by getting some other qual evaluated first. In your example, you seem to be imagining WHERE somecomplexfunction() AND leakprooffunctionx() as queries sent by the untrusted user, in which case, yet, the leak-proof one will get pushed down and the other one will not. >> But that's probably not going to perform very well, because to match >> an index on sales_rep_id, or an index on partner_id, that's going to >> have to get simplified a whole lot, and that's probably not going to >> happen. If we've only got one branch of the OR, I think we
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Robert, On Tuesday, June 17, 2014, Robert Haas wrote: > > After sending that one (1) email, I was promptly told that "I'm very > disappointed to hear that the mechanical pieces around making RLS easy > for users to use ... is receiving such push-back." The push-back, at > that point in time, consisted of one (1) email. Several more emails > have been sent that time, including the above-quoted text, seeming to > me to imply that the people who are concerned about this feature are > being unreasonable. I don't believe I am the only such person, > although I may be the main one right at the moment, and you may not be > entirely surprised to hear that I don't think I'm being unreasonable. I'm on my phone at the moment but that looks like a quote from me. My email and concern there was regarding the specific suggestion that we could check off the "RLS" capability which users have been asking us to provide nearly since I started with PG by saying that they could use Updatable SB views. I did not intend it as a comment regarding the specific technical concerns raised and have been responding to and trying to address those independently and openly. I've expressed elsewhere on this thread my gratitude that the technical concerns are being brought up now, near the beginning of the cycle, so we can address them. I've been working with others who are interested in RLS on a wiki page to outline and understand the options and identify dependencies and priorities. Hopefully the link will be posted shortly (again, not at a computer right now) and we can get comments back. There are some very specific questions which really need to be addressed and which I've mentioned before (in particular the question of what user the functions in a view definition should run as, both for "normal" views, for SB views, and for when an RLS qual is included and run through that framework, and if doing so would address some of the concerns which have been raised regarding selects running code). Thanks, Stephen
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On Thu, Jun 12, 2014 at 8:13 PM, Stephen Frost wrote: >> I'm in full agreement we should clearly communicate the issues around >> pg_dump in particular, because they can't necessarily be eliminated >> altogether without some major work that's going to take a while to finish. >> And if the work-around is some sort of GUC for killing RLS altogether, >> that's ugly but not unacceptable to me as a short-term fix. > > A GUC which is enable / disable / error-instead may work quiet well, with > error-instead for pg_dump default if people really want it (there would have > to be a way to disable that though, imv). > > Note that enable is default in general, disable would be for superuser only > (or on start-up) to disable everything, and error-instead anyone could use > but it would error instead of implementing RLS when querying an RLS-enabled > table. > > This approach was suggested by an existing user testing out this RLS > approach, to be fair, but it looks pretty sane to me as a way to address > some of these concerns. Certainly open to other ideas and thoughts though. In general, I agree that this is a good approach. I think it will be difficult to have a GUC with three values, one of which is superuser-only and the other two of which are not. I don't think there's any precedent for something like that in the existing framework, and I think it's likely we'll run into unpleasant corner cases if we try to graft it in. Also, I think we need to separate things: whether the system is willing to allow the user to access the table without RLS, and whether the user is willing to accept RLS if the system deems it necessary. For the first one, two solutions have been proposed. The initial proposal was to insist on RLS except for the superuser (and maybe the table owner?). Having a separate grantable privilege, as Dean suggests, may be better. I'll reply separately to that email also, as I have a question about what he's proposing. For the second one, I think the two most useful behaviors are "normal mode" - i.e. allow access to the table, applying RLS predicates if required and not applying them if I am exempt - and "error-instead" mode - i.e. if my access to this table would be mediated by an RLS predicate, then throw an error instead. There's a third mode which might be useful as well, which is "even though I have the *right* to bypass the RLS predicate on this table, please apply the predicate anyway". This could be used by the table owner in testing, for example. Here again, the level of granularity we want to provide is an interesting question. Having a GUC (e.g. enable_row_level_security = on, off, force) would be adequate for pg_dump, but allowing the table name to be qualified in the query, as proposed downthread, would be more granular, admittedly at some parser cost. I'm personally of the view that we *at least* need the GUC, because that seems like the best way to secure pg_dump, and perhaps other applications. We can and should give pg_dump an--allow-row-level-security flag, I think, but pg_dump's default behavior should be to configure the system in such a way that the dump will fail rather than complete with a subset of the data. I'm less sure whether we should have something that can be used to qualify table names in particular queries. I think it would be really useful, but I'm concerned that it will require creating additional fully-reserved keywords, which are somewhat painful for users. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On Fri, Jun 13, 2014 at 3:11 AM, Dean Rasheed wrote: > Yeah, I was thinking something like this could work, but I would go > further. Suppose you had separate GRANTable privileges for direct > access to individual tables, bypassing RLS, e.g. > > GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name So, is this one new privilege (DIRECT) or four separate new privileges that are variants of the existing privileges (DIRECT SELECT, DIRECT INSERT, DIRECT UPDATE, DIRECT DELETE)? > Actually, given the fact that the majority of users won't be using > RLS, I would be tempted to invert the above logic and have the new > privilege be for LIMITED access (via RLS quals). So a user granted the > normal SELECT privilege would be able to bypass RLS, but a user only > granted LIMITED SELECT wouldn't. Well, for the people who are not using RLS, there's no difference anyway. I think it matters more what users of RLS will expect from a command like GRANT SELECT ... and I'm guessing they'll prefer that RLS always apply unless they very specifically grant the right for RLS to not apply. I might be wrong, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On Thu, Jun 12, 2014 at 6:33 PM, Gregory Smith wrote: > I'm kind of surprised to see this turn into a hot button all of the sudden > though, because my thought on all that so far has been a giant so what? > This is what PostgreSQL does. [...] > But let's not act like RLS is a scary bogeyman because it introduces a new > way to hack the server or get surprising side-effects. That's expected and > possibly unavoidable behavior in a feature like this, and there are much > worse instances of arbitrary function risk throughout the core code already. I have some technical comments on later emails in this thread, but first let me address this point. In the past, people have sometimes complained that reviewers waited until very late in the cycle to complain about issues which they found problematic. By the time the issues were pointed out, insufficient time remained before feature freeze to get those issues addressed, causing the patch to slip out of the release and provoking developer frustration. It has therefore been requested numerous times by numerous people that potential issues be raised as early as possible. The concerns that I have raised in this thread are not new; I have raised them before. However, we are now at the beginning of a new development cycle, and it seems fair to assume that the people who are working on this patch are hoping very much that something will get committed to 9.5. Since it seems to me that there are unaddressed issues with the design of this patch, I felt that it was a good idea to make sure that those concerns were on the table right from the beginning of the process, rather than waiting until the patch was on the verge of commit or, indeed, already committed. That is why, when this thread was revived on June 10th, I decide that it was a good time to again comment on the design points about which I was concerned. After sending that one (1) email, I was promptly told that "I'm very disappointed to hear that the mechanical pieces around making RLS easy for users to use ... is receiving such push-back." The push-back, at that point in time, consisted of one (1) email. Several more emails have been sent that time, including the above-quoted text, seeming to me to imply that the people who are concerned about this feature are being unreasonable. I don't believe I am the only such person, although I may be the main one right at the moment, and you may not be entirely surprised to hear that I don't think I'm being unreasonable. I will admit that my initial email may have contained just a touch of hyperbole. But I won't admit to more than a touch, and frankly, I think it was warranted. I perfectly well understand that people really, really, really want this feature, and if I hadn't understood that before, I certainly understand it now. However, I believe that there has been a lack of focus in the development of the patch thus far in a couple of key areas - first in terms of articulating how it is different from and better than a writeable security barrier view, and second on how to manage the security and operational aspects of having a feature like this. I think that the discussion subsequent to my June 10th email has let to some good discussion on both points, which was my intent, but I still think much more time and thought needs to be spent on those issues if we are to have a feature which is up to our usual standards. I do apologize to anyone who interpreted that initial as a pure rant, because it really wasn't intended that way. Contrariwise, I hope that the people defending this patch will admit that the issues I am raising are real and focus on whether and how those concerns can be addressed. Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
* Kevin Grittner (kgri...@ymail.com) wrote: > Stephen Frost wrote: > > Any dump not run by a superuser is already in doubt, imv. That > > is a problem we already have which really needs to be addressed, > > but I view that as an independent issue. > > I'm not seeing that. If the user can't dump, you get an error and > pg_dump returns something other than SUCCESS. We've outlined an approach with RLS which would do the same. I'm still of the opinion that, today, we have a problem that only a superuser-run dump has any chance of success (and even if you get it working today it'll probably break tomorrow, and you had better be paying attention). I'd like to fix that situation, but it's an independent effort from this. We've had issues in the past with pg_dump creating things that can't be restored and they're certainly bugs but trying to make that work with a regular user as a whole system backup strategy, today, is just asking for trouble. > > I agree with avoiding adding another superuser-only capability; > > see the other sub-thread about making this a per-user capability. > > It should be possible to design something which does not have this > risk. The risk that pg_dump might create a dump which can't be restored? Agreed, and I'd love to hear your thoughts on the proposal. > What I was saying was that what was being described at that > point wasn't it, and IMV was not acceptable. I think that there > should never by any doubt that a pg_dump run which completes > without error copied all requested tables in their entirety, not a > subset of the rows in the tables. pg_dump needs to be able to have an option to go either way on this case, as I can see value in running pg_dump in "RLS-enforcing" mode, but it could default to "error-if-RLS". > A GUC which only caused an error on the attempt to actually read > specific rows which the user does not have permission to see would > leak too much information. A GUC which caused a SELECT or COPY > from a table to throw an error if the user was not entitled to see > all rows in the table could work. Right- this would be the 'DIRECT SELECT' which would allow bypassing all RLS and therefore mean that the user is allowed to see ALL rows of a table. That's one of the reasons why I agree with Dean's approach, because we really need to know at the outset if the calling user is allowed to extract all rows from a table or not- we can't go looking through the entire table testing each row before we start running the query. > Another thing which could work, > if it can be coded, would be a GUC which would throw an error if > the there were not quals on the query to prohibit seeing rows which > the security conditions would prohibit, whether or not any matching > rows actually existed. If I'm following you correctly, this would be an optimization that allows avoiding RLS in the case where some information about the user causes the overall qual to always return 'true', correct? I'd certainly like to see what happens in that case today and agree that it'd be great to optimize for and perhaps even allow a user for which that is true to not need the 'DIRECT SELECT' privilege, but in practice, I don't think it'll be possible in most cases (certainly not in the case where an external security module is deciding the access) and the optimization may not be worth it. > The latter would match the behavior of > column level security -- you get an error when trying to select a > prohibited column even if there are no rows in the table. Agreed, but that would be a relaxation of the proposed approach and therefore something which could be added later, if it's deemed worthwhile. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Stephen Frost wrote: > Kevin Grittner (kgri...@ymail.com) wrote: >> The proposed approach would leave the validity of any dump which >> was not run as a superuser in doubt. The last thing we need, in >> terms of improving security, is another thing you can't do >> without connecting as a superuser. > > Any dump not run by a superuser is already in doubt, imv. That > is a problem we already have which really needs to be addressed, > but I view that as an independent issue. I'm not seeing that. If the user can't dump, you get an error and pg_dump returns something other than SUCCESS. test=# create user bob; CREATE ROLE test=# create user tom; CREATE ROLE test=# set role bob; SET test=> create table person(person_id int primary key, name text not null, ssn text); CREATE TABLE test=> insert into person values (1, 'Stephen Frost', '123-45-6789'); INSERT 0 1 test=> insert into person values (2, 'Kevin Grittner'); INSERT 0 1 test=> grant select (person_id, name) on person to tom; GRANT test=> \q kgrittn@Kevin-Desktop:~/pg/master$ pg_dump -U bob test >bob-backup.sql kgrittn@Kevin-Desktop:~/pg/master$ pg_dump -U tom test >tom-backup.sql pg_dump: [archiver (db)] query failed: ERROR: permission denied for relation person pg_dump: [archiver (db)] query was: LOCK TABLE public.person IN ACCESS SHARE MODE kgrittn@Kevin-Desktop:~/pg/master$ echo $? 1 > I agree with avoiding adding another superuser-only capability; > see the other sub-thread about making this a per-user capability. It should be possible to design something which does not have this risk. What I was saying was that what was being described at that point wasn't it, and IMV was not acceptable. I think that there should never by any doubt that a pg_dump run which completes without error copied all requested tables in their entirety, not a subset of the rows in the tables. A GUC which only caused an error on the attempt to actually read specific rows which the user does not have permission to see would leak too much information. A GUC which caused a SELECT or COPY from a table to throw an error if the user was not entitled to see all rows in the table could work. Another thing which could work, if it can be coded, would be a GUC which would throw an error if the there were not quals on the query to prohibit seeing rows which the security conditions would prohibit, whether or not any matching rows actually existed. The latter would match the behavior of column level security -- you get an error when trying to select a prohibited column even if there are no rows in the table. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Dean, * Dean Rasheed (dean.a.rash...@gmail.com) wrote: > On 13 June 2014 01:13, Stephen Frost wrote: > > This approach was suggested by an existing user testing out this RLS > > approach, to be fair, but it looks pretty sane to me as a way to address > > some of these concerns. Certainly open to other ideas and thoughts though. > > Yeah, I was thinking something like this could work, but I would go > further. Suppose you had separate GRANTable privileges for direct > access to individual tables, bypassing RLS, e.g. > > GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name This is certainly an interesting idea and I'm glad we're getting this level of discussion early on in the 9.5 cycle as I'd really like to see a good solution implemented for 9.5. I've been going back-and-forth about this and what's really swaying me right now is that it'd be nearly impossible to determine if a given RLS qual actually allows full access to a table for a given user without going through the entire table and testing the qual against each row. With this GRANT ability, we'd be able to completely avoid calling the RLS quals when the user is granted this right. Not sure offhand how many bits we've got left at the per-table level though; we added TRUNCATE rights not that long ago and this looks like another good right to add, but there are only so many bits available.. At the same time, I do think this is something we could also add later, perhaps after figuring out a good way to extend the set of bits available for privileges on tables. > Combined with the GUC (direct_table_access, say) to request direct > access to all tables. Then with direct_table_access = true/required, a > SELECT from a table would error if the user hadn't been granted the > DIRECT SELECT privilege on all the tables referenced in the query. I can see this working. One thing I'm curious about is if we would want to support this inside of the SELECT statement (or perhaps COPY?) directly, rather than making a user have to flip a GUC back and forth while they're doing something. I can imagine, during testing, a session looking like this: select * from table; @#@!$! set direct_table_access = true; select * from table; select * from table where blah = x; alter table set row level security blah = x; select * from table; select * from table; select * from table; @!#$!@#! set direct_table_access = false; select * from table; ... Would 'select direct' or 'select * from DIRECT table' (or maybe 'ONLY'?) be workable? There's certainly SQL standard concerns to be thought of here which might precldue anything we do with SELECT, but we could support something with COPY. > Tools like pg_dump would require direct_table_access, but there might > be other levels of access that didn't error out. pg_dump would need an option to set direct_table_access or not. Having it ask by default is acceptable to me, but I do think we need to be able to tell it to *not* set that. > I think if I were using RLS, I would definitely want/expect this level > of fine-grained control over permissions on a per-table basis, rather > than the superuser/non-superuser level of control, or having > RLS-exempt users. I agree that it'd be great to have- and we need to make sure we don't paint ourselves into a corner with the initial versions. What I'm worried about is that we're going to end up feature-creeping this to death and ending up with nothing in 9.5. I'll try to get a wiki page going to discuss these items (as mentioned up-thread) and we can look at prioritizing them and looking at what dependencies exist on other parts of the system and seeing what's required for the initial version. > Actually, given the fact that the majority of users won't be using > RLS, I would be tempted to invert the above logic and have the new > privilege be for LIMITED access (via RLS quals). So a user granted the > normal SELECT privilege would be able to bypass RLS, but a user only > granted LIMITED SELECT wouldn't. This I don't agree with- it goes against what is done on existing systems afaik and part of the idea is that you can minimize changes to the applications or users but still be able to curtail what they can see. Making regular SELECTs start erroring if they haven't set some GUC because RLS has been implemented on a given table would be quite annoying, imv. Now, that said, wouldn't the end user be able to control this for their particular environment by setting the GUC accordingly in postgresql.conf? I'd still argue that it should be defaulted to what I view as the 'normal' case, where RLS is applied unless you asked for your queries to error instead, but if a user wants to have it flipped around the other way, they could update their postgresql.conf to make it so. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Kevin, * Kevin Grittner (kgri...@ymail.com) wrote: > Robert Haas wrote: > > Even aside from security exposures, how > > does a non-superuser who runs pg_dump know whether they've got a > > complete backup or a filtered dump that's missing some rows? > > This seems to me to be a killer objection to the feature as > proposed, and points out a huge difference between column level > security and the proposed implementation of row level security. I really hate this notion of "killer objection". It's been discussed (perhaps not seen by all) at least one suggestion for how to address this specific issue and there are other ways in which to address it (having COPY have the same behavior as the GUC being discussed, instead of having a GUC, though I feel like the GUC is a better approach..). > (In fact it is a difference between just about any GRANTed > permission and row level security.) If you try to SELECT * FROM > sometable and you don't have rights to all the columns, you get an > error. A dump would always either work as expected or generate an > error. Provided you know all of the tables and other objects which need to be included in such a partial dump (as a full dump, today, must be run by a superuser to be sure you're actually getting everything anyway...). > The proposed approach would leave the validity of any dump which > was not run as a superuser in doubt. The last thing we need, in > terms of improving security, is another thing you can't do without > connecting as a superuser. Any dump not run by a superuser is already in doubt, imv. That is a problem we already have which really needs to be addressed, but I view that as an independent issue. I agree with avoiding adding another superuser-only capability; see the other sub-thread about making this a per-user capability. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Robert, * Robert Haas (robertmh...@gmail.com) wrote: > On Wed, Jun 11, 2014 at 8:59 PM, Stephen Frost wrote: > > In this case the user-defined code needs to return a boolean. We don't > > currently do anything to prevent it from having side-effects, no, but > > the same is true with views which incorporate functions. I agree that > > it makes a difference when compared to column-level privileges, but my > > point was that we have provided easier ways to do things which were > > possible using more complicated methods before. Perhaps the risk with > > RLS is higher but these issues look managable to me and the level of > > doubt about our ability to provide this feature in a reasonable and > > principled way that our users will understand surprises me. > > I'm glad the issues look manageable to you, but you haven't really > explained how to manage them. There's been a number of suggestions made and it'd be great to get more feedback on them- running the quals as the table owner, having a GUC which can be set to either run 'as normal' or either ignore RLS (if the user has that right) or error out if RLS would happen, and undoubtably there are other ideas along those same lines to address the pg_dump and other concerns. > For my part, I'm mildly surprised that anyone thinks it's a good idea > to have SELECT * FROM tab to mean different things depending on who is > typing it. Realistically, in the RDBMS realm in which we're in and that we're working to break into- this is absolutely a given and expected. It's new to PostgreSQL, certainly, but it's not uncommon or surprising at all in our industry. > To me, that seems very confusing; how does an unprivileged > user with no ability to assume some other role validate that the row > security policy they've configured works at all and exposes precisely > the intended set of rows? While I see what you're getting at, I'm not convinced it's really all that different from being set up without access to some schema or table which the administrator setting up accounts didn't include for you. Sure, in the case of a schema or table, you can get an error back instead of just not seeing the data, but if you're looking for specific data, chances are pretty good you'll realize the lack of data quickly and ask the same question regarding access. To wit, I've certainly had users ask exactly that question of- "do I have access to all the data in this table?" even when using PG where it's a bit tricky to limit such access. Clearly, the same risk applies when using views and so the question is understandable. Perhaps these were users with more experience in other RDBMS's where it's more common to have RLS, but there are at least a couple cases which I can think of where that wouldn't apply. > Even aside from security exposures, how > does a non-superuser who runs pg_dump know whether they've got a > complete backup or a filtered dump that's missing some rows? This would be addressed with the GUC that's been proposed. As would the previous paragraph, though I wanted to apply to that independently. > I'm not referring to the proposed implementation particularly; or at > least not that aspect of it. I don't think trying to run the view > quals as the defining user is likely to be very appealing, because I > think it's going to hurt performance, for example by preventing > function inlining and requiring lots of user-ID switches. I understand that there are performance implications. As mentioned to Tom, realistically, there's zero way to optimized at least some of these use-cases because they require a completely external module (eg: SELlinux) to be involved in the decision about who can view what records. If we can optimize that, it'd be by a completely different approach whereby we pull up the qual higher because we know the whole query only involves leakproof functions or similar, allowing us to only apply the filter to the final set of records prior to them being returned to the user. The point being that such optimizations would happen independently and regardless of the quals or user-defined functions involved. At the end of the day, I can't think of a better optimization for such a case (where we have to ask an external security module if a row is acceptable to return to the user) than that. Is there something specific you're thinking about that we'd be missing out on? > But I'm not > gonna complain if someone wants to mull it over and make a proposal > for how to make it work. Rather, my concern is that all we've got is > what might be called the core of the feature; the actual guts of it. > There are a lot of ancillary details that seem to me to be not worked > out at all yet, or only half-baked. Perhaps it's just my experience, but I've been focused on the main core feature for quite some time and it feels like we're really close to having it there. I agree that a few additional bits would be nice to have but these strike me as relative
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
* Tom Lane (t...@sss.pgh.pa.us) wrote: > Craig Ringer writes: > > I agree, and now that the urgency of trying to deliver this for 9.4 is > > over it's worth seeing if we can just run as table owner. > > > Failing that, we could take the approach a certain other RDBMS does and > > make the ability to define row security quals a GRANTable right > > initially held only by the superuser. > > Hmm ... that might be a workable compromise. I think the main issue here > is whether we expect that RLS quals will be something that the planner > could optimize to any meaningful extent. If they're always (in effect) > wrapped in SECURITY DEFINER functions, I think that largely blocks any > optimizations; but maybe that wouldn't matter in practice. From what I've heard from actual users with other RDBMS's who are coming to PostgreSQL- the reality is that they're going to be using a security module (eg: SELinux) whose responsibility it is to manage this whole question of "can this user see this row", meaning there's zero chance of optimization. I'd certainly like to see the ability to optimize remain in cases where the qual itself gives us a way to filter (eg: a table partitioned based on some security level, where another table maps users to levels), but that is, from a practical standpoint, not an immediate concern from real users and I don't believe our approach paints us into a corner which would prevent that. What that would require is better support for true partitioning rather than constraint exclusions. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
* Tom Lane (t...@sss.pgh.pa.us) wrote: > Adam Brightwell writes: > > Through this effort, we have concluded that for RLS the case of > > invalidating a plan is only necessary when switching between a superuser > > and a non-superuser. Obviously, re-planning on every role change would be > > too costly, but this approach should help minimize that cost. As well, > > there were not any cases outside of this one that were immediately apparent > > with respect to RLS that would require re-planning on a per userid basis. > > Hm ... I'm not following why we'd need a special case for superusers and > not anyone else? Seems like any useful RLS scheme is going to require > more privilege levels than just superuser and not-superuser. Just to clarify this- the proposal allows RLS to be implemented essentially by any user-defined qual, where that qual can include the current user, the IP the user is connecting from, or more-or-less anything else, possibly even via a user-defined function or security module. It is not superuser-or-not. This discussion is about how to support users for which RLS should not be applied. I can see that being useful at a more granular level than superuser-or-not, but even at that level, RLS is still extremely useful. > Could we put the "if superuser then ok" test into the RLS condition test > and thereby not need more than one plan at all? As discussed, that unfortunately doesn't quite work. This discussion, in general, has been quite useful and I'll work on adding documentation to the wiki pages which discusses the consideration and suggestions for a GUC to disable-or-error when RLS is encountered, along with a per-role capability to bypass RLS; that is in line with the goal of avoiding adding superuser-specific capabilities. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On 13 June 2014 01:13, Stephen Frost wrote: > Greg, all, > > I will reply to the emails in detail when I get a chance but am out of town > at a funeral, so it'll likely be delayed. I did want to echo my agreement > for the most part with Greg and in particular... > > On Thursday, June 12, 2014, Gregory Smith wrote: >> >> On 6/11/14, 10:26 AM, Robert Haas wrote: >>> >>> Now, as soon as we introduce the concept that selecting from a table >>> might not really mean "read from the table" but "read from the table after >>> applying this owner-specified qual", we're opening up a whole new set of >>> attack surfaces. Every pg_dump is an opportunity to hack somebody else's >>> account, or at least audit their activity. >> >> >> I'm in full agreement we should clearly communicate the issues around >> pg_dump in particular, because they can't necessarily be eliminated >> altogether without some major work that's going to take a while to finish. >> And if the work-around is some sort of GUC for killing RLS altogether, >> that's ugly but not unacceptable to me as a short-term fix. > > > A GUC which is enable / disable / error-instead may work quiet well, with > error-instead for pg_dump default if people really want it (there would have > to be a way to disable that though, imv). > > Note that enable is default in general, disable would be for superuser only > (or on start-up) to disable everything, and error-instead anyone could use > but it would error instead of implementing RLS when querying an RLS-enabled > table. > > This approach was suggested by an existing user testing out this RLS > approach, to be fair, but it looks pretty sane to me as a way to address > some of these concerns. Certainly open to other ideas and thoughts though. > Yeah, I was thinking something like this could work, but I would go further. Suppose you had separate GRANTable privileges for direct access to individual tables, bypassing RLS, e.g. GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name Combined with the GUC (direct_table_access, say) to request direct access to all tables. Then with direct_table_access = true/required, a SELECT from a table would error if the user hadn't been granted the DIRECT SELECT privilege on all the tables referenced in the query. Tools like pg_dump would require direct_table_access, but there might be other levels of access that didn't error out. I think if I were using RLS, I would definitely want/expect this level of fine-grained control over permissions on a per-table basis, rather than the superuser/non-superuser level of control, or having RLS-exempt users. Actually, given the fact that the majority of users won't be using RLS, I would be tempted to invert the above logic and have the new privilege be for LIMITED access (via RLS quals). So a user granted the normal SELECT privilege would be able to bypass RLS, but a user only granted LIMITED SELECT wouldn't. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Greg, all, I will reply to the emails in detail when I get a chance but am out of town at a funeral, so it'll likely be delayed. I did want to echo my agreement for the most part with Greg and in particular... On Thursday, June 12, 2014, Gregory Smith wrote: > On 6/11/14, 10:26 AM, Robert Haas wrote: > >> Now, as soon as we introduce the concept that selecting from a table >> might not really mean "read from the table" but "read from the table after >> applying this owner-specified qual", we're opening up a whole new set of >> attack surfaces. Every pg_dump is an opportunity to hack somebody else's >> account, or at least audit their activity. >> > > I'm in full agreement we should clearly communicate the issues around > pg_dump in particular, because they can't necessarily be eliminated > altogether without some major work that's going to take a while to finish. > And if the work-around is some sort of GUC for killing RLS altogether, > that's ugly but not unacceptable to me as a short-term fix. A GUC which is enable / disable / error-instead may work quiet well, with error-instead for pg_dump default if people really want it (there would have to be a way to disable that though, imv). Note that enable is default in general, disable would be for superuser only (or on start-up) to disable everything, and error-instead anyone could use but it would error instead of implementing RLS when querying an RLS-enabled table. This approach was suggested by an existing user testing out this RLS approach, to be fair, but it looks pretty sane to me as a way to address some of these concerns. Certainly open to other ideas and thoughts though. Thanks, Stephen
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On 6/11/14, 10:26 AM, Robert Haas wrote: Now, as soon as we introduce the concept that selecting from a table might not really mean "read from the table" but "read from the table after applying this owner-specified qual", we're opening up a whole new set of attack surfaces. Every pg_dump is an opportunity to hack somebody else's account, or at least audit their activity. I'm in full agreement we should clearly communicate the issues around pg_dump in particular, because they can't necessarily be eliminated altogether without some major work that's going to take a while to finish. And if the work-around is some sort of GUC for killing RLS altogether, that's ugly but not unacceptable to me as a short-term fix. One of the difficult design requests in my inbox right now asks how pg_dump might be changed both to reduce its overlap with superuser permissions and to allow auditing of its activity. Those requests aren't going away; their incoming frequency is actually rising quite fast right now. They're both things people expect from serious SQL oriented commercial database products, and I'd like to see PostgreSQL continue to displace those as we reach feature parity in those areas. Any way you implement finer grained user permissions and auditing features will be considered a new attack vector when you use those features. The way the proposed RLS feature inserts an arbitrary function for reads has a similar new attack vector when you use that feature. I'm kind of surprised to see this turn into a hot button all of the sudden though, because my thought on all that so far has been a giant so what? This is what PostgreSQL does. You wanna write your own C code and then link the thing right into the server, so that bugs can expose data and crash the whole server? Not only can you shoot yourself in the foot that way, we supply a sample gun and bullets in contrib. How about writing arbitrary code in any one of a dozen server-side languages of wildly varying quality, then hooking that code so it runs as a trigger function whenever you change a row? PostgreSQL is *on it*; we love letting people write some random thing, and then running that random thing against your data as a side-effect of doing an operation. And if you like that...just wait until you learn about this half-assed rules feature we have too! And when the database breaks because the functions people inserted were garbage, that's their fault, not a cause for a CVE. And when someone blindly installs adminpack because it sounded like a pgAdmin requirement, lets a monitoring system run as root so it can watch pg_stat_activity, and then discovers that pair of reasonable decisions suddenly means any fool with monitoring access can call pg_file_unlink...that's their fault too. These are powerful tools with serious implications, and they're expected to be used by equally serious users. We as a development community do need to put a major amount of work into refactoring all of these security mechanisms. There should be less of these embarrassing incidents where bad software design really forced the insecure thing to happen, which I'd argue is the case for that pg_stat_activity example. And luckily so far development resources are appearing for organizations I know of working in that direction recently, as fast as the requirements are rising. I think there's a good outcome at the end of that road. But let's not act like RLS is a scary bogeyman because it introduces a new way to hack the server or get surprising side-effects. That's expected and possibly unavoidable behavior in a feature like this, and there are much worse instances of arbitrary function risk throughout the core code already. -- Greg Smith greg.sm...@crunchydatasolutions.com Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Robert Haas wrote: > Even aside from security exposures, how > does a non-superuser who runs pg_dump know whether they've got a > complete backup or a filtered dump that's missing some rows? This seems to me to be a killer objection to the feature as proposed, and points out a huge difference between column level security and the proposed implementation of row level security. (In fact it is a difference between just about any GRANTed permission and row level security.) If you try to SELECT * FROM sometable and you don't have rights to all the columns, you get an error. A dump would always either work as expected or generate an error. test=# create user bob; CREATE ROLE test=# create user bill; CREATE ROLE test=# set role bob; SET test=> create table person (person_id int not null primary key, name text not null, ssn text); CREATE TABLE test=> grant select (person_id, name) on table person to bill; GRANT test=> reset role; RESET test=# set role bill; SET test=> select person_id, name from person; person_id | name ---+-- (0 rows) test=> select * from person; ERROR: permission denied for relation person The proposed approach would leave the validity of any dump which was not run as a superuser in doubt. The last thing we need, in terms of improving security, is another thing you can't do without connecting as a superuser. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On Wed, Jun 11, 2014 at 8:59 PM, Stephen Frost wrote: >> Row-level security is not a chance for the system to deny access; it's >> a chance for user-defined code to take control and perform arbitrary >> operations. So the scope of what we're contemplating for row-level >> security is really far, far more invasive than what we did for >> column-level privileges. > > In this case the user-defined code needs to return a boolean. We don't > currently do anything to prevent it from having side-effects, no, but > the same is true with views which incorporate functions. I agree that > it makes a difference when compared to column-level privileges, but my > point was that we have provided easier ways to do things which were > possible using more complicated methods before. Perhaps the risk with > RLS is higher but these issues look managable to me and the level of > doubt about our ability to provide this feature in a reasonable and > principled way that our users will understand surprises me. I'm glad the issues look manageable to you, but you haven't really explained how to manage them. The way to dispel doubt is to come up with specific technical proposals that address the technical issues that have been raised. I accept that you are surprised that someone might not think we are on the right course here, but it's entirely appropriate for me to express my doubts about this or any other patch, much as many people do in regards to many patches that are posted here - generally for good and valid reasons. For my part, I'm mildly surprised that anyone thinks it's a good idea to have SELECT * FROM tab to mean different things depending on who is typing it. To me, that seems very confusing; how does an unprivileged user with no ability to assume some other role validate that the row security policy they've configured works at all and exposes precisely the intended set of rows? Even aside from security exposures, how does a non-superuser who runs pg_dump know whether they've got a complete backup or a filtered dump that's missing some rows? A filtered dump might not even be restorable if foreign keys are involved. I think those are serious issues that deserve serious thought and consideration, not just a vague assurance that the issues are probably manageable. >> Because we don't have a good design. > > We're using a design that's found in multiple other RDBMS's and used > extensively by certain industries which use those RDBMS's today. I'm > certainly open to improving what is found in other systems for PG but I > have a hard time seeing this approach as a bad design. Perhaps you're > referring to our implementation, in which case I might agree and things > like running the quals as the table owner is something which should be > considered (I don't know how the other RDBMS's operate in this regard > offhand- it'd be good to find out). I'm not referring to the proposed implementation particularly; or at least not that aspect of it. I don't think trying to run the view quals as the defining user is likely to be very appealing, because I think it's going to hurt performance, for example by preventing function inlining and requiring lots of user-ID switches. But I'm not gonna complain if someone wants to mull it over and make a proposal for how to make it work. Rather, my concern is that all we've got is what might be called the core of the feature; the actual guts of it. There are a lot of ancillary details that seem to me to be not worked out at all yet, or only half-baked. >> I'm not categorically opposed to adding more RLS features to >> PostgreSQL and never have been; in fact, I was deeply involved in the >> original design of security barrier views and committed the original >> patch to add that functionality to PostgreSQL, without which none of >> what we're talking about here would be possible. But the >> currently-proposed design is very unappealing to me, for the reasons >> that I've explained. The right answer to "this feature doesn't >> provide anything that we don't already have and will introduce major >> new security exposures that haven't been adequate thought" is >> debatable, but "well other people have this so we should too" is >> definitely not it. > > How about "it's in high demand by our user base"? In particular, it's > being asked for by a *highly* technical section of our user base who > uses these capabilities today, with this design, in those other > databases. Sure, that's a valid reason for considering any feature. But it's not an excuse to overlook whatever design problems may exist. >> Are you equally unimpressed with the idea that RLS as proposed can't >> support more than one security policy right now *at all*? Because it >> seems to me that either you think multiple RLS policies on a single >> table is important (in which case the current patch is inadequate) or >> you think it's not important (in which case we need not argue about >> whether doing it with multi
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
* Robert Haas (robertmh...@gmail.com) wrote: > On Wed, Jun 11, 2014 at 12:23 PM, Stephen Frost wrote: > > This argument could have been made for column-level privileges also, no? > > Not really. First of all, we didn't have security_barrier views at > that time, let alone security barrier views that are auto-updateable. We had security definer functions, even set-returning ones, along with rules and triggers. > That's a really important piece of technology which makes filtering > access via views feasible in ways that really were not feasible in the > past. Secondly, column-level permissions - like every other > currently-existing type of permissions - are declarative. They are an > additional opportunity for the system to say "no" to something it > otherwise would have allowed, but no user-defined code is executed. We could try to avoid calling user-defined code for RLS, but it'd add a whole lot of complexity and as far as I can see and your proposed solution isn't avoiding the user-defined code anyway, so I'm not sure why this solution should be required to meet that. > Row-level security is not a chance for the system to deny access; it's > a chance for user-defined code to take control and perform arbitrary > operations. So the scope of what we're contemplating for row-level > security is really far, far more invasive than what we did for > column-level privileges. In this case the user-defined code needs to return a boolean. We don't currently do anything to prevent it from having side-effects, no, but the same is true with views which incorporate functions. I agree that it makes a difference when compared to column-level privileges, but my point was that we have provided easier ways to do things which were possible using more complicated methods before. Perhaps the risk with RLS is higher but these issues look managable to me and the level of doubt about our ability to provide this feature in a reasonable and principled way that our users will understand surprises me. > > I agree that views, or even security-definer functions, offer a great > > deal of flexibility, and that may be necessary in some use-cases, but I > > fail to see why that means we should avoid providing the mechanics to > > achieve simple and usable RLS akin to what other major RDBMS's have. > > Because we don't have a good design. We're using a design that's found in multiple other RDBMS's and used extensively by certain industries which use those RDBMS's today. I'm certainly open to improving what is found in other systems for PG but I have a hard time seeing this approach as a bad design. Perhaps you're referring to our implementation, in which case I might agree and things like running the quals as the table owner is something which should be considered (I don't know how the other RDBMS's operate in this regard offhand- it'd be good to find out). > I'm not categorically opposed to adding more RLS features to > PostgreSQL and never have been; in fact, I was deeply involved in the > original design of security barrier views and committed the original > patch to add that functionality to PostgreSQL, without which none of > what we're talking about here would be possible. But the > currently-proposed design is very unappealing to me, for the reasons > that I've explained. The right answer to "this feature doesn't > provide anything that we don't already have and will introduce major > new security exposures that haven't been adequate thought" is > debatable, but "well other people have this so we should too" is > definitely not it. How about "it's in high demand by our user base"? In particular, it's being asked for by a *highly* technical section of our user base who uses these capabilities today, with this design, in those other databases. > Craig's patch really hasn't grappled with any of > these thorny definition and security issues; it's just about making > the basic functionality work. That's fine for a POC, but it's not > enough for a feature that the project would be committing to maintain > for the indefinite future. Improving the patch is exactly what I'd like to do, but throwing out the notion that RLS can't be allowed to execute user-defined code is cutting the legs out of the feature completely- particularly with our system where users can create all manner of objects in the system with their own code being run. > >> That's mighty useful for debugging, at least IMHO. And, if you want > >> to have several row-level security policies for different classes of > >> users, just create more than one view and grant different privileges > >> on each. > > > > I'm really not impressed with the idea that RLS should be done with > > multiple different views of the same underlying table. > > Are you equally unimpressed with the idea that RLS as proposed can't > support more than one security policy right now *at all*? Because it > seems to me that either you think multiple RLS policies on a single > ta
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Tom, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Stephen Frost writes: > > * Robert Haas (robertmh...@gmail.com) wrote: > >> I'm really concerned about the security implications of this patch. I > >> think we're setting ourselves up for a whole lot of hurt for somewhat > >> unclear gain. > > > I'm certainly of a different opinion and, for the most part, I feel that > > if there are security concerns then they need to be addressed- and > > better by us than by asking users to use some other mechanism to > > implement RLS. > > TBH, I found Robert's argument pretty persuasive. The idea that > "SELECT * FROM table" might invoke arbitrary processing ought to scare > anyone who's concerned about security, because that's going to completely > break any assumptions about pg_dump being safe for instance, as well as > force top-to-bottom rethinking of many other security assumptions. SELECT triggers for a wide variety of use-cases are pretty commonly asked for here and are something I'd like to see us support also. There are also quite a few ways in which a select can end up executing code. Today it requires more than 'select * from table;', but not very much.. I agree that it'd be good if we had a way to address that but I continue to view that as an independent issue. What I haven't heard any comments on, yet found interesting, was the idea of having the RLS quals run as the owner of the table. Would that address these concerns? I seem to recall wondering why we didn't do that for views in the first place, though I doubt we could change it now even if we wanted to (and I'm guessing the spec has something to say about this, though I haven't gone and looked and don't remember offhand). It's certainly rather curious that functions called under a view are run as the calling user while permissions checks on relations referred to by the view are as the view owner. Hopefully that will make the rest of this discussion less relevant, but I'll respond with my feelings anyway. > > ... That commit was > > the ground-work to allow us to finally get proper RLS and I'm very > > disappointed to hear that the mechanical pieces around making RLS easy > > for users to use (and getting that check-box taken care of in a wide > > variety of fields that we are being exposed to now, see the PGConf.NYC > > keynote speakers...) is receiving such push-back. > > If this is being sold as merely "ease of use", then it is probably going > to get rejected. In order to get some extra ease of use for the minority > of users who need RLS, you are going to significantly complicate the lives > of all Postgres users. That's not a net win in any sane calculation of > ease of use. I don't view this as being at all accurate- how is this complicating the lives of all Postgres users? If they are worried about running user defined code then they *already* have a lot to worry about. While the users of RLS might be less than 50% and therefore the minority, I expect it will have quite a bit of up-take in certain industries and I know that our lack of any RLS is currently preventing use of Postgres in some rather important cases. As for it being ease-of-use, again, there are ways in which column level privileges could have been dealt with using views, rules, security definer functions, etc, but that doesn't mean we don't want that feature. I certainly view RLS (and have for quite some time..) as a much needed capability, even if it can be done today using a bunch of user written code that must be security audited. > Maybe the right thing to think about is how we can make it easier to set > up table + view combinations according to the pattern Robert described. While this sounds interesting, I don't see adding columns or redefining them as being in the perview of RLS. The current approach of allowing a boolean expression to be defined is both extremely flexible while also being simple when the requirement is simple. Having to create, manage, update, etc, an independent object would add unnecessary complexity. Perhaps having it be a boolean expression is too much flexibility but the alternatives that I can think of aren't terribly attractive to me and the boolean expression approach is what folks coming from other RDBMS's will be familiar with and understand how to build their applications around. We may need to provide some additional pieces around this (perhaps a trigger-like function type which also gets information about the object being queried, etc) but the point is to have a straight-forward and simply reasoned about way of limiting what data is returned. > I wouldn't have a problem with some more-or-less-automated support for > doing that. (Consider SERIAL as a possible precedent here: it's basically > a table creation macro.) Perhaps there's a way to make that work, but personally it looks like a whole bunch more work and I don't see the gain. How would adding RLS to an existing table work? It's worse than the SERIAL case
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On Wed, Jun 11, 2014 at 12:23 PM, Stephen Frost wrote: >> In my view, commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5 basically >> *is* row-level security: instead of applying a row-level security >> policy to a table, just create a security-barrier view over the table >> and grant access to the view. Forget that the table ever existed. >> Done. > > This argument could have been made for column-level privileges also, no? Not really. First of all, we didn't have security_barrier views at that time, let alone security barrier views that are auto-updateable. That's a really important piece of technology which makes filtering access via views feasible in ways that really were not feasible in the past. Secondly, column-level permissions - like every other currently-existing type of permissions - are declarative. They are an additional opportunity for the system to say "no" to something it otherwise would have allowed, but no user-defined code is executed. Row-level security is not a chance for the system to deny access; it's a chance for user-defined code to take control and perform arbitrary operations. So the scope of what we're contemplating for row-level security is really far, far more invasive than what we did for column-level privileges. > I agree that views, or even security-definer functions, offer a great > deal of flexibility, and that may be necessary in some use-cases, but I > fail to see why that means we should avoid providing the mechanics to > achieve simple and usable RLS akin to what other major RDBMS's have. Because we don't have a good design. I'm not categorically opposed to adding more RLS features to PostgreSQL and never have been; in fact, I was deeply involved in the original design of security barrier views and committed the original patch to add that functionality to PostgreSQL, without which none of what we're talking about here would be possible. But the currently-proposed design is very unappealing to me, for the reasons that I've explained. The right answer to "this feature doesn't provide anything that we don't already have and will introduce major new security exposures that haven't been adequate thought" is debatable, but "well other people have this so we should too" is definitely not it. Craig's patch really hasn't grappled with any of these thorny definition and security issues; it's just about making the basic functionality work. That's fine for a POC, but it's not enough for a feature that the project would be committing to maintain for the indefinite future. >> That's mighty useful for debugging, at least IMHO. And, if you want >> to have several row-level security policies for different classes of >> users, just create more than one view and grant different privileges >> on each. > > I'm really not impressed with the idea that RLS should be done with > multiple different views of the same underlying table. Are you equally unimpressed with the idea that RLS as proposed can't support more than one security policy right now *at all*? Because it seems to me that either you think multiple RLS policies on a single table is important (in which case the current patch is inadequate) or you think it's not important (in which case we need not argue about whether doing it with multiple views over the same underlying table is awkward). >> By contrast, it seems to me that every design so far proposed for >> something that is actually called row-level security - as opposed to >> commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5, which *really is* >> row-level security, is extremely limited. Look back at all the things >> listed in the previous paragraph; can you do those things easily with >> the designs that have been proposed? As far as I can see, not really. > > I don't feel that RLS will, or even *should*, have the same level of > flexibility that you can achieve with views and/or security definer > functions. I expect that, over time, we will add more capabilities to > it, but it's never going to be able to redefine the contents of a column > as a view can, nor will it be able to add columns to a table as views > can. I don't see those as reasons against having support for RLS. What this patch is doing is basically allowing a table to really be a view over itself. If we choose to support that, I think it is absolutely inevitable that people are going to want all the same options that they would have if they really made a separate view - separate permissions, WITH CHECK OPTION, all of it. I find the contrary argument - that people will only want X amount and no more - simply not plausible. If it's valuable to have some of those capabilities in an RLS framework, somebody's going to want all of them. There's no bright line to divide the things that are valuable in that context from those that aren't. > I'm glad to hear your thoughts on the level of granularity which might > be nice to have with RLS. What would be great is to spend a bit more > time review
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Stephen Frost writes: > * Robert Haas (robertmh...@gmail.com) wrote: >> I'm really concerned about the security implications of this patch. I >> think we're setting ourselves up for a whole lot of hurt for somewhat >> unclear gain. > I'm certainly of a different opinion and, for the most part, I feel that > if there are security concerns then they need to be addressed- and > better by us than by asking users to use some other mechanism to > implement RLS. TBH, I found Robert's argument pretty persuasive. The idea that "SELECT * FROM table" might invoke arbitrary processing ought to scare anyone who's concerned about security, because that's going to completely break any assumptions about pg_dump being safe for instance, as well as force top-to-bottom rethinking of many other security assumptions. > ... That commit was > the ground-work to allow us to finally get proper RLS and I'm very > disappointed to hear that the mechanical pieces around making RLS easy > for users to use (and getting that check-box taken care of in a wide > variety of fields that we are being exposed to now, see the PGConf.NYC > keynote speakers...) is receiving such push-back. If this is being sold as merely "ease of use", then it is probably going to get rejected. In order to get some extra ease of use for the minority of users who need RLS, you are going to significantly complicate the lives of all Postgres users. That's not a net win in any sane calculation of ease of use. Maybe the right thing to think about is how we can make it easier to set up table + view combinations according to the pattern Robert described. I wouldn't have a problem with some more-or-less-automated support for doing that. (Consider SERIAL as a possible precedent here: it's basically a table creation macro.) > You're suggesting that we use views instead, which clearly could run > someone else's code. Perhaps the user will notice that they're > selecting from a view instead of a table, but I've never seen a security > design around making sure that what is being select'd from is a table > vs. a view. pg_dump is a sufficient counterexample to that statement. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
* Robert Haas (robertmh...@gmail.com) wrote: > I'm really concerned about the security implications of this patch. I > think we're setting ourselves up for a whole lot of hurt for somewhat > unclear gain. I'm certainly of a different opinion and, for the most part, I feel that if there are security concerns then they need to be addressed- and better by us than by asking users to use some other mechanism to implement RLS. > In my view, commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5 basically > *is* row-level security: instead of applying a row-level security > policy to a table, just create a security-barrier view over the table > and grant access to the view. Forget that the table ever existed. > Done. This argument could have been made for column-level privileges also, no? Yet I don't hear any calls for that to be ripped out now that you could implement it through updatable security-barrier views. That commit was the ground-work to allow us to finally get proper RLS and I'm very disappointed to hear that the mechanical pieces around making RLS easy for users to use (and getting that check-box taken care of in a wide variety of fields that we are being exposed to now, see the PGConf.NYC keynote speakers...) is receiving such push-back. > With this approach, there's a lot of stuff that we don't have to > reinvent. We've talked a lot about whether row-level security should > only be concerned with the rows it scans, or whether it should also > restrict the new rows that can be created. You can get either > behavior by choosing whether or not to use WITH CHECK OPTION. And > then there's this question of who should be RLS-exempt; that's > basically a question of to whom you grant privileges on the underlying > table. Note that this can be very fine-grained: for example, you can > allow someone to exempt themselves for selects but not for updates by > granting them SELECT privileges but not UPDATE privileges on the > underlying table. And potentially-exempt users can choose whether > they want a particular access to actually be exempt by targeting the > view when they don't want to be exempt and the table when they do. I agree that views, or even security-definer functions, offer a great deal of flexibility, and that may be necessary in some use-cases, but I fail to see why that means we should avoid providing the mechanics to achieve simple and usable RLS akin to what other major RDBMS's have. > That's mighty useful for debugging, at least IMHO. And, if you want > to have several row-level security policies for different classes of > users, just create more than one view and grant different privileges > on each. I'm really not impressed with the idea that RLS should be done with multiple different views of the same underlying table. > By contrast, it seems to me that every design so far proposed for > something that is actually called row-level security - as opposed to > commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5, which *really is* > row-level security, is extremely limited. Look back at all the things > listed in the previous paragraph; can you do those things easily with > the designs that have been proposed? As far as I can see, not really. I don't feel that RLS will, or even *should*, have the same level of flexibility that you can achieve with views and/or security definer functions. I expect that, over time, we will add more capabilities to it, but it's never going to be able to redefine the contents of a column as a view can, nor will it be able to add columns to a table as views can. I don't see those as reasons against having support for RLS. > Your (Craig's) rls-9.4-upd-sb-views patch seems to have a rough > equivalent of WITH CHECK OPTION, probably because we've talked a lot > about that specific issue, but it doesn't line up exactly to what WITH > CHECK OPTION actually does. There's no independently-grantable > RLS-exemption privilege - and even when we talk about that, it's > usually some kind of global bit that applies to all tables and all > operations equally - whereas with the above approach it can be > per-table and per-operation and doesn't require superuser intervention > to flip the bit. I'm glad to hear your thoughts on the level of granularity which might be nice to have with RLS. What would be great is to spend a bit more time reviewing what other systems provide in this area and considering what makes sense for us. This will also be a feature and an area which we will be improving for a long time to come, but we do need this capability and we have to start somewhere. > There's no way for users who are RLS exempt to turn > off their exemption for testing purposes, let alone on a per-table > basis. I don't follow this argument entirely- users can't turn off the existing permissions system for testing either, unless an authorized user with the correct permissions makes the change to allow it- or the user bumps themselves up to superuser, or to a role
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
* Craig Ringer (cr...@2ndquadrant.com) wrote: > On 06/11/2014 07:24 AM, Tom Lane wrote: > > Is the point of that that the table owner might have put trojan-horse > > functions into the RLS qual? If so, why are we only concerned about > > defending the superuser and not other users? Seems like the right fix > > would be to insist that functions in the RLS qual run as the table owner. > > Granted, that might be painful to do. But it still seems like "we only > > need to do this for superusers" is designing with blinkers on. > > I agree, and now that the urgency of trying to deliver this for 9.4 is > over it's worth seeing if we can just run as table owner. We'll need to work out how to ensure that things like current_user() still returns the calling user in that case, otherwise it won't make any sense. In general, I agree that having the RLS quals run as the table owner is a good approach and would love to hear suggestions about how we can make that happen. > Failing that, we could take the approach a certain other RDBMS does and > make the ability to define row security quals a GRANTable right > initially held only by the superuser. I don't particularly like this idea- it's akin, to me anyway, to making the ability to control other permissions on a table (SELECT, INSERT, etc) something which a user would have to be granted- and it doesn't really address the issue. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
* Craig Ringer (cr...@2ndquadrant.com) wrote: > On 06/11/2014 02:19 AM, Tom Lane wrote: > > Hm ... I'm not following why we'd need a special case for superusers and > > not anyone else? Seems like any useful RLS scheme is going to require > > more privilege levels than just superuser and not-superuser. > > What it really needs is to invalidate plans when switching between > RLS-enabled and RLS-exempt users, yes. I'm sure we'll want an "RLS > exempt" right or mode sooner rather than later, so I'm against tying > this explicitly to superuser as such. That certainly sounds reasonable to me, but the point is we're just looking to see if the current role executing the plan should or should not have RLS applied and, if it's changing, we need to re-plan. We don't need to actually track an independent plan for each and every user executing the plan, which means that the plan cache can be largely left alone. > I wouldn't be surprised to see > > SET ROW SECURITY ON|OFF > > down the track, with a right controlling whether you can or not. Or at > least, a right that directly exempts a user from row security. Agreed, but doing a re-planning in that case seems reasonable to me. I find it pretty unlikely that there will be a lot of critical path cases of the same plan flipping back and forth between a role for which RLS is applied and a role where it shouldn't be. > > Could we put the "if superuser then ok" test into the RLS condition test > > and thereby not need more than one plan at all? > > Only if we put it in another level of security barrier subquery, because > otherwise the planner might execute the other quals (including possible > user defined functions) before the superuser test. Which was the whole > reason for the superuser test in the first place. Yeah, I'm not a big fan of this and it certainly seems a simpler approach to just force a re-plan. We're talking about a query which has been prepared and then is being executed by different roles, some of which are RLS enabled and some which are RLS exempt. That just strikes me as pretty unlikely to happen and if it does become an issue, a user could work around it by having two different plans prepared and making sure that they are called from the appropriate roles to avoid the replanning. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On Tue, Jun 10, 2014 at 7:18 PM, Craig Ringer wrote: > On 06/11/2014 02:19 AM, Tom Lane wrote: >> Hm ... I'm not following why we'd need a special case for superusers and >> not anyone else? Seems like any useful RLS scheme is going to require >> more privilege levels than just superuser and not-superuser. > > What it really needs is to invalidate plans when switching between > RLS-enabled and RLS-exempt users, yes. I'm sure we'll want an "RLS > exempt" right or mode sooner rather than later, so I'm against tying > this explicitly to superuser as such. > > I wouldn't be surprised to see > > SET ROW SECURITY ON|OFF > > down the track, with a right controlling whether you can or not. Or at > least, a right that directly exempts a user from row security. I'm really concerned about the security implications of this patch. I think we're setting ourselves up for a whole lot of hurt for somewhat unclear gain. In my view, commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5 basically *is* row-level security: instead of applying a row-level security policy to a table, just create a security-barrier view over the table and grant access to the view. Forget that the table ever existed. Done. With this approach, there's a lot of stuff that we don't have to reinvent. We've talked a lot about whether row-level security should only be concerned with the rows it scans, or whether it should also restrict the new rows that can be created. You can get either behavior by choosing whether or not to use WITH CHECK OPTION. And then there's this question of who should be RLS-exempt; that's basically a question of to whom you grant privileges on the underlying table. Note that this can be very fine-grained: for example, you can allow someone to exempt themselves for selects but not for updates by granting them SELECT privileges but not UPDATE privileges on the underlying table. And potentially-exempt users can choose whether they want a particular access to actually be exempt by targeting the view when they don't want to be exempt and the table when they do. That's mighty useful for debugging, at least IMHO. And, if you want to have several row-level security policies for different classes of users, just create more than one view and grant different privileges on each. By contrast, it seems to me that every design so far proposed for something that is actually called row-level security - as opposed to commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5, which *really is* row-level security, is extremely limited. Look back at all the things listed in the previous paragraph; can you do those things easily with the designs that have been proposed? As far as I can see, not really. Your (Craig's) rls-9.4-upd-sb-views patch seems to have a rough equivalent of WITH CHECK OPTION, probably because we've talked a lot about that specific issue, but it doesn't line up exactly to what WITH CHECK OPTION actually does. There's no independently-grantable RLS-exemption privilege - and even when we talk about that, it's usually some kind of global bit that applies to all tables and all operations equally - whereas with the above approach it can be per-table and per-operation and doesn't require superuser intervention to flip the bit. There's no way for users who are RLS exempt to turn off their exemption for testing purposes, let alone on a per-table basis. There's no way to have multiple RLS policies on a single table. All of those are things that we get "for free" in the view-over-table model, and implementing formal RLS basically requires us to either invent a new RLS-specific way of doing each of those things, or suffer along with a subset of the functionality. Yuck. But what's really awful about this whole design is that it breaks the invariant that reading from a table doesn't run anybody else's code. It's already the case that users need to be awfully careful about modifying tables, because that might fire triggers that do bad things. But at least you can SELECT from a table and it will either work, or it will fail with a permission denied error. What it will not do is unexpectedly run some code that you weren't expecting it to run. You can't be so blithe about selecting from views, but reading a plain table is always OK. Now, as soon as we introduce the concept that selecting from a table might not really mean "read from the table" but "read from the table after applying this owner-specified qual", we're opening up a whole new set of attack surfaces. Every pg_dump is an opportunity to hack somebody else's account, or at least audit their activity. Protecting the superuser against everybody else is nice, but I think it's just as important to protect non-superusers against each other, and I think that's going to be hard -- because in the RLS world, SELECT * FROM tab is now *fundamentally* ambiguous. Maybe it's reading from the table, and maybe it's really clandestinely reading from a view over the table, and the user has no
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Craig Ringer writes: > On 06/11/2014 07:24 AM, Tom Lane wrote: >> Is the point of that that the table owner might have put trojan-horse >> functions into the RLS qual? If so, why are we only concerned about >> defending the superuser and not other users? Seems like the right fix >> would be to insist that functions in the RLS qual run as the table owner. >> Granted, that might be painful to do. But it still seems like "we only >> need to do this for superusers" is designing with blinkers on. > I agree, and now that the urgency of trying to deliver this for 9.4 is > over it's worth seeing if we can just run as table owner. > Failing that, we could take the approach a certain other RDBMS does and > make the ability to define row security quals a GRANTable right > initially held only by the superuser. Hmm ... that might be a workable compromise. I think the main issue here is whether we expect that RLS quals will be something that the planner could optimize to any meaningful extent. If they're always (in effect) wrapped in SECURITY DEFINER functions, I think that largely blocks any optimizations; but maybe that wouldn't matter in practice. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On 06/11/2014 07:24 AM, Tom Lane wrote: > Is the point of that that the table owner might have put trojan-horse > functions into the RLS qual? If so, why are we only concerned about > defending the superuser and not other users? Seems like the right fix > would be to insist that functions in the RLS qual run as the table owner. > Granted, that might be painful to do. But it still seems like "we only > need to do this for superusers" is designing with blinkers on. I agree, and now that the urgency of trying to deliver this for 9.4 is over it's worth seeing if we can just run as table owner. Failing that, we could take the approach a certain other RDBMS does and make the ability to define row security quals a GRANTable right initially held only by the superuser. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Craig Ringer writes: > On 06/11/2014 02:19 AM, Tom Lane wrote: >> Could we put the "if superuser then ok" test into the RLS condition test >> and thereby not need more than one plan at all? > Only if we put it in another level of security barrier subquery, because > otherwise the planner might execute the other quals (including possible > user defined functions) before the superuser test. Which was the whole > reason for the superuser test in the first place. Is the point of that that the table owner might have put trojan-horse functions into the RLS qual? If so, why are we only concerned about defending the superuser and not other users? Seems like the right fix would be to insist that functions in the RLS qual run as the table owner. Granted, that might be painful to do. But it still seems like "we only need to do this for superusers" is designing with blinkers on. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On 06/11/2014 02:19 AM, Tom Lane wrote: > Hm ... I'm not following why we'd need a special case for superusers and > not anyone else? Seems like any useful RLS scheme is going to require > more privilege levels than just superuser and not-superuser. What it really needs is to invalidate plans when switching between RLS-enabled and RLS-exempt users, yes. I'm sure we'll want an "RLS exempt" right or mode sooner rather than later, so I'm against tying this explicitly to superuser as such. I wouldn't be surprised to see SET ROW SECURITY ON|OFF down the track, with a right controlling whether you can or not. Or at least, a right that directly exempts a user from row security. > Could we put the "if superuser then ok" test into the RLS condition test > and thereby not need more than one plan at all? Only if we put it in another level of security barrier subquery, because otherwise the planner might execute the other quals (including possible user defined functions) before the superuser test. Which was the whole reason for the superuser test in the first place. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Hey Tom, > Hm ... I'm not following why we'd need a special case for superusers and > not anyone else? Seems like any useful RLS scheme is going to require > more privilege levels than just superuser and not-superuser. > As it stands right now, superuser is the only case where RLS policies should not be applied/completely ignored. I suppose it is possible to create RLS policies that are related to other privilege levels, but those would still need to be applied despite user id, excepting superuser. I'll defer to Stephen or Craig on the usefulness of this scheme. Could we put the "if superuser then ok" test into the RLS condition test > and thereby not need more than one plan at all? > As I understand it, the application of RLS policies occurs in the rewriter. Therefore, when switching back and forth between superuser and not-superuser the query must be rewritten, which would ultimately result in the need for a new plan correct? If that is the case, then I am not sure how one plan is possible. However, again, I'll have to defer to Stephen or Craig on this one. Thanks, Adam
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Adam Brightwell writes: > Through this effort, we have concluded that for RLS the case of > invalidating a plan is only necessary when switching between a superuser > and a non-superuser. Obviously, re-planning on every role change would be > too costly, but this approach should help minimize that cost. As well, > there were not any cases outside of this one that were immediately apparent > with respect to RLS that would require re-planning on a per userid basis. Hm ... I'm not following why we'd need a special case for superusers and not anyone else? Seems like any useful RLS scheme is going to require more privilege levels than just superuser and not-superuser. Could we put the "if superuser then ok" test into the RLS condition test and thereby not need more than one plan at all? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Hi all, This is my first post to the mailing list and I am looking forward to working with everyone in the community. With that said... I'll take a look at changing the cache key to include user ID and > ripping out the plan invalidation logic from the current patch tomorrow > but I seriously doubt I'll be able to get all of that done in the next > day or two. If anyone else is able to help out, it'd certainly be > appreciated; I really think that's the main hurdle to address at this > point with this patch- without the plan invalidation complexity, the > the patch is really just building out the catalog, the SQL-level > statements for managing it, and the bit of code required to add the > conditional to statements involving RLS-enabled tables. > I have been collaborating with Stephen on addressing this particular item with RLS. As a basis, I have been working with Craig's 'rls-9.4-upd-sb-views' branch rebased against master around 9.4beta1. Through this effort, we have concluded that for RLS the case of invalidating a plan is only necessary when switching between a superuser and a non-superuser. Obviously, re-planning on every role change would be too costly, but this approach should help minimize that cost. As well, there were not any cases outside of this one that were immediately apparent with respect to RLS that would require re-planning on a per userid basis. I have tested this approach with the following patch. https://github.com/abrightwell/postgres/commit/4c959e63f7a89b24ebbd46575a31a629d24efa75 Does this sound like a sane approach? Thoughts? Recommendations? Thanks, Adam
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
* Craig Ringer (cr...@2ndquadrant.com) wrote: > On 04/15/2014 10:06 AM, Stephen Frost wrote: > > I've uploaded the latest patch, rebased against master, with my > > changes to here: http://snowman.net/~sfrost/rls_ringerc_sf.patch.gz > > as I don't believe it'd clear the mailing list (it's 29k). > > Does this exist in the form of an accessible git branch, too? Eh, no. > I was trying to maintain the patch as a series of distinct changes to > make it easier to see what each part is doing, and it'd be nice to > preserve that if possible. It also makes seeing what's changed a lot > easier. Yeah, I almost just posted a patch against your tree. I'll look at doing that tomorrow. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/15/2014 10:06 AM, Stephen Frost wrote: > I've uploaded the latest patch, rebased against master, with my > changes to here: http://snowman.net/~sfrost/rls_ringerc_sf.patch.gz > as I don't believe it'd clear the mailing list (it's 29k). Does this exist in the form of an accessible git branch, too? I was trying to maintain the patch as a series of distinct changes to make it easier to see what each part is doing, and it'd be nice to preserve that if possible. It also makes seeing what's changed a lot easier. - -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -BEGIN PGP SIGNATURE- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTWbGNAAoJELBXNkqjr+S28W4H/R49CJfz4Y3TMbvwxhrwkjL2 WEv80qY4GDCzG5CGKROn3kT9H5xePvL9eadSjr+CPsilerHrPkHmXnU5w+K2LnKV MCL/A2969b4ng1cUK9eHEFVx9BLLQmiVI6DbJ2OA2oWUs/Y7Zne5h6q0fNnnnTSq XEU6r3tVkUp5ipbhHi+aJ+mfckirdcMR0U5X+2fgGpLZ3D+8j9azvuXvQjSOekVB 3+EVVI0UXhhvw4It4/1CjieHvScdxnsz9bOpKGiEeePUB3CGC0iPtBgIGtE0n2OK cqKryuwZ3++LZih74M8z+Rn6yao5f4ElJrO3gz5q8axKzH/bHkEYElwEUhVfbSE= =AKzL -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
* Tom Lane (t...@sss.pgh.pa.us) wrote: > Stephen Frost writes: > > I've uploaded the latest patch, rebased against master, with my changes > > to here: http://snowman.net/~sfrost/rls_ringerc_sf.patch.gz as I don't > > believe it'd clear the mailing list (it's 29k). > > Please actually post it, for the archives' sake. 29k is far below the > list limit. (Which I don't know exactly what it is ... but certainly > in the hundreds of KB.) Huh, thought it was more like 25k. Well, here goes then... > > I'll take a look at changing the cache key to include user ID and > > ripping out the plan invalidation logic from the current patch tomorrow > > but I seriously doubt I'll be able to get all of that done in the next > > day or two. > > TBH I think we are up against the deadline. April 15 was the agreed-to > drop dead date for pushing new features into 9.4. Yeah. :/ May be for the best anyway, this should be able to go in early in the 9.5 cycle and get more testing and refinement. Still stinks though as I feel like this patch didn't get the attention it should have due to a simple misunderstanding, but we do need to stop at some point to get a release together. Thanks, Stephen rls_ringerc_sf.patch.gz Description: Binary data signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Stephen Frost writes: > I've uploaded the latest patch, rebased against master, with my changes > to here: http://snowman.net/~sfrost/rls_ringerc_sf.patch.gz as I don't > believe it'd clear the mailing list (it's 29k). Please actually post it, for the archives' sake. 29k is far below the list limit. (Which I don't know exactly what it is ... but certainly in the hundreds of KB.) > I'll take a look at changing the cache key to include user ID and > ripping out the plan invalidation logic from the current patch tomorrow > but I seriously doubt I'll be able to get all of that done in the next > day or two. TBH I think we are up against the deadline. April 15 was the agreed-to drop dead date for pushing new features into 9.4. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Craig, Tom, all, I've been through the RLS code over the past couple of days which I pulled from Craig's repo and have a bunch of minor updates. In general, the patch seems pretty reasonable- except for the issues discussed below. Quite a bit of this patch is tied up in plan invalidation and tracking if the security quals depend on the current user, all of which seems pretty grotty and the wrong way around to me. * Tom Lane (t...@sss.pgh.pa.us) wrote: > Craig Ringer writes: > > It's only that the plan depends on the user ID. There's no point keeping > > the plan around if the user no longer exists. > > [ shrug... ] Leaving such a plan cached would be harmless, though. Agreed. > Furthermore, the user ID we'd be talking about is either the owner > of the current session, or the owner of some view or security-definer > function that the plan is already dependent on, so it's fairly hard > to credit that the plan would survive long enough for the issue to > arise. I don't entirely follow which 'issue' is being referred to here, but we need to consider that 'set role' changes should also cause a new plan. > Even if there is a scenario where invalidating by user ID is actually > useful, I think adding infrastructure to cause invalidation in such a case > is optimizing for the wrong thing. You're adding cycles to every query to > benefit a case that is going to be quite infrequent in practice. Yeah, I have a hard time seeing that there's an issue w/ keeping the cached plans around even if the session never goes back to being under the user ID for which those older plans were built. Also, wouldn't a 'RESET ALL' clear any of them anyway? > > Yes, that would be good, but is IMO more of a separate optimization. I'm > > currently using KaiGai's code to invalidate and re-plan when a user ID > > change is detected. > > I'm unlikely to accept a patch that does that; wouldn't it be catastrophic > for performance in the presence of security-definer functions? You can't > just trash the whole plan cache when a user ID switch occurs. Yeah, this doesn't seem like the right approach. Adding the user ID to the cache key definitely strikes me as the right way to fix this. I've uploaded the latest patch, rebased against master, with my changes to here: http://snowman.net/~sfrost/rls_ringerc_sf.patch.gz as I don't believe it'd clear the mailing list (it's 29k). I'll take a look at changing the cache key to include user ID and ripping out the plan invalidation logic from the current patch tomorrow but I seriously doubt I'll be able to get all of that done in the next day or two. If anyone else is able to help out, it'd certainly be appreciated; I really think that's the main hurdle to address at this point with this patch- without the plan invalidation complexity, the the patch is really just building out the catalog, the SQL-level statements for managing it, and the bit of code required to add the conditional to statements involving RLS-enabled tables. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Craig Ringer writes: > On 03/06/2014 02:58 AM, Tom Lane wrote: >> The PlanInvalItem could perfectly well be generated by the planner, >> no, if it has the user OID? But I'm not real sure why you need it. >> I don't see the reason for an invalidation triggered by user ID. >> What exactly about the *user*, and not something else, would trigger >> plan invalidation? > It's only that the plan depends on the user ID. There's no point keeping > the plan around if the user no longer exists. [ shrug... ] Leaving such a plan cached would be harmless, though. Furthermore, the user ID we'd be talking about is either the owner of the current session, or the owner of some view or security-definer function that the plan is already dependent on, so it's fairly hard to credit that the plan would survive long enough for the issue to arise. Even if there is a scenario where invalidating by user ID is actually useful, I think adding infrastructure to cause invalidation in such a case is optimizing for the wrong thing. You're adding cycles to every query to benefit a case that is going to be quite infrequent in practice. >> What we do need is a notion that a plan cache entry might only be >> valid for a specific calling user ID; but that's a matter for cache >> entry lookup not for subsequent invalidation. > Yes, that would be good, but is IMO more of a separate optimization. I'm > currently using KaiGai's code to invalidate and re-plan when a user ID > change is detected. I'm unlikely to accept a patch that does that; wouldn't it be catastrophic for performance in the presence of security-definer functions? You can't just trash the whole plan cache when a user ID switch occurs. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On 03/06/2014 02:58 AM, Tom Lane wrote: > Craig Ringer writes: >> One of the remaining issues with row security is how to pass plan >> invalidation information generated in the rewriter back into the planner. > >> With row security, it's necessary to set a field in PlannerGlobal, >> tracking the user ID of the user the query was planned for if row >> security was applied. It is also necessary to add a PlanInvalItem for >> the user ID. > > TBH I'd just add a user OID field in struct Query and not hack up a bunch > of existing function APIs. It's not much worse than the existing > constraintDeps field. If you're happy with that, I certainly won't complain. It's much simpler and less intrusive. I should be able to post an update using this later today. > The PlanInvalItem could perfectly well be generated by the planner, > no, if it has the user OID? But I'm not real sure why you need it. > I don't see the reason for an invalidation triggered by user ID. > What exactly about the *user*, and not something else, would trigger > plan invalidation? It's only that the plan depends on the user ID. There's no point keeping the plan around if the user no longer exists. You're quite right that this can be done in the planner when a dependency on the user ID is found, though. So there's no need to pass a PlanInvalItem down, which is a lot nicer. > What we do need is a notion that a plan cache entry might only be > valid for a specific calling user ID; but that's a matter for cache > entry lookup not for subsequent invalidation. Yes, that would be good, but is IMO more of a separate optimization. I'm currently using KaiGai's code to invalidate and re-plan when a user ID change is detected. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
Craig Ringer writes: > One of the remaining issues with row security is how to pass plan > invalidation information generated in the rewriter back into the planner. > With row security, it's necessary to set a field in PlannerGlobal, > tracking the user ID of the user the query was planned for if row > security was applied. It is also necessary to add a PlanInvalItem for > the user ID. TBH I'd just add a user OID field in struct Query and not hack up a bunch of existing function APIs. It's not much worse than the existing constraintDeps field. The PlanInvalItem could perfectly well be generated by the planner, no, if it has the user OID? But I'm not real sure why you need it. I don't see the reason for an invalidation triggered by user ID. What exactly about the *user*, and not something else, would trigger plan invalidation? What we do need is a notion that a plan cache entry might only be valid for a specific calling user ID; but that's a matter for cache entry lookup not for subsequent invalidation. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers