Re: [HACKERS] Reworks for Access Control facilities (r2363)
KaiGai Kohei wrote: 1) creation of a database object In SELinux model, when a user tries to create a new object (not limited to database object, like a file or socket), a default security context is assigned on the new object, then SELinux checks whether the user has privileges to create a new object labeled with the security context, or not. When we create a new table, the default PG model checks ACL_CREATE privilege on the namespace which is supposed to own the new table. DefineRelation() invokes pg_namespace_aclcheck() with OID of the namespace, but we cannot see any properties of the new table from inside of pg_namespace_aclcheck(). It checks permissions on the couple of a user and a namespace. On the other hand, SE-PG model follows the above principle. When we create a new table, SE-PG compute a default security context to be assigned on, then it checks the security policy whether the user is allowed to create a new table labeled with the context, or not. It checks permissions on the couple of a user and a new table itself. I don't think I buy that argument. Can't we simply decide that in PostgreSQL, the granularity is different, and you can only create policies governing creation of objects on the basis of schema+user combination, not on the properties of the new object. AFAICS it wouldn't violate the principle of Mandatory Access Control. 2) AND-condition for all the privileges When a certain action requires multiple permissions at one time, the principle of SELinux is that all the permissions have to be checked. If one of them is not allowed, it disallows the required action. In other word, all the conditions are chained by AND. This principle enables us to analyze the data flows between users and resources with the security policy, without implementation details. If a certain permission (e.g db_table:{select}) can override any other permission (e.g db_column:{select}), it also implicitly means a possibility of infotmation leaks/manipulations, even if the security policy said this user cannot read a data from the column. On the other hand, the default PG model allows to bypass checks on certain objects. For example, column-level privileges are only checked when a user does not have enough permissions on the target table. If SELECT a,b FROM t is given, pg_attribute_aclcheck() may not invoked when user has needed privileges on the table t. Hmm, I see. Yes, it does seem like we'd need to change such permission checks to accommodate both models. 3) superuser is not an exception of access control. It is the similar issue to the 2). Yeah. The following code is a part of AlterFunctionOwner_internal(). /* Superusers can always do it */ if (!superuser()) { /* Otherwise, must be owner of the existing object */ if (!pg_proc_ownercheck(procOid, GetUserId())) aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_PROC, NameStr(procForm-proname)); /* Must be able to become new owner */ check_is_member_of_role(GetUserId(), newOwnerId); /* New owner must have CREATE privilege on namespace */ aclresult = pg_namespace_aclcheck(procForm-pronamespace, newOwnerId, ACL_CREATE); if (aclresult != ACLCHECK_OK) aclcheck_error(aclresult, ACL_KIND_NAMESPACE, get_namespace_name(procForm-pronamespace)); } From perspective of the default PG model, this code perfectly correct. Both of pg_proc_ownercheck() and pg_namespace_aclcheck() always returns ACLCHECK_OK, so these invocations are bypassable. However, if SE-PG's hooks are deployed on pg_xxx_aclcheck() routines, it means that we cannot check correct MAC permissions when a client is allowed to apply superuser privilege. Please remind that SELinux requires AND-condition for all the privileges required to a certain action. When a root user tries to read a certain file without DAC permisions, it requires both of capability:{dac_override} and file:{read} permissions in operating system. We need to ask ourselves, is that a realistic goal, given how widespread such if (superuser()) calls are? And more imporantly, unless you sprinkle additional fine-grained permission checks to all the places that currently just check if (superuser()), it will be possible to circumvent the system with LOAD or any of the other commands that are inherently dangerous. We don't want such additional fine-grained permissions, not for now at least. Seems a lot simpler and also easier to understand if there's a single superuser privilege that trumps all other permission checks. If we look at the SE-PgSQL project on the greater scale, it also can be considered as an efforts to add MAC checks on the module which applied its own access controls, but bypassed
Re: [HACKERS] Deprecation
Greg Stark gsst...@mit.edu writes: On Fri, Oct 16, 2009 at 12:26 PM, Tom Lane t...@sss.pgh.pa.us wrote: However, if the columnref looks like x.y where x happens to match some table in the database (and not in the query) that doesn't have a column y, the implicit-RTE code will have already modified the querytree before finding out that column y doesn't exist. Hm, so if you do nothing then really the only thing that doesn't work is if you have add_missing_from then plpgsql record variables wouldn't work when you tried to reference their columns? Do nothing isn't the right phrase here --- it would take a great deal of work and ugly, hard-to-maintain code to get it to work even that badly. The code paths in transformColumnRef are fairly messy already :-(. Getting rid of add_missing_from would definitely make it easier to refactor to support hooks for external variable sources. The approach I had been thinking about proposing, before David piped up with his modest proposal, was to have external variables take precedence over implicit RTEs --- ie, we'd call the hook *before* trying the add_missing_from case. But that seems pretty weird, and it'd still be messy to program. What it would mainly accomplish is to avoid the extra lock hazard. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecation
Tom Lane wrote: Greg Stark gsst...@mit.edu writes: On Fri, Oct 16, 2009 at 12:26 PM, Tom Lane t...@sss.pgh.pa.us wrote: However, if the columnref looks like x.y where x happens to match some table in the database (and not in the query) that doesn't have a column y, the implicit-RTE code will have already modified the querytree before finding out that column y doesn't exist. Hm, so if you do nothing then really the only thing that doesn't work is if you have add_missing_from then plpgsql record variables wouldn't work when you tried to reference their columns? Do nothing isn't the right phrase here --- it would take a great deal of work and ugly, hard-to-maintain code to get it to work even that badly. The code paths in transformColumnRef are fairly messy already :-(. Getting rid of add_missing_from would definitely make it easier to refactor to support hooks for external variable sources. The approach I had been thinking about proposing, before David piped up with his modest proposal, was to have external variables take precedence over implicit RTEs --- ie, we'd call the hook *before* trying the add_missing_from case. But that seems pretty weird, and it'd still be messy to program. What it would mainly accomplish is to avoid the extra lock hazard. Sounds like a good reason to remove add_missing_from in 8.5. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecation
Tom Lane t...@sss.pgh.pa.us writes: Do nothing isn't the right phrase here --- it would take a great deal of work and ugly, hard-to-maintain code to get it to work even that badly. The code paths in transformColumnRef are fairly messy already :-(. Getting rid of add_missing_from would definitely make it easier to refactor to support hooks for external variable sources. A little voice from the field, +1 for deprecating add_missing_from. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecation
2009/10/17 Dimitri Fontaine dfonta...@hi-media.com: Tom Lane t...@sss.pgh.pa.us writes: Do nothing isn't the right phrase here --- it would take a great deal of work and ugly, hard-to-maintain code to get it to work even that badly. The code paths in transformColumnRef are fairly messy already :-(. Getting rid of add_missing_from would definitely make it easier to refactor to support hooks for external variable sources. A little voice from the field, +1 for deprecating add_missing_from. +1 Pavel Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecation
Bruce Momjian br...@momjian.us writes: Sounds like a good reason to remove add_missing_from in 8.5. Seems like the general consensus is that it's okay to do that. I will go make it happen unless somebody squawks pretty soon... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecation
Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Sounds like a good reason to remove add_missing_from in 8.5. Seems like the general consensus is that it's okay to do that. I will go make it happen unless somebody squawks pretty soon... Squawk. I am currently travelling. Please give me until early next week to research and react. thanks andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecation
Andrew Dunstan and...@dunslane.net writes: Squawk. I am currently travelling. Please give me until early next week to research and react. Okay, I'll hold off for a bit. For reference, attached is the patch I was about to apply. This doesn't do any of the refactoring I had in mind, it just removes the implicit-RTE logic. regards, tom lane Index: doc/src/sgml/config.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v retrieving revision 1.230 diff -c -r1.230 config.sgml *** doc/src/sgml/config.sgml 3 Oct 2009 23:10:47 - 1.230 --- doc/src/sgml/config.sgml 17 Oct 2009 20:37:00 - *** *** 4659,4695 variablelist - varlistentry id=guc-add-missing-from xreflabel=add_missing_from - termvarnameadd_missing_from/varname (typeboolean/type)/term - indextermprimaryFROM/secondarymissing// - indexterm -primaryvarnameadd_missing_from/ configuration parameter/primary - /indexterm - listitem -para - When on, tables that are referenced by a query will be - automatically added to the literalFROM/ clause if not - already present. This behavior does not comply with the SQL - standard and many people dislike it because it can mask mistakes - (such as referencing a table where you should have referenced - its alias). The default is literaloff/. This variable can be - enabled for compatibility with releases of - productnamePostgreSQL/ prior to 8.1, where this behavior was - allowed by default. -/para - -para - Note that even when this variable is enabled, a warning - message will be emitted for each implicit literalFROM/ - entry referenced by a query. Users are encouraged to update - their applications to not rely on this behavior, by adding all - tables referenced by a query to the query's literalFROM/ - clause (or its literalUSING/ clause in the case of - commandDELETE/). -/para - /listitem - /varlistentry - varlistentry id=guc-array-nulls xreflabel=array_nulls termvarnamearray_nulls/varname (typeboolean/type)/term indexterm --- 4659,4664 Index: doc/src/sgml/queries.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/queries.sgml,v retrieving revision 1.55 diff -c -r1.55 queries.sgml *** doc/src/sgml/queries.sgml 17 Jun 2009 21:58:49 - 1.55 --- doc/src/sgml/queries.sgml 17 Oct 2009 20:37:00 - *** *** 521,543 /para para ! The alias becomes the new name of the table reference for the ! current query mdash; it is no longer possible to refer to the table ! by the original name. Thus: ! programlisting ! SELECT * FROM my_table AS m WHERE my_table.a gt; 5; ! /programlisting ! is not valid according to the SQL standard. In ! productnamePostgreSQL/productname this will draw an error, assuming the ! xref linkend=guc-add-missing-from configuration variable is ! literaloff/ (as it is by default). If it is literalon/, ! an implicit table reference will be added to the ! literalFROM/literal clause, so the query is processed as if ! it were written as: programlisting ! SELECT * FROM my_table AS m, my_table AS my_table WHERE my_table.a gt; 5; /programlisting - That will result in a cross join, which is usually not what you want. /para para --- 521,533 /para para ! The alias becomes the new name of the table reference so far as the ! current query is concerned mdash; it is not allowed to refer to the ! table by the original name elsewhere in the query. Thus, this is not ! valid: programlisting ! SELECT * FROM my_table AS m WHERE my_table.a gt; 5;-- wrong /programlisting /para para Index: doc/src/sgml/ref/select.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/ref/select.sgml,v retrieving revision 1.125 diff -c -r1.125 select.sgml *** doc/src/sgml/ref/select.sgml 18 Sep 2009 05:00:42 - 1.125 --- doc/src/sgml/ref/select.sgml 17 Oct 2009 20:37:01 - *** *** 1451,1462 productnamePostgreSQL/productname releases prior to 8.1 would accept queries of this form, and add an implicit entry to the query's literalFROM/literal clause for each table ! referenced by the query. This is no longer the default behavior, ! because it does not comply with the SQL standard, and is ! considered by many to be error-prone. For compatibility with ! applications that rely on this behavior the xref ! linkend=guc-add-missing-from configuration variable can be ! enabled. /para /refsect2 --- 1451,1457
Re: [HACKERS] LATERAL
On Tue, Sep 22, 2009 at 11:21 PM, Tom Lane t...@sss.pgh.pa.us wrote: Currently, however, we only consider this possibility when the inner rel is NOT a joinrel. It seems like it might be possible to change this, but it doesn't look straightforward. Well, it's straightforward enough in theory, it's just the exponential growth in the number of join paths to consider that's a problem :-(. I think what we'd need is a heuristic to limit the paths considered. [ picking this up again now that CommitFest is over ] So that I don't have to keep referring to this possibility and similar circumlocution, let's define an index-accelerated path (feel free to suggest a different term) to be the generalization to joinrels of what we now call a nested inner index-scan. In other words, they can only be usefully used on the inner side of a nested loop, where they'll be rescanned with different parameters for each outer row, and they can only ever win when some part of the path involves a *partial* index scan that can used one of the passed parameters as an index qual. For a fixed a set of parameters to be pushed down from the outer side, it seems to me that we could build up a set of index-accelerated paths for a joinrel {A B} by considering the combination of (1) an index-accelerated path for A joined to an index-accelerated path for B, or (2) an index-accelerated path for A joined to the cheapest total path for B. I believe we don't need to worry about path keys because this will only ever be used on the inner side of a nested loop, so the pathkeys will be ignored anyway. In other words, the first heuristic is that you can't build up an index-accelerated path for the joinrel unless there is an index-accelerated path for a least one of its baserels. That still leaves a lot of silly paths, though. In many cases, if you're thinking about joining A to {B C} using an index-accelerated path, you'd be just as well off joining to B first and then to C. So it might be that we only need to consider index-accelerated paths when there is no legal join between the LHS and a subset of the RHS. But I'm not completely sure that I'm right about this, nor whether it's easy to check for. The other problem I see here is that the bottom-up approach that we use in general is going to be difficult to apply here, because the set of paths will vary depending on what parameters are pushed down from the outer side. I think Andrew was suggesting that we not attempt to consider this automatically, but only when the user writes the query in a way that exposes it directly via LATERAL. While that goes against my normal instincts for the planner, it isn't unreasonable as a first step. Possibly, but I'm not sure we have a good enough design sketch at this point to even know whether such a restriction would be useful. Another thought, only semi-related. One of the big use cases for LATERAL in general is to use a set-returning function in the FROM clause that uses vars from a preceding FROM item. I am idly wondering if there's a reason why ExecProject is not its own node type. It seems that we have close to the same logic scattered through a whole bunch of different node types... ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LATERAL
Robert Haas robertmh...@gmail.com writes: Another thought, only semi-related. One of the big use cases for LATERAL in general is to use a set-returning function in the FROM clause that uses vars from a preceding FROM item. I am idly wondering if there's a reason why ExecProject is not its own node type. You generally need it everywhere. Scan nodes need it because you don't want unused columns propagating upwards, and join nodes need it because the two input tuples have to be unified somehow. We do skip projection ability in a few node types, but I doubt it would be profitable to remove it from the rest. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reworks for Access Control facilities (r2363)
On Sat, Oct 17, 2009 at 9:53 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: This raises an important point: We need *developer documentation* on how to write SE-Pgsql compliant permission checks. Not only for authors of 3rd party modules but for developers of PostgreSQL itself. Point 2) above needs to be emphasized, it's a big change in the way permission checks have to be programmed. One that I hadn't realized before. I haven't been paying much attention, but neither is most other developers, so we need clear documentation. This is a good point. All throughout these discussions, there has been a concern that whatever is implemented here will be unmaintainable because we don't have any committers who are familiar with the ins and outs of SE-Linux and MAC (and not too many other community members interested in the topic, either). So some developer documentation seems like it might help. On the other hand, KaiGai has made several attempts at documentation and several attempts at patches and we're not really any closer to having SE-PostgreSQL in core than we were a year ago. I think that's partly because KaiGai tried to bite off far too much initially (still?), partly because of technical problems with the patches, partly because the intersection of people who are experts in PostgreSQL and people who are experts in MAC seems to be empty, and partly because, as much as people sorta kinda like this feature, nobody other than KaiGai has really been willing to step up and pour into this project the kind of resources that it will likely require to be successful. I have to admit that I'm kind of giving up hope. We seem to be going in circles, and I don't think anything new is being said on this thread that hasn't been said before. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LATERAL
Robert Haas robertmh...@gmail.com writes: That still leaves a lot of silly paths, though. In many cases, if you're thinking about joining A to {B C} using an index-accelerated path, you'd be just as well off joining to B first and then to C. So it might be that we only need to consider index-accelerated paths when there is no legal join between the LHS and a subset of the RHS. Yeah. If there are no join order constraints, it's always possible to form a plan such that you join a given rel only once you have all the rels needed (to provide values for the inner indexscan) on the lefthand side. I think this is probably the main reason why the issue was not treated in the planner originally. So maybe the way to think about this is as a way of dealing with join order constraints without losing the benefits of inner indexscans. The other problem I see here is that the bottom-up approach that we use in general is going to be difficult to apply here, because the set of paths will vary depending on what parameters are pushed down from the outer side. Well, we deal with that already --- the set of possible inner indexscan paths already varies depending on what the LHS is. I think the point here is that we'd be considering an inner path that's against an LHS that it's not legal to join the inner rel to *directly*. Such a path would only be legal if we later join to that LHS at a higher join level. So we'd be keeping around some tentative paths that might not ever form a valid join plan. Maybe we should turn around the way that inner indexscan paths are made. Currently we form them on-the-fly while considering a valid join combination. Maybe we should build them all at the first level (driving this off the set of available join clauses for each base rel) and mark each such path as requires a join to this other set of rels to be valid. But then we'd go ahead and join such paths to *other* rels, keeping the resulting join paths still marked as requiring the same future join. Once that join actually happens, the resulting path becomes fully valid. Only a join to a proper subset of the future-join requirement would be disallowed meanwhile. I'm not even sure this would be slower or more complicated than what we do now --- if you look at the logic that caches potential inner indexscan plans, it's almost doing this already. It would result in considering more join paths, but only ones that have some plausible use. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers