Re: [HACKERS] Reworks for Access Control facilities (r2363)

2009-10-17 Thread Heikki Linnakangas
KaiGai Kohei wrote:
 1) creation of a database object
 
 In SELinux model, when a user tries to create a new object (not limited
 to database object, like a file or socket), a default security context
 is assigned on the new object, then SELinux checks whether the user has
 privileges to create a new object labeled with the security context, or not.
 
 When we create a new table, the default PG model checks ACL_CREATE privilege
 on the namespace which is supposed to own the new table. DefineRelation()
 invokes pg_namespace_aclcheck() with OID of the namespace, but we cannot
 see any properties of the new table from inside of pg_namespace_aclcheck().
 It checks permissions on the couple of a user and a namespace.
 
 On the other hand, SE-PG model follows the above principle. When we create
 a new table, SE-PG compute a default security context to be assigned on,
 then it checks the security policy whether the user is allowed to create
 a new table labeled with the context, or not.
 It checks permissions on the couple of a user and a new table itself.

I don't think I buy that argument.  Can't we simply decide that in
PostgreSQL, the granularity is different, and you can only create
policies governing creation of objects on the basis of schema+user
combination, not on the properties of the new object. AFAICS it wouldn't
violate the principle of Mandatory Access Control.

 2) AND-condition for all the privileges
 
 When a certain action requires multiple permissions at one time,
 the principle of SELinux is that all the permissions have to be checked.
 If one of them is not allowed, it disallows the required action.
 In other word, all the conditions are chained by AND.
 
 This principle enables us to analyze the data flows between users and
 resources with the security policy, without implementation details.
 If a certain permission (e.g db_table:{select}) can override any other
 permission (e.g db_column:{select}), it also implicitly means a possibility
 of infotmation leaks/manipulations, even if the security policy said this
 user cannot read a data from the column.
 
 On the other hand, the default PG model allows to bypass checks on
 certain objects. For example, column-level privileges are only checked
 when a user does not have enough permissions on the target table.
 If SELECT a,b FROM t is given, pg_attribute_aclcheck() may not invoked
 when user has needed privileges on the table t.

Hmm, I see. Yes, it does seem like we'd need to change such permission
checks to accommodate both models.

 3) superuser is not an exception of access control.
 
 It is the similar issue to the 2).

Yeah.

 The following code is a part of AlterFunctionOwner_internal().
 
 
 /* Superusers can always do it */
 if (!superuser())
 {
 /* Otherwise, must be owner of the existing object */
 if (!pg_proc_ownercheck(procOid, GetUserId()))
 aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_PROC,
NameStr(procForm-proname));
 
 /* Must be able to become new owner */
 check_is_member_of_role(GetUserId(), newOwnerId);
 
 /* New owner must have CREATE privilege on namespace */
 aclresult = pg_namespace_aclcheck(procForm-pronamespace,
   newOwnerId,
   ACL_CREATE);
 if (aclresult != ACLCHECK_OK)
 aclcheck_error(aclresult, ACL_KIND_NAMESPACE,
get_namespace_name(procForm-pronamespace));
 }
 
 
 From perspective of the default PG model, this code perfectly correct.
 Both of pg_proc_ownercheck() and pg_namespace_aclcheck() always returns
 ACLCHECK_OK, so these invocations are bypassable.
 
 However, if SE-PG's hooks are deployed on pg_xxx_aclcheck() routines,
 it means that we cannot check correct MAC permissions when a client is
 allowed to apply superuser privilege.
 Please remind that SELinux requires AND-condition for all the privileges
 required to a certain action. When a root user tries to read a certain
 file without DAC permisions, it requires both of capability:{dac_override}
 and file:{read} permissions in operating system.

We need to ask ourselves, is that a realistic goal, given how widespread
such if (superuser()) calls are? And more imporantly, unless you
sprinkle additional fine-grained permission checks to all the places
that currently just check if (superuser()), it will be possible to
circumvent the system with LOAD or any of the other commands that are
inherently dangerous. We don't want such additional fine-grained
permissions, not for now at least.

Seems a lot simpler and also easier to understand if there's a single
superuser privilege that trumps all other permission checks.

 If we look at the SE-PgSQL project on the greater scale, it also can be
 considered as an efforts to add MAC checks on the module which applied
 its own access controls, but bypassed 

Re: [HACKERS] Deprecation

2009-10-17 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 On Fri, Oct 16, 2009 at 12:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 However, if the columnref looks like x.y where x happens to
 match some table in the database (and not in the query) that doesn't
 have a column y, the implicit-RTE code will have already modified the
 querytree before finding out that column y doesn't exist.

 Hm, so if you do nothing then really the only thing that doesn't work
 is if you have add_missing_from then plpgsql record variables wouldn't
 work when you tried to reference their columns?

Do nothing isn't the right phrase here --- it would take a great deal
of work and ugly, hard-to-maintain code to get it to work even that badly.
The code paths in transformColumnRef are fairly messy already :-(.
Getting rid of add_missing_from would definitely make it easier to
refactor to support hooks for external variable sources.

The approach I had been thinking about proposing, before David piped up
with his modest proposal, was to have external variables take precedence
over implicit RTEs --- ie, we'd call the hook *before* trying the
add_missing_from case.  But that seems pretty weird, and it'd still be
messy to program.  What it would mainly accomplish is to avoid the extra
lock hazard.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecation

2009-10-17 Thread Bruce Momjian
Tom Lane wrote:
 Greg Stark gsst...@mit.edu writes:
  On Fri, Oct 16, 2009 at 12:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  However, if the columnref looks like x.y where x happens to
  match some table in the database (and not in the query) that doesn't
  have a column y, the implicit-RTE code will have already modified the
  querytree before finding out that column y doesn't exist.
 
  Hm, so if you do nothing then really the only thing that doesn't work
  is if you have add_missing_from then plpgsql record variables wouldn't
  work when you tried to reference their columns?
 
 Do nothing isn't the right phrase here --- it would take a great deal
 of work and ugly, hard-to-maintain code to get it to work even that badly.
 The code paths in transformColumnRef are fairly messy already :-(.
 Getting rid of add_missing_from would definitely make it easier to
 refactor to support hooks for external variable sources.
 
 The approach I had been thinking about proposing, before David piped up
 with his modest proposal, was to have external variables take precedence
 over implicit RTEs --- ie, we'd call the hook *before* trying the
 add_missing_from case.  But that seems pretty weird, and it'd still be
 messy to program.  What it would mainly accomplish is to avoid the extra
 lock hazard.

Sounds like a good reason to remove add_missing_from in 8.5.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecation

2009-10-17 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 Do nothing isn't the right phrase here --- it would take a great deal
 of work and ugly, hard-to-maintain code to get it to work even that badly.
 The code paths in transformColumnRef are fairly messy already :-(.
 Getting rid of add_missing_from would definitely make it easier to
 refactor to support hooks for external variable sources.

A little voice from the field, +1 for deprecating add_missing_from.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecation

2009-10-17 Thread Pavel Stehule
2009/10/17 Dimitri Fontaine dfonta...@hi-media.com:
 Tom Lane t...@sss.pgh.pa.us writes:
 Do nothing isn't the right phrase here --- it would take a great deal
 of work and ugly, hard-to-maintain code to get it to work even that badly.
 The code paths in transformColumnRef are fairly messy already :-(.
 Getting rid of add_missing_from would definitely make it easier to
 refactor to support hooks for external variable sources.

 A little voice from the field, +1 for deprecating add_missing_from.


+1
Pavel

 Regards,
 --
 dim

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecation

2009-10-17 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Sounds like a good reason to remove add_missing_from in 8.5.

Seems like the general consensus is that it's okay to do that.
I will go make it happen unless somebody squawks pretty soon...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecation

2009-10-17 Thread Andrew Dunstan



Tom Lane wrote:

Bruce Momjian br...@momjian.us writes:
  

Sounds like a good reason to remove add_missing_from in 8.5.



Seems like the general consensus is that it's okay to do that.
I will go make it happen unless somebody squawks pretty soon...
  



Squawk. I am currently travelling. Please give me until early next week 
to research and react.


thanks

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecation

2009-10-17 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Squawk. I am currently travelling. Please give me until early next week 
 to research and react.

Okay, I'll hold off for a bit.  For reference, attached is the patch
I was about to apply.  This doesn't do any of the refactoring I had
in mind, it just removes the implicit-RTE logic.

regards, tom lane

Index: doc/src/sgml/config.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.230
diff -c -r1.230 config.sgml
*** doc/src/sgml/config.sgml	3 Oct 2009 23:10:47 -	1.230
--- doc/src/sgml/config.sgml	17 Oct 2009 20:37:00 -
***
*** 4659,4695 
  
   variablelist
  
-  varlistentry id=guc-add-missing-from xreflabel=add_missing_from
-   termvarnameadd_missing_from/varname (typeboolean/type)/term
-   indextermprimaryFROM/secondarymissing//
-   indexterm
-primaryvarnameadd_missing_from/ configuration parameter/primary
-   /indexterm
-   listitem
-para
- When on, tables that are referenced by a query will be
- automatically added to the literalFROM/ clause if not
- already present. This behavior does not comply with the SQL
- standard and many people dislike it because it can mask mistakes
- (such as referencing a table where you should have referenced
- its alias). The default is literaloff/. This variable can be
- enabled for compatibility with releases of
- productnamePostgreSQL/ prior to 8.1, where this behavior was
- allowed by default.
-/para
- 
-para
- Note that even when this variable is enabled, a warning
- message will be emitted for each implicit literalFROM/
- entry referenced by a query. Users are encouraged to update
- their applications to not rely on this behavior, by adding all
- tables referenced by a query to the query's literalFROM/
- clause (or its literalUSING/ clause in the case of
- commandDELETE/).
-/para
-   /listitem
-  /varlistentry
- 
   varlistentry id=guc-array-nulls xreflabel=array_nulls
termvarnamearray_nulls/varname (typeboolean/type)/term
indexterm
--- 4659,4664 
Index: doc/src/sgml/queries.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/queries.sgml,v
retrieving revision 1.55
diff -c -r1.55 queries.sgml
*** doc/src/sgml/queries.sgml	17 Jun 2009 21:58:49 -	1.55
--- doc/src/sgml/queries.sgml	17 Oct 2009 20:37:00 -
***
*** 521,543 
  /para
  
  para
!  The alias becomes the new name of the table reference for the
!  current query mdash; it is no longer possible to refer to the table
!  by the original name.  Thus:
! programlisting
! SELECT * FROM my_table AS m WHERE my_table.a gt; 5;
! /programlisting
!  is not valid according to the SQL standard.  In
!  productnamePostgreSQL/productname this will draw an error, assuming the
!  xref linkend=guc-add-missing-from configuration variable is
!  literaloff/ (as it is by default).  If it is literalon/,
!  an implicit table reference will be added to the
!  literalFROM/literal clause, so the query is processed as if
!  it were written as:
  programlisting
! SELECT * FROM my_table AS m, my_table AS my_table WHERE my_table.a gt; 5;
  /programlisting
-  That will result in a cross join, which is usually not what you want.
  /para
  
  para
--- 521,533 
  /para
  
  para
!  The alias becomes the new name of the table reference so far as the
!  current query is concerned mdash; it is not allowed to refer to the
!  table by the original name elsewhere in the query.  Thus, this is not
!  valid:
  programlisting
! SELECT * FROM my_table AS m WHERE my_table.a gt; 5;-- wrong
  /programlisting
  /para
  
  para
Index: doc/src/sgml/ref/select.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/select.sgml,v
retrieving revision 1.125
diff -c -r1.125 select.sgml
*** doc/src/sgml/ref/select.sgml	18 Sep 2009 05:00:42 -	1.125
--- doc/src/sgml/ref/select.sgml	17 Oct 2009 20:37:01 -
***
*** 1451,1462 
  productnamePostgreSQL/productname releases prior to
  8.1 would accept queries of this form, and add an implicit entry
  to the query's literalFROM/literal clause for each table
! referenced by the query. This is no longer the default behavior,
! because it does not comply with the SQL standard, and is
! considered by many to be error-prone. For compatibility with
! applications that rely on this behavior the xref
! linkend=guc-add-missing-from configuration variable can be
! enabled.
 /para
/refsect2
  
--- 1451,1457 
  

Re: [HACKERS] LATERAL

2009-10-17 Thread Robert Haas
On Tue, Sep 22, 2009 at 11:21 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Currently, however, we only consider this possibility when the inner
 rel is NOT a joinrel.  It seems like it might be possible to change
 this, but it doesn't look straightforward.

 Well, it's straightforward enough in theory, it's just the exponential
 growth in the number of join paths to consider that's a problem :-(.
 I think what we'd need is a heuristic to limit the paths considered.

[ picking this up again now that CommitFest is over ]

So that I don't have to keep referring to this possibility and
similar circumlocution, let's define an index-accelerated path (feel
free to suggest a different term) to be the generalization to joinrels
of what we now call a nested inner index-scan.  In other words, they
can only be usefully used on the inner side of a nested loop, where
they'll be rescanned with different parameters for each outer row, and
they can only ever win when some part of the path involves a *partial*
index scan that can used one of the passed parameters as an index
qual.

For a fixed a set of parameters to be pushed down from the outer side,
it seems to me that we could build up a set of index-accelerated paths
for a joinrel {A B} by considering the combination of (1) an
index-accelerated path for A joined to an index-accelerated path for
B, or (2) an index-accelerated path for A joined to the cheapest total
path for B.  I believe we don't need to worry about path keys because
this will only ever be used on the inner side of a nested loop, so the
pathkeys will be ignored anyway.  In other words, the first heuristic
is that you can't build up an index-accelerated path for the joinrel
unless there is an index-accelerated path for a least one of its
baserels.

That still leaves a lot of silly paths, though.  In many cases, if
you're thinking about joining A to {B C} using an index-accelerated
path, you'd be just as well off joining to B first and then to C.  So
it might be that we only need to consider index-accelerated paths when
there is no legal join between the LHS and a subset of the RHS.  But
I'm not completely sure that I'm right about this, nor whether it's
easy to check for.

The other problem I see here is that the bottom-up approach that we
use in general is going to be difficult to apply here, because the set
of paths will vary depending on what parameters are pushed down from
the outer side.

 I think Andrew was suggesting that we not attempt to consider this
 automatically, but only when the user writes the query in a way that
 exposes it directly via LATERAL.  While that goes against my normal
 instincts for the planner, it isn't unreasonable as a first step.

Possibly, but I'm not sure we have a good enough design sketch at this
point to even know whether such a restriction would be useful.

Another thought, only semi-related.  One of the big use cases for
LATERAL in general is to use a set-returning function in the FROM
clause that uses vars from a preceding FROM item.  I am idly wondering
if there's a reason why ExecProject is not its own node type.  It
seems that we have close to the same logic scattered through a whole
bunch of different node types...

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LATERAL

2009-10-17 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Another thought, only semi-related.  One of the big use cases for
 LATERAL in general is to use a set-returning function in the FROM
 clause that uses vars from a preceding FROM item.  I am idly wondering
 if there's a reason why ExecProject is not its own node type.

You generally need it everywhere.  Scan nodes need it because you don't
want unused columns propagating upwards, and join nodes need it because
the two input tuples have to be unified somehow.  We do skip projection
ability in a few node types, but I doubt it would be profitable to
remove it from the rest.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reworks for Access Control facilities (r2363)

2009-10-17 Thread Robert Haas
On Sat, Oct 17, 2009 at 9:53 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 This raises an important point: We need *developer documentation* on how
 to write SE-Pgsql compliant permission checks. Not only for authors of
 3rd party modules but for developers of PostgreSQL itself. Point 2)
 above needs to be emphasized, it's a big change in the way permission
 checks have to be programmed. One that I hadn't realized before. I
 haven't been paying much attention, but neither is most other
 developers, so we need clear documentation.

This is a good point.  All throughout these discussions, there has
been a concern that whatever is implemented here will be
unmaintainable because we don't have any committers who are familiar
with the ins and outs of SE-Linux and MAC (and not too many other
community members interested in the topic, either).  So some developer
documentation seems like it might help.

On the other hand, KaiGai has made several attempts at documentation
and several attempts at patches and we're not really any closer to
having SE-PostgreSQL in core than we were a year ago.  I think that's
partly because KaiGai tried to bite off far too much initially
(still?), partly because of technical problems with the patches,
partly because the intersection of people who are experts in
PostgreSQL and people who are experts in MAC seems to be empty, and
partly because, as much as people sorta kinda like this feature,
nobody other than KaiGai has really been willing to step up and pour
into this project the kind of resources that it will likely require to
be successful.

I have to admit that I'm kind of giving up hope.  We seem to be going
in circles, and I don't think anything new is being said on this
thread that hasn't been said before.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LATERAL

2009-10-17 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 That still leaves a lot of silly paths, though.  In many cases, if
 you're thinking about joining A to {B C} using an index-accelerated
 path, you'd be just as well off joining to B first and then to C.  So
 it might be that we only need to consider index-accelerated paths when
 there is no legal join between the LHS and a subset of the RHS.

Yeah.  If there are no join order constraints, it's always possible to
form a plan such that you join a given rel only once you have all the
rels needed (to provide values for the inner indexscan) on the lefthand
side.  I think this is probably the main reason why the issue was not
treated in the planner originally.  So maybe the way to think about
this is as a way of dealing with join order constraints without losing
the benefits of inner indexscans.

 The other problem I see here is that the bottom-up approach that we
 use in general is going to be difficult to apply here, because the set
 of paths will vary depending on what parameters are pushed down from
 the outer side.

Well, we deal with that already --- the set of possible inner indexscan
paths already varies depending on what the LHS is.  I think the point
here is that we'd be considering an inner path that's against an LHS
that it's not legal to join the inner rel to *directly*.  Such a path
would only be legal if we later join to that LHS at a higher join level.
So we'd be keeping around some tentative paths that might not ever form
a valid join plan.

Maybe we should turn around the way that inner indexscan paths are
made.  Currently we form them on-the-fly while considering a valid
join combination.  Maybe we should build them all at the first level
(driving this off the set of available join clauses for each base rel)
and mark each such path as requires a join to this other set of rels
to be valid.  But then we'd go ahead and join such paths to *other*
rels, keeping the resulting join paths still marked as requiring the
same future join.  Once that join actually happens, the resulting path
becomes fully valid.  Only a join to a proper subset of the future-join
requirement would be disallowed meanwhile.

I'm not even sure this would be slower or more complicated than what we
do now --- if you look at the logic that caches potential inner
indexscan plans, it's almost doing this already.  It would result in
considering more join paths, but only ones that have some plausible use.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers