Re: [HACKERS] Improving RLS planning

2017-01-19 Thread Tom Lane
Andreas Seltenreich  writes:
> sqlsmith doesn't seem to like 215b43cdc:

> select 1 from information_schema.usage_privileges
> where information_schema._pg_keysequal(
>(select null::smallint[]),
>'{16,25,23}');

> -- TRAP: FailedAssertion("!(!and_clause((Node *) clause))", File: 
> "restrictinfo.c", Line: 81)

Thanks, I'll take a look.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2017-01-19 Thread Andreas Seltenreich
Tom Lane writes:

> Thanks for reviewing --- I'll do something with that test case and
> push it.

sqlsmith doesn't seem to like 215b43cdc:

select 1 from information_schema.usage_privileges
where information_schema._pg_keysequal(
   (select null::smallint[]),
   '{16,25,23}');

-- TRAP: FailedAssertion("!(!and_clause((Node *) clause))", File: 
"restrictinfo.c", Line: 81)

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2017-01-18 Thread Tom Lane
Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> Thanks for the review.  Attached is an updated patch that I believe
>> addresses all of the review comments so far.  The code is unchanged from
>> v2, but I improved the README, some comments, and the regression tests.

> I've reviewed your updates and they answer all of my comments and I
> appreciate the EC regression tests you added.

> I also agree with Dean's down-thread suggested regression test change.

Thanks for reviewing --- I'll do something with that test case and
push it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2017-01-17 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
> >> Here's an updated version of the RLS planning patch that gets rid of
> >> the incorrect interaction with Query.hasRowSecurity and adjusts
> >> terminology as agreed.
> 
> > I've spent a fair bit of time going over this change to understand it,
> > how it works, and how it changes the way RLS and securiy barrier views
> > work.
> 
> Thanks for the review.  Attached is an updated patch that I believe
> addresses all of the review comments so far.  The code is unchanged from
> v2, but I improved the README, some comments, and the regression tests.

I've reviewed your updates and they answer all of my comments and I
appreciate the EC regression tests you added.

I also agree with Dean's down-thread suggested regression test change.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Improving RLS planning

2016-12-29 Thread Dean Rasheed
On 29 December 2016 at 15:55, Tom Lane  wrote:
> Dean Rasheed  writes:
>> On 28 December 2016 at 19:12, Tom Lane  wrote:
>>> [ getting back to this patch finally... ]  I made the suggested change
>>> to that test case, and what I see is a whole lot of "NOTICE: snooped value
>>> = whatever" outputs.
>>>
>>> I'd leave it as shown in the attached diff fragment, except that I'm
>>> a bit worried about possible platform dependency of the output.  The
>>> hashing occurring in the subplans shouldn't affect output order, but
>>> I'm not sure if we want a test output like this or not.  Thoughts?
>
>> How about replacing "a = 3" with "a < 7 AND a != 6". That then
>> exercises more of the possible types of behaviour for quals: The "a <
>> 7" qual is pushed down and used as an index condition. The "a != 6"
>> qual is pushed down and used as a filter, because it's cheap and
>> leakproof. The leakproof() qual isn't pushed down on cost grounds. The
>> snoop() qual isn't pushed down on security grounds. Both snoop() and
>> leakproof() are used as filters, along with "a != 6", and a SB subplan
>> qual. "a != 6" is executed first because it has a security_level of 0,
>> and is cheaper than the subplan. snoop() is executed later, despite
>> being cheaper than the other filter quals, because it has a higher
>> security_level, and leakproof() is executed last because it has the
>> same security level as snoop() but is more expensive.
>
> Will do, although I think that the next test case (the one with "a = 8")
> already shows most of those behaviors.
>

Except that it doesn't have a cheap leakproof qual like "a != 6", not
handled automatically by the index, that order_qual_clauses() can
assign security_level = 0 to, and then move to the start of the list.

I think it's probably worth having a clause like that in one of the
tests, but it could perhaps be added to the "a = 8" test, if you
wanted to drop the "a = 3" test.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-12-29 Thread Tom Lane
Dean Rasheed  writes:
> On 28 December 2016 at 19:12, Tom Lane  wrote:
>> [ getting back to this patch finally... ]  I made the suggested change
>> to that test case, and what I see is a whole lot of "NOTICE: snooped value
>> = whatever" outputs.
>> 
>> I'd leave it as shown in the attached diff fragment, except that I'm
>> a bit worried about possible platform dependency of the output.  The
>> hashing occurring in the subplans shouldn't affect output order, but
>> I'm not sure if we want a test output like this or not.  Thoughts?

> How about replacing "a = 3" with "a < 7 AND a != 6". That then
> exercises more of the possible types of behaviour for quals: The "a <
> 7" qual is pushed down and used as an index condition. The "a != 6"
> qual is pushed down and used as a filter, because it's cheap and
> leakproof. The leakproof() qual isn't pushed down on cost grounds. The
> snoop() qual isn't pushed down on security grounds. Both snoop() and
> leakproof() are used as filters, along with "a != 6", and a SB subplan
> qual. "a != 6" is executed first because it has a security_level of 0,
> and is cheaper than the subplan. snoop() is executed later, despite
> being cheaper than the other filter quals, because it has a higher
> security_level, and leakproof() is executed last because it has the
> same security level as snoop() but is more expensive.

Will do, although I think that the next test case (the one with "a = 8")
already shows most of those behaviors.

Maybe this one's just redundant and we should drop it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-12-29 Thread Dean Rasheed
On 28 December 2016 at 19:12, Tom Lane  wrote:
> Stephen Frost  writes:
>> * Dean Rasheed (dean.a.rash...@gmail.com) wrote:
>>> Hmm. I've not read any of the new code yet, but the fact that this
>>> test now reduces to a one-time filter makes it effectively useless as
>>> a test of qual evaluation order because it has deduced that it doesn't
>>> need to evaluate them. I would suggest replacing the qual with
>>> something that can't be reduced, perhaps "2*a = 6".
>
>> That's a good thought, I agree.
>
> [ getting back to this patch finally... ]  I made the suggested change
> to that test case, and what I see is a whole lot of "NOTICE: snooped value
> = whatever" outputs.
>
> I'd leave it as shown in the attached diff fragment, except that I'm
> a bit worried about possible platform dependency of the output.  The
> hashing occurring in the subplans shouldn't affect output order, but
> I'm not sure if we want a test output like this or not.  Thoughts?
>

How about replacing "a = 3" with "a < 7 AND a != 6". That then
exercises more of the possible types of behaviour for quals: The "a <
7" qual is pushed down and used as an index condition. The "a != 6"
qual is pushed down and used as a filter, because it's cheap and
leakproof. The leakproof() qual isn't pushed down on cost grounds. The
snoop() qual isn't pushed down on security grounds. Both snoop() and
leakproof() are used as filters, along with "a != 6", and a SB subplan
qual. "a != 6" is executed first because it has a security_level of 0,
and is cheaper than the subplan. snoop() is executed later, despite
being cheaper than the other filter quals, because it has a higher
security_level, and leakproof() is executed last because it has the
same security level as snoop() but is more expensive.

Thus the test is more likely to highlight any future changes to the
pushdown/ordering rules. The output is also neater, because nothing
ends up being printed by snoop(), although of course it is OK for
values greater than 5 to be printed.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-12-28 Thread Tom Lane
Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> Here's an updated version of the RLS planning patch that gets rid of
>> the incorrect interaction with Query.hasRowSecurity and adjusts
>> terminology as agreed.

> I've spent a fair bit of time going over this change to understand it,
> how it works, and how it changes the way RLS and securiy barrier views
> work.

Thanks for the review.  Attached is an updated patch that I believe
addresses all of the review comments so far.  The code is unchanged from
v2, but I improved the README, some comments, and the regression tests.

> Are you thinking that we will be able to remove the cut-out in
> is_simple_subquery() which currently punts whenever an RTE is marked as
> security_barrier?

Yeah, that's the long-term plan, but it's not done yet.

> Speaking of which, it seems like we should probably update the README to
> include some mention, at least, of what we're doing today when it comes
> to joins which involve security barrier entanglements.

I tweaked the new section in README to point out specifically that
security views aren't flattened.

>> ! * In addition to the quals inherited from the parent, we might 
>> have
>> ! * securityQuals associated with this particular child node.  
>> (This
>> ! * won't happen in inheritance cases, only with appendrels 
>> originating
>> ! * from UNION ALL.)

> Right, this won't happen in inheritance cases because we explicitly
> don't consider the quals of the children when querying through the
> parent, similar to how we don't consider the GRANT-based permissions on
> the child tables.  This is mentioned elsewhere but might make sense to
> also mention it here, or at least say 'see expand_inherited_rtentry()'.

Comment adjusted.

>> +/* Reject if it is potentially postponable by security considerations */

> The first comment makes a lot of sense, but the one-liner doesn't seem
> as clear, to me anyway.

Not sure how to make it better.

> The result of the above, as I understand it, is that security_level will
> either be zero, or the restrictinfo will be leakproof, no?  Meaning that
> ec_max_security will either be zero, or the functions involved will be
> leakproof, right?

Right.  We still have to check other member operators of the opfamily,
if we need to select one, but we at least know that the original clauses
are safe.

> Perhaps it's more difficult than it's worth to come up with cases that
> cover the other code paths involved, but it seems like it might be good
> to at least try to as it's likely to happen in more cases now that we're
> returning (or should be, at least) InvalidOid due to the only operators
> found being leaky ones.

To test this you need a btree opclass that contains some leakproof and
some non-leakproof operators, which is a mite unusual, but fortunately
we already have a test (equivclass.sql) that creates such a situation.
I added a test there that demonstrates the planner backing off an
equivalence-class deduction in the presence of lower-level security
quals.  We might have to tweak the test in future if we refine these
rules, but that seems fine.

> As discussed previously, this looks like a good, practical, hack, but I
> feel a little bad that we don't mention it anywhere except in this
> comment.  Is it too low-level to get a mention in the README?

Done.

I also adjusted the test that was collapsing to a dummy query, and
updated the expected results for a couple of new queries that weren't
there two months ago.

regards, tom lane



new-rls-planning-implementation-3.patch.gz
Description: new-rls-planning-implementation-3.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-12-28 Thread Tom Lane
Stephen Frost  writes:
> * Dean Rasheed (dean.a.rash...@gmail.com) wrote:
>> Hmm. I've not read any of the new code yet, but the fact that this
>> test now reduces to a one-time filter makes it effectively useless as
>> a test of qual evaluation order because it has deduced that it doesn't
>> need to evaluate them. I would suggest replacing the qual with
>> something that can't be reduced, perhaps "2*a = 6".

> That's a good thought, I agree.

[ getting back to this patch finally... ]  I made the suggested change
to that test case, and what I see is a whole lot of "NOTICE: snooped value
= whatever" outputs.  The fact that there are none in the current test
output is because in

UPDATE v1 SET a=100 WHERE snoop(a) AND leakproof(a) AND a = 3;

we currently decide that the subquery can't be flattened, but we then push
down the two leakproof quals into it, so that they get evaluated ahead of
the snoop() call.  The revised code doesn't do that, allowing snoop() to
be called on rows that will fail the other two quals --- but AFAICS,
that's a feature not a bug.  There is no security-based argument why
snoop() can't go before them, and on cost grounds it should.

I'd leave it as shown in the attached diff fragment, except that I'm
a bit worried about possible platform dependency of the output.  The
hashing occurring in the subplans shouldn't affect output order, but
I'm not sure if we want a test output like this or not.  Thoughts?

regards, tom lane


*** SELECT * FROM v1 WHERE a=8;
*** 2114,2198 
  (4 rows)
  
  EXPLAIN (VERBOSE, COSTS OFF)
! UPDATE v1 SET a=100 WHERE snoop(a) AND leakproof(a) AND a = 3;
!  QUERY PLAN   
  
! 

!  Update on public.t1 t1_4
!Update on public.t1 t1_4
!Update on public.t11 t1
!Update on public.t12 t1
!Update on public.t111 t1
!->  Subquery Scan on t1
   Output: 100, t1.b, t1.c, t1.ctid
!  Filter: snoop(t1.a)
!  ->  LockRows
!Output: t1_5.ctid, t1_5.a, t1_5.b, t1_5.c, t1_5.ctid, 
t12.ctid, t12.tableoid
!->  Nested Loop Semi Join
!  Output: t1_5.ctid, t1_5.a, t1_5.b, t1_5.c, t1_5.ctid, 
t12.ctid, t12.tableoid
!  ->  Seq Scan on public.t1 t1_5
!Output: t1_5.ctid, t1_5.a, t1_5.b, t1_5.c
!Filter: ((t1_5.a > 5) AND (t1_5.a = 3) AND 
leakproof(t1_5.a))
!  ->  Append
!->  Seq Scan on public.t12
!  Output: t12.ctid, t12.tableoid, t12.a
!  Filter: (t12.a = 3)
!->  Seq Scan on public.t111
!  Output: t111.ctid, t111.tableoid, t111.a
!  Filter: (t111.a = 3)
!->  Subquery Scan on t1_1
!  Output: 100, t1_1.b, t1_1.c, t1_1.d, t1_1.ctid
!  Filter: snoop(t1_1.a)
!  ->  LockRows
!Output: t11.ctid, t11.a, t11.b, t11.c, t11.d, t11.ctid, 
t12_1.ctid, t12_1.tableoid
!->  Nested Loop Semi Join
!  Output: t11.ctid, t11.a, t11.b, t11.c, t11.d, t11.ctid, 
t12_1.ctid, t12_1.tableoid
!  ->  Seq Scan on public.t11
!Output: t11.ctid, t11.a, t11.b, t11.c, t11.d
!Filter: ((t11.a > 5) AND (t11.a = 3) AND 
leakproof(t11.a))
!  ->  Append
!->  Seq Scan on public.t12 t12_1
!  Output: t12_1.ctid, t12_1.tableoid, t12_1.a
!  Filter: (t12_1.a = 3)
!->  Seq Scan on public.t111 t111_1
!  Output: t111_1.ctid, t111_1.tableoid, 
t111_1.a
!  Filter: (t111_1.a = 3)
!->  Subquery Scan on t1_2
!  Output: 100, t1_2.b, t1_2.c, t1_2.e, t1_2.ctid
!  Filter: snoop(t1_2.a)
!  ->  LockRows
!Output: t12_2.ctid, t12_2.a, t12_2.b, t12_2.c, t12_2.e, 
t12_2.ctid, t12_3.ctid, t12_3.tableoid
!->  Nested Loop Semi Join
!  Output: t12_2.ctid, t12_2.a, t12_2.b, t12_2.c, t12_2.e, 
t12_2.ctid, t12_3.ctid, t12_3.tableoid
!  ->  Seq Scan on public.t12 t12_2
!Output: t12_2.ctid, t12_2.a, t12_2.b, t12_2.c, 
t12_2.e
!Filter: ((t12_2.a > 5) AND (t12_2.a = 3) AND 
leakproof(t12_2.a))
!  ->  Append
!->  Seq Scan on public.t12 t12_3
!  Output: t12_3.ctid, t12_3.tableoid, t12_3.a
! 

Re: [HACKERS] Improving RLS planning

2016-12-01 Thread Stephen Frost
Dean,

* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
> Hmm. I've not read any of the new code yet, but the fact that this
> test now reduces to a one-time filter makes it effectively useless as
> a test of qual evaluation order because it has deduced that it doesn't
> need to evaluate them. I would suggest replacing the qual with
> something that can't be reduced, perhaps "2*a = 6".

That's a good thought, I agree.

> In addition, I think that the tests on this view are probably no
> longer adequate for the purpose of validating that the qual evaluation
> order is safe. With the old implementation, the subquery scans in the
> plans made it pretty clear that it was safe, and likely to remain safe
> with variants of those queries, but that's not so obvious with the new
> plans. Maybe some additional quals could be added to the view
> definition, perhaps based on the other view columns, to verify that
> the outer leaky qual always gets evaluated after the security barrier
> quals, regardless of cost. Or perhaps that's something that's better
> proved with an all-new set of tests, but it does seem to me that the
> new implementation has a higher risk (or at least introduces different
> risks) of unsafe evaluation orders that warrant some additional
> testing.

This also sounds like a good idea to me.  I'm not sure how practical it
would be in this case, but I do think it might be a good idea to also
review the code coverage results and see if there are tests which could
improve wherever it is lacking.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Improving RLS planning

2016-12-01 Thread Dean Rasheed
On 28 November 2016 at 23:55, Stephen Frost  wrote:
>> diff --git a/src/test/regress/expected/updatable_views.out 
>> b/src/test/regress/expected/updatable_views.out
> [...]
>> --- 2104,2114 
>>
>>   EXPLAIN (VERBOSE, COSTS OFF)
>>   UPDATE v1 SET a=100 WHERE snoop(a) AND leakproof(a) AND a = 3;
>> ! QUERY PLAN
>> ! --
>> !  Result
>> !One-Time Filter: false
>> ! (2 rows)
>
> Perhaps Dean recalls something more specific, but reviewing this test
> and the others around it, I believe it was simply to see what happens in
> a case which doesn't pass the security barrier view constraint and to
> cover the same cases with the UPDATE as were in the SELECTs above it.  I
> don't see it being an issue that it now results in a one-time filter:
> false result.
>

Hmm. I've not read any of the new code yet, but the fact that this
test now reduces to a one-time filter makes it effectively useless as
a test of qual evaluation order because it has deduced that it doesn't
need to evaluate them. I would suggest replacing the qual with
something that can't be reduced, perhaps "2*a = 6".

In addition, I think that the tests on this view are probably no
longer adequate for the purpose of validating that the qual evaluation
order is safe. With the old implementation, the subquery scans in the
plans made it pretty clear that it was safe, and likely to remain safe
with variants of those queries, but that's not so obvious with the new
plans. Maybe some additional quals could be added to the view
definition, perhaps based on the other view columns, to verify that
the outer leaky qual always gets evaluated after the security barrier
quals, regardless of cost. Or perhaps that's something that's better
proved with an all-new set of tests, but it does seem to me that the
new implementation has a higher risk (or at least introduces different
risks) of unsafe evaluation orders that warrant some additional
testing.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-11-29 Thread Stephen Frost
Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Mon, Nov 28, 2016 at 6:55 PM, Stephen Frost  wrote:
> >> diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
> >> [...]
> >> + Additional rules will be needed to support safe handling of join quals
> >> + when there is a mix of security levels among join quals; for example, it
> >> + will be necessary to prevent leaky higher-security-level quals from being
> >> + evaluated at a lower join level than other quals of lower security level.
> >> + Currently there is no need to consider that since security-prioritized
> >> + quals can only be single-table restriction quals coming from RLS policies
> >> + or security-barrier views, and thus enforcement only needs to happen at
> >> + the table scan level.  With such extra rules, it should be possible to 
> >> let
> >> + security-barrier views be flattened into the parent query, allowing more
> >> + flexibility of planning while still preserving required ordering of qual
> >> + evaluation.  But that will come later.
> >
> > Are you thinking that we will be able to remove the cut-out in
> > is_simple_subquery() which currently punts whenever an RTE is marked as
> > security_barrier?  That would certainly be excellent as, currently, even
> > if everything involved is leakproof, we may end up with a poor choice of
> > plan because the join in the security barrier view must be performed
> > first.  Consider a case where we have two relativly large tables being
> > joined together in a security barrier view, but a join from outside
> > which is against a small table.  A better plan would generally be to
> > join with the smaller table first and then join the resulting small set
> > against the remaining large table.
> 
> We have to be careful that we don't introduce new security holes while
> we're improving the plans.  I guess this would be OK if the table, its
> target list, and its quals all happened to be leakproof, but otherwise
> not.  Or am I confused?

I agree that we need to be careful that we don't introduce security
holes.

Also, I do think it would be nice if we could arrange to have the same
plan in the security-barrier case as is in the no-security-barrier case
when there are no leaky functions involved.

That said, I believe this becomes a similar order-of-operations question
to make sure that values are never exposed to leaky functions until
after all necessary filtering has been performed.  In particular, when
considering joins, if all of the join operators are leakproof then we
could possibly reorder the joins however we choose, as long as anything
leaky is performed after all joins required for security correctness are
performed and all security barrier quals are applied.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Improving RLS planning

2016-11-29 Thread Tom Lane
Robert Haas  writes:
> On Mon, Nov 28, 2016 at 6:55 PM, Stephen Frost  wrote:
>> Are you thinking that we will be able to remove the cut-out in
>> is_simple_subquery() which currently punts whenever an RTE is marked as
>> security_barrier?  That would certainly be excellent as, currently, even
>> if everything involved is leakproof, we may end up with a poor choice of
>> plan because the join in the security barrier view must be performed
>> first.  Consider a case where we have two relativly large tables being
>> joined together in a security barrier view, but a join from outside
>> which is against a small table.  A better plan would generally be to
>> join with the smaller table first and then join the resulting small set
>> against the remaining large table.

> We have to be careful that we don't introduce new security holes while
> we're improving the plans.  I guess this would be OK if the table, its
> target list, and its quals all happened to be leakproof, but otherwise
> not.  Or am I confused?

The plan I have in mind --- it's not implemented in this patch --- is to
fix things so that the "lower security_level quals must be evaluated
first" rule applies to join quals.  It should then be possible to allow
flattening of security-barrier views without security holes.

One part of that would be to teach distribute_qual_to_rels about it
so that less-secure quals can't fall to a lower join level than
more-secure quals.  I think that's relatively straightforward, though
I've not tried to do it yet.

A bigger issue is that we don't have security_level attached to individual
quals at the time when view flattening gets done.  We'd need some other
way of maintaining the distinction between security quals and regular
quals between there and where RestrictInfos get built.  I don't have a good
idea about how to do that yet, but it doesn't seem insoluble.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-11-29 Thread Robert Haas
On Mon, Nov 28, 2016 at 6:55 PM, Stephen Frost  wrote:
>> diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
>> [...]
>> + Additional rules will be needed to support safe handling of join quals
>> + when there is a mix of security levels among join quals; for example, it
>> + will be necessary to prevent leaky higher-security-level quals from being
>> + evaluated at a lower join level than other quals of lower security level.
>> + Currently there is no need to consider that since security-prioritized
>> + quals can only be single-table restriction quals coming from RLS policies
>> + or security-barrier views, and thus enforcement only needs to happen at
>> + the table scan level.  With such extra rules, it should be possible to let
>> + security-barrier views be flattened into the parent query, allowing more
>> + flexibility of planning while still preserving required ordering of qual
>> + evaluation.  But that will come later.
>
> Are you thinking that we will be able to remove the cut-out in
> is_simple_subquery() which currently punts whenever an RTE is marked as
> security_barrier?  That would certainly be excellent as, currently, even
> if everything involved is leakproof, we may end up with a poor choice of
> plan because the join in the security barrier view must be performed
> first.  Consider a case where we have two relativly large tables being
> joined together in a security barrier view, but a join from outside
> which is against a small table.  A better plan would generally be to
> join with the smaller table first and then join the resulting small set
> against the remaining large table.

We have to be careful that we don't introduce new security holes while
we're improving the plans.  I guess this would be OK if the table, its
target list, and its quals all happened to be leakproof, but otherwise
not.  Or am I confused?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-11-28 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Dean Rasheed (dean.a.rash...@gmail.com) wrote:
> >> +1 for that terminology and no renaming of fields.
> 
> > Agreed.
> 
> Here's an updated version of the RLS planning patch that gets rid of
> the incorrect interaction with Query.hasRowSecurity and adjusts
> terminology as agreed.

I've spent a fair bit of time going over this change to understand it,
how it works, and how it changes the way RLS and securiy barrier views
work.

Overall, I'm happy with how it works and don't see any serious issues
with the qual ordering or the general concept.  I did have a few
comments from my review:

> diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
> [...]
> + Additional rules will be needed to support safe handling of join quals
> + when there is a mix of security levels among join quals; for example, it
> + will be necessary to prevent leaky higher-security-level quals from being
> + evaluated at a lower join level than other quals of lower security level.
> + Currently there is no need to consider that since security-prioritized
> + quals can only be single-table restriction quals coming from RLS policies
> + or security-barrier views, and thus enforcement only needs to happen at
> + the table scan level.  With such extra rules, it should be possible to let
> + security-barrier views be flattened into the parent query, allowing more
> + flexibility of planning while still preserving required ordering of qual
> + evaluation.  But that will come later.

Are you thinking that we will be able to remove the cut-out in
is_simple_subquery() which currently punts whenever an RTE is marked as
security_barrier?  That would certainly be excellent as, currently, even
if everything involved is leakproof, we may end up with a poor choice of
plan because the join in the security barrier view must be performed
first.  Consider a case where we have two relativly large tables being
joined together in a security barrier view, but a join from outside
which is against a small table.  A better plan would generally be to
join with the smaller table first and then join the resulting small set
against the remaining large table.

Speaking of which, it seems like we should probably update the README to
include some mention, at least, of what we're doing today when it comes
to joins which involve security barrier entanglements.
  
> diff --git a/src/backend/optimizer/path/allpaths.c 
> b/src/backend/optimizer/path/allpaths.c
[...]
> ! /*
> !  * In addition to the quals inherited from the parent, we might 
> have
> !  * securityQuals associated with this particular child node.  
> (This
> !  * won't happen in inheritance cases, only with appendrels 
> originating
> !  * from UNION ALL.)  Pull them up into the baserestrictinfo for 
> the
> !  * child.  This is similar to process_security_barrier_quals() 
> for the
> !  * parent rel, except that we can't make any general deductions 
> from
> !  * such quals, since they don't hold for the whole appendrel.
> !  */

Right, this won't happen in inheritance cases because we explicitly
don't consider the quals of the children when querying through the
parent, similar to how we don't consider the GRANT-based permissions on
the child tables.  This is mentioned elsewhere but might make sense to
also mention it here, or at least say 'see expand_inherited_rtentry()'.

> *** subquery_push_qual(Query *subquery, Rang
> *** 2708,2714 
>   make_and_qual(subquery->jointree->quals, qual);
>   
>   /*
> !  * We need not change the subquery's hasAggs or hasSublinks 
> flags,
>* since we can't be pushing down any aggregates that weren't 
> there
>* before, and we don't push down subselects at all.
>*/
> --- 2748,2754 
>   make_and_qual(subquery->jointree->quals, qual);
>   
>   /*
> !  * We need not change the subquery's hasAggs or hasSubLinks 
> flags,
>* since we can't be pushing down any aggregates that weren't 
> there
>* before, and we don't push down subselects at all.
>*/

Seems like this change is unrelated to what this patch is about.  Not a
big deal, but did take me a second to realize that you were just
changing the case of the 'L' in hasSubLinks.

> +  * We also reject proposed equivalence clauses if they contain leaky 
> functions
> +  * and have security_level above zero.  The EC evaluation rules require us 
> to
> +  * apply certain tests at certain joining levels, and we can't tolerate
> +  * delaying any test on security_level grounds.  By rejecting candidate 
> clauses
> +  * that might require security delays, we ensure it's safe to apply an EC
> +  * 

Re: [HACKERS] Improving RLS planning

2016-11-13 Thread Tom Lane
Stephen Frost  writes:
> * Dean Rasheed (dean.a.rash...@gmail.com) wrote:
>> +1 for that terminology and no renaming of fields.

> Agreed.

Here's an updated version of the RLS planning patch that gets rid of
the incorrect interaction with Query.hasRowSecurity and adjusts
terminology as agreed.

regards, tom lane



new-rls-planning-implementation-2.patch.gz
Description: new-rls-planning-implementation-2.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-11-11 Thread Stephen Frost
* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
> On 10 November 2016 at 17:12, Tom Lane  wrote:
> > Yeah, I think we'd be best off to avoid the bare term "security".
> > It's probably too late to change the RTE field name "securityQuals",
> > but maybe we could uniformly call those "security barrier quals" in
> > the comments.  Then the basic terminology is that we have security
> > barrier views and row-level security both implemented on top of
> > security barrier quals, and we should be careful to use the right
> > one of those three terms in comments/documentation.
> 
> +1 for that terminology and no renaming of fields.

Agreed.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Improving RLS planning

2016-11-10 Thread Dean Rasheed
On 10 November 2016 at 17:12, Tom Lane  wrote:
> Yeah, I think we'd be best off to avoid the bare term "security".
> It's probably too late to change the RTE field name "securityQuals",
> but maybe we could uniformly call those "security barrier quals" in
> the comments.  Then the basic terminology is that we have security
> barrier views and row-level security both implemented on top of
> security barrier quals, and we should be careful to use the right
> one of those three terms in comments/documentation.
>

+1 for that terminology and no renaming of fields.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-11-10 Thread Tom Lane
Stephen Frost  writes:
> * Dean Rasheed (dean.a.rash...@gmail.com) wrote:
>> On 8 November 2016 at 14:45, Tom Lane  wrote:
>>> ... I'm still suspicious that the three places I found may
>>> represent bugs in the management of Query.hasRowSecurity.

>> I don't believe that there are any existing bugs here, but I think 1
>> or possibly 2 of the 3 places you changed should be kept on robustness
>> grounds (see below).

> Agreed.

OK.  I'll push a small patch that adds two of those and a comment as to
why it's not appropriate in the third case.  HEAD-only should be
sufficient since we don't think this is a live bug.

>> On a related note, I think it's worth establishing a terminology
>> convention for code and comments in this whole area.

> For my 2c, 'security' is a pretty overloaded term, unfortunately.

Yeah, I think we'd be best off to avoid the bare term "security".
It's probably too late to change the RTE field name "securityQuals",
but maybe we could uniformly call those "security barrier quals" in
the comments.  Then the basic terminology is that we have security
barrier views and row-level security both implemented on top of
security barrier quals, and we should be careful to use the right
one of those three terms in comments/documentation.

Or, if you are willing to put up with renaming the field, we could
call the RTE field "barrierQuals" and then they are just "barrier
quals" for documentation purposes.  But this would be a PITA for
back-patching, so I'm not sure it's worth it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-11-10 Thread Stephen Frost
Dean,

* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
> On 8 November 2016 at 14:45, Tom Lane  wrote:
> > OK.  In that case I'll need to adjust the patch so that the planner keeps
> > its own flag about whether the query contains any securityQuals; that's
> > easy enough.  But I'm still suspicious that the three places I found may
> > represent bugs in the management of Query.hasRowSecurity.
> 
> I don't believe that there are any existing bugs here, but I think 1
> or possibly 2 of the 3 places you changed should be kept on robustness
> grounds (see below).

Agreed.

> On a related note, I think it's worth establishing a terminology
> convention for code and comments in this whole area. In the existing
> code, or in my head at least, there's a convention that a term that
> contains the words "row" or "policy" together with "security" refers
> specifically to RLS, not to SB views. For example:

Agreed, at least for 'row security'.  I tend to view 'policy' as just
about sufficient to stand on its own, in an 'object' type of context
(vs. something like a 'policy decision').  There aren't many other
mentions of policy in src/backend either, the notable one I found
quickly being 'LockWaitPolicy'.  That strikes me as pretty distinctive
from RLS-related policies though.

> * Row-level security or just row security for the name of the feature itself
> * row_security -- the configuration setting
> * get_row_security_policies()
> * Query.hasRowSecurity
> * rowsecurity.c
> 
> On the other hand, terms that contain just the word "security" without
> the words "row" or "policy" have a broader scope and may refer to
> either RLS or SB views. For example:

For my 2c, 'security' is a pretty overloaded term, unfortunately.  We
also have things like fmgr_security_definer(), fmgr_info_cxt_security(),
the security label system, etc, so I don't know that 'security' can
really stand on its own, except perhaps within a specific context, like
"within the rewriter and planner/optimizer, 'security' generally is
going to be talking about security barriers, be they for RLS or security
barrier views."  Even that is likely a bit of a stretch though.  I tend
think we should move in more of a 'Security Barrier'/'SecBarrier' or
similar direction.  Anyone working with the code associated with this
should understand that RLS is built on top of the security barrier
system.

I'm not sure we need to get particularly wrapped up in this, however, or
go making changes just for the sake of making them.

> It's a pretty fine distinction, and I don't know how others have been
> thinking about this, but I think that it's helpful to make the
> distinction, and there are at least a couple of places in the patch
> that use RLS-specific terminology for what could also be a SB view.

I agree that we shouldn't be using RLS-specific terminology for
components which are actually used by RLS and SB views.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Improving RLS planning

2016-11-10 Thread Dean Rasheed
On 8 November 2016 at 16:46, Robert Haas  wrote:
> On Thu, Nov 3, 2016 at 6:23 PM, Tom Lane  wrote:
>>> I think that ordering might be sub-optimal if you had a mix of
>>> leakproof quals and security quals and the cost of some security quals
>>> were significantly higher than the cost of some other quals. Perhaps
>>> all leakproof quals should be assigned security_level 0, to allow them
>>> to be checked earlier if they have a lower cost (whether or not they
>>> are security quals), and only leaky quals would have a security_level
>>> greater than zero. Rule 1 would then not need to check whether the
>>> qual was leakproof, and you probably wouldn't need the separate
>>> "leakproof" bool field on RestrictInfo.
>>
>> Hm, but it would also force leakproof quals to be evaluated in front
>> of potentially-cheaper leaky quals, whether or not that's semantically
>> necessary.
>>

True. That's also what currently happens with RLS and SB views because
leakproof quals are pushed down into subqueries without considering
their cost. It would be nice to do better than that.


>> I experimented with ignoring security_level altogether for leakproof
>> quals, but I couldn't make it work properly, because that didn't lead to
>> a comparison rule that satisfies transitivity.  For instance, consider
>> three quals:
>> A: cost 1, security_level 1, leaky
>> B: cost 2, security_level 1, leakproof
>> C: cost 3, security_level 0, leakproof
>> A should sort before B, since same security_level and lower cost;
>> B should sort before C, since lower cost and leakproof;
>> but A must sort after C, since higher security_level and leaky.
>
> Yeah, this is pretty thorny.  IIUC, all leaky quals of a given
> security level must be evaluated before any quals of the next higher
> security level, or we have a security problem.  Beyond that, we'd
> *prefer* to evaluate cheaper quals first (though perhaps we ought to
> be also thinking about how selective they are) but that's "just" a
> matter of how good the query plan is.  So in this example, security
> dictates that C must precede A, but that's it.  We can pick between
> C-A-B, C-B-A, and B-C-A based on cost.  C-B-A is clearly inferior to
> either of the other two, but it's less obvious whether C-A-B or B-C-A
> is better.  If you expect each predicate to have a selectivity of 50%,
> then C-A-B costs 3+(0.5*1)+(0.25*2) = 4 while B-C-A costs
> 2+(0.5*3)+(0.25*1) = 3.75, so B-C-A is better.  But now make the cost
> of B and C 18 and 20 while keeping the cost of A at 1.  Now C-A-B
> costs 20+(0.5*1)+(0.25*18) = 25 while B-C-A costs 18+(0.5*20)+(0.25*1)
> = 28.25, so now C-A-B is better.
>
> So I think any attempt to come up with a transitive comparison rule is
> doomed.  We could do something like: sort by cost then security level;
> afterwards, allow leakproof qual to migrate forward as many position
> as is possible without passing a qual that is either higher-cost or
> (non-leakproof and lower security level).  So in the above example we
> would start by sorting the like C-A-B and then check whether B can
> move forward; it can't, so we're done.  If all operators were
> leakproof, this would essentially turn into an insertion-sort that
> orders them strictly by cost, whereas if they're all leaky, it orders
> strictly by security level and then by cost.  With a mix of leaky and
> non-leaky operators you get something in the middle.
>
> I'm not sure that this is practically better than the hack you
> proposed, but I wanted to take the time to comment on the theory here,
> as I see it anyway.
>

Yes, I think you're right. It doesn't look possible to invent a
transitive comparison rule.

I thought perhaps the rule could be to only "push down" a leakproof
qual (change it to a lower security_level) if there are more expensive
quals at the lower level, but as you point out, this doesn't guarantee
cheaper execution.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-11-10 Thread Dean Rasheed
On 8 November 2016 at 14:45, Tom Lane  wrote:
> Stephen Frost  writes:
>> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>>> * Since the planner is now depending on Query.hasRowSecurity to be set
>>> whenever there are any securityQuals, I put in an Assert about that,
>>> and promptly found three places in prepjointree.c and the rewriter where
>>> we'd been failing to set it.  I have not looked to see if these represent
>>> live bugs in existing releases, but they might.  Or am I misunderstanding
>>> what the flag is supposed to mean?
>
>> They're independent, actually.  securityQuals can be set via either
>> security barrier view or from RLS, while hasRowSecurity is specifically
>> for the RLS case.  The reason for the distinction is that changing your
>> role isn't going to impact security barrier views at all, while it could
>> impact what RLS policies are used.  See extract_query_dependencies().
>

Right. securityQuals was added for updatable SB views, which pre-dates RLS.


> OK.  In that case I'll need to adjust the patch so that the planner keeps
> its own flag about whether the query contains any securityQuals; that's
> easy enough.  But I'm still suspicious that the three places I found may
> represent bugs in the management of Query.hasRowSecurity.
>

I don't believe that there are any existing bugs here, but I think 1
or possibly 2 of the 3 places you changed should be kept on robustness
grounds (see below).

Query.hasRowSecurity is only used for invalidation of cached plans,
when the current role or environment changes, causing a change to the
set of policies that need to be applied, and thus requiring that the
query be re-planned. This happens in extract_query_dependencies(),
which walks the query tree and will find any Query.hasRowSecurity
flags and set PlannerGlobal.dependsOnRole, which is sufficient for its
intended purpose.

Regarding the 3 places you mention...

1). rewriteRuleAction() doesn't need to set Query.hasRowSecurity
because it's called during "Step 1" of the rewriter, where non-SELECT
rules are expanded, but RLS expansion doesn't happen until later, at
the end of "Step 2", after SELECT rules are expanded. That said, you
could argue that copying the flag in rewriteRuleAction() makes the
code more bulletproof, even though it is expected to always be false
at that point.

2). pull_up_simple_subquery() technically doesn't need to set
Query.hasRowSecurity because nothing in the planner refers to it, and
plancache.c will have already recorded the fact that the original
query had RLS. However, this seems like a bug waiting to happen, and
this really ought to be copying the flag in case we later add code
that does look at the flag later in the planning process.

3). rewriteTargetView() should not set the flag because the flag is
only for RLS, not for SB views, and we don't want cached plans for SB
views to be invalidated.


On a related note, I think it's worth establishing a terminology
convention for code and comments in this whole area. In the existing
code, or in my head at least, there's a convention that a term that
contains the words "row" or "policy" together with "security" refers
specifically to RLS, not to SB views. For example:

* Row-level security or just row security for the name of the feature itself
* row_security -- the configuration setting
* get_row_security_policies()
* Query.hasRowSecurity
* rowsecurity.c

On the other hand, terms that contain just the word "security" without
the words "row" or "policy" have a broader scope and may refer to
either RLS or SB views. For example:

* RangeTblEntry.security_barrier
* RangeTblEntry.securityQuals
* expand_security_quals()
* prepsecurity.c
* The new security_level field

It's a pretty fine distinction, and I don't know how others have been
thinking about this, but I think that it's helpful to make the
distinction, and there are at least a couple of places in the patch
that use RLS-specific terminology for what could also be a SB view.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-11-08 Thread Robert Haas
On Thu, Nov 3, 2016 at 6:23 PM, Tom Lane  wrote:
>> I think that ordering might be sub-optimal if you had a mix of
>> leakproof quals and security quals and the cost of some security quals
>> were significantly higher than the cost of some other quals. Perhaps
>> all leakproof quals should be assigned security_level 0, to allow them
>> to be checked earlier if they have a lower cost (whether or not they
>> are security quals), and only leaky quals would have a security_level
>> greater than zero. Rule 1 would then not need to check whether the
>> qual was leakproof, and you probably wouldn't need the separate
>> "leakproof" bool field on RestrictInfo.
>
> Hm, but it would also force leakproof quals to be evaluated in front
> of potentially-cheaper leaky quals, whether or not that's semantically
> necessary.
>
> I experimented with ignoring security_level altogether for leakproof
> quals, but I couldn't make it work properly, because that didn't lead to
> a comparison rule that satisfies transitivity.  For instance, consider
> three quals:
> A: cost 1, security_level 1, leaky
> B: cost 2, security_level 1, leakproof
> C: cost 3, security_level 0, leakproof
> A should sort before B, since same security_level and lower cost;
> B should sort before C, since lower cost and leakproof;
> but A must sort after C, since higher security_level and leaky.

Yeah, this is pretty thorny.  IIUC, all leaky quals of a given
security level must be evaluated before any quals of the next higher
security level, or we have a security problem.  Beyond that, we'd
*prefer* to evaluate cheaper quals first (though perhaps we ought to
be also thinking about how selective they are) but that's "just" a
matter of how good the query plan is.  So in this example, security
dictates that C must precede A, but that's it.  We can pick between
C-A-B, C-B-A, and B-C-A based on cost.  C-B-A is clearly inferior to
either of the other two, but it's less obvious whether C-A-B or B-C-A
is better.  If you expect each predicate to have a selectivity of 50%,
then C-A-B costs 3+(0.5*1)+(0.25*2) = 4 while B-C-A costs
2+(0.5*3)+(0.25*1) = 3.75, so B-C-A is better.  But now make the cost
of B and C 18 and 20 while keeping the cost of A at 1.  Now C-A-B
costs 20+(0.5*1)+(0.25*18) = 25 while B-C-A costs 18+(0.5*20)+(0.25*1)
= 28.25, so now C-A-B is better.

So I think any attempt to come up with a transitive comparison rule is
doomed.  We could do something like: sort by cost then security level;
afterwards, allow leakproof qual to migrate forward as many position
as is possible without passing a qual that is either higher-cost or
(non-leakproof and lower security level).  So in the above example we
would start by sorting the like C-A-B and then check whether B can
move forward; it can't, so we're done.  If all operators were
leakproof, this would essentially turn into an insertion-sort that
orders them strictly by cost, whereas if they're all leaky, it orders
strictly by security level and then by cost.  With a mix of leaky and
non-leaky operators you get something in the middle.

I'm not sure that this is practically better than the hack you
proposed, but I wanted to take the time to comment on the theory here,
as I see it anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-11-08 Thread Tom Lane
Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> * Since the planner is now depending on Query.hasRowSecurity to be set
>> whenever there are any securityQuals, I put in an Assert about that,
>> and promptly found three places in prepjointree.c and the rewriter where
>> we'd been failing to set it.  I have not looked to see if these represent
>> live bugs in existing releases, but they might.  Or am I misunderstanding
>> what the flag is supposed to mean?

> They're independent, actually.  securityQuals can be set via either
> security barrier view or from RLS, while hasRowSecurity is specifically
> for the RLS case.  The reason for the distinction is that changing your
> role isn't going to impact security barrier views at all, while it could
> impact what RLS policies are used.  See extract_query_dependencies().

OK.  In that case I'll need to adjust the patch so that the planner keeps
its own flag about whether the query contains any securityQuals; that's
easy enough.  But I'm still suspicious that the three places I found may
represent bugs in the management of Query.hasRowSecurity.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-11-08 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Dean Rasheed  writes:
> > On 25 October 2016 at 22:58, Tom Lane  wrote:
> >> The alternative I'm now thinking about pursuing is to get rid of the
> >> conversion of RLS quals to subqueries.  Instead, we can label individual
> >> qual clauses with security precedence markings.
> 
> > +1 for this approach. It looks like it could potentially be much
> > simpler. There's some ugly code in the inheritance planner (and
> > probably one or two other places) that it might be possible to chop
> > out, which would probably also speed up planning times.
> 
> Here's a draft patch for this.  I've only addressed the RLS use-case
> so far, so this doesn't get into managing the order of application of
> join quals, only restriction quals.

Thanks much for working on this.

Just a few relatively quick comments:

> >> 2. In order_qual_clauses, sort first by security_level and second by cost.
> 
> > I think that ordering might be sub-optimal if you had a mix of
> > leakproof quals and security quals and the cost of some security quals
> > were significantly higher than the cost of some other quals. Perhaps
> > all leakproof quals should be assigned security_level 0, to allow them
> > to be checked earlier if they have a lower cost (whether or not they
> > are security quals), and only leaky quals would have a security_level
> > greater than zero. Rule 1 would then not need to check whether the
> > qual was leakproof, and you probably wouldn't need the separate
> > "leakproof" bool field on RestrictInfo.
> 
> Hm, but it would also force leakproof quals to be evaluated in front
> of potentially-cheaper leaky quals, whether or not that's semantically
> necessary.

I agree that this is a concern.

> I experimented with ignoring security_level altogether for leakproof
> quals, but I couldn't make it work properly, because that didn't lead to
> a comparison rule that satisfies transitivity.  For instance, consider
> three quals:
>   A: cost 1, security_level 1, leaky
>   B: cost 2, security_level 1, leakproof
>   C: cost 3, security_level 0, leakproof
> A should sort before B, since same security_level and lower cost;
> B should sort before C, since lower cost and leakproof;
> but A must sort after C, since higher security_level and leaky.
> 
> So what I ended up doing was using your idea of forcing the security
> level to 0 for leakproof quals, but only if they have cost below a
> threshold (which I set at 10X cpu_operator_cost, which should take in
> most built-in functions).  That at least limits the possible damage
> from forcing early evaluation of a leakproof qual.  There may be
> some better way to do it, though, so I didn't go so far as to remove
> the separate leakproof flag.

I'm not a huge fan of these kinds of magic values, particularly when
they can't be independently managed, but, from a practical perspective,
I have a hard time seeing a real problem with this approach.

> Some other notes:
> 
> * This creates a requirement on scan-planning code (and someday on
> join-planning code) to be sure it doesn't create violations of the qual
> ordering rule.  Currently only indxpath.c and tidpath.c have to worry
> about that AFAICS.  FDWs would need to worry about it too, except that
> we don't currently allow RLS to be enabled on foreign tables.  I'm a
> little concerned about whether FDWs could create security holes by
> not accounting for this, but it's moot for now.  Custom scan providers
> will need to pay attention as well.

I'd certainly like to support RLS on FDWs (and views...), but we need to
hash out what the exact semantics of that will be and this looks like
something we should be able to address when we look at adding that
support.

> * prepsecurity.c is now dead code and should be removed, but I did not
> include that in this patch, since it would just bloat the patch.

Sure.

> * Accounting for the removal of prepsecurity.c, this is actually a net
> savings of about 300 lines of code.  So I feel pretty good about that.
> It also gets rid of some really messy kluges, particularly the behavior
> of generating new subquery RTEs as late as halfway through
> grouping_planner.  I find it astonishing that that worked at all.

I'm a bit surprised to hear that as we've been doing that for quite a
while, though looking back on it, I can see why you bring it up.

> * Since the planner is now depending on Query.hasRowSecurity to be set
> whenever there are any securityQuals, I put in an Assert about that,
> and promptly found three places in prepjointree.c and the rewriter where
> we'd been failing to set it.  I have not looked to see if these represent
> live bugs in existing releases, but they might.  Or am I misunderstanding
> what the flag is supposed to mean?

They're independent, actually.  securityQuals can be set via either
security barrier view or from RLS, while hasRowSecurity is specifically
for the RLS 

Re: [HACKERS] Improving RLS planning

2016-11-03 Thread Tom Lane
Dean Rasheed  writes:
> On 25 October 2016 at 22:58, Tom Lane  wrote:
>> The alternative I'm now thinking about pursuing is to get rid of the
>> conversion of RLS quals to subqueries.  Instead, we can label individual
>> qual clauses with security precedence markings.

> +1 for this approach. It looks like it could potentially be much
> simpler. There's some ugly code in the inheritance planner (and
> probably one or two other places) that it might be possible to chop
> out, which would probably also speed up planning times.

Here's a draft patch for this.  I've only addressed the RLS use-case
so far, so this doesn't get into managing the order of application of
join quals, only restriction quals.

>> 2. In order_qual_clauses, sort first by security_level and second by cost.

> I think that ordering might be sub-optimal if you had a mix of
> leakproof quals and security quals and the cost of some security quals
> were significantly higher than the cost of some other quals. Perhaps
> all leakproof quals should be assigned security_level 0, to allow them
> to be checked earlier if they have a lower cost (whether or not they
> are security quals), and only leaky quals would have a security_level
> greater than zero. Rule 1 would then not need to check whether the
> qual was leakproof, and you probably wouldn't need the separate
> "leakproof" bool field on RestrictInfo.

Hm, but it would also force leakproof quals to be evaluated in front
of potentially-cheaper leaky quals, whether or not that's semantically
necessary.

I experimented with ignoring security_level altogether for leakproof
quals, but I couldn't make it work properly, because that didn't lead to
a comparison rule that satisfies transitivity.  For instance, consider
three quals:
A: cost 1, security_level 1, leaky
B: cost 2, security_level 1, leakproof
C: cost 3, security_level 0, leakproof
A should sort before B, since same security_level and lower cost;
B should sort before C, since lower cost and leakproof;
but A must sort after C, since higher security_level and leaky.

So what I ended up doing was using your idea of forcing the security
level to 0 for leakproof quals, but only if they have cost below a
threshold (which I set at 10X cpu_operator_cost, which should take in
most built-in functions).  That at least limits the possible damage
from forcing early evaluation of a leakproof qual.  There may be
some better way to do it, though, so I didn't go so far as to remove
the separate leakproof flag.

Some other notes:

* This creates a requirement on scan-planning code (and someday on
join-planning code) to be sure it doesn't create violations of the qual
ordering rule.  Currently only indxpath.c and tidpath.c have to worry
about that AFAICS.  FDWs would need to worry about it too, except that
we don't currently allow RLS to be enabled on foreign tables.  I'm a
little concerned about whether FDWs could create security holes by
not accounting for this, but it's moot for now.  Custom scan providers
will need to pay attention as well.

* prepsecurity.c is now dead code and should be removed, but I did not
include that in this patch, since it would just bloat the patch.

* Accounting for the removal of prepsecurity.c, this is actually a net
savings of about 300 lines of code.  So I feel pretty good about that.
It also gets rid of some really messy kluges, particularly the behavior
of generating new subquery RTEs as late as halfway through
grouping_planner.  I find it astonishing that that worked at all.

* Since the planner is now depending on Query.hasRowSecurity to be set
whenever there are any securityQuals, I put in an Assert about that,
and promptly found three places in prepjointree.c and the rewriter where
we'd been failing to set it.  I have not looked to see if these represent
live bugs in existing releases, but they might.  Or am I misunderstanding
what the flag is supposed to mean?

* Aside from plan changes, there's one actual behavioral change in the
regression test results, but I think it's okay because it's a question
of whether to put a non-RLS qual before or after a leaky qual.  That's
not something we are promising anything about.

* There's one test query in updatable_views.sql where the plan collapses
to a dummy (Result with constant false qual) because the planner is now
able to see that the qual conditions are mutually contradictory.  Maybe
that query needs adjustment; I'm not sure what it's intending to test
exactly.

regards, tom lane

PS: I've been slacking on the commitfest because I wanted to push this
to a reasonably finished state before I set it aside to do CF work.
I'm not expecting it to be reviewed in this fest.



new-rls-planning-implementation-1.patch.gz
Description: new-rls-planning-implementation-1.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] Improving RLS planning

2016-10-27 Thread Dean Rasheed
On 25 October 2016 at 22:58, Tom Lane  wrote:
> The alternative I'm now thinking about pursuing is to get rid of the
> conversion of RLS quals to subqueries.  Instead, we can label individual
> qual clauses with security precedence markings.  Concretely, suppose we
> add an "int security_level" field to struct RestrictInfo.  The semantics
> of this would be that a qual with a lower security_level value must be
> evaluated before a qual with a higher security_level value, unless the
> latter qual is leakproof.  (It would likely also behoove us to add a
> "leakproof" bool field to struct RestrictInfo, to avoid duplicate
> leakproof-ness checks on quals.  But that's just an optimization.)
>

+1 for this approach. It looks like it could potentially be much
simpler. There's some ugly code in the inheritance planner (and
probably one or two other places) that it might be possible to chop
out, which would probably also speed up planning times.


> In the initial implementation, quals coming from a RangeTblEntry's
> securityQuals field would have security_level 0, quals coming from
> anywhere else would have security_level 1; except that if we know
> there are no security quals anywhere (ie not Query->hasRowSecurity),
> we could give all quals security_level 0.  (I think this exception
> may be worth making because there's no need to test leakproofness
> for a qual with security level 0; it could never be a candidate
> for security delay anyway.)
>

Note that the securityQuals list represents nested levels of security
barrier (e.g., nested SB views), so you'd have to actually assign
security_level 0 to the first security qual, security_level 1 to the
second security qual, and so on. Then the quals coming from anywhere
else would have to have a security_level one greater than the maximum
of all the other security levels.


> Having done that much, I think all we need in order to get rid of
> RLS subqueries, and just stick RLS quals into their relation's
> baserestrictinfo list, are two rules:
>
> 1. When selecting potential indexquals, a RestrictInfo can be considered
> for indexqual use only if it is leakproof or has security_level <= the
> minimum among the table's baserestrictinfo clauses.
>
> 2. In order_qual_clauses, sort first by security_level and second by cost.
>

I think that ordering might be sub-optimal if you had a mix of
leakproof quals and security quals and the cost of some security quals
were significantly higher than the cost of some other quals. Perhaps
all leakproof quals should be assigned security_level 0, to allow them
to be checked earlier if they have a lower cost (whether or not they
are security quals), and only leaky quals would have a security_level
greater than zero. Rule 1 would then not need to check whether the
qual was leakproof, and you probably wouldn't need the separate
"leakproof" bool field on RestrictInfo.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-10-26 Thread Tom Lane
Robert Haas  writes:
> On Wed, Oct 26, 2016 at 1:20 PM, Tom Lane  wrote:
>> Right, so quals from above the SB view would have to not be allowed to
>> drop below the join level (but they could fall *to* the join level,
>> where they'd be applied after the join's own quals).  I mentioned that
>> in the part of the message you cut.  I don't have a detailed design yet
>> but it seems possible, and I expect it to be a lot simpler than the Rube
>> Goldberg design we've got for SB views now.

> OK; it wasn't clear to me that you had considered that case.  I'm not
> convinced that what you end up with is going to be simpler than what
> we have now, but if it is, great.

Well, we already have mechanisms for controlling how far down the join
tree upper quals can fall; outer joins in particular require that.  So
I'm thinking that it shouldn't take a lot of additional code for
distribute_qual_to_rels to handle this too.  Admittedly, the amount of
boilerplate elsewhere, if it turns out we need a new jointree nodetype
to control this, is not negligible.  But I'm thinking it'll be a lot more
straightforward.  There's weird warts for security quals all over the
planner right now, and there are still some things about them that I think
work only by accident.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-10-26 Thread Robert Haas
On Wed, Oct 26, 2016 at 1:20 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> This might work for RLS policies, if they can only reference a single
>> table, but I can't see how it's going to work for security barrier
>> views.  For example, consider CREATE VIEW v WITH (security_barrier) AS
>> SELECT * FROM x, y WHERE x.a = y.a followed by SELECT * FROM v WHERE
>> leak(somefield).  somefield is necessarily coming from either x or y,
>> and you can't let it be passed to leak() except for rows where the
>> join qual has been satisfied.
>
> Right, so quals from above the SB view would have to not be allowed to
> drop below the join level (but they could fall *to* the join level,
> where they'd be applied after the join's own quals).  I mentioned that
> in the part of the message you cut.  I don't have a detailed design yet
> but it seems possible, and I expect it to be a lot simpler than the Rube
> Goldberg design we've got for SB views now.

OK; it wasn't clear to me that you had considered that case.  I'm not
convinced that what you end up with is going to be simpler than what
we have now, but if it is, great.  One of the reasons I did the
initial security_barrier stuff this way was to avoid inventing a lot
of new stuff.  Subqueries already acted as optimization fences and
that was what we needed for this, so it made sense to me to build on
top of that.  Now, if we do build stuff specifically for this purpose,
it can probably be smarter than what we have today, and that is fine.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-10-26 Thread Tom Lane
Robert Haas  writes:
> This might work for RLS policies, if they can only reference a single
> table, but I can't see how it's going to work for security barrier
> views.  For example, consider CREATE VIEW v WITH (security_barrier) AS
> SELECT * FROM x, y WHERE x.a = y.a followed by SELECT * FROM v WHERE
> leak(somefield).  somefield is necessarily coming from either x or y,
> and you can't let it be passed to leak() except for rows where the
> join qual has been satisfied.

Right, so quals from above the SB view would have to not be allowed to
drop below the join level (but they could fall *to* the join level,
where they'd be applied after the join's own quals).  I mentioned that
in the part of the message you cut.  I don't have a detailed design yet
but it seems possible, and I expect it to be a lot simpler than the Rube
Goldberg design we've got for SB views now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-10-26 Thread Robert Haas
On Tue, Oct 25, 2016 at 5:58 PM, Tom Lane  wrote:
> The alternative I'm now thinking about pursuing is to get rid of the
> conversion of RLS quals to subqueries.  Instead, we can label individual
> qual clauses with security precedence markings.  Concretely, suppose we
> add an "int security_level" field to struct RestrictInfo.  The semantics
> of this would be that a qual with a lower security_level value must be
> evaluated before a qual with a higher security_level value, unless the
> latter qual is leakproof.  (It would likely also behoove us to add a
> "leakproof" bool field to struct RestrictInfo, to avoid duplicate
> leakproof-ness checks on quals.  But that's just an optimization.)
>
> In the initial implementation, quals coming from a RangeTblEntry's
> securityQuals field would have security_level 0, quals coming from
> anywhere else would have security_level 1; except that if we know
> there are no security quals anywhere (ie not Query->hasRowSecurity),
> we could give all quals security_level 0.  (I think this exception
> may be worth making because there's no need to test leakproofness
> for a qual with security level 0; it could never be a candidate
> for security delay anyway.)
>
> Having done that much, I think all we need in order to get rid of
> RLS subqueries, and just stick RLS quals into their relation's
> baserestrictinfo list, are two rules:
>
> 1. When selecting potential indexquals, a RestrictInfo can be considered
> for indexqual use only if it is leakproof or has security_level <= the
> minimum among the table's baserestrictinfo clauses.
>
> 2. In order_qual_clauses, sort first by security_level and second by cost.

This might work for RLS policies, if they can only reference a single
table, but I can't see how it's going to work for security barrier
views.  For example, consider CREATE VIEW v WITH (security_barrier) AS
SELECT * FROM x, y WHERE x.a = y.a followed by SELECT * FROM v WHERE
leak(somefield).  somefield is necessarily coming from either x or y,
and you can't let it be passed to leak() except for rows where the
join qual has been satisfied.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2016-10-25 Thread David Rowley
On 26 October 2016 at 10:58, Tom Lane  wrote:
> The alternative I'm now thinking about pursuing is to get rid of the
> conversion of RLS quals to subqueries.  Instead, we can label individual
> qual clauses with security precedence markings.  Concretely, suppose we
> add an "int security_level" field to struct RestrictInfo.  The semantics
> of this would be that a qual with a lower security_level value must be
> evaluated before a qual with a higher security_level value, unless the
> latter qual is leakproof.  (It would likely also behoove us to add a
> "leakproof" bool field to struct RestrictInfo, to avoid duplicate
> leakproof-ness checks on quals.  But that's just an optimization.)

I wonder if there will be a need for more security_levels in the
future, otherwise perhaps a more generic field can be added, like "int
evaulation_flags".
In [1], there's still things to work out with it, but I mentioned that
I'd like to improve equivalence classes to handle more than just
simple equality, which seems to require "optional" RestrictInfos. It
would be nice if we could store all this in one field as a set of
bits.

[1] 
https://www.postgresql.org/message-id/cakjs1f9fpdlkm6%3dsuzagwuch3otbspk6k0yt8-a1hgjfapl...@mail.gmail.com
-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers