Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-09 Thread KaiGai Kohei
(2010/06/08 11:15), Robert Haas wrote:
 2010/6/7 KaiGai Koheikai...@ak.jp.nec.com:
 Our headache is on functions categorized to middle-threat. It enables to
 leak the given arguments using error messages. Here are several ideas,
 but they have good and bad points.
 
 I think we are altogether off in the weeds here.  We ought to start
 with an implementation that pushes nothing down, and then try to
 figure out how much we can relax that without too much compromising
 security.
 

The attached patch tries to prevent pushing down anything into subqueries
from outside of them.

The distribute_qual_to_rels() tries to distribute the given qualifier
into a certain scanning-plan based on the dependency of qualifier.

E.g) SELECT * FROM (SELECT * FROM t1 JOIN t2 ON t1.a = t2.x WHERE 
f_policy(t1.a)) WHERE f_user(t2.x);

In this case, f_user() function depends on only t2 table, so it is
reasonable to attach on the scanning plan of t2 from perspective of
performance.

However, f_user() may have a side-effect which writes arguments into
somewhere. If here is such a possibility, f_user() should not be called
before the joined tuples being filtered by f_policy().

In the case when we can ensure all functions within the qualifier are
enough trustable, we don't need to prevent them to be pushed down.
But the algorithm to determine it is under discussion. So, right now,
we prevent all the possible pushing down.

Example.1)  CREATE VIEW v1 AS SELECT * FROM t1 JOIN t2 ON a = x WHERE 
f_policy(a);
SELECT * FROM v1 WHERE f_malicious(b);

 * without this patch
postgres=# EXPLAIN SELECT * FROM v1 WHERE f_malicious(b);
QUERY PLAN
---
 Hash Join  (cost=639.01..667.29 rows=137 width=72)
   Hash Cond: (t2.x = t1.a)
   -  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
   -  Hash  (cost=637.30..637.30 rows=137 width=36)
 -  Seq Scan on t1  (cost=0.00..637.30 rows=137 width=36)
   Filter: (f_policy(a) AND f_malicious(b))
(6 rows)

 * with this patch
postgres=# EXPLAIN SELECT * FROM v1 WHERE f_malicious(b);
QUERY PLAN
---
 Hash Join  (cost=334.93..468.44 rows=137 width=72)
   Hash Cond: (t2.x = t1.a)
   Join Filter: f_malicious(t1.b)
   -  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
   -  Hash  (cost=329.80..329.80 rows=410 width=36)
 -  Seq Scan on t1  (cost=0.00..329.80 rows=410 width=36)
   Filter: f_policy(a)
(7 rows)

It prevents to push down f_malicious() inside of the join loop.


Example.2)  CREATE VIEW v1 AS SELECT * FROM t1 JOIN t2 ON a = x WHERE 
f_policy(a);
SELECT * FROM v1 JOIN t3 ON v1.a=t3.s WHERE f_malicious(b);

  * without this patch
postgres=# EXPLAIN SELECT * FROM v1 JOIN t3 ON v1.a=t3.s WHERE 
f_malicious(b);
  QUERY PLAN

---
 Hash Join  (cost=669.01..697.29 rows=137 width=108)
   Hash Cond: (t3.s = t1.a)
   -  Seq Scan on t3  (cost=0.00..22.30 rows=1230 width=36)
   -  Hash  (cost=667.29..667.29 rows=137 width=72)
 -  Hash Join  (cost=639.01..667.29 rows=137 width=72)
   Hash Cond: (t2.x = t1.a)
   -  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
   -  Hash  (cost=637.30..637.30 rows=137 width=36)
 -  Seq Scan on t1  (cost=0.00..637.30 rows=137 
width=36)
   Filter: (f_policy(a) AND f_malicious(b))
(10 rows)

  * with this patch
postgres=# EXPLAIN SELECT * FROM v1 JOIN t3 ON v1.a=t3.s WHERE 
f_malicious(b);
  QUERY PLAN

---
 Hash Join  (cost=470.15..498.43 rows=137 width=108)
   Hash Cond: (t3.s = t1.a)
   -  Seq Scan on t3  (cost=0.00..22.30 rows=1230 width=36)
   -  Hash  (cost=468.44..468.44 rows=137 width=72)
 -  Hash Join  (cost=334.93..468.44 rows=137 width=72)
   Hash Cond: (t2.x = t1.a)
   Join Filter: f_malicious(t1.b)
   -  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
   -  Hash  (cost=329.80..329.80 rows=410 width=36)
 -  Seq Scan on t1  (cost=0.00..329.80 rows=410 
width=36)
   Filter: f_policy(a)
(11 rows)

It also prevents f_malisious() to be pushed down into the join loop within view,
but we can push it down into same level of the query.


Please note that it specially handles equality operator at the bottom half of
the distribute_qual_to_rels(), so this patch does not care about these cases.
However, I'm not in hustle to 

Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Heikki Linnakangas

On 07/06/10 06:06, Stephen Frost wrote:

Also, perhaps I'm not being paranoid enough, but all this concern over
error cases really doesn't really worry me that much.  The amount of
data one could acquire that way is pretty limited.


It's not limited. It allows you to read all contents of the underlying 
table or tables. I don't see much point doing anything at all if we 
don't plug that.


There's many side channels like exposing row counts in EXPLAIN and 
statistics and timing attacks, that are not as critical, because they 
don't let expose all data, and the attacker can't accurately choose what 
data is exposed. Those are not as important.



--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread KaiGai Kohei
(2010/06/07 15:48), Heikki Linnakangas wrote:
 On 07/06/10 06:06, Stephen Frost wrote:
 Also, perhaps I'm not being paranoid enough, but all this concern over
 error cases really doesn't really worry me that much. The amount of
 data one could acquire that way is pretty limited.
 
 It's not limited. It allows you to read all contents of the underlying 
 table or tables. I don't see much point doing anything at all if we 
 don't plug that.
 
IIRC, Oracle pays attention not to expose function's arguments via
error messages due to the security reasons, although it focuses on
that does not provide a hint to attacker who tries SQL injection.

It seems to me it is a matter of degree.

If we allows to execute lowrite() inside of the join loop, it can
be abused to move massive invisible information into visible large
objects within a little time. So, its priority is relatively higher
than error messages.

However, we cannot move massive information via error messages in
a little time, although it may expose whole of the arguments.
So, its threat is relatively smaller.

 There's many side channels like exposing row counts in EXPLAIN and 
 statistics and timing attacks, that are not as critical, because they 
 don't let expose all data, and the attacker can't accurately choose what 
 data is exposed. Those are not as important.
 
It also means; because they can provide much smaller bandwidth to leak
invisible information than error messages, these are not as important.
Is it right?

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Heikki Linnakangas
On 07/06/10 10:30, KaiGai Kohei wrote:
 (2010/06/07 15:48), Heikki Linnakangas wrote:
 There's many side channels like exposing row counts in EXPLAIN and
 statistics and timing attacks, that are not as critical, because they
 don't let expose all data, and the attacker can't accurately choose what
 data is exposed. Those are not as important.

 It also means; because they can provide much smaller bandwidth to leak
 invisible information than error messages, these are not as important.
 Is it right?

The big difference is what information can be obtained, not how fast it
can be obtained.

Imagine a table that holds username/passwords for users. Each user is
allowed to see his own row, including password, but not anyone else's.
EXPLAIN side-channel might give pretty accurate information of how many
rows there is in the table, and via clever EXPLAIN+statistics probing
you might be able to find out what the top-10 passwords are, for
example. But if you wanted to know what your neighbor's password is, the
side-channels would not help you much, but an error message would reveal
it easily.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Stephen Frost
Heikki,

* Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:
 The big difference is what information can be obtained, not how fast it
 can be obtained.

Actually, I disagree.  Time required to acquire the data does matter.

 Imagine a table that holds username/passwords for users. Each user is
 allowed to see his own row, including password, but not anyone else's.
 EXPLAIN side-channel might give pretty accurate information of how many
 rows there is in the table, and via clever EXPLAIN+statistics probing
 you might be able to find out what the top-10 passwords are, for
 example. But if you wanted to know what your neighbor's password is, the
 side-channels would not help you much, but an error message would reveal
 it easily.

Using only built-ins, could you elaborate on how one could pick exactly
what row was revealed using an error case?  That strikes me as
difficult, but perhaps I'm not thinking creatively enough.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Heikki Linnakangas

On 07/06/10 14:06, Stephen Frost wrote:

* Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:

The big difference is what information can be obtained, not how fast it
can be obtained.


Actually, I disagree.  Time required to acquire the data does matter.


Depends on the magnitude, of course. If it takes 1 year per row, that's 
probably acceptable. If it takes 1 second, that's extremely slow 
compared to normal queries, but most likely still disastreous from a 
security point of view.



Imagine a table that holds username/passwords for users. Each user is
allowed to see his own row, including password, but not anyone else's.
EXPLAIN side-channel might give pretty accurate information of how many
rows there is in the table, and via clever EXPLAIN+statistics probing
you might be able to find out what the top-10 passwords are, for
example. But if you wanted to know what your neighbor's password is, the
side-channels would not help you much, but an error message would reveal
it easily.


Using only built-ins, could you elaborate on how one could pick exactly
what row was revealed using an error case?  That strikes me as
difficult, but perhaps I'm not thinking creatively enough.


WHERE should do it:

SELECT * FROM secrets_view WHERE username = 'neighbor' AND 
password::integer = 1234;

ERROR:  invalid input syntax for integer: neighborssecretpassword

Assuming that username = 'neighbor' is evaluated before the cast.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Stephen Frost
* Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:
 WHERE should do it:

 SELECT * FROM secrets_view WHERE username = 'neighbor' AND  
 password::integer = 1234;
 ERROR:  invalid input syntax for integer: neighborssecretpassword

 Assuming that username = 'neighbor' is evaluated before the cast.

Fair enough, so we can't allow built-ins either, except perhaps in very
specific/limited situations.  Still, if we track that the above WHERE
and password::integer calls *should* be run as role X, while the view
should run as role Y, maybe we can at least identify the case where
we've ended up in a situation where we are going to expose unintended
data.  I don't know enough about the optimizer or the planner to have
any clue how we might teach them to actually avoid doing such, though I
certainly believe it could end up being a disaster on performance based
on comments from others who know better. :)

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread KaiGai Kohei
(2010/06/07 20:53), Heikki Linnakangas wrote:
 On 07/06/10 14:06, Stephen Frost wrote:
 * Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:
 The big difference is what information can be obtained, not how fast it
 can be obtained.

 Actually, I disagree. Time required to acquire the data does matter.
 
 Depends on the magnitude, of course. If it takes 1 year per row, that's 
 probably acceptable. If it takes 1 second, that's extremely slow 
 compared to normal queries, but most likely still disastreous from a 
 security point of view.
 
FYI, the classic also mentioned about bandwidth of covert channel,
although it was already obsoleted.

See the page.80 of:
  http://csrc.nist.gov/publications/history/dod85.pdf

It said 1bit/sec are acceptable on DoD in 25 years ago.

 Imagine a table that holds username/passwords for users. Each user is
 allowed to see his own row, including password, but not anyone else's.
 EXPLAIN side-channel might give pretty accurate information of how many
 rows there is in the table, and via clever EXPLAIN+statistics probing
 you might be able to find out what the top-10 passwords are, for
 example. But if you wanted to know what your neighbor's password is, the
 side-channels would not help you much, but an error message would reveal
 it easily.

 Using only built-ins, could you elaborate on how one could pick exactly
 what row was revealed using an error case? That strikes me as
 difficult, but perhaps I'm not thinking creatively enough.
 
 WHERE should do it:
 
 SELECT * FROM secrets_view WHERE username = 'neighbor' AND 
 password::integer = 1234;
 ERROR: invalid input syntax for integer: neighborssecretpassword
 
 Assuming that username = 'neighbor' is evaluated before the cast.
 

In this case, is it unnecessary to expose the given argument in
the error message (from security perspective), isn't it?
Because it is basically matter of the integer input handler,
it seems to me what we should fix up is int4in(), not optimizer.

Perhaps, we should categorize the issued functionalities base on
the level of its threat when abused.

* High-threat
Functions have side-effect that allows to move the given arguments
into another tables or other high-bandwidth chennel.
E.g) lowrite(), pg_write_file()

 - It should be fixed soon.

* Middle-threat
Functions have side-effect that allows to move the given arguments
using error messages or other low-bandwidth channel.
E.g) int4in()

 - It should be fixed in long term.

* Row-threat
Functions can imply existence of invisible tuples, but it does not
expose the value itself.
E.g) EXPLAIN statement, PK/FK constraints

 - It should not be fixed in PostgreSQL.

Now we allow all of them.
But isn't it valuable to fix the high-threat first?
Then, we can revise error messages in built-in functions, I think.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread KaiGai Kohei
(2010/06/07 21:56), Stephen Frost wrote:
 * Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:
 WHERE should do it:

 SELECT * FROM secrets_view WHERE username = 'neighbor' AND
 password::integer = 1234;
 ERROR:  invalid input syntax for integer: neighborssecretpassword

 Assuming that username = 'neighbor' is evaluated before the cast.
 
 Fair enough, so we can't allow built-ins either, except perhaps in very
 specific/limited situations.  Still, if we track that the above WHERE
 and password::integer calls *should* be run as role X, while the view
 should run as role Y, maybe we can at least identify the case where
 we've ended up in a situation where we are going to expose unintended
 data.  I don't know enough about the optimizer or the planner to have
 any clue how we might teach them to actually avoid doing such, though I
 certainly believe it could end up being a disaster on performance based
 on comments from others who know better. :)
 

My opinion is that it is a matter in individual functions, not optimizer.
Basically, built-in functions *should* be trusted, because our security
mechanism is not designed to prevent anything from malicious internal
binary modules.

Historically, we have not known the risk to leak invisible information
using error messages for a long time, so most of internal functions have
not been designed not to return users unnecessary information.
If so, it needs to revise error messages, but it will not complete with
a single commit.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Tom Lane
KaiGai Kohei kai...@ak.jp.nec.com writes:
 In this case, is it unnecessary to expose the given argument in
 the error message (from security perspective), isn't it?

Yes, if all you care about is security and not usability, that looks
like a great solution.  We're *not* doing it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread KaiGai Kohei
(2010/06/08 9:23), KaiGai Kohei wrote:
 (2010/06/07 21:56), Stephen Frost wrote:
 * Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:
 WHERE should do it:

 SELECT * FROM secrets_view WHERE username = 'neighbor' AND
 password::integer = 1234;
 ERROR:  invalid input syntax for integer: neighborssecretpassword

 Assuming that username = 'neighbor' is evaluated before the cast.

 Fair enough, so we can't allow built-ins either, except perhaps in very
 specific/limited situations.  Still, if we track that the above WHERE
 and password::integer calls *should* be run as role X, while the view
 should run as role Y, maybe we can at least identify the case where
 we've ended up in a situation where we are going to expose unintended
 data.  I don't know enough about the optimizer or the planner to have
 any clue how we might teach them to actually avoid doing such, though I
 certainly believe it could end up being a disaster on performance based
 on comments from others who know better. :)

 
 My opinion is that it is a matter in individual functions, not optimizer.
 Basically, built-in functions *should* be trusted, because our security
 mechanism is not designed to prevent anything from malicious internal
 binary modules.
 
Sorry, it does not mean *all* the built-in functions could be trusted.
Some of built-in ones cannot be trusted from definitions, such as
lowrite().

Perhaps, it eventually needs a flag in the pg_proc to mark a function
being either trusted or untrusted. Then, planner may be able to check
the flag to decide whether is can be pushed down, or not.

If so, we can mark int4in() as trusted, when we revise the issue of
error message. I think the idea makes this problem more simple.

Thanks,

 Historically, we have not known the risk to leak invisible information
 using error messages for a long time, so most of internal functions have
 not been designed not to return users unnecessary information.
 If so, it needs to revise error messages, but it will not complete with
 a single commit.
 
 Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Greg Stark
On Tue, Jun 8, 2010 at 1:46 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 KaiGai Kohei kai...@ak.jp.nec.com writes:
 In this case, is it unnecessary to expose the given argument in
 the error message (from security perspective), isn't it?

 Yes, if all you care about is security and not usability, that looks
 like a great solution.  We're *not* doing it.

It's possible to take a more nuanced angle on this approach. You could
imagine a flag indicating whether the call is within a SECURE VIEW
which if enabled caused error messages to elide data taken from
arguments. In order for this not to destroy the code legibility I
think you would need to design some new %-escapes for error messages
to indicate such arguments or some similar trick. You wouldn't want to
have to put extra conditionals everywhere. And I'm not sure where to
conveniently put such a flag so it would have the right semantics.

It would still be a lot of work and a big patch, but mostly mechanical
and it wouldn't impact usability for any case where it wasn't
necessary. It would still have the problem described earlier that we
would keep finding new omissions for years. I can't see us
implementing variable taint tracking in C to do it automatically
though. Perhaps we could get it from one of the static analysis tools.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread KaiGai Kohei
(2010/06/08 9:46), Tom Lane wrote:
 KaiGai Koheikai...@ak.jp.nec.com  writes:
 In this case, is it unnecessary to expose the given argument in
 the error message (from security perspective), isn't it?
 
 Yes, if all you care about is security and not usability, that looks
 like a great solution.  We're *not* doing it.
 
Sorry, are you saying we should not revise error messages because
of usability??

If so, and if we decide the middle-threat also should be fixed,
it is necessary to distinguish functions trusted and untrusted,
even if a function is built-in.
Perhaps, pg_proc takes a new flag to represent it.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Stephen Frost
KaiGai,

* KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
 Perhaps, pg_proc takes a new flag to represent it.

Without an actual well-formed approach for dealing with this which
requires it, it's far too soon to be asking for changes in the catalog.
Please stop suggesting individual maybe-this-would-help changes.  We
need an actual solution which addresses all, or at least most, of the
concerns described, followed by a patch which implements the mechanics
of the solution.  If the solution calls for an additional pg_proc field,
then the patch should include it.

Not sure if this would translate well, but asking for new flags to be
added to pg_proc right now is putting the cart before the horse.  We
don't even know which functions we might mark as trusted or not yet, nor
is it even clear that adding such a flag would actually help.  Adding a
flag to pg_proc is certainly nothing like a solution to this problem by
itself.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread KaiGai Kohei
(2010/06/08 10:17), Stephen Frost wrote:
 KaiGai,
 
 * KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
 Perhaps, pg_proc takes a new flag to represent it.
 
 Without an actual well-formed approach for dealing with this which
 requires it, it's far too soon to be asking for changes in the catalog.
 Please stop suggesting individual maybe-this-would-help changes.  We
 need an actual solution which addresses all, or at least most, of the
 concerns described, followed by a patch which implements the mechanics
 of the solution.  If the solution calls for an additional pg_proc field,
 then the patch should include it.
 
OK, it was too implementation-specific.

Please return to the categorization with 3-level that I mentioned at
the previous message.

I guess nobody opposes to disable pushing down on functions categorized
to high-threat, such as lowrite() and so on.
It actually can move given arguments to other tables, files, ...

And, I guess nobody wants to tackle an issue categorized to low-threat,
such as EXPLAIN, PK/FK constraints and so on.
It can imply existence of invisible objects, but no leaks of actual value.


Our headache is on functions categorized to middle-threat. It enables to
leak the given arguments using error messages. Here are several ideas,
but they have good and bad points.

At first, it is necessary whether we should fix up the threat in this
level, or not. It seems to me majority of the -hackers want to fix both
of high and middle level.

If we should fix up the issue, I think we have only two options semantically.

(A) It prevents leaky functions to be pushed down, so no invisible
information will be provided. But it makes performance regression.

(B) It prevents leaky functions to raise an error, although we allow
it to be pushed down. But is needs large scale of code changes.

Of course, it has trade-off. As TCSEC mentioned, we are facing with the
large potential cost of reducing the bandwidths of such covert channel

 Not sure if this would translate well, but asking for new flags to be
 added to pg_proc right now is putting the cart before the horse.  We
 don't even know which functions we might mark as trusted or not yet, nor
 is it even clear that adding such a flag would actually help.  Adding a
 flag to pg_proc is certainly nothing like a solution to this problem by
 itself.

For built-in functions, the code should be reviewed to ensure it does not
expose the given argument using error messages.
Then, we can mark it as trusted.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Robert Haas
2010/6/7 KaiGai Kohei kai...@ak.jp.nec.com:
 Our headache is on functions categorized to middle-threat. It enables to
 leak the given arguments using error messages. Here are several ideas,
 but they have good and bad points.

I think we are altogether off in the weeds here.  We ought to start
with an implementation that pushes nothing down, and then try to
figure out how much we can relax that without too much compromising
security.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 2010/6/7 KaiGai Kohei kai...@ak.jp.nec.com:
  Our headache is on functions categorized to middle-threat. It enables to
  leak the given arguments using error messages. Here are several ideas,
  but they have good and bad points.
 
 I think we are altogether off in the weeds here.  We ought to start
 with an implementation that pushes nothing down, and then try to
 figure out how much we can relax that without too much compromising
 security.

I agree with this- and it's more-or-less what I was trying to propose in
my previous comments.  I'm not even sure we need to focus on not pushing
anything down at this point- I'd start with trying to get enough
information passed around/through the system to even *identify* the case
where there's a problem in the first place..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread KaiGai Kohei
(2010/06/08 11:15), Robert Haas wrote:
 2010/6/7 KaiGai Koheikai...@ak.jp.nec.com:
 Our headache is on functions categorized to middle-threat. It enables to
 leak the given arguments using error messages. Here are several ideas,
 but they have good and bad points.
 
 I think we are altogether off in the weeds here.  We ought to start
 with an implementation that pushes nothing down, and then try to
 figure out how much we can relax that without too much compromising
 security.
 
It seems to me fair enough.
I think we can adjust what functions are harmless, and whats are not later.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread Stephen Frost
For the sake of clarity..

* KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
 OK, it was too implementation-specific.

No, that wasn't the problem.  There isn't an actual implementation yet
for it to be too-specific on.  The problem is that proposing a change to
the catalog without figuring out what it'd actually be used for in an
overall solution is a waste of time. 

 Please return to the categorization with 3-level that I mentioned at
 the previous message.

As Robert said, we're off in the weeds here.  I'm not convinced that
we've got 3 levels, for starters.  It could well be fewer, or more.
Let's stop making assumptions about what's OK and what's not OK.

 For built-in functions, the code should be reviewed to ensure it does not
 expose the given argument using error messages.
 Then, we can mark it as trusted.

One thing that I think *is* clear- removing useful information from
error messages is *not* going to be an acceptable solution.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-07 Thread KaiGai Kohei
(2010/06/08 11:28), Stephen Frost wrote:
 For the sake of clarity..
 
 * KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
 OK, it was too implementation-specific.
 
 No, that wasn't the problem.  There isn't an actual implementation yet
 for it to be too-specific on.  The problem is that proposing a change to
 the catalog without figuring out what it'd actually be used for in an
 overall solution is a waste of time.
 
Indeed,

 Please return to the categorization with 3-level that I mentioned at
 the previous message.
 
 As Robert said, we're off in the weeds here.  I'm not convinced that
 we've got 3 levels, for starters.  It could well be fewer, or more.
 Let's stop making assumptions about what's OK and what's not OK.
 
Indeed, we may find out the 4th category in the future.

 For built-in functions, the code should be reviewed to ensure it does not
 expose the given argument using error messages.
 Then, we can mark it as trusted.
 
 One thing that I think *is* clear- removing useful information from
 error messages is *not* going to be an acceptable solution.
 
Even if it is conditional, like as Greg Stark suggested?

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-06 Thread Robert Haas
On Fri, Jun 4, 2010 at 4:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 04/06/10 22:33, Tom Lane wrote:
 A counterexample: suppose we had a form of type text that carried a
 collation specifier internally, and the comparison routine threw an
 error if asked to compare values with incompatible specifiers.  An index
 built on a column of all the same collation would work fine.  A query
 that tried to compare against a constant of a different collation would
 throw an error.

 I can't take that example seriously. First of all, tacking a collation
 specifier to text values would be an awful hack.

 Really?  I thought that was under serious discussion.  But whether it
 applies to text or not is insignificant; I believe there are cases just
 like this in existence today for some datatypes (think postgis).

 The real point is that the comparison constant is under the control of
 the attacker, and it's not part of the index.  Therefore it didn't
 throw an error during index construction proves nothing whatever.

 ... Secondly, it would be a
 bad idea to define the b-tree comparison operators to throw an error;

 You're still being far too trusting, by imagining that only *designed*
 error conditions matter here.  Think about overflows, out-of-memory,
 (perhaps intentionally) corrupted data, etc etc.

btree comparison operators should handle overflow and corrupted data
without blowing up.  Maybe out-of-memory is worth worrying about, but
I think that is a mighty thin excuse for abandoning this feature
altogether.  You'd have to contrive a situation where the system was
just about out of memory and something about the value being compared
against resulted in the comparison blowing up or not.  I think that's
likely to be rather hard in practice, and in any case it's a covert
channel attack, which I think everyone already agrees is beyond the
scope of what we can protect against.  You can probably learn more
information more quickly about the unseen data by fidding with
EXPLAIN, analyzing query execution times, etc.  As long as we are
preventing the actual contents of the unseen tuples from being
revealed, I feel reasonably good about it.

 I think the only real fix would be something like what Marc suggested:
 if there's a security view involved in the query, we simply don't give
 the client the real error message.  Of course, now our security
 feature is utterly disastrous on usability as well as performance
 counts ...

Not pushing anything into the view is an equally real fix, although
I'll be pretty depressed if that's where we end up.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-06 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 Another idea I had was... would it be safe to trust functions defined
 by the same user who owns the view?  If he's granted access to the
 view and the function to some other user, presumably he doesn't mind
 them being used together?  Or is that too optimistic?

This was more-or-less what I'd been kind of kicking around in my head.
Forget about functions that are defined in the view itself.  Any other
functions, etc, which are attached to the view by the calling user would
be suspect, etc.  Perhaps with the exception of some built-ins that
we've marked as safe in some way.

My first thought was to track the run this as X information on every
RTE (more-or-less, relations, function calls, etc) and then at least be
able to, hopefully, *detect* situations that might be a problem- eg:
running a function which has run as Q against a relation that was
accessed as run as R when a filter run as R happens later.  This is
all far too hand-wavey, I'm sure, but at least if we could detect it
then we might be able to find a way to deal with it.

Also, perhaps I'm not being paranoid enough, but all this concern over
error cases really doesn't really worry me that much.  The amount of
data one could acquire that way is pretty limited.  It'd be great if we
could deal with that case too, but maybe we could worry about the bigger
issue (at least, as I see it) first.

Just my 2c.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-06 Thread KaiGai Kohei
(2010/06/07 10:38), Robert Haas wrote:
 On Fri, Jun 4, 2010 at 4:12 PM, Tom Lanet...@sss.pgh.pa.us  wrote:
 Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:
 On 04/06/10 22:33, Tom Lane wrote:
 A counterexample: suppose we had a form of type text that carried a
 collation specifier internally, and the comparison routine threw an
 error if asked to compare values with incompatible specifiers.  An index
 built on a column of all the same collation would work fine.  A query
 that tried to compare against a constant of a different collation would
 throw an error.

 I can't take that example seriously. First of all, tacking a collation
 specifier to text values would be an awful hack.

 Really?  I thought that was under serious discussion.  But whether it
 applies to text or not is insignificant; I believe there are cases just
 like this in existence today for some datatypes (think postgis).

 The real point is that the comparison constant is under the control of
 the attacker, and it's not part of the index.  Therefore it didn't
 throw an error during index construction proves nothing whatever.

 ... Secondly, it would be a
 bad idea to define the b-tree comparison operators to throw an error;

 You're still being far too trusting, by imagining that only *designed*
 error conditions matter here.  Think about overflows, out-of-memory,
 (perhaps intentionally) corrupted data, etc etc.
 
 btree comparison operators should handle overflow and corrupted data
 without blowing up.  Maybe out-of-memory is worth worrying about, but
 I think that is a mighty thin excuse for abandoning this feature
 altogether.  You'd have to contrive a situation where the system was
 just about out of memory and something about the value being compared
 against resulted in the comparison blowing up or not.  I think that's
 likely to be rather hard in practice, and in any case it's a covert
 channel attack, which I think everyone already agrees is beyond the
 scope of what we can protect against.  You can probably learn more
 information more quickly about the unseen data by fidding with
 EXPLAIN, analyzing query execution times, etc.  As long as we are
 preventing the actual contents of the unseen tuples from being
 revealed, I feel reasonably good about it.
 
It seems to me a good point. In general, any software have
possibility to leak information via cover-channels, and it is
nearly impossible to fix all the channels.
However, from a security perspective, covert channels with low
bandwidths represent a lower threat than those with high bandwidths.
So, evaluation criteria stands on the viewpoint that it does not
necessary to eliminate all the covert channels more than necessity.

Even if we can estimate the contents of invisible tuples from error
messages or EXPLAIN output, its bandwidths are quite limited.

But, lowrite() might write out the given argument somewhere, if it
is invoked prior to security policy functions.
In this case, the contents of invisible tuples will be moved to
another large object unexpectedly with unignorable speed.
Such a direct data flow channel should be fixed at first, because of
its large bandwidth.

 I think the only real fix would be something like what Marc suggested:
 if there's a security view involved in the query, we simply don't give
 the client the real error message.  Of course, now our security
 feature is utterly disastrous on usability as well as performance
 counts ...
 
 Not pushing anything into the view is an equally real fix, although
 I'll be pretty depressed if that's where we end up.
 
I could not find out any certified commercial RDBMS with functionality
to tackle covert channel. It includes Oracle's row-level security stuff.
So, I don't think it is our goal to prevent anything into views.

An idea is that we focus on the direct data flow which allows to move
information to be invisible into another tables, files and so on.
In this stage, I don't think the priority of error messages are high.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-06 Thread KaiGai Kohei
(2010/06/07 12:06), Stephen Frost wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
 Another idea I had was... would it be safe to trust functions defined
 by the same user who owns the view?  If he's granted access to the
 view and the function to some other user, presumably he doesn't mind
 them being used together?  Or is that too optimistic?
 
 This was more-or-less what I'd been kind of kicking around in my head.
 Forget about functions that are defined in the view itself.  Any other
 functions, etc, which are attached to the view by the calling user would
 be suspect, etc.  Perhaps with the exception of some built-ins that
 we've marked as safe in some way.
 
 My first thought was to track the run this as X information on every
 RTE (more-or-less, relations, function calls, etc) and then at least be
 able to, hopefully, *detect* situations that might be a problem- eg:
 running a function which has run as Q against a relation that was
 accessed as run as R when a filter run as R happens later.  This is
 all far too hand-wavey, I'm sure, but at least if we could detect it
 then we might be able to find a way to deal with it.
 
It is a similar idea to what I tried to implement at the conceptual
patch. It checks privileges of calling user on the underlaying tables
to be referenced using views. If violated, it prevents to push down
the user given FuncExpr into join loops of views. Otherwise, it does
not prevent anything, because the user originally has privileges to
reference them.
Are you suggesting the idea (with adjustments such as safe functions)?

 Also, perhaps I'm not being paranoid enough, but all this concern over
 error cases really doesn't really worry me that much.  The amount of
 data one could acquire that way is pretty limited.  It'd be great if we
 could deal with that case too, but maybe we could worry about the bigger
 issue (at least, as I see it) first.
 
As I also mentioned at previous message. It seems to me a good
point to focus on bandwidth of the covert channel.

The error message indeed can deliver information to be invisible,
but I don't think its priority is higher than functions that can
be abused to direct data flow channel, such as lowrite().

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 The proposal some time back in this thread was to trust all built-in
 functions and no others.  That's a bit simplistic, no doubt, but it
 seems to me to largely solve the performance problem and to do so with
 minimal effort.  When and if you get to a solution that's committable
 with respect to everything else, it might be time to think about
 more flexible answers to that particular point.

What about trusting all internal and C language function instead? My
understanding is that internal covers built-in functions, and as you
need to be a superuser to CREATE a C language function, surely you're
able to accept that by doing so you get to trust it?

How useful would that be?
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread KaiGai Kohei

(2010/06/04 18:26), Dimitri Fontaine wrote:

Tom Lanet...@sss.pgh.pa.us  writes:

The proposal some time back in this thread was to trust all built-in
functions and no others.  That's a bit simplistic, no doubt, but it
seems to me to largely solve the performance problem and to do so with
minimal effort.  When and if you get to a solution that's committable
with respect to everything else, it might be time to think about
more flexible answers to that particular point.


What about trusting all internal and C language function instead? My
understanding is that internal covers built-in functions, and as you
need to be a superuser to CREATE a C language function, surely you're
able to accept that by doing so you get to trust it?

How useful would that be?


If we trust all the C language functions, it also means DBA can never
install any binary functions having side-effect (e.g, pg_file_write() in
the contrib/adminpack ) without security risks.

If we need an intelligence to identify what functions are trusted and
what ones are untrusted, it will eventually need a hint to mark a certain
function as trusted, won't it?

Thanks,
--
KaiGai Kohei kai...@kaigai.gr.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Heikki Linnakangas

On 04/06/10 07:57, Tom Lane wrote:

KaiGai Koheikai...@ak.jp.nec.com  writes:

(2010/06/04 11:55), Robert Haas wrote:

A (very) important part of this problem is determining which quals are
safe to push down.


At least, I don't have an idea to distinguish trusted functions from
others without any additional hints, because we support variable kind
of PL languages. :(


The proposal some time back in this thread was to trust all built-in
functions and no others.


I thought I debunked that idea already 
(http://archives.postgresql.org/pgsql-hackers/2009-10/msg01428.php). Not 
all built-in functions are safe. Consider casting integer to text, for 
example. Seems innocent at first glance, but it's not; if the input is 
not a valid integer, it throws an error which contains the input string, 
revealing it.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 04/06/10 07:57, Tom Lane wrote:
 The proposal some time back in this thread was to trust all built-in
 functions and no others.

 I thought I debunked that idea already 
 (http://archives.postgresql.org/pgsql-hackers/2009-10/msg01428.php). Not 
 all built-in functions are safe. Consider casting integer to text, for 
 example. Seems innocent at first glance, but it's not; if the input is 
 not a valid integer, it throws an error which contains the input string, 
 revealing it.

Hmm ... that's a mighty interesting example, because it shows that any
well-meaning change in error handling might render seemingly-unrelated
functions unsafe.  And we're certainly not going to make error
messages stop showing relevant information just because of this.

Maybe the entire idea is unworkable.  I certainly don't find any comfort
in your proposal in the above-referenced message to trust index
operators; where is it written that those don't throw errors?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Marc Munro
On Fri, 2010-06-04 at 10:33 -0400, Tom Lane wrote:
 Hmm ... that's a mighty interesting example, because it shows that any
 well-meaning change in error handling might render seemingly-unrelated
 functions unsafe.  And we're certainly not going to make error
 messages stop showing relevant information just because of this.

Although that looks like a show-stopper, I think it can be worked
around.  Errors in operations on security views could simply be caught
and conditionally rewritten.  The original error could still appear in
the logs but the full details need not be reported to unprivileged
users.

If that can be done, then we would still need to be able to identify
trusted functions and views used for security purposes, and ensure that
(please excuse my terminology if it is incorrect) untrusted quals do not
get pushed down inside secured views.  If all of that can be done along
with the error trapping, then we may have a solution.

My big concern is still about performance, particularly when joining
between multiple security views and other objects.  I don't expect to
get security for free but I don't want to see unnecessary barriers to
optimisation.

__
Marc


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Heikki Linnakangas

On 04/06/10 17:33, Tom Lane wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:

On 04/06/10 07:57, Tom Lane wrote:

The proposal some time back in this thread was to trust all built-in
functions and no others.



I thought I debunked that idea already
(http://archives.postgresql.org/pgsql-hackers/2009-10/msg01428.php). Not
all built-in functions are safe. Consider casting integer to text, for
example.


(I meant text to integer, of course)


Maybe the entire idea is unworkable.  I certainly don't find any comfort
in your proposal in the above-referenced message to trust index
operators; where is it written that those don't throw errors?


Let's consider b-tree operators for an index on the secure table, for 
starters. Surely a b-tree index comparison operator can't throw an error 
on any value that's in the table already, you would've gotten an error 
trying to insert that.


Now, is it safe to expand that thinking to b-tree operators in general, 
even if there's no such index on the table? I'm not sure. But indexable 
operations are what we care about the most; the order of executing those 
determines if you can use an index scan or not.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Robert Haas
On Fri, Jun 4, 2010 at 1:46 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 04/06/10 17:33, Tom Lane wrote:

 Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:

 On 04/06/10 07:57, Tom Lane wrote:

 The proposal some time back in this thread was to trust all built-in
 functions and no others.

 I thought I debunked that idea already
 (http://archives.postgresql.org/pgsql-hackers/2009-10/msg01428.php). Not
 all built-in functions are safe. Consider casting integer to text, for
 example.

 (I meant text to integer, of course)

 Maybe the entire idea is unworkable.  I certainly don't find any comfort
 in your proposal in the above-referenced message to trust index
 operators; where is it written that those don't throw errors?

 Let's consider b-tree operators for an index on the secure table, for
 starters. Surely a b-tree index comparison operator can't throw an error on
 any value that's in the table already, you would've gotten an error trying
 to insert that.

 Now, is it safe to expand that thinking to b-tree operators in general, even
 if there's no such index on the table? I'm not sure. But indexable
 operations are what we care about the most; the order of executing those
 determines if you can use an index scan or not.

Another idea I had was... would it be safe to trust functions defined
by the same user who owns the view?  If he's granted access to the
view and the function to some other user, presumably he doesn't mind
them being used together?  Or is that too optimistic?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 04/06/10 17:33, Tom Lane wrote:
 Maybe the entire idea is unworkable.  I certainly don't find any comfort
 in your proposal in the above-referenced message to trust index
 operators; where is it written that those don't throw errors?

 Let's consider b-tree operators for an index on the secure table, for 
 starters. Surely a b-tree index comparison operator can't throw an error 
 on any value that's in the table already, you would've gotten an error 
 trying to insert that.

Man, are *you* trusting.

A counterexample: suppose we had a form of type text that carried a
collation specifier internally, and the comparison routine threw an
error if asked to compare values with incompatible specifiers.  An index
built on a column of all the same collation would work fine.  A query
that tried to compare against a constant of a different collation would
throw an error.

 I'm not sure. But indexable 
 operations are what we care about the most; the order of executing those 
 determines if you can use an index scan or not.

Personally, I care just as much about hash and merge join operators...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Heikki Linnakangas

On 04/06/10 22:33, Tom Lane wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:

On 04/06/10 17:33, Tom Lane wrote:

Maybe the entire idea is unworkable.  I certainly don't find any comfort
in your proposal in the above-referenced message to trust index
operators; where is it written that those don't throw errors?



Let's consider b-tree operators for an index on the secure table, for
starters. Surely a b-tree index comparison operator can't throw an error
on any value that's in the table already, you would've gotten an error
trying to insert that.


Man, are *you* trusting.

A counterexample: suppose we had a form of type text that carried a
collation specifier internally, and the comparison routine threw an
error if asked to compare values with incompatible specifiers.  An index
built on a column of all the same collation would work fine.  A query
that tried to compare against a constant of a different collation would
throw an error.


I can't take that example seriously. First of all, tacking a collation 
specifier to text values would be an awful hack. Secondly, it would be a 
bad idea to define the b-tree comparison operators to throw an error; it 
would be a lot more useful to impose an arbitrary order on the 
collations, so that all values with collation A are considered smaller 
than values with collation B. We do that for types like box; smaller or 
greater than don't make much sense for boxes, but we implement them in a 
pretty arbitrary way anyway to make it possible to build a b-tree index 
on them, and for the planner to use merge joins on them, and implement 
DISTINCT using sort etc.



I'm not sure. But indexable
operations are what we care about the most; the order of executing those
determines if you can use an index scan or not.


Personally, I care just as much about hash and merge join operators...


Hash seems safe too. Don't merge joins just use the default b-tree operator?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-04 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 04/06/10 22:33, Tom Lane wrote:
 A counterexample: suppose we had a form of type text that carried a
 collation specifier internally, and the comparison routine threw an
 error if asked to compare values with incompatible specifiers.  An index
 built on a column of all the same collation would work fine.  A query
 that tried to compare against a constant of a different collation would
 throw an error.

 I can't take that example seriously. First of all, tacking a collation 
 specifier to text values would be an awful hack.

Really?  I thought that was under serious discussion.  But whether it
applies to text or not is insignificant; I believe there are cases just
like this in existence today for some datatypes (think postgis).

The real point is that the comparison constant is under the control of
the attacker, and it's not part of the index.  Therefore it didn't
throw an error during index construction proves nothing whatever.

 ... Secondly, it would be a 
 bad idea to define the b-tree comparison operators to throw an error;

You're still being far too trusting, by imagining that only *designed*
error conditions matter here.  Think about overflows, out-of-memory,
(perhaps intentionally) corrupted data, etc etc.

I think the only real fix would be something like what Marc suggested:
if there's a security view involved in the query, we simply don't give
the client the real error message.  Of course, now our security
feature is utterly disastrous on usability as well as performance
counts ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-03 Thread KaiGai Kohei
I fixed up the subject.

(2010/06/04 2:23), Marc Munro wrote:
 On Thu, 2010-06-03 at 05:53 -0300, pgsql-hackers-ow...@postgresql.org
 wrote:
 [ . . . ]

 In my current idea, when a qual-node that contains FuncExpr tries to
 reference a part of relations within a security view, its referencing
 relations will be expanded to whole of the security view at
 distribute_qual_to_rels().
 [ . . . ]
 
 I may be missing something here but this seems a bit too simplistic and,
 I think, fails to deal with an important use case.
 
 Security views, that do anything useful at all, tend to introduce
 performance issues.  I am concerned that placing a conceptual barrier
 between the secured and unsecured parts of queries is going to
 unnecessarily restrict what the optimiser can do.
 
 For example consider that we have three secured views, each of the form:
 
create view s_x as select * from x where i_can_see(x.key);
 
 and consider the query:
 
select stuff from s_x
  inner join s_y on s_y.key = s_x.key
  inner join s_z on s_z.key = s_x.key
where fn(s_x.a) = 3;
 
 The optimiser ought to be able to spot the fact that i_can_see() need
 only be called once for each joined result.  By placing a barrier (if I
 understand your proposal correctly) between the outermost joins and the
 inner views, doesn't this optimisation become impossible?
 
It seems to me incorrect.

If i_can_see() can filter a part of tuples within table: x, the optimizer
prefers to call i_can_see() on scanning each tables rather than result of
join, because it effectively reduce the size of scan.

In fact, the existing optimizer make the following plan:

  postgres=# CREATE FUNCTION i_can_see(int) RETURNS bool IMMUTABLE
  AS 'begin return $1 % 1 = 1; end;' LANGUAGE 'plpgsql';
  CREATE FUNCTION
  postgres=# CREATE VIEW v1 as select * from t1 where i_can_see(a);
  CREATE VIEW
  postgres=# CREATE VIEW v2 as select * from t2 where i_can_see(x);
  CREATE VIEW
  postgres=# CREATE VIEW v3 as select * from t3 where i_can_see(s);
  CREATE VIEW

  postgres=# EXPLAIN SELECT * FROM (v1 JOIN v2 ON v1.a=v2.x) JOIN v3 on 
v1.a=v3.s;
QUERY PLAN
  
---
   Nested Loop  (cost=0.00..860.47 rows=410 width=108)
 -  Nested Loop  (cost=0.00..595.13 rows=410 width=72)
   -  Seq Scan on t1  (cost=0.00..329.80 rows=410 width=36)
 Filter: i_can_see(a)
   -  Index Scan using t2_pkey on t2  (cost=0.00..0.63 rows=1 width=36)
 Index Cond: (t2.x = t1.a)
 Filter: i_can_see(t2.x)
 -  Index Scan using t3_pkey on t3  (cost=0.00..0.63 rows=1 width=36)
   Index Cond: (t3.s = t1.a)
   Filter: i_can_see(t3.s)
  (10 rows)

Sorry, I may miss something your point.

 I think a simpler solution may be possible here.  If you can tag the
 function i_can_see() as a security function, at least in the context of
 its use in the security views, and then create the rule that security
 functions are always considered to be lower cost than user-defined
 non-security functions, don't we achieve the result of preventing the
 insecure function from seeing rows that it shouldn't?
 
Sorry, it seems to me you mixed up different issues.

My patch tries to tackle the problem that optimizer distributes outermost
functions into inside of the view when the function takes arguments only
from a side of join, even if security views.
This issue is independent from cost of functions.

On the other hand, when a scan plan has multiple qualifiers to filter
fetched tuples, we have to pay attention on the order to be called.
If outermost functions are called prior to view's policy functions,
it may cause unexpected leaks.

In my opinion, we should consider the nestlevel of the function where
it is originally put. But it is not concluded yet, and the patch does
not tackle anything about this issue.

 I guess my concern is that a query may be constructed a=out of secured
 and unsecured parts and the optimiser should be free to group all of the
 secured parts together before considering the unsecured parts.
 
Yes, it is an opinion, but it may cause unnecessary performance regression
when user given condition is obviously harmless.

E.g, If table: t has primary key t.a, and a security view is defined as:

  CREATE VIEW v AS SELECT * FROM t WHERE f_policy(t.b);

When user gives the following query:

  SELECT * FROM v WHERE a = 100;

If we strictly separate secured and unsecure part prior to optimizer,
it will break a chance to scan the table: t with index.

I also think security is not free, so it may take a certain level of
performance degradation. But it should not bring down more than necessity.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-03 Thread KaiGai Kohei
(2010/06/04 11:55), Robert Haas wrote:
 On Thu, Jun 3, 2010 at 1:23 PM, Marc Munrom...@bloodnok.com  wrote:
 On Thu, 2010-06-03 at 05:53 -0300, pgsql-hackers-ow...@postgresql.org
 wrote:
 [ . . . ]

 In my current idea, when a qual-node that contains FuncExpr tries to
 reference a part of relations within a security view, its referencing
 relations will be expanded to whole of the security view at
 distribute_qual_to_rels().
 [ . . . ]

 I may be missing something here but this seems a bit too simplistic and,
 I think, fails to deal with an important use case.
 
 If anything, you're putting it mildly.  This is quite a bit too
 simplistic and fails to deal with several important issues, at least
 some of which have already been mentioned on this thread.
 
Hmm. Was it too simplified even if just a proof of concept?

 The optimiser ought to be able to spot the fact that i_can_see() need
 only be called once for each joined result.  By placing a barrier (if I
 understand your proposal correctly) between the outermost joins and the
 inner views, doesn't this optimisation become impossible?

 I think a simpler solution may be possible here.  If you can tag the
 function i_can_see() as a security function, at least in the context of
 its use in the security views, and then create the rule that security
 functions are always considered to be lower cost than user-defined
 non-security functions, don't we achieve the result of preventing the
 insecure function from seeing rows that it shouldn't?
 
 So, yes and no.  You DO need a security barrier between the view and
 the rest of the query, but if a function can be trusted not to do evil
 things, then it should be allowed to be pushed down.  What we need to
 prevent is the pushdown of untrusted functions (or operators).  A
 (very) important part of this problem is determining which quals are
 safe to push down.
 
At least, I don't have an idea to distinguish trusted functions from
others without any additional hints, because we support variable kind
of PL languages. :(

An idea is to add TRUSTED (or other) keyword for CREATE FUNCTION to mark
this function is harmless to execute, but only DBA can define functions
with the option.
(Perhaps, most of built-in functions should be marked as trusted?)

If we should identify a function as either trusted or untrusted, is
there any other ideas?

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-03 Thread Tom Lane
KaiGai Kohei kai...@ak.jp.nec.com writes:
 (2010/06/04 11:55), Robert Haas wrote:
 A (very) important part of this problem is determining which quals are
 safe to push down.
 
 At least, I don't have an idea to distinguish trusted functions from
 others without any additional hints, because we support variable kind
 of PL languages. :(

The proposal some time back in this thread was to trust all built-in
functions and no others.  That's a bit simplistic, no doubt, but it
seems to me to largely solve the performance problem and to do so with
minimal effort.  When and if you get to a solution that's committable
with respect to everything else, it might be time to think about
more flexible answers to that particular point.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-03 Thread KaiGai Kohei
(2010/06/04 13:57), Tom Lane wrote:
 KaiGai Koheikai...@ak.jp.nec.com  writes:
 (2010/06/04 11:55), Robert Haas wrote:
 A (very) important part of this problem is determining which quals are
 safe to push down.

 At least, I don't have an idea to distinguish trusted functions from
 others without any additional hints, because we support variable kind
 of PL languages. :(
 
 The proposal some time back in this thread was to trust all built-in
 functions and no others.  That's a bit simplistic, no doubt, but it
 seems to me to largely solve the performance problem and to do so with
 minimal effort.  When and if you get to a solution that's committable
 with respect to everything else, it might be time to think about
 more flexible answers to that particular point.
 
Although I've not check all the functions within pg_proc.h, it seems to
me reasonable assumptions, except for a small number of exception which
have side-effects, such as lo_write().

Maybe, we have to shut out a small number of exceptions.
However, at least, it seems to me reasonable assumption in this stage.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers