The attached patches are cut-off version based on the latest Robert's updates. The "v8.regtest" adds regression test cases on variable leaky-view scenarios with/without security-barrier property.
The "v8.option-1" add checks around restriction_selectivity, and prevent to invoke estimator function if Var node touches relation with security- barrier attribute. The "v8.option-2" add checks around examine_simple_variable, and prevent to reference statistical data, if Var node tries to reference relation with security-barrier attribute. (And, it shall be marked as "leakproof") I initially thought restriction_selectivity called by clause_selectivity is the best point to add checks, however, I reconsidered it might not be the origin of this problem. As long as user-defined functions acquires control on selectivity estimation of operators, same problems can be re-produced; if someone tries to reference unrelated data within estimator. This scenario is normally prevented, because only superuser can define a function that can bypass permission checks to reference internal data structures; using "untrusted" procedural-language. If my conclusion is right, what we should fix up is built-in estimators side, and we should enforce estimator function being "leakproof", even though we still allow unprivileged users to define operators. Thanks, 2011/12/11 Kohei KaiGai <kai...@kaigai.gr.jp>: > 2011/12/10 Kohei KaiGai <kai...@kaigai.gr.jp>: >> 2011/12/9 Robert Haas <robertmh...@gmail.com>: >>> I feel like there must be some logic in the planner somewhere that is >>> "looking through" the subquery RTE and figuring out that safe_foo.a is >>> really the same variable as foo.a, and which therefore feels entitled >>> to apply !!'s selectivity estimator to foo.a's statistics. If that's >>> the case, it might be possible to handicap that logic so that when the >>> security_barrier flag is set, it doesn't do that, and instead treats >>> safe_foo.a as a black box. That would, obviously, degrade the quality >>> of complex plans involving security views, but I think we should focus >>> on getting something that meets our security goals first and then try >>> to improve performance later. >>> >> I tried to investigate the code around size-estimation, and it seems to >> me here is two candidates to put this type of checks. >> >> The one is examine_simple_variable() that is used to pull a datum >> from statistic information, but it locates on the code path restriction >> estimator of operators; so user controlable, although it requires least >> code changes just after if (rte->rtekind == RTE_SUBQUERY). >> The other is clause_selectivity(). Its code path is not user controlable, >> so we can apply necessary checks to prevent qualifier that reference >> variable come from sub-query with security-barrier. >> >> In my sense, clause_selectivity() is better place to apply this type of >> checks. But, on the other hand, it provides get_relation_stats_hook >> to allow extensions to control references to statistic data. >> So, I wonder whether the PG core assumes this routine covers >> all the code path here? >> > The attached patch adds checks around invocation of selectivity > estimator functions, and it changes behavior of the estimator, if the > supplied operator tries to touch variables come from security-barrier > relations. > Then, it fixes the problem you mentioned. > > postgres=# explain select * from safe_foo where a !! 0; > QUERY PLAN > ------------------------------------------------------------- > Subquery Scan on safe_foo (cost=0.00..2.70 rows=3 width=4) > Filter: (safe_foo.a !! 0) > -> Seq Scan on foo (cost=0.00..1.14 rows=6 width=4) > Filter: (a > 5) > (4 rows) > > However, I'm still a bit skeptical on my patch, because it still allows > to invoke estimator function when operator's argument does not > touch values originated from security-barrier relation. > In the case when oprrest or oprjoin are implemented without our > regular convention (E.g, it anyway reference whole of statistical data), > it will break this solution. > > Of course, it is an assumption that we cannot prevent any attack > using binary modules, so we need to say "use it your own risk" if > people tries to use extensional modules. And, we also need to > keep the built-in code secure. > > Some of built-in estimator functions (such as eqsel) provides > a feature that invokes operator function with arguments originated > from pg_statistics table. It didn't harmless, however, we got understand > that this logic can be used to break row-level security. > So, I begin to consider the routines to be revised are some of built-in > functions to be used for estimator functions; such as eqsel and ... > These function eventually reference statistical data at examine_variable. > > It might be a better approach to add checks whether invocation of > the supplied operator possibly leaks contents to be invisible. > It seems to me the Idea of the attached patch depends on something > internal stuff of existing built-in estimator functions... > > Thanks, > -- > KaiGai Kohei <kai...@kaigai.gr.jp> -- KaiGai Kohei <kai...@kaigai.gr.jp>
Description: Binary data
Description: Binary data
Description: Binary data
-- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers