Thanks for the detailed response, super helpful in understanding what's happening, in particular understanding the risk of not marking functions as leakproof. I'll take a look at the underlying code to understand what's involved in getting a function to be leakproof.
That said, it does seem like it should be possible and reasonable to specify that a user should have access to the table stats so that the query planner works as expected. Maybe it comes down to the fact that RLS is still a work in progress, and I shouldn't be relying on it unless I'm really certain it supports the functionality I need. I've updated word_similarity_op(text,text) to be leakproof, and pg_proc agrees it is. I'm assuming word_similarity_op() is equivalent to <%, though I haven't found explicit confirmation. However, using word_similarity() instead of <% on a 100k row table, without any RLS involved, doesn't make use of the index, while using <% does. Obviously, adding the RLS doesn't make that any better. Any idea what might be the cause? On Tue, Aug 13, 2019 at 5:39 PM Stephen Frost <sfr...@snowman.net> wrote: > Greetings, > > * Derek Hans (derek.h...@gmail.com) wrote: > > Unfortunately only "alter function" supports "leakproof" - "alter > operator" > > does not. Is there a function-equivalent for marking operators as > > leakproof? Is there any documentation for which operators/functions are > > leakproof? > > Tom's query downthread provides the complete list. > > Note that the list is not completely static- it's entirely possible that > additional functions can be made leak-proof, what's needed is a careful > review of the function code to ensure that it can't leak information > about the data (or, if it does today, a patch which removes that). If > you have an interest in that then I'd encourage you to dig into the code > and look for possible leaks (Tom's already hinted in the direction you'd > want to go in) and then propose a patch to address those cases and to > mark the function(s) as leakproof. > > > In my particular case, RLS is still useful even if operators are leaky > as I > > control the application code and therefore can ensure leaky errors are > > handled. If it's possible to disable all checking for "leakproof", that > > would work for me. > > There isn't a way to disable the leakproof-checking system. Certainly > in the general case that wouldn't be acceptable and I'm not entirely > convinced by your argument that such an option should exist, though you > could go through and set all of the functions to be leakproof if you > really wish to. > > > > If that's not possible, it sounds like it > > > > effectively blocks the use of GIN/GIST indexes when RLS is in use. > > > > > > There's a whole lot of daylight between "it doesn't pick an indexscan > in > > > this one example" and "it effectively blocks the use of GIN/GIST". > > > > True indeed :). Would you have a working example of using a GIN/GIST > index > > with RLS? All the attempts I've made have ended in seq scans. In > practice, > > I'm looking to implement fuzzy search using trigrams, so % and %> > operators > > are what matter to me. ~~ also happens to fail. Should I expect to be > able > > to use any of these with RLS, large amounts of data and reasonable > > performance? > > Functions that aren't marked leakproof aren't going to be able to be > pushed down. > > > Your description of leakproof (and the documentation I've found) makes it > > sound like I'm not just hitting an isolated problem, but a general > problem > > with RLS that represents a substantial limitation and is likely worth > > documenting. > > There's some documentation regarding leakproof functions here: > > https://www.postgresql.org/docs/current/ddl-rowsecurity.html > > and here: > > https://www.postgresql.org/docs/11/sql-createfunction.html > > Of course, patches are welcome to improve on our documentation. > > One thing that it sounds like you're not quite appreciating is that in > the general case, verifying that a function is leakproof isn't optional. > Without such a check, any user could create a function and then get PG > to push that function down below the RLS checks and therefore gain > access to the data that they aren't supposed to be able to see. > > All that said, there's quite a few functions that *are* marked as > leakproof already and they're quite handy and work well with RLS > already, as I expect you'll see when you go querying pg_proc. > > Thanks, > > Stephen > -- *Derek* +1 (415) 754-0519 | derek.h...@gmail.com | Skype: derek.hans