Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
* Dean Rasheed (dean.a.rash...@gmail.com) wrote: On 26 February 2015 at 09:50, Dean Rasheed dean.a.rash...@gmail.com wrote: On 26 February 2015 at 05:43, Stephen Frost sfr...@snowman.net wrote: I wonder if there are some documentation updates which need to be done for this also? I'm planning to look as I vauguely recall mentioning the ordering of operations somewhere along the way. I couldn't find any mention of the timing of the check in the existing documentation, although it does vaguely imply that the check is done before inserting any new data. There is an existing paragraph describing the timing of USING conditions, so I added a new paragraph immediately after that to explain when CHECK expressions are enforced, since that seemed like the best place for it. Ah, ok, I must have been thinking about the USING discussion. Thanks for adding the paragraph about the CHECK timing. I also addressed the bitrot from the column-priv leak patch. Would be great to have you take a look at the latest and let me know if you have any further comments or suggestions. I looked it over again, and I'm happy with these updates, except there was a missing break in the switch statement in ExecWithCheckOptions(). Oh, thanks for catching that! Here's an updated patch. Excellent, will review. Thanks again! Stephen signature.asc Description: Digital signature
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On 26 February 2015 at 09:50, Dean Rasheed dean.a.rash...@gmail.com wrote: On 26 February 2015 at 05:43, Stephen Frost sfr...@snowman.net wrote: I wonder if there are some documentation updates which need to be done for this also? I'm planning to look as I vauguely recall mentioning the ordering of operations somewhere along the way. I couldn't find any mention of the timing of the check in the existing documentation, although it does vaguely imply that the check is done before inserting any new data. There is an existing paragraph describing the timing of USING conditions, so I added a new paragraph immediately after that to explain when CHECK expressions are enforced, since that seemed like the best place for it. I also addressed the bitrot from the column-priv leak patch. Would be great to have you take a look at the latest and let me know if you have any further comments or suggestions. I looked it over again, and I'm happy with these updates, except there was a missing break in the switch statement in ExecWithCheckOptions(). Here's an updated patch. Regards, Dean diff --git a/doc/src/sgml/ref/create_policy.sgml b/doc/src/sgml/ref/create_policy.sgml new file mode 100644 index 868a6c1..49eaadc *** a/doc/src/sgml/ref/create_policy.sgml --- b/doc/src/sgml/ref/create_policy.sgml *** CREATE POLICY replaceable class=parame *** 61,66 --- 61,74 /para para + For INSERT and UPDATE queries, WITH CHECK expressions are enforced after + BEFORE triggers are fired, and before any data modifications are made. + Thus a BEFORE ROW trigger may modify the data to be inserted, affecting + the result of the security policy check. WITH CHECK expressions are + enforced before any other constraints. + /para + + para Policy names are per-table, therefore one policy name can be used for many different tables and have a definition for each table which is appropriate to that table. diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c new file mode 100644 index 33b172b..8dfb952 *** a/src/backend/executor/execMain.c --- b/src/backend/executor/execMain.c *** ExecConstraints(ResultRelInfo *resultRel *** 1667,1675 /* * ExecWithCheckOptions -- check that tuple satisfies any WITH CHECK OPTIONs */ void ! ExecWithCheckOptions(ResultRelInfo *resultRelInfo, TupleTableSlot *slot, EState *estate) { Relation rel = resultRelInfo-ri_RelationDesc; --- 1667,1676 /* * ExecWithCheckOptions -- check that tuple satisfies any WITH CHECK OPTIONs + * of the specified kind. */ void ! ExecWithCheckOptions(WCOKind kind, ResultRelInfo *resultRelInfo, TupleTableSlot *slot, EState *estate) { Relation rel = resultRelInfo-ri_RelationDesc; *** ExecWithCheckOptions(ResultRelInfo *resu *** 1694,1699 --- 1695,1703 WithCheckOption *wco = (WithCheckOption *) lfirst(l1); ExprState *wcoExpr = (ExprState *) lfirst(l2); + if (wco-kind != kind) + continue; + /* * WITH CHECK OPTION checks are intended to ensure that the new tuple * is visible (in the case of a view) or that it passes the *** ExecWithCheckOptions(ResultRelInfo *resu *** 1715,1726 modifiedCols, 64); ! ereport(ERROR, ! (errcode(ERRCODE_WITH_CHECK_OPTION_VIOLATION), ! errmsg(new row violates WITH CHECK OPTION for \%s\, ! wco-viewname), ! val_desc ? errdetail(Failing row contains %s., val_desc) : ! 0)); } } } --- 1719,1747 modifiedCols, 64); ! switch (wco-kind) ! { ! case WCO_VIEW_CHECK: ! ereport(ERROR, ! (errcode(ERRCODE_WITH_CHECK_OPTION_VIOLATION), ! errmsg(new row violates WITH CHECK OPTION for \%s\, ! wco-relname), ! val_desc ? errdetail(Failing row contains %s., ! val_desc) : 0)); ! break; ! case WCO_RLS_INSERT_CHECK: ! case WCO_RLS_UPDATE_CHECK: ! ereport(ERROR, ! (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), ! errmsg(new row violates row level security policy for \%s\, ! wco-relname), ! val_desc ? errdetail(Failing row contains %s., ! val_desc) : 0)); ! break; ! default: ! elog(ERROR, unrecognized WCO kind: %u, wco-kind); ! break; ! } } } } diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c new file mode 100644 index f96fb24..be879a4 *** a/src/backend/executor/nodeModifyTable.c --- b/src/backend/executor/nodeModifyTable.c *** ExecInsert(TupleTableSlot *slot, *** 253,258 --- 253,265 tuple-t_tableOid = RelationGetRelid(resultRelationDesc); /* + * Check any RLS INSERT WITH CHECK policies + */ + if (resultRelInfo-ri_WithCheckOptions != NIL) + ExecWithCheckOptions(WCO_RLS_INSERT_CHECK, +
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On 26 February 2015 at 09:27, Dean Rasheed dean.a.rash...@gmail.com wrote: I think this should probably be committed as 2 separate patches... On closer inspection, the 2 parts are interrelated since the new self-join test that tests the inheritance planner changes also requires the rewriter changes, otherwise it falls over with an infinite recursion error in the rewriter. I suppose the rewriter changes could be committed first with the self-join without inheritance test to test the fix to the infinite recursion problem, but I think it's probably easier to just treat this as a single set of related bug fixes. Attached is an updated patch. There are no more code changes, but I've added a few more regression tests. Regards, Dean diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c new file mode 100644 index b02a107..e4e4d80 *** a/src/backend/optimizer/plan/planner.c --- b/src/backend/optimizer/plan/planner.c *** inheritance_planner(PlannerInfo *root) *** 791,796 --- 791,797 { Query *parse = root-parse; int parentRTindex = parse-resultRelation; + Bitmapset *resultRTindexes = NULL; int nominalRelation = -1; List *final_rtable = NIL; int save_rel_array_size = 0; *** inheritance_planner(PlannerInfo *root) *** 816,822 --- 817,837 * (1) would result in a rangetable of length O(N^2) for N targets, with * at least O(N^3) work expended here; and (2) would greatly complicate * management of the rowMarks list. + * + * Note that any RTEs with security barrier quals will be turned into + * subqueries during planning, and so we must create copies of them too, + * except where they are target relations, which will each only be used + * in a single plan. */ + resultRTindexes = bms_add_member(resultRTindexes, parentRTindex); + foreach(lc, root-append_rel_list) + { + AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc); + if (appinfo-parent_relid == parentRTindex) + resultRTindexes = bms_add_member(resultRTindexes, + appinfo-child_relid); + } + foreach(lc, root-append_rel_list) { AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc); *** inheritance_planner(PlannerInfo *root) *** 887,907 { RangeTblEntry *rte = (RangeTblEntry *) lfirst(lr); ! if (rte-rtekind == RTE_SUBQUERY) { Index newrti; /* * The RTE can't contain any references to its own RT ! * index, so we can save a few cycles by applying ! * ChangeVarNodes before we append the RTE to the ! * rangetable. */ newrti = list_length(subroot.parse-rtable) + 1; ChangeVarNodes((Node *) subroot.parse, rti, newrti, 0); ChangeVarNodes((Node *) subroot.rowMarks, rti, newrti, 0); ChangeVarNodes((Node *) subroot.append_rel_list, rti, newrti, 0); rte = copyObject(rte); subroot.parse-rtable = lappend(subroot.parse-rtable, rte); } --- 902,930 { RangeTblEntry *rte = (RangeTblEntry *) lfirst(lr); ! /* ! * Copy subquery RTEs and RTEs with security barrier quals ! * that will be turned into subqueries, except those that are ! * target relations. ! */ ! if (rte-rtekind == RTE_SUBQUERY || ! (rte-securityQuals != NIL ! !bms_is_member(rti, resultRTindexes))) { Index newrti; /* * The RTE can't contain any references to its own RT ! * index, except in the security barrier quals, so we can ! * save a few cycles by applying ChangeVarNodes before we ! * append the RTE to the rangetable. */ newrti = list_length(subroot.parse-rtable) + 1; ChangeVarNodes((Node *) subroot.parse, rti, newrti, 0); ChangeVarNodes((Node *) subroot.rowMarks, rti, newrti, 0); ChangeVarNodes((Node *) subroot.append_rel_list, rti, newrti, 0); rte = copyObject(rte); + ChangeVarNodes((Node *) rte-securityQuals, rti, newrti, 0); subroot.parse-rtable = lappend(subroot.parse-rtable, rte); } *** preprocess_rowmarks(PlannerInfo *root) *** 2271,2278 newrc = makeNode(PlanRowMark); newrc-rti = newrc-prti = i; newrc-rowmarkId = ++(root-glob-lastRowMarkId); ! /* real tables support REFERENCE, anything else needs COPY */ if (rte-rtekind == RTE_RELATION rte-relkind != RELKIND_FOREIGN_TABLE) newrc-markType = ROW_MARK_REFERENCE; else --- 2294,2306 newrc = makeNode(PlanRowMark); newrc-rti = newrc-prti = i; newrc-rowmarkId = ++(root-glob-lastRowMarkId); ! /* ! * Real tables support REFERENCE, anything else needs COPY. Since ! * RTEs with security barrier quals will become subqueries, they also ! * need COPY. ! */ if (rte-rtekind == RTE_RELATION + rte-securityQuals == NIL rte-relkind != RELKIND_FOREIGN_TABLE) newrc-markType =
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On 26 February 2015 at 05:41, Stephen Frost sfr...@snowman.net wrote: Dean, * Dean Rasheed (dean.a.rash...@gmail.com) wrote: Here's an updated patch with a new test for this bug. I've been developing the fixes for these RLS issues as one big patch, but I suppose it would be easy to split up, if that's preferred. Thanks for working on all of this! I've brought this up to date with master and addressed the little bit of bitrot. I'm still reviewing it but I'm generally happy with the approach and would certainly welcome any additional thoughts from you or feedback from others. Thanks for doing that. I had been meaning to update it myself, but kept getting distracted by my day job. I think this should probably be committed as 2 separate patches, because there are really 2 separate issues being fixed here, and the commit comment only mentions the second one: 1). The planner changes and the first new regression test (update with from clause self join). This is a fix to the inheritance planner code, which wasn't properly handling the case of a query containing both target and non-target relations with RLS and inheritance. 2). The rewriter changes and the other regression tests (S.b. view on top of Row-level security). This is more than just a code tidy-up -- there's a real bug there. If you run those tests against HEAD, the update against the s.b. view fails to apply the RLS policy of the underlying table because the old recursion detection code thinks it has already handled that RTE. I should get more time to look at this over the weekend. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On 26 February 2015 at 05:43, Stephen Frost sfr...@snowman.net wrote: Dean, * Dean Rasheed (dean.a.rash...@gmail.com) wrote: Attached is a patch to make RLS checks run before attempting to insert/update any data rather than afterwards. Excellent, this I really like and it's a pretty straight-forward change. I wonder if there are some documentation updates which need to be done for this also? I'm planning to look as I vauguely recall mentioning the ordering of operations somewhere along the way. I also addressed the bitrot from the column-priv leak patch. Would be great to have you take a look at the latest and let me know if you have any further comments or suggestions. I'm definitely looking forward to getting these changes in soon. Thanks for the update. I don't recall any mention of ordering in the docs, but if there isn't anything there it makes sense to add something. I haven't thought about this patch in a while, but I still like the look of it. It's definitely neater to do these checks earlier and it's good to be able to get RLS-specific errors too. I'll take a look at the latest updates. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
Dean, * Dean Rasheed (dean.a.rash...@gmail.com) wrote: Here's an updated patch with a new test for this bug. I've been developing the fixes for these RLS issues as one big patch, but I suppose it would be easy to split up, if that's preferred. Thanks for working on all of this! I've brought this up to date with master and addressed the little bit of bitrot. I'm still reviewing it but I'm generally happy with the approach and would certainly welcome any additional thoughts from you or feedback from others. Thanks! Stephen From aaef38f8ac98ba65f6c7b90c61d95c20afe271d0 Mon Sep 17 00:00:00 2001 From: Stephen Frost sfr...@snowman.net Date: Thu, 26 Feb 2015 00:21:20 -0500 Subject: [PATCH] Handle RLS independently in fireRIRrules() The original approach to handling RLS and security policies in fireRIRrules() had them included in the main loop through the rtable. This turned out to be less than ideal, however, as the RLS policies need their own infinite recursion detection apart from with check options and security quals from security barrier views. This is also simpler and ensures we only expand RLS policies once for each RTE. In passing, rename prepend_row_security_policies to get_row_security_policies and have it explicitly return (through return arguments) the security quals, with check options, hasRowSecurity and hasSubLinks results. Dean Rasheed --- src/backend/optimizer/plan/planner.c | 38 ++- src/backend/rewrite/rewriteHandler.c | 127 ++ src/backend/rewrite/rowsecurity.c | 89 src/include/rewrite/rowsecurity.h | 5 +- src/test/regress/expected/rowsecurity.out | 170 ++ src/test/regress/sql/rowsecurity.sql | 42 6 files changed, 373 insertions(+), 98 deletions(-) diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index b02a107..e4e4d80 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -791,6 +791,7 @@ inheritance_planner(PlannerInfo *root) { Query *parse = root-parse; int parentRTindex = parse-resultRelation; + Bitmapset *resultRTindexes = NULL; int nominalRelation = -1; List *final_rtable = NIL; int save_rel_array_size = 0; @@ -816,7 +817,21 @@ inheritance_planner(PlannerInfo *root) * (1) would result in a rangetable of length O(N^2) for N targets, with * at least O(N^3) work expended here; and (2) would greatly complicate * management of the rowMarks list. + * + * Note that any RTEs with security barrier quals will be turned into + * subqueries during planning, and so we must create copies of them too, + * except where they are target relations, which will each only be used + * in a single plan. */ + resultRTindexes = bms_add_member(resultRTindexes, parentRTindex); + foreach(lc, root-append_rel_list) + { + AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc); + if (appinfo-parent_relid == parentRTindex) + resultRTindexes = bms_add_member(resultRTindexes, + appinfo-child_relid); + } + foreach(lc, root-append_rel_list) { AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc); @@ -887,21 +902,29 @@ inheritance_planner(PlannerInfo *root) { RangeTblEntry *rte = (RangeTblEntry *) lfirst(lr); -if (rte-rtekind == RTE_SUBQUERY) +/* + * Copy subquery RTEs and RTEs with security barrier quals + * that will be turned into subqueries, except those that are + * target relations. + */ +if (rte-rtekind == RTE_SUBQUERY || + (rte-securityQuals != NIL + !bms_is_member(rti, resultRTindexes))) { Index newrti; /* * The RTE can't contain any references to its own RT - * index, so we can save a few cycles by applying - * ChangeVarNodes before we append the RTE to the - * rangetable. + * index, except in the security barrier quals, so we can + * save a few cycles by applying ChangeVarNodes before we + * append the RTE to the rangetable. */ newrti = list_length(subroot.parse-rtable) + 1; ChangeVarNodes((Node *) subroot.parse, rti, newrti, 0); ChangeVarNodes((Node *) subroot.rowMarks, rti, newrti, 0); ChangeVarNodes((Node *) subroot.append_rel_list, rti, newrti, 0); rte = copyObject(rte); + ChangeVarNodes((Node *) rte-securityQuals, rti, newrti, 0); subroot.parse-rtable = lappend(subroot.parse-rtable, rte); } @@ -2271,8 +2294,13 @@ preprocess_rowmarks(PlannerInfo *root) newrc = makeNode(PlanRowMark); newrc-rti = newrc-prti = i; newrc-rowmarkId = ++(root-glob-lastRowMarkId); - /* real tables support REFERENCE, anything else needs COPY */ + /* + * Real tables support REFERENCE, anything else needs COPY. Since + * RTEs with security barrier quals will become subqueries, they also + * need COPY. + */ if (rte-rtekind == RTE_RELATION +
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
Dean, * Dean Rasheed (dean.a.rash...@gmail.com) wrote: Attached is a patch to make RLS checks run before attempting to insert/update any data rather than afterwards. Excellent, this I really like and it's a pretty straight-forward change. I wonder if there are some documentation updates which need to be done for this also? I'm planning to look as I vauguely recall mentioning the ordering of operations somewhere along the way. I also addressed the bitrot from the column-priv leak patch. Would be great to have you take a look at the latest and let me know if you have any further comments or suggestions. I'm definitely looking forward to getting these changes in soon. Thanks! Stephen From 9fb54d9119d83b00cc7839efff573d313c6fa8e5 Mon Sep 17 00:00:00 2001 From: Stephen Frost sfr...@snowman.net Date: Wed, 25 Feb 2015 23:58:22 -0500 Subject: [PATCH] Run RLS checks before attempting to insert/update The initial approach with RLS leveraged the existing WITH CHECK OPTION capability, which meant that the tuple was essentially fully inserted and only after that was done was it checked for policies. WITH CHECK OPTION is required to do that per the SQL spec, but RLS would ideally be done earlier. This changes the RLS policies to be done prior to attempting to insert or update the tuple, which means that an RLS violation will be thrown (if the policy is violated) instead of, say, a primary key constraint violation. This approach still leverages a great deal of the WITH CHECK OPTION infrastructure, but modifies it to understand when it's operating with RLS policies or WITH CHECK options, which allows it to produce better error messages. Dean Rasheed --- src/backend/executor/execMain.c | 34 --- src/backend/executor/nodeModifyTable.c| 32 + src/backend/nodes/copyfuncs.c | 3 ++- src/backend/nodes/equalfuncs.c| 3 ++- src/backend/nodes/outfuncs.c | 3 ++- src/backend/nodes/readfuncs.c | 3 ++- src/backend/rewrite/rewriteHandler.c | 3 ++- src/backend/rewrite/rowsecurity.c | 8 ++-- src/include/executor/executor.h | 2 +- src/include/nodes/execnodes.h | 2 +- src/include/nodes/parsenodes.h| 15 +++--- src/test/regress/expected/rowsecurity.out | 9 ++-- src/test/regress/sql/rowsecurity.sql | 4 13 files changed, 91 insertions(+), 30 deletions(-) diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index 33b172b..5c0ae39 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -1667,9 +1667,10 @@ ExecConstraints(ResultRelInfo *resultRelInfo, /* * ExecWithCheckOptions -- check that tuple satisfies any WITH CHECK OPTIONs + * of the specified kind. */ void -ExecWithCheckOptions(ResultRelInfo *resultRelInfo, +ExecWithCheckOptions(WCOKind kind, ResultRelInfo *resultRelInfo, TupleTableSlot *slot, EState *estate) { Relation rel = resultRelInfo-ri_RelationDesc; @@ -1694,6 +1695,9 @@ ExecWithCheckOptions(ResultRelInfo *resultRelInfo, WithCheckOption *wco = (WithCheckOption *) lfirst(l1); ExprState *wcoExpr = (ExprState *) lfirst(l2); + if (wco-kind != kind) + continue; + /* * WITH CHECK OPTION checks are intended to ensure that the new tuple * is visible (in the case of a view) or that it passes the @@ -1715,12 +1719,28 @@ ExecWithCheckOptions(ResultRelInfo *resultRelInfo, modifiedCols, 64); - ereport(ERROR, - (errcode(ERRCODE_WITH_CHECK_OPTION_VIOLATION), - errmsg(new row violates WITH CHECK OPTION for \%s\, - wco-viewname), - val_desc ? errdetail(Failing row contains %s., val_desc) : - 0)); + switch (wco-kind) + { +case WCO_VIEW_CHECK: + ereport(ERROR, + (errcode(ERRCODE_WITH_CHECK_OPTION_VIOLATION), + errmsg(new row violates WITH CHECK OPTION for \%s\, + wco-relname), + val_desc ? errdetail(Failing row contains %s., + val_desc) : 0)); +case WCO_RLS_INSERT_CHECK: +case WCO_RLS_UPDATE_CHECK: + ereport(ERROR, + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg(new row violates row level security policy for \%s\, +wco-relname), + val_desc ? errdetail(Failing row contains %s., + val_desc) : 0)); + break; +default: + elog(ERROR, unrecognized WCO kind: %u, wco-kind); + break; + } } } } diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c index f96fb24..be879a4 100644 --- a/src/backend/executor/nodeModifyTable.c +++ b/src/backend/executor/nodeModifyTable.c @@ -253,6 +253,13 @@ ExecInsert(TupleTableSlot *slot, tuple-t_tableOid = RelationGetRelid(resultRelationDesc); /* + * Check any RLS INSERT WITH CHECK policies + */ + if (resultRelInfo-ri_WithCheckOptions != NIL) +
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On 10 January 2015 at 15:12, Stephen Frost sfr...@snowman.net wrote: * Dean Rasheed (dean.a.rash...@gmail.com) wrote: Currently we're applying RLS CHECKs after the INSERT or UPDATE, like WITH CHECK OPTIONs on views. The SQL spec says that WITH CHECK OPTIONs on views have to be applied after the INSERT/UPDATE on the base relation, but we're free to do something different for RLS CHECKs if that makes more sense. If we want RLS to be more like column-level privilege checking, then it does make sense to do the checks sooner, so perhaps we should be checking the RLS policies before the INSERT/UPDATE, like CHECK constraints. Were you thinking about working up a patch for such a change? If not, I'll see about finding time to do it, unless someone else wants to volunteer. :) Attached is a patch to make RLS checks run before attempting to insert/update any data rather than afterwards. In the end I decided not to create a new structure for RLS checks because most of the code that handles them treats them the same as WCOs. Instead, I just added a new 'kind' enum field to the existing structure and renamed/reworded things a bit. The patch also changes the error message for a RLS check violation, to make the cause of the error clearer. One thing I'm not sure about is what sqlstate code to use for this error, but I don't think that using WITH_CHECK_OPTION_VIOLATION is appropriate, because that seems to be specifically intended for views. Regards, Dean diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c new file mode 100644 index 5b70cc9..6762b63 *** a/src/backend/executor/execMain.c --- b/src/backend/executor/execMain.c *** ExecConstraints(ResultRelInfo *resultRel *** 1638,1646 /* * ExecWithCheckOptions -- check that tuple satisfies any WITH CHECK OPTIONs */ void ! ExecWithCheckOptions(ResultRelInfo *resultRelInfo, TupleTableSlot *slot, EState *estate) { ExprContext *econtext; --- 1638,1647 /* * ExecWithCheckOptions -- check that tuple satisfies any WITH CHECK OPTIONs + * of the specified kind. */ void ! ExecWithCheckOptions(WCOKind kind, ResultRelInfo *resultRelInfo, TupleTableSlot *slot, EState *estate) { ExprContext *econtext; *** ExecWithCheckOptions(ResultRelInfo *resu *** 1663,1668 --- 1664,1672 WithCheckOption *wco = (WithCheckOption *) lfirst(l1); ExprState *wcoExpr = (ExprState *) lfirst(l2); + if (wco-kind != kind) + continue; + /* * WITH CHECK OPTION checks are intended to ensure that the new tuple * is visible (in the case of a view) or that it passes the *** ExecWithCheckOptions(ResultRelInfo *resu *** 1673,1686 * case (the opposite of what we do above for CHECK constraints). */ if (!ExecQual((List *) wcoExpr, econtext, false)) ! ereport(ERROR, ! (errcode(ERRCODE_WITH_CHECK_OPTION_VIOLATION), ! errmsg(new row violates WITH CHECK OPTION for \%s\, ! wco-viewname), ! errdetail(Failing row contains %s., ! ExecBuildSlotValueDescription(slot, ! RelationGetDescr(resultRelInfo-ri_RelationDesc), ! 64; } } --- 1677,1711 * case (the opposite of what we do above for CHECK constraints). */ if (!ExecQual((List *) wcoExpr, econtext, false)) ! { ! switch (wco-kind) ! { ! case WCO_VIEW_CHECK: ! ereport(ERROR, ! (errcode(ERRCODE_WITH_CHECK_OPTION_VIOLATION), ! errmsg(new row violates WITH CHECK OPTION for \%s\, ! wco-relname), ! errdetail(Failing row contains %s., ! ExecBuildSlotValueDescription(slot, ! RelationGetDescr(resultRelInfo-ri_RelationDesc), ! 64; ! break; ! case WCO_RLS_INSERT_CHECK: ! case WCO_RLS_UPDATE_CHECK: ! ereport(ERROR, ! (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), ! errmsg(new row violates row level security policy for \%s\, ! wco-relname), ! errdetail(Failing row contains %s., ! ExecBuildSlotValueDescription(slot, ! RelationGetDescr(resultRelInfo-ri_RelationDesc), ! 64; ! break; ! default: ! elog(ERROR, unrecognized WCO kind: %u, wco-kind); ! break; ! } ! } } } diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c new file mode 100644 index f96fb24..be879a4 *** a/src/backend/executor/nodeModifyTable.c --- b/src/backend/executor/nodeModifyTable.c *** ExecInsert(TupleTableSlot *slot, *** 253,258 --- 253,265 tuple-t_tableOid = RelationGetRelid(resultRelationDesc); /* + * Check any RLS INSERT WITH CHECK policies + */ + if (resultRelInfo-ri_WithCheckOptions != NIL) + ExecWithCheckOptions(WCO_RLS_INSERT_CHECK, + resultRelInfo, slot, estate); + + /* * Check the constraints of the tuple */ if
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
* Dean Rasheed (dean.a.rash...@gmail.com) wrote: Turns out it wasn't as simple as that. prepend_row_security_policies() really could get called multiple times for the same RTE, because the call to query_tree_walker() at the end of fireRIRrules() would descend into the just-added quals again. The simplest fix seems to be to process RLS in a separate loop at the end, so that it can have it's own infinite recursion detection, which is different from that needed for pre-existing security quals and with check options from security barrier views. This approach simplifies things a bit, and ensures that we only try to expand RLS once for each RTE. Right, I specifically recall having prepend_row_security_policies() getting called multiple times for the same RTE. I like this approach of using a separate loop though and it strikes me that it lends more weight to the argument that we're better off with these as independent considerations. Also, I'm thinking that it would be better to refactor things a bit and have prepend_row_security_policies() just return the new securityQuals and withCheckOptions to add. Then fireRIRrules() would only have to recurse into the new quals being added, not the already-processed quals. Hmm, good point. Turns out that refactoring actually became necessary in order to fix this bug, but I think it makes things cleaner and more efficient. Sounds good, I'll take a look. Here's an updated patch with a new test for this bug. I've been developing the fixes for these RLS issues as one big patch, but I suppose it would be easy to split up, if that's preferred. I'm alright with it as-is for now. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On 14 January 2015 at 08:43, Dean Rasheed dean.a.rash...@gmail.com wrote: On 12 January 2015 at 14:24, Stephen Frost sfr...@snowman.net wrote: Interesting, thanks for the work! I had been suspicious that there was an issue with the recursion handling. So continuing to review the RLS code, I spotted the following in prepend_row_security_policies(): /* * We may end up getting called multiple times for the same RTE, so check * to make sure we aren't doing double-work. */ if (rte-securityQuals != NIL) return false; which looked suspicious. I assume that's just a hang-up from an earlier attempt to prevent infinite recursion in RLS expansion, but actually it's wrong because in the case of an UPDATE to a security barrier view on top of a table with RLS enabled, the view's securityQuals will get added to the RTE first, and that shouldn't prevent the underlying table's RLS securityQuals from being added. AFAICT, it should be safe to just remove the above code. I can't see how prepend_row_security_policies() could end up getting called more than once for the same RTE. Turns out it wasn't as simple as that. prepend_row_security_policies() really could get called multiple times for the same RTE, because the call to query_tree_walker() at the end of fireRIRrules() would descend into the just-added quals again. The simplest fix seems to be to process RLS in a separate loop at the end, so that it can have it's own infinite recursion detection, which is different from that needed for pre-existing security quals and with check options from security barrier views. This approach simplifies things a bit, and ensures that we only try to expand RLS once for each RTE. Also, I'm thinking that it would be better to refactor things a bit and have prepend_row_security_policies() just return the new securityQuals and withCheckOptions to add. Then fireRIRrules() would only have to recurse into the new quals being added, not the already-processed quals. Turns out that refactoring actually became necessary in order to fix this bug, but I think it makes things cleaner and more efficient. Here's an updated patch with a new test for this bug. I've been developing the fixes for these RLS issues as one big patch, but I suppose it would be easy to split up, if that's preferred. Regards, Dean diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c new file mode 100644 index 9cbbcfb..34ddf41 *** a/src/backend/optimizer/plan/planner.c --- b/src/backend/optimizer/plan/planner.c *** inheritance_planner(PlannerInfo *root) *** 790,795 --- 790,796 { Query *parse = root-parse; int parentRTindex = parse-resultRelation; + Bitmapset *resultRTindexes = NULL; List *final_rtable = NIL; int save_rel_array_size = 0; RelOptInfo **save_rel_array = NULL; *** inheritance_planner(PlannerInfo *root) *** 814,820 --- 815,835 * (1) would result in a rangetable of length O(N^2) for N targets, with * at least O(N^3) work expended here; and (2) would greatly complicate * management of the rowMarks list. + * + * Note that any RTEs with security barrier quals will be turned into + * subqueries during planning, and so we must create copies of them too, + * except where they are target relations, which will each only be used + * in a single plan. */ + resultRTindexes = bms_add_member(resultRTindexes, parentRTindex); + foreach(lc, root-append_rel_list) + { + AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc); + if (appinfo-parent_relid == parentRTindex) + resultRTindexes = bms_add_member(resultRTindexes, + appinfo-child_relid); + } + foreach(lc, root-append_rel_list) { AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc); *** inheritance_planner(PlannerInfo *root) *** 885,905 { RangeTblEntry *rte = (RangeTblEntry *) lfirst(lr); ! if (rte-rtekind == RTE_SUBQUERY) { Index newrti; /* * The RTE can't contain any references to its own RT ! * index, so we can save a few cycles by applying ! * ChangeVarNodes before we append the RTE to the ! * rangetable. */ newrti = list_length(subroot.parse-rtable) + 1; ChangeVarNodes((Node *) subroot.parse, rti, newrti, 0); ChangeVarNodes((Node *) subroot.rowMarks, rti, newrti, 0); ChangeVarNodes((Node *) subroot.append_rel_list, rti, newrti, 0); rte = copyObject(rte); subroot.parse-rtable = lappend(subroot.parse-rtable, rte); } --- 900,928 { RangeTblEntry *rte = (RangeTblEntry *) lfirst(lr); ! /* ! * Copy subquery RTEs and RTEs with security barrier quals ! * that will be turned into subqueries, except those that are ! * target relations. ! */ ! if (rte-rtekind == RTE_SUBQUERY || ! (rte-securityQuals !=
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On 12 January 2015 at 14:24, Stephen Frost sfr...@snowman.net wrote: Interesting, thanks for the work! I had been suspicious that there was an issue with the recursion handling. So continuing to review the RLS code, I spotted the following in prepend_row_security_policies(): /* * We may end up getting called multiple times for the same RTE, so check * to make sure we aren't doing double-work. */ if (rte-securityQuals != NIL) return false; which looked suspicious. I assume that's just a hang-up from an earlier attempt to prevent infinite recursion in RLS expansion, but actually it's wrong because in the case of an UPDATE to a security barrier view on top of a table with RLS enabled, the view's securityQuals will get added to the RTE first, and that shouldn't prevent the underlying table's RLS securityQuals from being added. AFAICT, it should be safe to just remove the above code. I can't see how prepend_row_security_policies() could end up getting called more than once for the same RTE. Also, I'm thinking that it would be better to refactor things a bit and have prepend_row_security_policies() just return the new securityQuals and withCheckOptions to add. Then fireRIRrules() would only have to recurse into the new quals being added, not the already-processed quals. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
Dean, * Dean Rasheed (dean.a.rash...@gmail.com) wrote: On 10 January 2015 at 21:38, Dean Rasheed dean.a.rash...@gmail.com wrote: OK, I'll take a look at it. I started reading the existing code that expands RLS policies and spotted a couple of bugs that should be fixed first:- 1). In prepend_row_security_policies(), defaultDeny should be initialised to false at the top. 2). In fireRIRrules(), activeRIRs isn't being reset properly after recursing, which will cause a self-join to fail. So as I starting looking into these bugs, which looked fairly trivial, the test case I came up with revealed another more subtle bug with RLS, using a more obscure query -- given an update of the form UPDATE t1 ... FROM t2 ..., if t1 is part of an inheritance hierarchy and both t1 and t2 have RLS enabled, the inheritance planner was doing the wrong thing. There is code in there to copy subquery RTEs into each child plan, which needed to do the same for non-target RTEs with security barrier quals, which haven't yet been turned into subqueries. Similarly the rowmark preprocessing code needed to be taught about (non-target) RTEs with security barrier quals to handle this kind of UPDATE with a FROM clause. Interesting, thanks for the work! I had been suspicious that there was an issue with the recursion handling. The attached patch fixes those issues. Excellent. Will take a look. This bug can't happen with updatable security barrier views, since non-target security barrier views are expanded in the rewriter, so technically this doesn't need bach-patching to 9.4. However, I think the planner changes should probably be back-patched anyway, to keep the code in the 2 branches in sync, and more maintainable. Also it feels like the planner ought to be able to handle any legal query thrown at it, even if it looks like the 9.4 rewriter couldn't generate such a query. I'm not anxious to back-patch if there's no issue in existing branches, but I understand your point about making sure it can handle legal queries even if we don't believe current 9.4 can't generate them. We don't tend to back-patch just to keep things in sync as that could possibly have unintended side-effects. Looking at the regression tests a bit though, I notice that this removes the lower-level LockRows.. There had been much discussion about that last spring which I believe you were a part of..? I specifically recall discussing it with Craig, at least. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On 10 January 2015 at 21:38, Dean Rasheed dean.a.rash...@gmail.com wrote: OK, I'll take a look at it. I started reading the existing code that expands RLS policies and spotted a couple of bugs that should be fixed first:- 1). In prepend_row_security_policies(), defaultDeny should be initialised to false at the top. 2). In fireRIRrules(), activeRIRs isn't being reset properly after recursing, which will cause a self-join to fail. So as I starting looking into these bugs, which looked fairly trivial, the test case I came up with revealed another more subtle bug with RLS, using a more obscure query -- given an update of the form UPDATE t1 ... FROM t2 ..., if t1 is part of an inheritance hierarchy and both t1 and t2 have RLS enabled, the inheritance planner was doing the wrong thing. There is code in there to copy subquery RTEs into each child plan, which needed to do the same for non-target RTEs with security barrier quals, which haven't yet been turned into subqueries. Similarly the rowmark preprocessing code needed to be taught about (non-target) RTEs with security barrier quals to handle this kind of UPDATE with a FROM clause. The attached patch fixes those issues. This bug can't happen with updatable security barrier views, since non-target security barrier views are expanded in the rewriter, so technically this doesn't need bach-patching to 9.4. However, I think the planner changes should probably be back-patched anyway, to keep the code in the 2 branches in sync, and more maintainable. Also it feels like the planner ought to be able to handle any legal query thrown at it, even if it looks like the 9.4 rewriter couldn't generate such a query. Regards, Dean diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c new file mode 100644 index 9cbbcfb..119a016 *** a/src/backend/optimizer/plan/planner.c --- b/src/backend/optimizer/plan/planner.c *** inheritance_planner(PlannerInfo *root) *** 790,795 --- 790,796 { Query *parse = root-parse; int parentRTindex = parse-resultRelation; + Bitmapset *resultRTindexes = NULL; List *final_rtable = NIL; int save_rel_array_size = 0; RelOptInfo **save_rel_array = NULL; *** inheritance_planner(PlannerInfo *root) *** 814,820 --- 815,835 * (1) would result in a rangetable of length O(N^2) for N targets, with * at least O(N^3) work expended here; and (2) would greatly complicate * management of the rowMarks list. + * + * Note that any RTEs with security barrier quals will be turned into + * subqueries during planning, and so we must create copies of them too, + * except where they are target relations, which will each only be used + * in a single plan. */ + resultRTindexes = bms_add_member(resultRTindexes, parentRTindex); + foreach(lc, root-append_rel_list) + { + AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc); + if (appinfo-parent_relid == parentRTindex) + resultRTindexes = bms_add_member(resultRTindexes, + appinfo-child_relid); + } + foreach(lc, root-append_rel_list) { AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc); *** inheritance_planner(PlannerInfo *root) *** 885,905 { RangeTblEntry *rte = (RangeTblEntry *) lfirst(lr); ! if (rte-rtekind == RTE_SUBQUERY) { Index newrti; /* * The RTE can't contain any references to its own RT ! * index, so we can save a few cycles by applying ! * ChangeVarNodes before we append the RTE to the ! * rangetable. */ newrti = list_length(subroot.parse-rtable) + 1; ChangeVarNodes((Node *) subroot.parse, rti, newrti, 0); ChangeVarNodes((Node *) subroot.rowMarks, rti, newrti, 0); ChangeVarNodes((Node *) subroot.append_rel_list, rti, newrti, 0); rte = copyObject(rte); subroot.parse-rtable = lappend(subroot.parse-rtable, rte); } --- 900,928 { RangeTblEntry *rte = (RangeTblEntry *) lfirst(lr); ! /* ! * Copy subquery RTEs and RTEs with security barrier quals ! * that will be turned into subqueries, except those that are ! * target relations. ! */ ! if (rte-rtekind == RTE_SUBQUERY || ! (rte-securityQuals != NIL ! !bms_is_member(rti, resultRTindexes))) { Index newrti; /* * The RTE can't contain any references to its own RT ! * index, except in the security barrier quals, so we can ! * save a few cycles by applying ChangeVarNodes before we ! * append the RTE to the rangetable. */ newrti = list_length(subroot.parse-rtable) + 1; ChangeVarNodes((Node *) subroot.parse, rti, newrti, 0); ChangeVarNodes((Node *) subroot.rowMarks, rti, newrti, 0); ChangeVarNodes((Node *) subroot.append_rel_list, rti, newrti, 0); rte = copyObject(rte); + ChangeVarNodes((Node *)
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On 12 January 2015 at 14:24, Stephen Frost sfr...@snowman.net wrote: Looking at the regression tests a bit though, I notice that this removes the lower-level LockRows.. There had been much discussion about that last spring which I believe you were a part of..? I specifically recall discussing it with Craig, at least. Ah, yes you're right. Looking back over that discussion it shouldn't be removing those lower-level LockRows. I was a bit aggressive with my change to the rowmark preprocessing -- the first loop applies to requested locks, like SELECT .. FOR UPDATE, so it shouldn't be messing with that, presecurity.c handles that fine. It's only the second loop that needs to be taught about RTEs with security quals that will become subqueries. Here's an updated patch, that passes with the original regression test results. Regards, Dean diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c new file mode 100644 index 9cbbcfb..34ddf41 *** a/src/backend/optimizer/plan/planner.c --- b/src/backend/optimizer/plan/planner.c *** inheritance_planner(PlannerInfo *root) *** 790,795 --- 790,796 { Query *parse = root-parse; int parentRTindex = parse-resultRelation; + Bitmapset *resultRTindexes = NULL; List *final_rtable = NIL; int save_rel_array_size = 0; RelOptInfo **save_rel_array = NULL; *** inheritance_planner(PlannerInfo *root) *** 814,820 --- 815,835 * (1) would result in a rangetable of length O(N^2) for N targets, with * at least O(N^3) work expended here; and (2) would greatly complicate * management of the rowMarks list. + * + * Note that any RTEs with security barrier quals will be turned into + * subqueries during planning, and so we must create copies of them too, + * except where they are target relations, which will each only be used + * in a single plan. */ + resultRTindexes = bms_add_member(resultRTindexes, parentRTindex); + foreach(lc, root-append_rel_list) + { + AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc); + if (appinfo-parent_relid == parentRTindex) + resultRTindexes = bms_add_member(resultRTindexes, + appinfo-child_relid); + } + foreach(lc, root-append_rel_list) { AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc); *** inheritance_planner(PlannerInfo *root) *** 885,905 { RangeTblEntry *rte = (RangeTblEntry *) lfirst(lr); ! if (rte-rtekind == RTE_SUBQUERY) { Index newrti; /* * The RTE can't contain any references to its own RT ! * index, so we can save a few cycles by applying ! * ChangeVarNodes before we append the RTE to the ! * rangetable. */ newrti = list_length(subroot.parse-rtable) + 1; ChangeVarNodes((Node *) subroot.parse, rti, newrti, 0); ChangeVarNodes((Node *) subroot.rowMarks, rti, newrti, 0); ChangeVarNodes((Node *) subroot.append_rel_list, rti, newrti, 0); rte = copyObject(rte); subroot.parse-rtable = lappend(subroot.parse-rtable, rte); } --- 900,928 { RangeTblEntry *rte = (RangeTblEntry *) lfirst(lr); ! /* ! * Copy subquery RTEs and RTEs with security barrier quals ! * that will be turned into subqueries, except those that are ! * target relations. ! */ ! if (rte-rtekind == RTE_SUBQUERY || ! (rte-securityQuals != NIL ! !bms_is_member(rti, resultRTindexes))) { Index newrti; /* * The RTE can't contain any references to its own RT ! * index, except in the security barrier quals, so we can ! * save a few cycles by applying ChangeVarNodes before we ! * append the RTE to the rangetable. */ newrti = list_length(subroot.parse-rtable) + 1; ChangeVarNodes((Node *) subroot.parse, rti, newrti, 0); ChangeVarNodes((Node *) subroot.rowMarks, rti, newrti, 0); ChangeVarNodes((Node *) subroot.append_rel_list, rti, newrti, 0); rte = copyObject(rte); + ChangeVarNodes((Node *) rte-securityQuals, rti, newrti, 0); subroot.parse-rtable = lappend(subroot.parse-rtable, rte); } *** preprocess_rowmarks(PlannerInfo *root) *** 2254,2261 newrc = makeNode(PlanRowMark); newrc-rti = newrc-prti = i; newrc-rowmarkId = ++(root-glob-lastRowMarkId); ! /* real tables support REFERENCE, anything else needs COPY */ if (rte-rtekind == RTE_RELATION rte-relkind != RELKIND_FOREIGN_TABLE) newrc-markType = ROW_MARK_REFERENCE; else --- 2277,2289 newrc = makeNode(PlanRowMark); newrc-rti = newrc-prti = i; newrc-rowmarkId = ++(root-glob-lastRowMarkId); ! /* ! * Real tables support REFERENCE, anything else needs COPY. Since ! * RTEs with security barrier quals will become subqueries, they also ! * need COPY. ! */ if (rte-rtekind ==
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
Dean, * Dean Rasheed (dean.a.rash...@gmail.com) wrote: On 9 January 2015 at 20:26, Stephen Frost sfr...@snowman.net wrote: Where this leaves me, at least, is feeling like we should always apply the INSERT WITH CHECK policy, then if there is a conflict, check the UPDATE USING policy and throw an error if the row isn't visible but otherwise perform the UPDATE and then check the UPDATE WITH CHECK policy. Yeah, I can see the logic in that, but I'm starting to question the final then in the above, i.e., the order in which the checks are done. Currently we're applying RLS CHECKs after the INSERT or UPDATE, like WITH CHECK OPTIONs on views. The SQL spec says that WITH CHECK OPTIONs on views have to be applied after the INSERT/UPDATE on the base relation, but we're free to do something different for RLS CHECKs if that makes more sense. If we want RLS to be more like column-level privilege checking, then it does make sense to do the checks sooner, so perhaps we should be checking the RLS policies before the INSERT/UPDATE, like CHECK constraints. I've been wondering about the placement of the checks also. I keep meaning to go back to the SQL spec and review the WITH CHECK requirement as it feels a bit odd to me, but then again, it is the spec. As you say, we're not obligated to follow it for RLS in any case. Doing that, it makes sense to first check INSERT CHECK policies, potentially throwing an error before any PK conflict is detected, so an error would result even in the INSERT .. ON CONFLICT IGNORE case. That way INSERT CHECK policies would always be applied regardless of the path taken, as you suggest. Right. I would still say that the RLS UPDATE USING/CHECK policies should only be applied if the UPDATE path is taken (before the UPDATE is applied), and they need to be applied to the (old/new) update tuples only. It's a bit like there is a WHERE record already exists clause attached to the UPDATE. The RLS UPDATE policies should only be applied to the rows affected by the UPDATE. Agreed. I don't see how we could possibly do otherwise. The only tuple we have at the outset is the to-be-INSERT'd tuple. We won't find the conflicting tuple until we try and even then that conflicting tuple might be different from the to-be-INSERT'd tuple, so we can't check that until we've discovered the conflict exists. Further, we won't have the tuple to check for the UPDATE's CHECK policy until all of the expressions (if any) are run on the UPDATE side. I see your point that this runs counter to the 'mod_count' example use-case and could cause problems for users using RLS with such a strategy. For my part, I expect users of RLS to expect errors in such a case rather than allowing it, but it's certainly a judgement call. Actually I don't think this is a problem for the mod_count example. This isn't the same as AND'ing together the INSERT and UPDATE quals, because each set of policies is being applied to a different tuple. There are 3 tuples in play here:- 1). The insert tuple, which has INSERT CHECK polcies applied to it 2). The existing (old) update tuple, which has UPDATE USING policies applied to it 3). The (new) update tuple, which has UPDATE CHECK policies applied to it Right. NOTE: If we do change RLS CHECKs to be executed prior to modifying any data, that's potentially a change that could be made independently of the UPSERT patch. We should also probably then stop referring to them as WITH CHECK OPTIONs in the docs and in error messages because they're not the same thing, and the code might be neater if they had their own data-structure rather than overloading WithCheckOption. I agree that's an independent change. I'm not sure if it would end up being neater or not offhand but I can see some advantages to breaking it up from WithCheck as it might be simpler for users to understand (particularly those users who are not already familiar with WithCheck). Were you thinking about working up a patch for such a change? If not, I'll see about finding time to do it, unless someone else wants to volunteer. :) Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On 9 January 2015 at 20:26, Stephen Frost sfr...@snowman.net wrote: Where this leaves me, at least, is feeling like we should always apply the INSERT WITH CHECK policy, then if there is a conflict, check the UPDATE USING policy and throw an error if the row isn't visible but otherwise perform the UPDATE and then check the UPDATE WITH CHECK policy. Yeah, I can see the logic in that, but I'm starting to question the final then in the above, i.e., the order in which the checks are done. Currently we're applying RLS CHECKs after the INSERT or UPDATE, like WITH CHECK OPTIONs on views. The SQL spec says that WITH CHECK OPTIONs on views have to be applied after the INSERT/UPDATE on the base relation, but we're free to do something different for RLS CHECKs if that makes more sense. If we want RLS to be more like column-level privilege checking, then it does make sense to do the checks sooner, so perhaps we should be checking the RLS policies before the INSERT/UPDATE, like CHECK constraints. Doing that, it makes sense to first check INSERT CHECK policies, potentially throwing an error before any PK conflict is detected, so an error would result even in the INSERT .. ON CONFLICT IGNORE case. That way INSERT CHECK policies would always be applied regardless of the path taken, as you suggest. I would still say that the RLS UPDATE USING/CHECK policies should only be applied if the UPDATE path is taken (before the UPDATE is applied), and they need to be applied to the (old/new) update tuples only. It's a bit like there is a WHERE record already exists clause attached to the UPDATE. The RLS UPDATE policies should only be applied to the rows affected by the UPDATE. I see your point that this runs counter to the 'mod_count' example use-case and could cause problems for users using RLS with such a strategy. For my part, I expect users of RLS to expect errors in such a case rather than allowing it, but it's certainly a judgement call. Actually I don't think this is a problem for the mod_count example. This isn't the same as AND'ing together the INSERT and UPDATE quals, because each set of policies is being applied to a different tuple. There are 3 tuples in play here:- 1). The insert tuple, which has INSERT CHECK polcies applied to it 2). The existing (old) update tuple, which has UPDATE USING policies applied to it 3). The (new) update tuple, which has UPDATE CHECK policies applied to it NOTE: If we do change RLS CHECKs to be executed prior to modifying any data, that's potentially a change that could be made independently of the UPSERT patch. We should also probably then stop referring to them as WITH CHECK OPTIONs in the docs and in error messages because they're not the same thing, and the code might be neater if they had their own data-structure rather than overloading WithCheckOption. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On 10 January 2015 at 15:12, Stephen Frost sfr...@snowman.net wrote: NOTE: If we do change RLS CHECKs to be executed prior to modifying any data, that's potentially a change that could be made independently of the UPSERT patch. We should also probably then stop referring to them as WITH CHECK OPTIONs in the docs and in error messages because they're not the same thing, and the code might be neater if they had their own data-structure rather than overloading WithCheckOption. I agree that's an independent change. I'm not sure if it would end up being neater or not offhand but I can see some advantages to breaking it up from WithCheck as it might be simpler for users to understand (particularly those users who are not already familiar with WithCheck). A new RlsCheck struct wouldn't need the cascade option that WithCheckOption has, and it's not nice the way viewname is being abused. For UPSERT it will need a field indicating the command type, as Peter pointed out, so I think it's different enough to warrant it's own struct, even if we weren't changing the firing time. Were you thinking about working up a patch for such a change? OK, I'll take a look at it. I started reading the existing code that expands RLS policies and spotted a couple of bugs that should be fixed first:- 1). In prepend_row_security_policies(), defaultDeny should be initialised to false at the top. 2). In fireRIRrules(), activeRIRs isn't being reset properly after recursing, which will cause a self-join to fail. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On 9 January 2015 at 00:49, Stephen Frost sfr...@snowman.net wrote: Peter, * Peter Geoghegan (p...@heroku.com) wrote: For column level privileges, you wouldn't expect to only get an error about not having the relevant update permissions at runtime, when the update path happens to be taken. And so it is for RLS. Right, that's the precedent we should be considering. Column-level privileges is a great example- you need both insert and update privileges for the columns involved for the command to succeed. It shouldn't depend on which path actually ends up being taken. It's not the same as column-level privileges though, because RLS policies depend on the new/existing data, not just the table metadata. So for example an UPDATE ... WHERE false can fail column-level privilege checks even though it isn't updating anything, but it cannot fail a RLS policy check. I was trying to think up an example where you might actually have different INSERT and UPDATE policies, and the best I can think of is some sort of mod_count column where you have an INSERT CHECK (mod_count = 0) and an UPDATE CHECK (mod_count 0). In that case, checking both policies would make an UPSERT impossible, whereas if you think of it as doing either an INSERT or an UPDATE, as the syntax suggests, it becomes possible. (I'm assuming here from your description that you intend for the policy expressions to be AND'ed together, so it might end up being an AND of 2 OR expressions.) Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On Fri, Jan 9, 2015 at 12:19 AM, Dean Rasheed dean.a.rash...@gmail.com wrote: I was trying to think up an example where you might actually have different INSERT and UPDATE policies, and the best I can think of is some sort of mod_count column where you have an INSERT CHECK (mod_count = 0) and an UPDATE CHECK (mod_count 0). In that case, checking both policies would make an UPSERT impossible, whereas if you think of it as doing either an INSERT or an UPDATE, as the syntax suggests, it becomes possible. Why does this user want to do this upsert? If they're upserting, then the inserted row could only reasonably have a value of (mod_count = 0). If updating, then they must have a constant value for the update path (a value that's greater than 0, naturally - say 2), which doesn't make any sense in the context of an upsert's auxiliary update - what happened to the 0 value? Sorry, but I don't think your example makes sense - I can't see what would motivate anyone to write a query like that with those RLS policies in place. It sounds like you're talking about an insert and a separate update that may or may not affect the same row, and not an upsert. Then those policies make sense, but in practice they render the upsert you describe contradictory. FWIW, I'm not suggesting that there couldn't possibly be a use case that doesn't do well with this convention where we enforce RLS deepening on the path taken. The cases are just very marginal, as I think your difficulty in coming up with a convincing counter-argument shows. I happen to think what Stephen and I favor (bunching together USING() barrier quals and check options from INSERT and UPDATE policies) is likely to be the best alternative available on balance. More generally, you could point out that I'm actually testing different tuples at different points in query processing under that regime (e.g. the post-insert tuple, or the before-update conflicting, existing tuple from the target, or the post update tuple) and so things could fail when the update path is taken despite the fact that they didn't fail when the insert path was taken. That's technically true, of course, but with idiomatic usage it isn't true, and that's what I care about. Does anyone have another counter-example of a practical upsert statement that cannot be used with certain RLS policies due to the fact that we chose to bunch together INSERT and UPDATE RLS policies? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On Fri, Jan 9, 2015 at 2:22 AM, Dean Rasheed dean.a.rash...@gmail.com wrote: Whoa, hang on. I think you're being a bit quick to dismiss that example. Why shouldn't I want an upsert where the majority of the table columns follow the usual make it so pattern of an upsert, but there is also this kind of audit column to be maintained? Then I would write something like INSERT INTO tbl (some values, 0) ON CONFLICT UPDATE SET same values, mod_count=mod_count+1; The root of the problem is the way that you're proposing to combine the RLS policies (using AND), which runs contrary to the way RLS policies are usually combined (using OR), which is why this kind of example fails -- RLS policies in general aren't intended to all be true simultaneously. In case I wasn't clear, I'm proposing that we AND together the already OR'd together UPDATE and INSERT quals. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
* Peter Geoghegan (p...@heroku.com) wrote: On Fri, Jan 9, 2015 at 1:09 PM, Stephen Frost sfr...@snowman.net wrote: Therefore, I'm not sure that I see the point in checking the INSERT tuple against the UPDATE policy. I guess it wouldn't be hard to modify the struct WithCheckOption to store the cmd (e.g. RowSecurityPolicy-style ACL_*_CHR permissions), usable only in this context. That way, when ExecWithCheckOptions() is called from the INSERT, we can tell it to only enforce INSERT-applicable policy cmds (in particular, not UPDATE-only-applicable policy cmds that happen to end up associated with the same ResultRelation). Whereas, at the end of ExecUpdate(), that final ExecWithCheckOptions() call (call 3 of 3 when the UPDATE path is taken), INSERT-based policies can still be enforced against the final tuple (as can USING() quals that would ordinarily not be checked at that point either). A given ResultRelation's policies wouldn't necessarily always need to be enforced within ExecWithCheckOptions(), which is a change. Does that make sense to you? That sounds reasonable on first blush. I'll note that I've not looked deeply at the UPSERT patch, but if the above means that the INSERT policy is always checked against the INSERT tuple and the UPDATE policy is always checked against the tuple-resulting-from-an-UPDATE, and that tuples which are not visible due to the UPDATE policy throw an error in that case, then it's working as I'd expect. This does mean that the UPDATE policy won't be checked when the INSERT path is used but that seems appropriate to me, as we can't check a policy against a tuple that never exists (the result of the update applied to an existing tuple). Note that I've already taught ExecWithCheckOptions() to not leak the existing, target tuple when the UPDATE path is taken (the tuple that can be referenced in the UPDATE using the TARGET.* alias), while still performing enforcement against it. I can teach it this too. Ok. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
Dean, Peter, * Dean Rasheed (dean.a.rash...@gmail.com) wrote: On 9 January 2015 at 08:49, Peter Geoghegan p...@heroku.com wrote: On Fri, Jan 9, 2015 at 12:19 AM, Dean Rasheed dean.a.rash...@gmail.com wrote: I was trying to think up an example where you might actually have different INSERT and UPDATE policies, and the best I can think of is some sort of mod_count column where you have an INSERT CHECK (mod_count = 0) and an UPDATE CHECK (mod_count 0). In that case, checking both policies would make an UPSERT impossible, whereas if you think of it as doing either an INSERT or an UPDATE, as the syntax suggests, it becomes possible. Why does this user want to do this upsert? If they're upserting, then the inserted row could only reasonably have a value of (mod_count = 0). If updating, then they must have a constant value for the update path (a value that's greater than 0, naturally - say 2), which doesn't make any sense in the context of an upsert's auxiliary update - what happened to the 0 value? Sorry, but I don't think your example makes sense - I can't see what would motivate anyone to write a query like that with those RLS policies in place. It sounds like you're talking about an insert and a separate update that may or may not affect the same row, and not an upsert. Then those policies make sense, but in practice they render the upsert you describe contradictory. Whoa, hang on. I think you're being a bit quick to dismiss that example. Why shouldn't I want an upsert where the majority of the table columns follow the usual make it so pattern of an upsert, but there is also this kind of audit column to be maintained? Then I would write something like INSERT INTO tbl (some values, 0) ON CONFLICT UPDATE SET same values, mod_count=mod_count+1; That's a good point- it doesn't make sense to compare the INSERT values against the UPDATE policy as the result of the UPDATE could be a completely different tuple, one we can't know the contents of until we actually find a conflicting tuple. We can't simply combine the policies and run them at the same time, but I'm not sure that then means we should let an INSERT .. ON CONFLICT UPDATE pass in values to the INSERT which are not permitted by any INSERT policy on the relation. Further, forgetting about RLS, what happens if a user doesn't have INSERT rights on the mod_count column at all? We still allow the above to execute if the row already exists? That doesn't strike me as a good idea. In this case, I do think it makes sense to consider RLS and our regular permissions system as they are both 'OR' cases. I might have UPDATE rights for a particular column through multiple different roles for a given relation, but if none of them give me INSERT rights for that column, then I'd expect to get an error if I try to do an INSERT into that column. The root of the problem is the way that you're proposing to combine the RLS policies (using AND), which runs contrary to the way RLS policies are usually combined (using OR), which is why this kind of example fails -- RLS policies in general aren't intended to all be true simultaneously. I agree that RLS policies, in general, are not intended to all be true simultaneously. I don't think it then follows that an INSERT .. ON CONFLICT UPDATE should be allowed if, for example, there are no INSERT policies on an RLS-enabled relation, because the call happens to end up doing an UPDATE. To try and outline another possible use-case, consider that you have tuples coming in from two independent sets of users, orange and blue. All users are allowed to INSERT, but the 'color' column must equal the user's. For UPDATEs, the blue users can see both orange and blue records while orange users can only see orange records. Rows resulting from an UPDATE can be orange or blue for blue users, but must be orange for orange users. Further, in this environment, attempts by orange users to insert blue records are flagged to be audited for review because orange users should never have access to nor be handling blue data. With the approach you're outlining, a command like: INSERT INTO data VALUES (... , 'blue') ON CONFLICT UPDATE set ... = ...; run by an orange user wouldn't error if it happened to result in an UPDATE (presuming, of course, that the UPDATE didn't try to change the existing color), but from a security perspective, it's clearly an error for an orange user to be attempting to insert blue data. I don't see this as impacting the more general case of how RLS policies are applied. An individual blue user might also be a member of some orange-related role and it wouldn't make any sense for that user to then be only able to see orange records (were we to AND RLS policies together). Where this leaves me, at least, is feeling like we should always apply the INSERT WITH CHECK policy, then if there is a conflict, check the UPDATE USING policy and throw an error if the row
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
Peter, * Peter Geoghegan (p...@heroku.com) wrote: On Fri, Jan 9, 2015 at 2:22 AM, Dean Rasheed dean.a.rash...@gmail.com wrote: Whoa, hang on. I think you're being a bit quick to dismiss that example. Why shouldn't I want an upsert where the majority of the table columns follow the usual make it so pattern of an upsert, but there is also this kind of audit column to be maintained? Then I would write something like INSERT INTO tbl (some values, 0) ON CONFLICT UPDATE SET same values, mod_count=mod_count+1; The root of the problem is the way that you're proposing to combine the RLS policies (using AND), which runs contrary to the way RLS policies are usually combined (using OR), which is why this kind of example fails -- RLS policies in general aren't intended to all be true simultaneously. In case I wasn't clear, I'm proposing that we AND together the already OR'd together UPDATE and INSERT quals. I'm not sure how that would work exactly though, since the tuple the UPDATE results in might be different from what the INSERT has, as Dean pointed out. The INSERT tuple might even pass the UPDATE policy where the resulting tuple from the actual UPDATE part doesn't. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On Fri, Jan 9, 2015 at 12:53 PM, Stephen Frost sfr...@snowman.net wrote: I'm not sure how that would work exactly though, since the tuple the UPDATE results in might be different from what the INSERT has, as Dean pointed out. The INSERT tuple might even pass the UPDATE policy where the resulting tuple from the actual UPDATE part doesn't. I'm not suggesting that Dean's example is totally without merit (his revised explanation made it clearer what he meant). I just think, on balance, that enforcing all INSERT + UPDATE policies at various points is the best behavior. Part of the reason for that is that if you ask people how they think it works, you'll get as many answers as the number of people you ask. My proposal (which may or may not be the same as yours, but if not is only slightly more restrictive) is unlikely to affect many users negatively, and is relatively easy to explain and reason about. There are workarounds for Dean's case, too. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On Fri, Jan 9, 2015 at 1:09 PM, Stephen Frost sfr...@snowman.net wrote: To flip it around a bit, I don't think we can avoid checking the *resulting* tuple from the UPDATE against the UPDATE policy. We can avoid it - by not updating. What I'm suggesting is that an enforcement occurs that is more or less equivalent to the enforcement that occurs when the UPDATE path is taken, without it actually being taken. I accept that that isn't perfect, but it has its advantages. Therefore, I'm not sure that I see the point in checking the INSERT tuple against the UPDATE policy. I guess it wouldn't be hard to modify the struct WithCheckOption to store the cmd (e.g. RowSecurityPolicy-style ACL_*_CHR permissions), usable only in this context. That way, when ExecWithCheckOptions() is called from the INSERT, we can tell it to only enforce INSERT-applicable policy cmds (in particular, not UPDATE-only-applicable policy cmds that happen to end up associated with the same ResultRelation). Whereas, at the end of ExecUpdate(), that final ExecWithCheckOptions() call (call 3 of 3 when the UPDATE path is taken), INSERT-based policies can still be enforced against the final tuple (as can USING() quals that would ordinarily not be checked at that point either). A given ResultRelation's policies wouldn't necessarily always need to be enforced within ExecWithCheckOptions(), which is a change. Does that make sense to you? Note that I've already taught ExecWithCheckOptions() to not leak the existing, target tuple when the UPDATE path is taken (the tuple that can be referenced in the UPDATE using the TARGET.* alias), while still performing enforcement against it. I can teach it this too. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On Fri, Jan 9, 2015 at 12:26 PM, Stephen Frost sfr...@snowman.net wrote: Where this leaves me, at least, is feeling like we should always apply the INSERT WITH CHECK policy, then if there is a conflict, check the UPDATE USING policy and throw an error if the row isn't visible but otherwise perform the UPDATE and then check the UPDATE WITH CHECK policy. I see your point that this runs counter to the 'mod_count' example use-case and could cause problems for users using RLS with such a strategy. For my part, I expect users of RLS to expect errors in such a case rather than allowing it, but it's certainly a judgement call. I mostly agree, but I don't know that I fully agree. Specifically, I also think we should check the update policy even when we only insert, on the theory that if we did go to update, the would-be inserted row would be a proxy for what we'd check then (the existing, target relation's tuple). What do you think of that? I certainly agree that the correct behavior here is at least a bit subjective. We cannot exactly generalize from other areas of the code, nor can we look for a precedent set by other systems (AFAICT there is none). The only reasonable way that I can see to support both sides would be to allow UPSERT to be a top-level policy definition in its own right and let users specify exactly what is allowed in the UPSERT case (possibly requiring three different expressions to represent the initial INSERT, what the UPDATE can see, and what the UPDATE results in..). I tend to doubt it would be worth it unless we end up supporting UPSERT-specific triggers and permissions.. Agreed. That would technically make everyone happy, in that it defers to the user, but is unlikely to be worth it. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
* Peter Geoghegan (p...@heroku.com) wrote: On Fri, Jan 9, 2015 at 12:26 PM, Stephen Frost sfr...@snowman.net wrote: Where this leaves me, at least, is feeling like we should always apply the INSERT WITH CHECK policy, then if there is a conflict, check the UPDATE USING policy and throw an error if the row isn't visible but otherwise perform the UPDATE and then check the UPDATE WITH CHECK policy. I see your point that this runs counter to the 'mod_count' example use-case and could cause problems for users using RLS with such a strategy. For my part, I expect users of RLS to expect errors in such a case rather than allowing it, but it's certainly a judgement call. I mostly agree, but I don't know that I fully agree. Specifically, I also think we should check the update policy even when we only insert, on the theory that if we did go to update, the would-be inserted row would be a proxy for what we'd check then (the existing, target relation's tuple). What do you think of that? To flip it around a bit, I don't think we can avoid checking the *resulting* tuple from the UPDATE against the UPDATE policy. Therefore, I'm not sure that I see the point in checking the INSERT tuple against the UPDATE policy. I also don't like the idea that a completely legitimate command (one which allows the to-be-inserted row via the INSERT policy AND allows the tuple-post-update via the UPDATE policy) to throw an error because the to-be-inserted row violated the UPDATE policy. That doesn't make sense to me. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On 9 January 2015 at 08:49, Peter Geoghegan p...@heroku.com wrote: On Fri, Jan 9, 2015 at 12:19 AM, Dean Rasheed dean.a.rash...@gmail.com wrote: I was trying to think up an example where you might actually have different INSERT and UPDATE policies, and the best I can think of is some sort of mod_count column where you have an INSERT CHECK (mod_count = 0) and an UPDATE CHECK (mod_count 0). In that case, checking both policies would make an UPSERT impossible, whereas if you think of it as doing either an INSERT or an UPDATE, as the syntax suggests, it becomes possible. Why does this user want to do this upsert? If they're upserting, then the inserted row could only reasonably have a value of (mod_count = 0). If updating, then they must have a constant value for the update path (a value that's greater than 0, naturally - say 2), which doesn't make any sense in the context of an upsert's auxiliary update - what happened to the 0 value? Sorry, but I don't think your example makes sense - I can't see what would motivate anyone to write a query like that with those RLS policies in place. It sounds like you're talking about an insert and a separate update that may or may not affect the same row, and not an upsert. Then those policies make sense, but in practice they render the upsert you describe contradictory. Whoa, hang on. I think you're being a bit quick to dismiss that example. Why shouldn't I want an upsert where the majority of the table columns follow the usual make it so pattern of an upsert, but there is also this kind of audit column to be maintained? Then I would write something like INSERT INTO tbl (some values, 0) ON CONFLICT UPDATE SET same values, mod_count=mod_count+1; The root of the problem is the way that you're proposing to combine the RLS policies (using AND), which runs contrary to the way RLS policies are usually combined (using OR), which is why this kind of example fails -- RLS policies in general aren't intended to all be true simultaneously. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
Peter, * Peter Geoghegan (p...@heroku.com) wrote: For column level privileges, you wouldn't expect to only get an error about not having the relevant update permissions at runtime, when the update path happens to be taken. And so it is for RLS. Right, that's the precedent we should be considering. Column-level privileges is a great example- you need both insert and update privileges for the columns involved for the command to succeed. It shouldn't depend on which path actually ends up being taken. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On 6 January 2015 at 20:44, Peter Geoghegan p...@heroku.com wrote: Another issue that I see is that there is only one resultRelation between the INSERT and UPDATE. That means that without some additional special measure, both UPDATE and INSERT WITH CHECK ( ... ) options are applied regardless of whether the INSERT path was taken, or the UPDATE path. Maybe that's actually sensible (note that even this doesn't happen right now, since the auxiliary UPDATE is currently minimally processed by the rewriter). What do people think about that? It seems like it might be less surprising to have that fail earlier when there are separate policies for INSERT and UPDATE -- for UPSERT, all policies are applied regardless of which path is taken. I think the policies applied should depend on the path taken, so if it does an INSERT, then only the INSERT CHECK policy should be applied (after the insert), but if it ends up doing an UPDATE, I would expect the UPDATE USING policy to be applied (before the update) and the UPDATE CHECK policy to be applied (after the update). I would not expect the INSERT CHECK policy to be applied on the UPDATE path. As to whether the UPDATE USING policy should cause an error to be thrown if it is not satisfied, my inclination would be to not error, and use the command tag to report that no rows were updated, since that is what would happen with a regular UPDATE. So overall INSERT .. ON CONFLICT UPDATE would be consistent with either an INSERT or an UPDATE, depending on whether the row existed beforehand, which is easier to explain than having some special UPSERT semantics. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On Wed, Jan 7, 2015 at 12:01 PM, Stephen Frost sfr...@snowman.net wrote: Other databases have this capability and have triggers and at least one ends up firing both INSERT and UPDATE triggers, with many complaints from users about how that ends up making the performance suck. Perhaps we could use that as a fallback but support the explicit single trigger option too.. Just some thoughts, apologies if it's already been convered in depth previously. I would like to expose whether or not statement-level triggers are being called in the context of an INSERT ... ON CONFLICT UPDATE, FWIW. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On Wed, Jan 7, 2015 at 12:26 PM, Stephen Frost sfr...@snowman.net wrote: I agree with that up to a point- this case feels more like a POLA violation though rather than a run-of-the-mill you need to test all that. I'm a bit worried this might lead to cases where users can't use UPSERT because they don't know which set of policies will end up getting applied, although the above approach is strictly a subset of the approach I'm advocating, but at least with the 'throw it all together' case, you know exactly what you're getting with UPSERT. I think that it makes sense to bunch the policies together, because INSERT ... ON CONFLICT UPDATE is mostly an INSERT - informally speaking, the UPDATE part is just enough to resolve a conflict - this is ultimately just a new clause of INSERT. Besides, I think that making the permissions as restrictive as possible is the least surprising behavior. It's clearly the most secure behavior. I think that we collectively lean towards actively throwing an error as the preferred behavior - I think that's good, because the intent of the user to UPDATE some particular tuple that they're not allowed to UPDATE is clear and unambiguous. As long as we're doing that, clearly writing a query that will fail based on the wrong path being taken is wrong-headed. Now, you can construct an UPSERT case that a bunch together policies approach doesn't serve well. I could be mistaken, but I think that case is likely to be more than a little contrived. We're now going to be testing the TARGET.* tuple before going to update, having failed to insert (and after row locking the TARGET.* tuple) - the target tuple is really how the update finds the existing row to update, so it should be tested, without necessarily being blamed directly (certainly, we cannot leak the TARGET.* tuple values, so maybe we should try blaming the EXCLUDED.* tuple first for error messaging reasons). If it fails because of the insert check after row locking but before update, well, it was liable to anyway, when the insert path was taken. Conversely, if it failed after actually inserting where that would not happen with a vanilla INSERT (due to a bunched-together update USING() security barrier qual or explicit user-supplied CHECK OPTION), well, you'd have gotten the same thing if you took the UPDATE path anyway. That just leaves the regular ExecUpdate() call of ExecWithCheckOptions(). When that is called with bunched together policies, you're now testing the final tuple. So it might be a bit confusing that the policy of final tuples originating from INSERTs is also enforced here, for the final tuple of an UPDATE. But in order for even that to be the case, you'd have to have a final tuple that was substantially different from the one originally proposed for insertion that made the difference between a qual passing and not passing. How likely is that in realistic use cases? And besides, isn't the policy for INSERTs still quite relevant even here? We're talking about a final tuple that originated from an INSERT statement, after all. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
* Robert Haas (robertmh...@gmail.com) wrote: On Wed, Jan 7, 2015 at 4:04 AM, Dean Rasheed dean.a.rash...@gmail.com wrote: I think the policies applied should depend on the path taken, so if it does an INSERT, then only the INSERT CHECK policy should be applied (after the insert), but if it ends up doing an UPDATE, I would expect the UPDATE USING policy to be applied (before the update) and the UPDATE CHECK policy to be applied (after the update). I would not expect the INSERT CHECK policy to be applied on the UPDATE path. I agree. I can certainly understand the appeal of this approach, but I don't think it makes sense. Consider what happens later on down the road, after the code has been written and deployed using INSERT .. ON CONFLICT UPDATE where 99% of the time only one path or the other is taken. Then the other path is taken and suddenly the exact same command and row ends up returning errors. Additional testing should have been done to check if that happens, of course, but I really don't like the idea that the exact same command, with the exact same policies, would succeed or fail, due to policies, based on the data in the database. That's not to say that, generally, speaking, the data in the database shouldn't impact policy failures, but in this case, the policies might not even reference any other tables in the database. As to whether the UPDATE USING policy should cause an error to be thrown if it is not satisfied, my inclination would be to not error, and use the command tag to report that no rows were updated, since that is what would happen with a regular UPDATE. My inclination would be to error, but I'd be OK with your proposal. I'm pretty strongly in favor of erroring. :) So overall INSERT .. ON CONFLICT UPDATE would be consistent with either an INSERT or an UPDATE, depending on whether the row existed beforehand, which is easier to explain than having some special UPSERT semantics. Yeah. We won't escape the question so easily where triggers are concerned, but at least for RLS it seems like it should be possible to avoid confusing, one-off semantics. Triggers are another question.. One approach that I don't recall seeing (and I'm not convinced that it's a good idea either, but thought it deserved mention) is the idea of having an UPSERT-specific set of policies (and perhaps an UPSERT-specific trigger...?). That gets pretty grotty when you consider existing systems which have INSERT and UPDATE triggers already, but UPSERT is a pretty big deal and a big change and for users who are just starting to use it, requiring that they write triggers and policies specifically to address that case might be acceptable. That doesn't address the cases where users have direct SQL access and might be able to then bypass existing triggers, but we might be able to work out an answer to that case.. Other databases have this capability and have triggers and at least one ends up firing both INSERT and UPDATE triggers, with many complaints from users about how that ends up making the performance suck. Perhaps we could use that as a fallback but support the explicit single trigger option too.. Just some thoughts, apologies if it's already been convered in depth previously. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
* Robert Haas (robertmh...@gmail.com) wrote: On Wed, Jan 7, 2015 at 3:01 PM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: On Wed, Jan 7, 2015 at 4:04 AM, Dean Rasheed dean.a.rash...@gmail.com wrote: I think the policies applied should depend on the path taken, so if it does an INSERT, then only the INSERT CHECK policy should be applied (after the insert), but if it ends up doing an UPDATE, I would expect the UPDATE USING policy to be applied (before the update) and the UPDATE CHECK policy to be applied (after the update). I would not expect the INSERT CHECK policy to be applied on the UPDATE path. I agree. I can certainly understand the appeal of this approach, but I don't think it makes sense. Consider what happens later on down the road, after the code has been written and deployed using INSERT .. ON CONFLICT UPDATE where 99% of the time only one path or the other is taken. Then the other path is taken and suddenly the exact same command and row ends up returning errors. I'd say: that's life. If you don't test your policies, they might not work. I agree with that up to a point- this case feels more like a POLA violation though rather than a run-of-the-mill you need to test all that. I'm a bit worried this might lead to cases where users can't use UPSERT because they don't know which set of policies will end up getting applied, although the above approach is strictly a subset of the approach I'm advocating, but at least with the 'throw it all together' case, you know exactly what you're getting with UPSERT. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On Wed, Jan 7, 2015 at 3:01 PM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: On Wed, Jan 7, 2015 at 4:04 AM, Dean Rasheed dean.a.rash...@gmail.com wrote: I think the policies applied should depend on the path taken, so if it does an INSERT, then only the INSERT CHECK policy should be applied (after the insert), but if it ends up doing an UPDATE, I would expect the UPDATE USING policy to be applied (before the update) and the UPDATE CHECK policy to be applied (after the update). I would not expect the INSERT CHECK policy to be applied on the UPDATE path. I agree. I can certainly understand the appeal of this approach, but I don't think it makes sense. Consider what happens later on down the road, after the code has been written and deployed using INSERT .. ON CONFLICT UPDATE where 99% of the time only one path or the other is taken. Then the other path is taken and suddenly the exact same command and row ends up returning errors. I'd say: that's life. If you don't test your policies, they might not work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
* Dean Rasheed (dean.a.rash...@gmail.com) wrote: On 6 January 2015 at 20:44, Peter Geoghegan p...@heroku.com wrote: Another issue that I see is that there is only one resultRelation between the INSERT and UPDATE. That means that without some additional special measure, both UPDATE and INSERT WITH CHECK ( ... ) options are applied regardless of whether the INSERT path was taken, or the UPDATE path. Maybe that's actually sensible (note that even this doesn't happen right now, since the auxiliary UPDATE is currently minimally processed by the rewriter). What do people think about that? It seems like it might be less surprising to have that fail earlier when there are separate policies for INSERT and UPDATE -- for UPSERT, all policies are applied regardless of which path is taken. I think the policies applied should depend on the path taken, so if it does an INSERT, then only the INSERT CHECK policy should be applied (after the insert), but if it ends up doing an UPDATE, I would expect the UPDATE USING policy to be applied (before the update) and the UPDATE CHECK policy to be applied (after the update). I would not expect the INSERT CHECK policy to be applied on the UPDATE path. You're making a distinction where I'm not convinced there should be one. While I understand that, internally, there are two paths (INSERT vs. UPDATE), it's still only one command for the user and their goal is simply update if this exists, insert if it doesn't. I also don't like the idea that the policy to be applied depends on if it ends up being an INSERT or an UPDATE. I liked Peter's suggestion that the row must pass both WITH CHECK conditions- at least that's consistent and understandable. As to whether the UPDATE USING policy should cause an error to be thrown if it is not satisfied, my inclination would be to not error, and use the command tag to report that no rows were updated, since that is what would happen with a regular UPDATE. Yes, for a regular UPDATE that's what would happen- but this isn't a regular UPDATE, it's an UPSERT. Consider how users handle this now- they have routines which continually try to do one or the other until either the INSERT or the UPDATE is successful. I've never seen such a function, properly written, that says try to INSERT, if that fails, try to UPDATE and if that doesn't update anything, then simply exit. What is the use-case for that? So overall INSERT .. ON CONFLICT UPDATE would be consistent with either an INSERT or an UPDATE, depending on whether the row existed beforehand, which is easier to explain than having some special UPSERT semantics. Any semantics which result in a no-op would be surprising for users of UPSERT. I like the idea of failing earlier- if you try to INSERT .. ON CONFLICT UPDATE a row which you're not allowed to INSERT, erroring strikes me as the right result. If you try to INSERT .. ON CONFLICT UPDATE a row which you're not allowed to UPDATE, I'd also think an error would be the right answer, even if you don't end up doing an actual UPDATE- you ran a command asking for an UPDATE to happen under a certain condition (the row already exists) and the policy states that you're not allowed to do such an UPDATE. As for the UPDATE's USING clause, if you're not allowed to view the records to be updated (which evidently exists because the INSERT failed), I'd handle that the same way we handle the case that a SELECT policy might not allow a user to see rows that they can attempt to INSERT- they'll get an error back in that case too. In the end, these are edge cases as policies are generally self consistent and so I think we have some leeway, but I don't think we have the right to turn an INSERT .. ON CONFLICT UPDATE into a no-op. That would be like allowing an INSERT to be a no-op. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On Wed, Jan 7, 2015 at 4:04 AM, Dean Rasheed dean.a.rash...@gmail.com wrote: I think the policies applied should depend on the path taken, so if it does an INSERT, then only the INSERT CHECK policy should be applied (after the insert), but if it ends up doing an UPDATE, I would expect the UPDATE USING policy to be applied (before the update) and the UPDATE CHECK policy to be applied (after the update). I would not expect the INSERT CHECK policy to be applied on the UPDATE path. I agree. As to whether the UPDATE USING policy should cause an error to be thrown if it is not satisfied, my inclination would be to not error, and use the command tag to report that no rows were updated, since that is what would happen with a regular UPDATE. My inclination would be to error, but I'd be OK with your proposal. So overall INSERT .. ON CONFLICT UPDATE would be consistent with either an INSERT or an UPDATE, depending on whether the row existed beforehand, which is easier to explain than having some special UPSERT semantics. Yeah. We won't escape the question so easily where triggers are concerned, but at least for RLS it seems like it should be possible to avoid confusing, one-off semantics. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On Tue, Jan 6, 2015 at 9:39 AM, Robert Haas robertmh...@gmail.com wrote: I think the INSERT .. ON CONFLICT UPDATE shouldn't be able to attempt an update unless the UPDATE policies of the table are such that a regular UPDATE would find the affected row. The post-image of the row needs to satisfy any UPDATE CHECK OPTION. If the INSERT fails due to a conflict with an unseen row, and the UPDATE can't find that row either due to RLS, then it should probably error out; the alternative is to silently do nothing, but that feels wrong. I can certainly see the argument for that behavior. I'm inclined to agree that this is better. With th existing implementation, UPDATE check options are effective at preventing updates. However, the implementation is not effective at preventing row locking from finding a row that the user would not otherwise be able to find (with a conventional UPDATE). So I guess what I have to do now is figure out a way of also having the ... FOR UPDATE USING ( ) qual be enforced after row locking in respect of the row locked (locked, but not yet used to generate a post-image) in the target table. If it isn't satisfied, throw an error that doesn't leak anything about the target row, rather than silently not affecting the row. So, as you say, a divergence from what regular RLS UPDATEs do that happens to make more sense here. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
Another issue that I see is that there is only one resultRelation between the INSERT and UPDATE. That means that without some additional special measure, both UPDATE and INSERT WITH CHECK ( ... ) options are applied regardless of whether the INSERT path was taken, or the UPDATE path. Maybe that's actually sensible (note that even this doesn't happen right now, since the auxiliary UPDATE is currently minimally processed by the rewriter). What do people think about that? It seems like it might be less surprising to have that fail earlier when there are separate policies for INSERT and UPDATE -- for UPSERT, all policies are applied regardless of which path is taken. The task of making the security qual enforced like a check option before we go to update a locked row (at the start of the UPDATE path, for any UPDATE policy with a USING security qual) looks complicated. I'd appreciate a little help on the implementation details. Apparently Oracle does not offer fine-grained access control with MERGE, which I believe means they just don't support this kind of thing at all. Obviously I'd rather avoid that here, but the correct semantics are not obvious. ON CONFLICT UPDATE could almost justify making CREATE POLICY FOR INSERT accept a USING expression, since that's really where the row to UPDATE comes from. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
The patch that implements INSERT ... ON CONFLICT UPDATE has support and tests for per-column privileges (which are not relevant to the IGNORE variant, AFAICT). However, RLS support is another thing entirely. It has not been properly thought out, and unlike per-column privileges requires careful consideration, as the correct behavior isn't obvious. I've documented the current problems with RLS here: https://wiki.postgresql.org/wiki/UPSERT#RLS It's not clear whether or not the auxiliary UPDATE within an INSERT... ON CONFLICT UPDATE statement should have security quals appended. Stephen seemed to think that that might not be the best solution [1]. I am not sure. I'd like to learn what other people think. What is the best way of integrating RLS with ON CONFLICT UPDATE? What behavior is most consistent with the guarantees of RLS? In particular, should the implementation append security quals to the auxiliary UPDATE, or fail sooner? [1] http://www.postgresql.org/message-id/20141121205926.gk28...@tamriel.snowman.net -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers