Clarifying docs on nuance of select and update policies

2022-09-16 Thread Chandler Gonzales
Hi y'all, I've got a proposed clarification to the documentation on the
nuances of RLS behavior for update policies, and maybe a (humble) request
for a change in behavior to make it more intuitive. I am starting with
pgsql-docs since I think the documentation change is a good starting point.

We use RLS policies to hide "soft deleted" objects from certain DB roles.
We recently tried to add the ability to let a user "soft delete" a row.
Ideally, we can write an RLS policy that allows a user to "soft delete" a
row, but then hides the row from that same user once it is soft deleted.

Here's the setup on a fresh Postgres db for my example. I'm executing these
queries as the database owner:
```
create role some_user_type;
create table foo(id int primary key, soft_deleted_at timestamptz,
some_other_field text);
alter table foo enable row level security;
grant select on table foo to some_user_type;
grant update(soft_deleted_at) on table foo to some_user_type;
insert into foo(id) values (1);
insert into foo(id, soft_deleted_at) values (2, now());
```

The behavior I'm trying to encode in RLS is that users with the role
some_user_type can see all rows where soft_deleted_at is null, can update
rows where BOTH the original soft_deleted_at is null AND the updated row
has a non-null soft_deleted_at. Basically, the only thing this user can do
to this row is to soft delete it.

We'll use a restrictive policy to get better error messages when we do an
update later on:
```
create policy pol_1 on foo for select to some_user_type using (true);
create policy pol_1_res on foo as restrictive for select to some_user_type
using (soft_deleted_at is null);
```

And just to verify it's working:
```
chandler@localhost:chandler> begin; set local role some_user_type; select *
from foo; rollback;
BEGIN
SET
╒╤═╤══╕
│ id │ soft_deleted_at │ some_other_field │
╞╪═╪══╡
│ 1  │ ¤   │ ¤│
╘╧═╧══╛
SELECT 1
ROLLBACK
```

Now the important bit, the update policy:
```
create policy pol_2 on foo for update to some_user_type using
(soft_deleted_at is null) with check (soft_deleted_at is not null);
```

If we update a row without a where clause that touches the row, the update
succeeds:
```
chandler@localhost:chandler> begin; set local role some_user_type; update
foo set soft_deleted_at = now() where true; rollback;
BEGIN
SET
UPDATE 1
ROLLBACK
```

But if we update a row with a where clause that uses the existing row, the
update fails:
```
chandler@localhost:chandler> begin; set local role some_user_type; update
foo set soft_deleted_at = now() where id = 1; rollback;
BEGIN
SET
new row violates row-level security policy "pol_1_res" for table "foo"
```

This was very unintuitive to me. My understanding is that when USING and
WITH CHECK are both used for an update policy, the USING is tested against
the original row, and the WITH CHECK clause against the new row.  This is
stated in the [documentation](
https://www.postgresql.org/docs/current/sql-createpolicy.html):

> The USING expression determines which records the UPDATE command will see
to operate against, while the WITH CHECK expression defines which modified
rows are allowed to be stored back into the relation

So why is it that the addition of where id = 1 violates the select policy?
Later in the same doc:

> Typically an UPDATE command also needs to read data from columns in the
relation being updated (e.g., in a WHERE clause or a RETURNING clause, or
in an expression on the right hand side of the SET clause). In this case,
SELECT rights are also required on the relation being updated, and the
appropriate SELECT or ALL policies will be applied in addition to the
UPDATE policies. Thus the user must have access to the row(s) being updated
through a SELECT or ALL policy in addition to being granted permission to
update the row(s) via an UPDATE or ALL policy.

If you read this documentation perfectly literally, it does describe the
behavior shown my examples above, but it is a bit ambiguous. I think a more
clear explanation of this behavior would be the following:

> Typically an UPDATE command also needs to read data from columns in the
relation being updated (e.g., in a WHERE clause or a RETURNING clause, or
in an expression on the right hand side of the SET clause). In this case,
SELECT rights are also required on the relation being updated, and the
appropriate SELECT or ALL policies will be applied in addition to the
UPDATE policies. Thus the user must have access to the row(s) operated
against and the rows being stored back into the relation via a SELECT or
ALL policy in addition to being granted permission to update the row(s) via
an UPDATE or ALL policy.

However, it seems like there is an opportunity to change the behavior to be
more intuitive, and to support the use case I am attempting. It seems like
use of a WHERE clause that uses the existing row should 

JSON/JSONB documentation, aggregate functions

2022-09-16 Thread PG Doc comments form
The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/14/functions-json.html
Description:

Regarding the Json documentation (9.16.1: processing and creating JSON
data), I would like to suggest that the reference to
json_agg/jsonb_agg/json_object_agg/jsonb_object_agg (section 9.21) be made
more prominent. When looking through the list of json[b] processing
functions, it's really easy to miss the footnote mentioning the aggregate
functions. The list of functions in table 9.47 includes a lot of functions
that are *really*similar in purpose to the json aggregator functions, and
it's easy to assume that anything related to building JSON[B] objects *must*
be listed there. I just spent hours searching for a way to turn one column
of keys and one column of values into a JSONB object (without coercing all
numeric values to json strings), and I can't imagine I'm the only one.

Regards,
Alex Denman