On 12/17/25 13:37, Tom Lane wrote:
Joe Conway <[email protected]> writes:
Possibly try using/abusing RLS?
Cute idea, but I think it doesn't reliably address the problem of
wanting to identify the specific rows that were read. In your toy
example it'd work, because the generated plan is
regression=> explain verbose select * from t1 where c1=42;
QUERY PLAN
------------------------------------------------------------
Seq Scan on public.t1 (cost=0.00..343.38 rows=2 width=36)
Output: c1, c2
Filter: ((t1.c1 = 42) AND audit(t1.c1))
(3 rows)
so the WHERE clause is applied before the RLS filter. But in any
slightly-more-complicated situation, like a non-leakproof WHERE
condition, the order would be reversed so the log would overstate
which rows were read.
Sure, of course we have had requests for a leakproofness check bypass[1]
for some use cases, and this could be one more such case.
If the application's behavior is simple and well-defined, this might
be good enough, of course.
FWIW when I read the original email in the thread I got the impression
that the application behavior was pretty simple WRT this table. But of
course I could easily be wrong...
I thought of a way that could possibly do this reliably, but it's
vastly more work than the use-case seems worth:
1. Convert the SELECTs into SELECT FOR UPDATE (you could do this
without changing the application, by interposing a view). SELECT
FOR SHARE might be good enough, not sure.
2. Write a logical replication output plugin that parses the WAL log
well enough to identify the tuple locks taken by FOR UPDATE.
Yeah this seems like a pretty heavy lift.
This should work to log only the rows actually read, because FOR
UPDATE is postponed to the top of the query plan, unlike RLS.
[1]
https://www.postgresql.org/message-id/flat/CAMxA3rsGQh9waorObOZyqrFqZ5uQ0b5D7SL6X6nh2kLhX%3D90vg%40mail.gmail.com#4a03eafc8c9660177874e11811c8f410
--
Joe Conway
PostgreSQL Contributors Team
Amazon Web Services: https://aws.amazon.com