Re: copying json data and backslashes
> From: p...@cmicdo.com > Sent: 22 November 2022 15:30 > To: Alastair McKinley ; > pgsql-general@lists.postgresql.org ; Erik > Wienhold > Subject: Re: copying json data and backslashes > > > > > On Tuesday, November 22, 2022 at 10:16:11 AM EST, Erik Wienhold > wrote: > > > > > > > On 22/11/2022 15:23 CET Alastair McKinley > wrote: > > > > > > Hi all, > > > > > > I have come across this apparently common issue COPY-ing json and > wondering if > > > there is potentially a better solution. > > > > > > I am copying data into a jsonb column originating from a 3rd party API. > The > > > data may have literal \r,\t,\n and also double backslashes. > > > > > > I discovered that I can cast this data to a jsonb value directly but I > can't > > > COPY the data without pre-processing. > > > > > > > Is there any other solution with COPY that doesn't require manual > > > implementation of search/replace to handle these edge cases? > > > Why does ::jsonb work but COPY doesn't? It seems a bit inconsistent. > > > > COPY handles special backslash sequences[1]. The \r in your sample JSON, > > although properly escaped according to JSON, is replaced with an actual > > carriage return by COPY before casting to jsonb. The error results from > JSON > > prohibiting unescaped control characters in strings[2]. > > > > You must double escape to pass those characters through COPY. > > > > See how COPY outputs backslash sequences: > > > > -- Actual carriage return: > > copy (select e'\r') to stdout; > > \r > > > > -- Backslash sequence for carriage return: > > copy (select '\r') to stdout; > > > > \\r > > I have been able to get around this problem by using the following method: > > \copy footable from 'input.json' (format csv, escape '^B', delimieter '^C', > quote '^E') > > where the control characters are the actual control char, not the > caret-letter, and it requires no escaping escapes. I realize this won't work > for all > situations. > > PJ Hi PJ, Thanks for the suggestion, this is interesting to me to try but I am not quite sure how this works. As far as I understand, escape/quote/delimiter have to be a single character, and CTRL-C etc. are multiple characters. What way do you input each of the escape/quote/delimiter characters? Best regards, Alastair > > > > > [1] > > https://www.postgresql.org/docs/current/sql-copy.html#id-1.9.3.55.9.2 > > [2] https://www.json.org/json-en.html > > > > -- > > Erik >
copying json data and backslashes
Hi all, I have come across this apparently common issue COPY-ing json and wondering if there is potentially a better solution. I am copying data into a jsonb column originating from a 3rd party API. The data may have literal \r,\t,\n and also double backslashes. I discovered that I can cast this data to a jsonb value directly but I can't COPY the data without pre-processing. The example below illustrates my issue (only with \r, but the problem extends to other \X combinations). do $$ lines=[r'{"test" : "\r this data has a carriage return"}'] with open("/tmp/test1.json","w") as f: for line in lines: f.write(line.strip() + "\n") $$ language plpython3u; create temp table testing (data jsonb); -- this works insert into testing (data) select l::jsonb from pg_read_file('/tmp/test1.json') f, lateral regexp_split_to_table(f,'\n') l where l <> ''; -- fails copy testing (data) from '/tmp/test1.json'; -- works copy testing (data) from program $c$ sed -e 's/\\r/u000a/g' /tmp/test1.json $c$; Is there any other solution with COPY that doesn't require manual implementation of search/replace to handle these edge cases? Why does ::jsonb work but COPY doesn't? It seems a bit inconsistent. Best regards, Alastair
Re: psql 15beta1 does not print notices on the console until transaction completes
> > From: Adrian Klaver > Sent: 29 May 2022 21:47To: Alastair McKinley > ; pgsql-general@lists.postgresql.org > Subject: Re: psql 15beta1 does not print > notices on the console until transaction completes > > On 5/29/22 13:11, Alastair McKinley wrote: > > Hi all, > > > > I notice this change in behaviour with psql in 15beta1 when testing an > > existing codebase. > > > > I didn't see any mention of this change in the release notes and it > > surprised me. > > > > Using this test function: > > > > create or replace function test_notice() returns void as > > $$ > > begin > > raise notice 'hello'; > > perform pg_sleep(10); > > end; $$ language plpgsql; > > > > In psql 15beta1, the "hello" message only appears on the console when the > > transaction completes. > > I am not seeing that. > > Can you provide more information about how you are running test_notice()? > Hi Adrian, I am running the function "select test_notice();" from the psql console with psql/server versions 15beta1. In psql 15beta1, the notice appears only after the function returns/transaction completes. If I execute the same function from psql 14.3 and server 15beta1, the notice appears on the psql console immediately as expected, before the pg_sleep() completes. It's reproducible for me, I just double checked it. I think Tom was able to reproduce this by the sounds of his response? Best regards, Alastair > > > > in psql 14.3, it appears immediately as I would have expected. > > > > Is there a way to change psql behaviour to display notices immediately as > > in versions < 15? > > > > Best regards, > > > > Alastair > > > > > -- > Adrian Klaver > adrian.kla...@aklaver.com
psql 15beta1 does not print notices on the console until transaction completes
Hi all, I notice this change in behaviour with psql in 15beta1 when testing an existing codebase. I didn't see any mention of this change in the release notes and it surprised me. Using this test function: create or replace function test_notice() returns void as $$ begin raise notice 'hello'; perform pg_sleep(10); end; $$ language plpgsql; In psql 15beta1, the "hello" message only appears on the console when the transaction completes. in psql 14.3, it appears immediately as I would have expected. Is there a way to change psql behaviour to display notices immediately as in versions < 15? Best regards, Alastair
Re: Function definition regression in 15beta1 when specific parameter name (string) is used
> From: Tom Lane > Sent: 29 May 2022 18:43 > To: Alastair McKinley > Cc: Andrew Dunstan ; pgsql-general@lists.postgresql.org > > Subject: Re: Function definition regression in 15beta1 when specific > parameter name (string) is used > > Alastair McKinley writes: > > The following function definition fails in 15beta1 (ok in 14.3): > > > create or replace function regexp_match_test(string text,pattern text) > > returns text[] as > > $$ > > select regexp_match(string,pattern); > > $$ language sql; > > Commit 1a36bc9db seems to have defined STRING as a type_func_name_keyword, > which strikes me as a pretty horrible trampling on user namespace. That > means you can't have tables or columns named "string" anymore either, and > I'll bet money the latter restriction is going to bite a lot of people. > Yes I would agree, could this potentially break a lot of upgrades? I checked the release notes and CTRL-F'd for "string" to check in case it had become reserved or become an alias for text, but there is nothing in the release notes at the minute. > In a quick experiment here, I don't see any bison complaints if I > back it down to unreserved_keyword, so this seems easily fixable. > I wonder though if we don't need more review of patches that add > partially- or fully-reserved keywords. > > regards, tom lane
Function definition regression in 15beta1 when specific parameter name (string) is used
Hi all, I was testing an existing codebase with 15beta1 and ran into this issue. I reduced the test case to an example with works in 14.3, and fails in 15beta1. The following function definition fails in 15beta1 (ok in 14.3): create or replace function regexp_match_test(string text,pattern text) returns text[] as $$ select regexp_match(string,pattern); $$ language sql; The error message is: ERROR: syntax error at or near "," LINE 3: select regexp_match(string,pattern); ^ Changing the first parameter name from string to anything else (e.g. strin or string1) resolves the issue. The issue also occurs with the "string" parameter name if this is used in a plpgsql function like this: create or replace function regexp_match_test(string text,pattern text) returns text[] as $$ begin return (select regexp_match(string,pattern)); end; $$ language plpgsql; Best regards, Alastair
Performance degradation with non-null proconfig
Hi all, I recently came across a subtle performance issue when working with some compiled UDFs to optimise a workload. These UDFs accidently had "set search_path = 'public'" in their definition. When profiling with perf, I noticed a large amount of elapsed time spent in the function voidAtEOXact_GUC(bool isCommit, int nestLevel) Reading the comments it is called when exiting a function with a custom proconfig, removing it for my case gave me a very large (3x) speedup. Below is a contrived test case that illustrates this issue (tested on 13.0, but also seen in 12.4). create table test as select r as row_id, array_agg(random()::float4) as data from generate_series(1,10) r, generate_series(1,20) e group by r; create or replace function array_sum(float4[]) returns float4 as $$ select sum(e) from unnest($1) e; $$ language sql immutable parallel safe; create or replace function array_sum_public_search_path(float4[]) returns float4 as $$ select sum(e) from unnest($1) e; $$ language sql immutable parallel safe set search_path = 'public'; \timing on \o /dev/null select format($q$ explain (analyze,verbose,buffers) select array_sum(data) from test $q$) from generate_series(1,10); \gexec select format($q$ explain (analyze,verbose,buffers) select array_sum_public_search_path(data) from test $q$) from generate_series(1,10); \gexec Test output: postgres=# select format($q$ explain (analyze,verbose,buffers) select array_sum(data) from test $q$) from generate_series(1,10); Time: 0.940 ms postgres=# \gexec Time: 745.988 ms Time: 677.056 ms Time: 653.709 ms Time: 651.033 ms Time: 650.063 ms Time: 647.741 ms Time: 650.328 ms Time: 651.954 ms Time: 655.384 ms Time: 650.988 ms Time: 0.976 ms postgres=# select format($q$ explain (analyze,verbose,buffers) select array_sum_public_search_path(data) from test $q$) from generate_series(1,10); Time: 0.774 ms postgres=# \gexec Time: 871.628 ms Time: 853.298 ms Time: 856.798 ms Time: 857.794 ms Time: 861.836 ms Time: 858.291 ms Time: 861.763 ms Time: 850.221 ms Time: 851.470 ms Time: 858.875 ms Time: 1.514 ms postgres=# I didn't see this discussed anywhere else, it might be worth adding a note to the documentation if it can't easily be addressed. Best regards, Alastair
ts_debug() style functions for jsonpath debugging
Hello everyone, I am working with jsonpaths heavily and was wondering if there is any method for debugging a jsonpath expression in a similar method to ts_debug() for text search? Essentially what I would like to do is debug the splitting of a path into tokens or logical components using the same parser as postgres itself. Is there any method currently available that could be exposed at the SQL layer to inspect jsonpath expressions? Best regards, Alastair
Re: Index selection issues with RLS using expressions
Hi Tom, This is the solution I went with. create policy X on tableX for select to new_role using ( has_table_read_permission(tableX.column) ); This covers all usage of the table and then for APIs that utilise leaky operators: create function with_leaky_operator(args) returns setof tableX as $$ select * from tableX where column @@ $1 and has_table_read_permission(tableX.column); $$ language sql security definer; Best regards, Alastair From: Alastair McKinley Sent: 31 March 2020 22:09 To: Tom Lane Cc: pgsql-general@lists.postgresql.org Subject: Re: Index selection issues with RLS using expressions Hi Tom, Thanks for looking at this! It seems like there are quite a few performance gotchas around leaky operators and RLS, this is my second encounter with this issue in the last few weeks. What would you recommend as a reasonable workaround? I have a large table with a gin index that I would like to use RLS on and use the @@ text search operator. My initial thought is to use a security definer set-returning function that implements the RLS policy explicitly. Would a security barrier view also potentially work? Best regards and thanks again, Alastair From: Tom Lane Sent: 31 March 2020 20:18 To: Alastair McKinley Cc: pgsql-general@lists.postgresql.org Subject: Re: Index selection issues with RLS using expressions Alastair McKinley writes: > I am running in to an issue with RLS and index selection in my queries. I > created a toy example to try to illustrate the issue below. Postgres version > is PostgreSQL 12.2 (Debian 12.2-2.pgdg100+1) on x86_64-pc-linux-gnu, compiled > by gcc (Debian 8.3.0-6) 8.3.0, 64-bit. > Is there some subtle reason as to why the role "new_user" cannot seem to > generate a query plan that uses the gin index? The && operator is not marked leakproof, so it can't be applied till after the RLS filter, making an indexscan with it impossible when RLS is active. Perhaps arrayoverlap() itself could be proven leakproof, but the underlying type-specific equality operator might or might not be. We don't have enough infrastructure to handle indirect leakproofness requirements like that, so you lose :-( regards, tom lane
Re: Index selection issues with RLS using expressions
Hi Tom, Thanks for looking at this! It seems like there are quite a few performance gotchas around leaky operators and RLS, this is my second encounter with this issue in the last few weeks. What would you recommend as a reasonable workaround? I have a large table with a gin index that I would like to use RLS on and use the @@ text search operator. My initial thought is to use a security definer set-returning function that implements the RLS policy explicitly. Would a security barrier view also potentially work? Best regards and thanks again, Alastair From: Tom Lane Sent: 31 March 2020 20:18 To: Alastair McKinley Cc: pgsql-general@lists.postgresql.org Subject: Re: Index selection issues with RLS using expressions Alastair McKinley writes: > I am running in to an issue with RLS and index selection in my queries. I > created a toy example to try to illustrate the issue below. Postgres version > is PostgreSQL 12.2 (Debian 12.2-2.pgdg100+1) on x86_64-pc-linux-gnu, compiled > by gcc (Debian 8.3.0-6) 8.3.0, 64-bit. > Is there some subtle reason as to why the role "new_user" cannot seem to > generate a query plan that uses the gin index? The && operator is not marked leakproof, so it can't be applied till after the RLS filter, making an indexscan with it impossible when RLS is active. Perhaps arrayoverlap() itself could be proven leakproof, but the underlying type-specific equality operator might or might not be. We don't have enough infrastructure to handle indirect leakproofness requirements like that, so you lose :-( regards, tom lane
Index selection issues with RLS using expressions
Hi all, I am running in to an issue with RLS and index selection in my queries. I created a toy example to try to illustrate the issue below. Postgres version is PostgreSQL 12.2 (Debian 12.2-2.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit. Is there some subtle reason as to why the role "new_user" cannot seem to generate a query plan that uses the gin index? Best regards, Alastair -- Example below create table test as select array_agg(v order by v) a, (random()*4)::bigint as n from ( select (random()*250)::bigint as v , (random()*10)::bigint as g from generate_series(1,100) ) s group by g; create index on test using gin(a); create or replace function has_permission(n bigint) returns boolean as $$ select n in (1,2); $$ language sql stable leakproof; alter table test enable row level security; create role new_user; grant select on test to new_user; grant execute on function has_permission(bigint) to new_user; create policy new_user_select on test for select to new_user using ( has_permission(test.n) ); set role new_user; explain select count(*) from test where a && array[100::bigint]; -- Aggregate (cost=3233.94..3233.95 rows=1 width=8) -- -> Seq Scan on test (cost=0.00..3228.93 rows=2005 width=0) -- Filter: ((n = ANY ('{1,2}'::bigint[])) AND (a && '{100}'::bigint[])) set role postgres; explain select count(*) from test where a && array[100::bigint]; -- Aggregate (cost=1833.21..1833.22 rows=1 width=8) -- -> Bitmap Heap Scan on test (cost=43.41..1823.07 rows=4053 width=0) -- Recheck Cond: (a && '{100}'::bigint[]) -- -> Bitmap Index Scan on test_a_idx (cost=0.00..42.40 rows=4053 width=0) -- Index Cond: (a && '{100}'::bigint[]) -- even with the has_permission() function the postgres user gets a bitmap index scan explain select count(*) from test where a && array[100::bigint] and has_permission(test.n); QUERY PLAN -- --- -- Aggregate (cost=1837.71..1837.72 rows=1 width=8) ---> Bitmap Heap Scan on test (cost=42.90..1832.69 rows=2005 width=0) -- Recheck Cond: (a && '{100}'::bigint[]) -- Filter: (n = ANY ('{1,2}'::bigint[])) -- -> Bitmap Index Scan on test_a_idx (cost=0.00..42.40 rows=4053 width=0) --Index Cond: (a && '{100}'::bigint[])
Re: Explain says 8 workers planned, only 1 executed
Hi Jeremy, Thanks for solving the mystery. I think this might be a missing point in section 15.2 in the docs. I wonder will this ever be improved or should I just write to temporary tables instead of return query? Best regards, Alastair From: Jeremy Smith Sent: 21 March 2020 20:50 To: Alastair McKinley Cc: Adrian Klaver ; pgsql-general@lists.postgresql.org Subject: Re: Explain says 8 workers planned, only 1 executed Unfortunately, return query will never use parallel workers. See: https://stackoverflow.com/q/58079898/895640 and https://www.postgresql.org/message-id/16040-eaacad11fecfb...@postgresql.org On Sat, Mar 21, 2020 at 1:59 PM Alastair McKinley mailto:a.mckin...@analyticsengines.com>> wrote: Hi Adrian, Thanks for getting back to me. Postgres version is: PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit I simplified it to the following structure: create function f() returns setof my_type as $$ declare q text; output text; begin select generate_query1() into q; -- the query q executes in parallel with 8 workers if executed standalone for output in execute('explain ' || q) loop raise notice '%',output; -- this plan says 8 workers will be launched end loop; return query execute q; -- this launches one worker select generate_query2() into q; for output in execute('explain ' || q) loop raise notice '%',output; -- this plan says 8 workers will be launched end loop; return query execute q; -- this also launches one worker end; language plpgsql; Should this work in principle or am I missing something subtle about parallel dynamic queries in plpgsql functions? Does the outer function need to be parallel safe? Might a stored proc work better? Best regards, Alastair From: Adrian Klaver mailto:adrian.kla...@aklaver.com>> Sent: 21 March 2020 17:38 To: Alastair McKinley mailto:a.mckin...@analyticsengines.com>>; pgsql-general@lists.postgresql.org<mailto:pgsql-general@lists.postgresql.org> mailto:pgsql-general@lists.postgresql.org>> Subject: Re: Explain says 8 workers planned, only 1 executed On 3/21/20 10:25 AM, Alastair McKinley wrote: > Hi all, > > I have a long running query that I have tweaked along with config (e.g. > min_parallel_table_scan_size) to execute nicely and very fast in > parallel which works as expected executed directly from psql client. > The query is then embedded in a psql function like "return query select > * from function_that_executes_the_query()". Postgres version? What is happening in function_that_executes_the_query()? You might want to take a look at below to see any of the conditions apply: https://www.postgresql.org/docs/12/when-can-parallel-query-be-used.html > > I am checking the explain output (using execute explain $query) just > before executing inside my function and it the plan is identical to what > I would expect, planning 8 workers. However, this query actually only > uses 1 worker and takes many times longer than when ran directly on the > psql command line with the same server configuration parameters. > > Why would the explain output be different from the executed plan? Is > this a limitation of plpgsql functions? Is there any way to debug this > further? > > If it is meaningful during parallel execution I notice lots of > "postgres: parallel worker" proceses in top and when executing from my > function just a single "postgres: $user $db $host($pid) SELECT" processes. > > Best regards, > > Alastair > > -- Adrian Klaver adrian.kla...@aklaver.com<mailto:adrian.kla...@aklaver.com>
Re: Explain says 8 workers planned, only 1 executed
Hi Adrian, Thanks for getting back to me. Postgres version is: PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit I simplified it to the following structure: create function f() returns setof my_type as $$ declare q text; output text; begin select generate_query1() into q; -- the query q executes in parallel with 8 workers if executed standalone for output in execute('explain ' || q) loop raise notice '%',output; -- this plan says 8 workers will be launched end loop; return query execute q; -- this launches one worker select generate_query2() into q; for output in execute('explain ' || q) loop raise notice '%',output; -- this plan says 8 workers will be launched end loop; return query execute q; -- this also launches one worker end; language plpgsql; Should this work in principle or am I missing something subtle about parallel dynamic queries in plpgsql functions? Does the outer function need to be parallel safe? Might a stored proc work better? Best regards, Alastair From: Adrian Klaver Sent: 21 March 2020 17:38 To: Alastair McKinley ; pgsql-general@lists.postgresql.org Subject: Re: Explain says 8 workers planned, only 1 executed On 3/21/20 10:25 AM, Alastair McKinley wrote: > Hi all, > > I have a long running query that I have tweaked along with config (e.g. > min_parallel_table_scan_size) to execute nicely and very fast in > parallel which works as expected executed directly from psql client. > The query is then embedded in a psql function like "return query select > * from function_that_executes_the_query()". Postgres version? What is happening in function_that_executes_the_query()? You might want to take a look at below to see any of the conditions apply: https://www.postgresql.org/docs/12/when-can-parallel-query-be-used.html > > I am checking the explain output (using execute explain $query) just > before executing inside my function and it the plan is identical to what > I would expect, planning 8 workers. However, this query actually only > uses 1 worker and takes many times longer than when ran directly on the > psql command line with the same server configuration parameters. > > Why would the explain output be different from the executed plan? Is > this a limitation of plpgsql functions? Is there any way to debug this > further? > > If it is meaningful during parallel execution I notice lots of > "postgres: parallel worker" proceses in top and when executing from my > function just a single "postgres: $user $db $host($pid) SELECT" processes. > > Best regards, > > Alastair > > -- Adrian Klaver adrian.kla...@aklaver.com
Explain says 8 workers planned, only 1 executed
Hi all, I have a long running query that I have tweaked along with config (e.g. min_parallel_table_scan_size) to execute nicely and very fast in parallel which works as expected executed directly from psql client. The query is then embedded in a psql function like "return query select * from function_that_executes_the_query()". I am checking the explain output (using execute explain $query) just before executing inside my function and it the plan is identical to what I would expect, planning 8 workers. However, this query actually only uses 1 worker and takes many times longer than when ran directly on the psql command line with the same server configuration parameters. Why would the explain output be different from the executed plan? Is this a limitation of plpgsql functions? Is there any way to debug this further? If it is meaningful during parallel execution I notice lots of "postgres: parallel worker" proceses in top and when executing from my function just a single "postgres: $user $db $host($pid) SELECT" processes. Best regards, Alastair
Re: Poor plan choice with partial unique indexes on jsonb column and simple RLS policy (with test script)
Hi Tom, Thanks once again for your time looking at this. I have a resolution but didn't exactly get to the bottom of what was going on. Forcing the function used in the index to be leakproof did not work. I guessed, but am not certain, that this is because either to_jsonb() or jsonb_eq operator are not leakproof as well? During my testing of a solution (which basically was not to use jsonb for this) I saw this message while using RLS in an unrelated query. DEBUG: not using statistics because function "enum_eq" is not leak-proof I did not see a message like this using my jsonb indexes, even though it seems like a related issue. Is there another effect potentially going on here or incomplete debugging messages? Best regards, Alastair From: Tom Lane Sent: 04 March 2020 04:22 To: Alastair McKinley Cc: pgsql-general@lists.postgresql.org Subject: Re: Poor plan choice with partial unique indexes on jsonb column and simple RLS policy (with test script) Alastair McKinley writes: > Thank you for having a look at this. In the interim I discovered that I > could trigger the issue by creating a security barrier view, whereas a > regular view worked fine, so I think that also points to your conclusion > about leakyness? > I attempted to workaround the issue with a leakproof function, so far with no > success. > ... > Is this an approach that could fundamentally work? Forcing the expression to be considered leakproof should work. I'm not sure that your partial index is OK for the purpose of collecting stats, though -- does it help if you make a non-partial index on that function expression? Otherwise, it's possible that I guessed wrong about which part of the WHERE clause is problematic. You could try doing EXPLAINs with different portions of the WHERE to see how the rowcount estimate changes. BTW, just marking something "leakproof" when it isn't really so is possibly a security problem. You should think twice about what threat model you're hoping RLS will protect against. regards, tom lane
Re: Poor plan choice with partial unique indexes on jsonb column and simple RLS policy (with test script)
Hi Tom, Thank you for having a look at this. In the interim I discovered that I could trigger the issue by creating a security barrier view, whereas a regular view worked fine, so I think that also points to your conclusion about leakyness? I attempted to workaround the issue with a leakproof function, so far with no success. I tried a leakproof function as below: create or replace function jsonb_select(obj jsonb,keys text[]) returns jsonb as $$ select jsonb_agg(value order by key) from jsonb_each(obj) where keys @> ARRAY[key]; $$ language sql immutable strict leakproof; And created the expression indexes: create unique index i_10 on testing ((jsonb_select(data,'{value}'))) where type_id = 10 and latest is true; But my query still produces a bad plan (bitmap heap scan) with rls or a security barrier view enabled: explain (analyze) select * from testing where (jsonb_select(data,'{value}')) = to_jsonb(10) and type_id = 10 and latest is true; Is this an approach that could fundamentally work? Best regards, Alastair From: Tom Lane Sent: 04 March 2020 00:04 To: Alastair McKinley Cc: pgsql-general@lists.postgresql.org Subject: Re: Poor plan choice with partial unique indexes on jsonb column and simple RLS policy (with test script) Alastair McKinley writes: > I have recently encountered a strange poor query plan choice after > implementing RLS. > My table has a number of partial indexes on a jsonb column and the query went > from low number of milliseconds to several seconds as the planner chose a > different index. > Simply stated, in the jsonb column case, "using ( (select true) )" instead of > "using (true)" produces a bad plan, illustrated below: If the planner isn't sure you have access to all rows in the table, that disables some of its ability to estimate where-clause selectivity. In particular it can't run "leaky" where-clauses against all values in the table's statistics entries to see how many pass, because a nefarious user could use that to glean info about what's in the table. Eyeing your test query, it looks like the issue is that jsonb "->" isn't leakproof, so that clause falls back to a default selectivity estimate, and you get a bad plan as a result. regards, tom lane
Poor plan choice with partial unique indexes on jsonb column and simple RLS policy (with test script)
Hi all, I have recently encountered a strange poor query plan choice after implementing RLS. My table has a number of partial indexes on a jsonb column and the query went from low number of milliseconds to several seconds as the planner chose a different index. Simply stated, in the jsonb column case, "using ( (select true) )" instead of "using (true)" produces a bad plan, illustrated below: postgres=# create policy testing_s ON testing for select to testing_user using ( postgres(# true postgres(# ); postgres=# set role testing_user; SET postgres=> explain (analyze) select * from testing where data->'value' = to_jsonb(10) and type_id = 10 and latest is true; QUERY PLAN --- Index Scan using i_10 on testing (cost=0.15..8.17 rows=1 width=49) (actual time=0.007..0.008 rows=1 loops=1) Index Cond: ((data -> 'value'::text) = to_jsonb(10)) Planning Time: 0.221 ms Execution Time: 0.017 ms (4 rows) postgres=# alter policy testing_s ON testing to testing_user using ( postgres(# (select true) postgres(# ); postgres=> explain (analyze) select * from testing where data->'value' = to_jsonb(10) and type_id = 10 and latest is true; QUERY PLAN -- Bitmap Heap Scan on testing (cost=9.16..17582.89 rows=1 width=49) (actual time=0.088..0.877 rows=1 loops=1) Recheck Cond: ((type_id = 10) AND (latest IS TRUE)) Filter: ($0 AND ((data -> 'value'::text) = to_jsonb(10))) Rows Removed by Filter: 199 Heap Blocks: exact=185 InitPlan 1 (returns $0) -> Result (cost=0.00..0.01 rows=1 width=1) (actual time=0.000..0.000 rows=1 loops=1) -> Bitmap Index Scan on i_10 (cost=0.00..9.14 rows=7500 width=0) (actual time=0.012..0.012 rows=200 loops=1) Planning Time: 0.306 ms Execution Time: 0.894 ms (10 rows) Tested on PostgreSQL 11.1 (Debian 11.1-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit My two full tests cases are linked below, the first works as expected, the second produces a bad plan: https://gist.github.com/a-mckinley/94a4ada1e40bf79e134a90349cd2a380 https://gist.github.com/a-mckinley/d98fec0fb48a1b8eea3adc526981fb5b This problem seems to make row level security unusable for me, I am missing something in regards to RLS and indexes on jsonb? Alastair
websearch_to_tsquery() and handling of ampersand characters inside double quotes
Hi all, I have recently discovered an unexpected difference in behaviour using websearch_to_tsquery() with quoted strings containing ampersands. These two queries are equivalent without surrounding double quotes select websearch_to_tsquery('something and another') = websearch_to_tsquery('something & another'); ?column? -- t (1 row) select websearch_to_tsquery('something and another'); websearch_to_tsquery -- 'someth' & 'anoth' (1 row) With surrounding double quotes they produce subtly different queries, with different positional information. select websearch_to_tsquery('"something and another"') = websearch_to_tsquery('"something & another"'); ?column? -- f (1 row) select websearch_to_tsquery('"something and another"'); websearch_to_tsquery -- 'someth' <2> 'anoth' (1 row) select websearch_to_tsquery('"something & another"'); websearch_to_tsquery -- 'someth' <-> 'anoth' (1 row) I imagine the difference is due to the ts_vector type recording different information for the underlying strings. select to_tsvector('something & another'); to_tsvector -- 'anoth':2 'someth':1 (1 row) chimera=# select to_tsvector('something and another'); to_tsvector -- 'anoth':3 'someth':1 (1 row) This leads to quite different search results and my current workaround is to suggest to users to do both searches with an OR. Is this the right solution? Best regards, Alastair
Re: websearch_to_tsquery() and apostrophe inside double quotes
Hi Tom, Thank you for looking at this. You are right I couldn't find anything in the docs that would explain this. I can't think of any rationale for producing a query like this so it does look like a bug. Best regards, Alastair From: Tom Lane Sent: 10 October 2019 14:35 To: Alastair McKinley Cc: pgsql-general@lists.postgresql.org ; teo...@sigaev.ru Subject: Re: websearch_to_tsquery() and apostrophe inside double quotes Alastair McKinley writes: > I am a little confused about what us being generated by > websearch_to_tsquery() in the case of an apostrophe inside double quotes. > ... > select websearch_to_tsquery('"peter o''toole"'); > websearch_to_tsquery > -- > 'peter' <-> ( 'o' & 'tool' ) > (1 row) > I am not quite sure what text this will actually match? I believe it's impossible for that to match anything :-(. It would require 'o' and 'tool' to match the same lexeme (one immediately after a 'peter') which of course is impossible. The underlying tsvector type seems to treat the apostrophe the same as whitespace; it separates 'o' and 'toole' into distinct words: # select to_tsvector('peter o''toole'); to_tsvector -- 'o':2 'peter':1 'tool':3 (1 row) So it seems to me that this is a bug: websearch_to_tsquery should also treat "'" like whitespace. There's certainly not anything in its documentation that suggests it should treat "'" specially. If it didn't, you'd get # select websearch_to_tsquery('"peter o toole"'); websearch_to_tsquery 'peter' <-> 'o' <-> 'tool' (1 row) which would match this tsvector. regards, tom lane
websearch_to_tsquery() and apostrophe inside double quotes
Hi all, I am a little confused about what us being generated by websearch_to_tsquery() in the case of an apostrophe inside double quotes. Here is an example of searching for a name containing an apostrophe. The following works as expected: select to_tsvector('peter o''toole') @@ websearch_to_tsquery('peter o''toole'); ?column? -- t (1 row) When the name is in double quotes, the search fails: select to_tsvector('peter o''toole') @@ websearch_to_tsquery('"peter o''toole"'); ?column? -- f (1 row) In the first case, websearch_to_tsquery() returns: select websearch_to_tsquery('peter o''toole'); websearch_to_tsquery 'peter' & 'o' & 'tool' (1 row) which makes sense to me. In the second case websearch_to_tsquery() returns something that I can't quite understand: select websearch_to_tsquery('"peter o''toole"'); websearch_to_tsquery -- 'peter' <-> ( 'o' & 'tool' ) (1 row) I am not quite sure what text this will actually match? Best regards, Alastair
Possible bug: SQL function parameter in window frame definition
Hi all, I noticed this strange behaviour whilst trying to write a function for Postgres 11.5 (PostgreSQL 11.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit) and reduced it to this minimal example. Using a function parameter in the window frame definition seems to be the cause of the error. create or replace function f(group_size bigint) returns setof int[] as $$ select array_agg(s) over w from generate_series(1,10) s window w as (order by s rows between current row and group_size following) $$ language sql immutable; Calling the function without a column list succeeds: postgres=# select f(3); f {1,2,3,4} {2,3,4,5} {3,4,5,6} {4,5,6,7} {5,6,7,8} {6,7,8,9} {7,8,9,10} {8,9,10} {9,10} {10} (10 rows) Calling the function with select * fails: postgres=# select * from f(3); ERROR: 42704: no value found for parameter 1 LOCATION: ExecEvalParamExtern, execExprInterp.c:2296 Using a plpgsql function with a stringified query works, which is my current workaround: create or replace function f1(group_size bigint) returns setof int[] as $$ begin return query execute format($q$ select array_agg(s) over w as t from generate_series(1,10) s window w as (order by s rows between current row and %1$s following) $q$,group_size); end; $$ language plpgsql immutable; This appears to be a bug to me. If confirmed that this is not some expected behaviour unknown to me I will report this. Alastair
Re: Strange performance degregation in sql function (PG11.1)
Hi Andrew, Thanks for your in-depth response. I found that adding the stable qualifier didn't solve the issue unfortunately. I actually encountered the same issue (or at least extremely similar) today and made a bit more progress on defining it. I have a carefully indexed and optimised query that runs in about 2 seconds with this structure and signature. create or replace function example_function(param1 int, param2 int) returns setof custom_type as $$ with a_few_ctes () select * from big_table where col1 = param1 and col2 = param2; $$ language sql stable; This function works as expected when executed from a psql client. I am calling this function via another framework (Postgrest) that executes the function using the following pattern: with args as ( select json_to_record($1) as (param1 int,param2 int) ), output as ( select * from example_function( param1 := (select param1 from args), param2 := (select param2 from args) ) ) select * from output; Running this query with the args coming from the CTE resulted in my query running for many tens of minutes before I gave up. Changing the underlying function to use plpgsql fixes the issue and the query runs as expected. Both versions work as expected when called directly, but the SQL version does not when called with the args coming from the CTE as shown above. The new function signature is create or replace function example_function(param1 int, param2 int) returns setof custom_type as $$ with a_few_ctes () return query select * from big_table where col1 = param1 and col2 = param2; $$ language plpgsql stable; I haven't been able to check the bad query plan with auto_explain as the query doesn't seem to finish. So to summarise, changing a stable SQL function to a stable plpgsql function when called with function args from a CTE fixes a huge performance issue of uncertain origin. I hope someone can offer some suggestions as this has really confused me. Best regards, Alastair From: Andrew Gierth Sent: 19 May 2019 03:48 To: Alastair McKinley Cc: pgsql-general@lists.postgresql.org Subject: Re: Strange performance degregation in sql function (PG11.1) >>>>> "Alastair" == Alastair McKinley writes: Alastair> Hi all, Alastair> I recently experienced a performance degradation in an Alastair> operational system that I can't explain. I had a function Alastair> wrapper for a aggregate query that was performing well using Alastair> the expected indexes with the approximate structure as shown Alastair> below. Alastair> create or replace function example_function(param1 int, param2 int) returns setof custom_type as Alastair> $$ Alastair> select * from big_table where col1 = param1 and col2 = param2; Alastair> $$ language sql; This function isn't inlinable due to missing a STABLE qualifier; that's a pretty big issue. Without inlining, the function will be run only with generic plans, which means that the decision about index usage will be made without knowledge of the parameter values. Was your actual function inlinable? See https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions You can get the query plan of a non-inlined function using the auto_explain module (with its log_nested_statements option). The query plan of non-inlined function calls is not otherwise shown by EXPLAIN. Alastair> After creating two new indexes on this table to support a Alastair> different use case during a migration, this unchanged Alastair> function reduced in performance by several orders of Alastair> magnitude. Running the query inside the function manually on Alastair> the console however worked as expected and the query plan did Alastair> not appear to have changed. But when you run it manually, you'll get a custom plan, based on the parameter values. Alastair> On a hunch I changed the structure of the function to the Alastair> structure below and immediately the query performance Alastair> returned to the expected baseline. Alastair> create or replace function example_function(param1 int, param2 int) returns setof custom_type as Alastair> $$ Alastair> BEGIN Alastair> return query execute format($query$ Alastair> select * from big_table where col1 = %1$L and col2 = %1$ Alastair> $query$,param1,param2); Alastair> END; Alastair> $$ language plpgsql; Using EXECUTE in plpgsql will get you a custom plan every time (though you really should have used EXECUTE USING rather than interpolating the parameters into the query string). I suggest looking into the inlining question first. -- Andrew (irc:RhodiumToad)
Strange performance degregation in sql function (PG11.1)
Hi all, I recently experienced a performance degradation in an operational system that I can't explain. I had a function wrapper for a aggregate query that was performing well using the expected indexes with the approximate structure as shown below. create or replace function example_function(param1 int, param2 int) returns setof custom_type as $$ select * from big_table where col1 = param1 and col2 = param2; $$ language sql; After creating two new indexes on this table to support a different use case during a migration, this unchanged function reduced in performance by several orders of magnitude. Running the query inside the function manually on the console however worked as expected and the query plan did not appear to have changed. On a hunch I changed the structure of the function to the structure below and immediately the query performance returned to the expected baseline. create or replace function example_function(param1 int, param2 int) returns setof custom_type as $$ BEGIN return query execute format($query$ select * from big_table where col1 = %1$L and col2 = %1$ $query$,param1,param2); END; $$ language plpgsql; The source data itself did not change during the time when I noticed this issue. Can anyone explain or guess what could have caused this degradation? The only other maintenance that I attempted was 'DISCARD PLANS;' which did not help. Best regards, Alastair
Re: Planner can't seem to use partial function indexes with parameter from join
Hi David, Thanks for having a look at this. You are right about the normalisation of record, unfortunately there are other constraints with the application that suit the denormalised approach for this table very well. Although my test case is quite simple compared to the real application (the prefix index is just for illustration of the function index), I have considered the record_prefix column idea you mentioned and will try that. It's good to clarify that runtime index selection won't work here (I had hoped it might be possible), I will almost certainly resort to the dynamically generated query approach in the interim. Best regards, Alastair Get Outlook for Android<https://aka.ms/ghei36> From: David Rowley Sent: Sunday, 14 April, 14:57 Subject: Re: Planner can't seem to use partial function indexes with parameter from join To: Alastair McKinley Cc: pgsql-general@lists.postgresql.org On Sun, 14 Apr 2019 at 21:55, Alastair McKinley wrote: > I have reduced my scenario to a minimal test case with inline comments to > illustrate the issue here > https://gist.github.com/a-mckinley/1b0e95142789cbc09121b71a83d03f45 > > Is there something that I am missing to allow the planner to use the > underlying indexes? Or is the scenario too complex and should I stick with > dynamic sql? I'd say the biggest part of the problem is that "record" is not in first normal form. However, it's worse than that as you're having to perform a join to determine how to fetch the value you want. If "record" was designed to have a "record_prefix" column and then store the remainder of the record_text over in the column by that name, then with an index on record (record_prefix, type_id) you could just do: explain analyze select r.type_id,count(*) from record r INNER JOIN (VALUES(1,'aa'),(2,'aab')) v(type_id, record_prefix) ON r.type_id = v.type_id AND r.record_prefix = v.record_prefix group by r.type_id; If you're lucky, and there's a good chance you would be, then you'd get a parameterised nested loop join. As for why the partial indexes cannot be used; partial indexes can only be used when the planner is able to match the index up to quals that will be evaluated in the table level quals (i.e not join quals). There's no such thing as dynamic index selection at execution time. The closest thing we have to that is run-time partition pruning in PG11, but that can't help you either since the partition key cannot contain values from other tables. You'd still need to normalise the record table. With that and a partitioned table, there might be further advantages of partition-wise aggregation, but that might not buy you much more than just normalising the table. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Planner can't seem to use partial function indexes with parameter from join
Hello all, I am having quite an interesting problem trying to get the planner to use my indexes as intended in my setup. I am using partial functional indexes that have a different function parameter based on the record type. A dynamically generated query using unions that are more explicit about index usage is significantly faster (10-50x) that my preferred plain sql approach using joins. I have reduced my scenario to a minimal test case with inline comments to illustrate the issue here https://gist.github.com/a-mckinley/1b0e95142789cbc09121b71a83d03f45 Is there something that I am missing to allow the planner to use the underlying indexes? Or is the scenario too complex and should I stick with dynamic sql? Best regards, Alastair
Index/trigger implementation for accessing latest records
Hi, I have a table that stores a location identifier per person which will be appended to many times. However, for many queries in this system we only need to know the most recent location per person, which is limited to about 1000 records. Is the following trigger/index strategy a reasonable and safe approach to fast access to the latest location records per person? 1. A boolean column (latest_record default true) to identify the latest record per person 2. A before insert trigger that updates all other records for that person to latest_record = false 3. A partial index on the latest_record column where latest_record is true Aside from performance, is it safe to update other records in the table from the insert trigger in this way? Minimal example is shown below: create table location_records ( id bigserial, person_id bigint, location_id bigint, latest_record boolean not null default true ); create function latest_record_update() returns trigger as $$ BEGIN update location_records set latest_record = false where person_id = new.person_id and latest_record is true and id != new.id; return new; END; $$ language plpgsql; create trigger latest_record_trigger before insert on location_records for each row execute procedure latest_record_update(); create index latest_record_index on location_records(latest_record) where latest_record is true; insert into location_records(person_id,location_id) values (1,1); insert into location_records(person_id,location_id) values (1,2); insert into location_records(person_id,location_id) values (1,3); insert into location_records(person_id,location_id) values (2,3); insert into location_records(person_id,location_id) values (2,4); select * from location_records; Best regards, Alastair