Re: doc: pg_prewarm add configuration example
On 22/06/29 03:57오후, Jacob Champion wrote: > On 6/18/22 01:55, Dong Wook Lee wrote: > > Hi hackers, > > > > I thought it would be nice to have an configuration example of the > > pg_prewarm extension. > > Therefore, I have written an example of a basic configuration. > > [offlist] > > Hi Dong Wook, I saw a commitfest entry registered for some of your other > patches, but not for this one. Quick reminder to register it in the > Documentation section if that was your intent. Thank you very much for letting me know. The patch is so trivial that I thought about whether to post it on commitfest, but I think it would be better to post it. > > Thanks, > --Jacob > > [NOTE] This is part of a mass communication prior to the July > commitfest, which begins in two days. If I've made a mistake -- for > example, if the patch has already been registered, withdrawn, or > committed -- just let me know, and sorry for the noise.
Re: First draft of the PG 15 release notes
On Tue, Jun 28, 2022 at 04:35:45PM -0400, Bruce Momjian wrote: > On Mon, Jun 27, 2022 at 11:37:19PM -0700, Noah Misch wrote: > > On Tue, May 10, 2022 at 11:44:15AM -0400, Bruce Momjian wrote: > > > I have completed the first draft of the PG 15 release notes > > > > > > > > > > > > > > > > > Remove PUBLIC creation permission on the > > linkend="ddl-schemas-public">public schema > > > (Noah Misch) > > > > > > > > > > > > This is a change in the default for newly-created databases in > > > existing clusters and for new clusters; USAGE > > > > If you dump/reload an unmodified v14 template1 (as pg_dumpall and pg_upgrade > > do), your v15 template1 will have a v14 ACL on its public schema. At that > > point, the fate of "newly-created databases in existing clusters" depends on > > whether you clone template1 or template0. Does any of that detail belong > > here, or does the existing text suffice? > > I think it is very confusing to have template0 have one value and > template1 have a different one, but as I understand it template0 will > only be used for pg_dump comparison, and that will keep template1 with > the same permissions, so I guess it is okay. It's an emergent property of two decisions. In the interest of backward compatibility, I decided to have v15 pg_dump emit GRANT for the public schema even when the source is an unmodified v14- database. When that combines with the ancient decision that a pg_dumpall or pg_upgrade covers template1 but not template0, one gets the above consequences. I don't see a way to improve on this outcome. > > > permissions on the public schema has not > > > been changed. Databases restored from previous Postgres releases > > > will be restored with their current permissions. Users wishing > > > to have the old permissions on new objects will need to grant > > > > The phrase "old permissions on new objects" doesn't sound right to me, but > > I'm > > not sure why. I think you're aiming for the fact that this is just a > > default; > > one can still change the ACL to anything, including to the old default. If > > these notes are going to mention the old default like they do so far, I > > think > > they should also urge readers to understand > > https://www.postgresql.org/docs/devel/ddl-schemas.html#DDL-SCHEMAS-PATTERNS > > before returning to the old default. What do you think? > > Agreed, the new text is: > > Users wishing to have the former permissions will need to grant > CREATE permission for PUBLIC on > the public schema; this change can be made on > template1 to cause all new databases to have these > permissions. What do you think about the "should also urge readers ..." part of my message?
Re: last_archived_wal is not necessary the latest WAL file (was Re: pgsql: Add test case for an archive recovery corner case.)
On Mon, Jun 27, 2022 at 07:32:08PM +0900, Michael Paquier wrote: > On Mon, Jun 27, 2022 at 12:04:57AM -0700, Noah Misch wrote: > > One can adapt the test to the server behavior by having the test wait for > > the > > archiver to start, as attached. This is sufficient to make check-world pass > > with the above sleep in place. I think we should also modify the > > PostgresNode > > archive_command to log a message. That lack of logging was a obstacle > > upthread (as seen in commit 3279cef) and again here. > > ? qq{copy "%p" "$path%f"} > - : qq{cp "%p" "$path/%f"}; > + : qq{echo >&2 "ARCHIVE_COMMAND %p"; cp "%p" "$path/%f"}; > > This is a bit inelegant. Perhaps it would be better through a perl > wrapper like cp_history_files? I see it the other way. Replacing a 49-character compound command with a wrapper script would gain no particular advantage, and it would give readers of the test code one more file to open and understand.
Re: [PATCH] minor reloption regression tests improvement
В письме от четверг, 30 июня 2022 г. 06:47:48 MSK пользователь Nikolay Shaplov написал: > Hi! I am surely feel this patch is important. I have bigger patch > https://commitfest.postgresql.org/38/3536/ and this test makes sense as a > part of big work of options refactoring, > > I am also was strongly advised to commit things chopped into smaller parts, > when possible. This test can be commit separately so I am doing it. Let me again explain why this test is importaint, so potential reviewers can easily find this information. Tests are for developers. You change the code and see that something does not work anymore, as it worked before. When you change the code, you should keep both documented and undocumented behaviour. Because user's code can intentionally or accidentally use it. So known undocumented behavior is better to be tested as well as documented one. If you as developer want to change undocumented behavior, it is better to do it consciously. This test insures that. I, personally hit this problem while working with reloption code. Older version of my new patch changed this behaviour, and I even did not notice that. So I decided to write this test to ensure, that nobody ever meet same problem. And as I said before, I was strongly advised to commit in small parts, when possible. This test can be commit separately. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su signature.asc Description: This is a digitally signed message part.
Add index item for MERGE.
Hi, I found that there is no index item for MERGE command, in the docs. Attached is the patch that adds the indexterm for MERGE to merge.sgml. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATIONdiff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml index cbb5b38a3e..7f73f3d89c 100644 --- a/doc/src/sgml/ref/merge.sgml +++ b/doc/src/sgml/ref/merge.sgml @@ -4,6 +4,9 @@ PostgreSQL documentation --> + + MERGE + MERGE
Re: tuplesort Generation memory contexts don't play nicely with index builds
On Wed, 29 Jun 2022 at 12:59, David Rowley wrote: > I noticed while doing some memory context related work that since we > now use generation.c memory contexts for tuplesorts (40af10b57) that > tuplesort_putindextuplevalues() causes memory "leaks" in the > generation context due to index_form_tuple() being called while we're > switched into the state->tuplecontext. I've attached a draft patch which changes things so that we don't use generation contexts for sorts being done for index builds. David diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c index 31554fd867..adc82028ba 100644 --- a/src/backend/utils/sort/tuplesort.c +++ b/src/backend/utils/sort/tuplesort.c @@ -250,6 +250,8 @@ struct Tuplesortstate boolbounded;/* did caller specify a maximum number of * tuples to return? */ boolboundUsed; /* true if we made use of a bounded heap */ + boolgenTupleCxt;/* true if we should use a generation.c memory +* context for tuple storage */ int bound; /* if bounded, the maximum number of tuples */ booltuples; /* Can SortTuple.tuple ever be set? */ int64 availMem; /* remaining memory available, in bytes */ @@ -618,7 +620,7 @@ struct Sharedsort static Tuplesortstate *tuplesort_begin_common(int workMem, SortCoordinate coordinate, - int sortopt); + int sortopt, bool genTupleCxt); static void tuplesort_begin_batch(Tuplesortstate *state); static void puttuple_common(Tuplesortstate *state, SortTuple *tuple); static bool consider_abort_common(Tuplesortstate *state); @@ -842,7 +844,8 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state) */ static Tuplesortstate * -tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt) +tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt, + bool genTupleCxt) { Tuplesortstate *state; MemoryContext maincontext; @@ -907,6 +910,8 @@ tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt) state->memtupsize = INITIAL_MEMTUPSIZE; state->memtuples = NULL; + state->genTupleCxt = genTupleCxt; + /* * After all of the other non-parallel-related state, we setup all of the * state needed for each batch. @@ -966,21 +971,16 @@ tuplesort_begin_batch(Tuplesortstate *state) * eases memory management. Resetting at key points reduces * fragmentation. Note that the memtuples array of SortTuples is allocated * in the parent context, not this context, because there is no need to -* free memtuples early. For bounded sorts, tuples may be pfreed in any -* order, so we use a regular aset.c context so that it can make use of -* free'd memory. When the sort is not bounded, we make use of a -* generation.c context as this keeps allocations more compact with less -* wastage. Allocations are also slightly more CPU efficient. +* free memtuples early. */ - if (state->sortopt & TUPLESORT_ALLOWBOUNDED) + if (state->genTupleCxt) + state->tuplecontext = GenerationContextCreate(state->sortcontext, + "Caller tuples", + ALLOCSET_DEFAULT_SIZES); + else state->tuplecontext = AllocSetContextCreate(state->sortcontext, "Caller tuples", ALLOCSET_DEFAULT_SIZES); - else - state->tuplecontext = GenerationContextCreate(state->sortcontext, - "Caller tuples", - ALLOCSET_DEFAULT_SIZES); - state->status = TSS_INITIAL; state->bounded = false; @@ -1033,11 +1033,21 @@ tuplesort_begin_heap(TupleDesc tupDesc, bool *nullsFirstFlags,
Re: [PATCH] minor reloption regression tests improvement
В письме от четверг, 30 июня 2022 г. 00:38:48 MSK пользователь Jacob Champion написал: > This patch is hanging open in the March commitfest. There was a bit of > back-and-forth on whether it should be rejected, but no clear statement on > the issue, so I'm going to mark it Returned with Feedback. If you still > feel strongly about this patch, please feel free to re-register it in the > July commitfest. Hi! I am surely feel this patch is important. I have bigger patch https://commitfest.postgresql.org/38/3536/ and this test makes sense as a part of big work of options refactoring, I am also was strongly advised to commit things chopped into smaller parts, when possible. This test can be commit separately so I am doing it. -- Nikolay Shaplov aka Nataraj Fuzzing Engineer at Postgres Professional Matrix IM: @dhyan:nataraj.su signature.asc Description: This is a digitally signed message part.
RE: Handle infinite recursion in logical replication setup
On Tue, Jun 28, 2022 2:18 PM vignesh C wrote: > > Thanks for the comments, the attached v25 patch has the changes for the > same. > Thanks for updating the patch. Here are some comments. 0002 patch: == 1. +# Test the CREATE SUBSCRIPTION 'origin' parameter and its interaction with +# 'copy_data' parameter. It seems we should move "and its interaction with 'copy_data' parameter" to 0003 patch. 0003 patch == 1. When using ALTER SUBSCRIPTION ... REFRESH, subscription will throw an error if any table is subscribed in publisher, even if the table has been subscribed before refresh (which won't do the initial copy when refreshing). It looks the previously subscribed tables don't need this check. Would it be better that we only check the tables which need to do the initial copy? 2. + errmsg("table:%s.%s might have replicated data in the publisher", + nspname, relname), I think the table name needs to be enclosed in double quotes, which is consistent with other messages. Regards, Shi yu
Backup command and functions can cause assertion failure and segmentation fault
Hi, I found that the assertion failure and the segmentation fault could happen by running pg_backup_start(), pg_backup_stop() and BASE_BACKUP replication command, in v15 or before. Here is the procedure to reproduce the assertion failure. 1. Connect to the server as the REPLICATION user who is granted EXECUTE to run pg_backup_start() and pg_backup_stop(). $ psql =# CREATE ROLE foo REPLICATION LOGIN; =# GRANT EXECUTE ON FUNCTION pg_backup_start TO foo; =# GRANT EXECUTE ON FUNCTION pg_backup_stop TO foo; =# \q $ psql "replication=database user=foo dbname=postgres" 2. Run pg_backup_start() and pg_backup_stop(). => SELECT pg_backup_start('test', true); => SELECT pg_backup_stop(); 3. Run BASE_BACKUP replication command with smaller MAX_RATE so that it can take a long time to finish. => BASE_BACKUP (CHECKPOINT 'fast', MAX_RATE 32); 4. Terminate the replication connection while it's running BASE_BACKUP. $ psql =# SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE backend_type = 'walsender'; This procedure can cause the following assertion failure. TRAP: FailedAssertion("XLogCtl->Insert.runningBackups > 0", File: "xlog.c", Line: 8779, PID: 69434) 0 postgres0x00010ab2ff7f ExceptionalCondition + 223 1 postgres0x00010a455126 do_pg_abort_backup + 102 2 postgres0x00010a8e13aa shmem_exit + 218 3 postgres0x00010a8e11ed proc_exit_prepare + 125 4 postgres0x00010a8e10f3 proc_exit + 19 5 postgres0x00010ab3171c errfinish + 1100 6 postgres0x00010a91fa80 ProcessInterrupts + 1376 7 postgres0x00010a886907 throttle + 359 8 postgres0x00010a88675d bbsink_throttle_archive_contents + 29 9 postgres0x00010a885aca bbsink_archive_contents + 154 10 postgres0x00010a885a2a bbsink_forward_archive_contents + 218 11 postgres0x00010a884a99 bbsink_progress_archive_contents + 89 12 postgres0x00010a881aba bbsink_archive_contents + 154 13 postgres0x00010a881598 sendFile + 1816 14 postgres0x00010a8806c5 sendDir + 3573 15 postgres0x00010a8805d9 sendDir + 3337 16 postgres0x00010a87e262 perform_base_backup + 1250 17 postgres0x00010a87c734 SendBaseBackup + 500 18 postgres0x00010a89a7f8 exec_replication_command + 1144 19 postgres0x00010a92319a PostgresMain + 2154 20 postgres0x00010a82b702 BackendRun + 50 21 postgres0x00010a82acfc BackendStartup + 524 22 postgres0x00010a829b2c ServerLoop + 716 23 postgres0x00010a827416 PostmasterMain + 6470 24 postgres0x00010a703e19 main + 809 25 libdyld.dylib 0x7fff2072ff3d start + 1 Here is the procedure to reproduce the segmentation fault. 1. Connect to the server as the REPLICATION user who is granted EXECUTE to run pg_backup_stop(). $ psql =# CREATE ROLE foo REPLICATION LOGIN; =# GRANT EXECUTE ON FUNCTION pg_backup_stop TO foo; =# \q $ psql "replication=database user=foo dbname=postgres" 2. Run BASE_BACKUP replication command with smaller MAX_RATE so that it can take a long time to finish. => BASE_BACKUP (CHECKPOINT 'fast', MAX_RATE 32); 3. Press Ctrl-C to cancel BASE_BACKUP while it's running. 4. Run pg_backup_stop(). => SELECT pg_backup_stop(); This procedure can cause the following segmentation fault. LOG: server process (PID 69449) was terminated by signal 11: Segmentation fault: 11 DETAIL: Failed process was running: SELECT pg_backup_stop(); The root cause of these failures seems that sessionBackupState flag is not reset to SESSION_BACKUP_NONE even when BASE_BACKUP is aborted. So attached patch changes do_pg_abort_backup callback so that it resets sessionBackupState. I confirmed that, with the patch, those assertion failure and segmentation fault didn't happen. But this change has one issue that; if BASE_BACKUP is run while a backup is already in progress in the session by pg_backup_start() and that session is terminated, the change causes XLogCtl->Insert.runningBackups to be decremented incorrectly. That is, XLogCtl->Insert.runningBackups is incremented by two by pg_backup_start() and BASE_BACKUP, but it's decremented only by one by the termination of the session. To address this issue, I think that we should disallow
Re: making relfilenodes 56 bits
On Thu, Jun 30, 2022 at 12:41 AM Simon Riggs wrote: > The reason to mention this now is that it would give more space than > 56bit limit being suggested here. Isn't 2^56 enough, though? Remembering that cluster time runs out when we've generated 2^64 bytes of WAL, if you want to run out of 56 bit relfile numbers before the end of time you'll need to find a way to allocate them in less than 2^8 bytes of WAL. That's technically possible, since SMgr CREATE records are only 42 bytes long, so you could craft some C code to do nothing but create (and leak) relfilenodes, but real usage is always accompanied by catalogue insertions to connect the new relfilenode to a database object, without which they are utterly useless. So in real life, it takes many hundreds or typically thousands of bytes, much more than 256.
Re: Add test of pg_prewarm extenion
On Wed, Jun 29, 2022 at 02:38:12PM +0900, Dong Wook Lee wrote: > Hi hackers, > I wrote a test for pg_prewarm extension. and I wrote it with the aim of > improving test coverage, and feedback is always welcome. The test fails when USE_PREFETCH isn't defined. http://cfbot.cputube.org/dongwook-lee.html You can accommodate that by adding an "alternate" output file, named like pg_prewarm_0.out BTW, you can test your patches the same as cfbot does (before mailing the list) on 4 OSes by pushing a branch to a github account. See ./src/tools/ci/README -- Justin
Re: Inconvenience of pg_read_binary_file()
At Tue, 07 Jun 2022 17:29:31 +0900 (JST), Kyotaro Horiguchi wrote in > pg_read_file(text, bool) makes sense to me, but it doesn't seem like > to be able to share C function with other variations. > pg_read_binary_file() need to accept some out-of-range value for > offset or length to signal that offset and length are not specified. In this version all the polypmorphic variations share the same body function. I tempted to add tail-reading feature but it would be another feature. > (function comments needs to be edited and docs are needed) - Simplified the implementation (by complexifying argument handling..). - REVOKEd EXECUTE from the new functions. - Edited the signature of the two functions. > - pg_read_file ( filename text [, offset bigint, length bigint [, missing_ok > boolean ]] ) → text > + pg_read_file ( filename text [, offset bigint, length bigint ] [, > missing_ok boolean ] ) → text And registered this to the next CF. regards. -- Kyotaro Horiguchi NTT Open Source Software Center >From 5d344fb56cded75e22cb4e56bcf1c10a31f5d4bb Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Thu, 30 Jun 2022 10:30:35 +0900 Subject: [PATCH v3] Add an argument variation to pg_read_file Let the functions pg_read_file and pg_read_binary_file have the argument variation of f(filename, missing_ok) so that the functions can read the whole file tolerating the file to be missing. --- doc/src/sgml/func.sgml | 4 +-- src/backend/catalog/system_functions.sql | 4 +++ src/backend/utils/adt/genfile.c | 33 +++- src/include/catalog/pg_proc.dat | 6 + 4 files changed, 39 insertions(+), 8 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 7b652460a1..34d76b74e4 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -28593,7 +28593,7 @@ SELECT pg_size_pretty(sum(pg_relation_size(relid))) AS total_size pg_read_file -pg_read_file ( filename text , offset bigint, length bigint , missing_ok boolean ) +pg_read_file ( filename text , offset bigint, length bigint , missing_ok boolean ) text @@ -28618,7 +28618,7 @@ SELECT pg_size_pretty(sum(pg_relation_size(relid))) AS total_size pg_read_binary_file -pg_read_binary_file ( filename text , offset bigint, length bigint , missing_ok boolean ) +pg_read_binary_file ( filename text , offset bigint, length bigint , missing_ok boolean ) bytea diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql index 73da687d5d..30a048f6b0 100644 --- a/src/backend/catalog/system_functions.sql +++ b/src/backend/catalog/system_functions.sql @@ -659,12 +659,16 @@ REVOKE EXECUTE ON FUNCTION pg_ls_tmpdir(oid) FROM public; REVOKE EXECUTE ON FUNCTION pg_read_file(text) FROM public; +REVOKE EXECUTE ON FUNCTION pg_read_file(text,boolean) FROM public; + REVOKE EXECUTE ON FUNCTION pg_read_file(text,bigint,bigint) FROM public; REVOKE EXECUTE ON FUNCTION pg_read_file(text,bigint,bigint,boolean) FROM public; REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text) FROM public; +REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text,boolean) FROM public; + REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text,bigint,bigint) FROM public; REVOKE EXECUTE ON FUNCTION pg_read_binary_file(text,bigint,bigint,boolean) FROM public; diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c index 2bf5219256..1a09431a79 100644 --- a/src/backend/utils/adt/genfile.c +++ b/src/backend/utils/adt/genfile.c @@ -278,6 +278,9 @@ pg_read_file(PG_FUNCTION_ARGS) * * No superuser check done here- instead privileges are handled by the * GRANT system. + * + * The two-argument variation of this function supposes the second argument as + * to be a Boolean. */ Datum pg_read_file_v2(PG_FUNCTION_ARGS) @@ -290,7 +293,9 @@ pg_read_file_v2(PG_FUNCTION_ARGS) text *result; /* handle optional arguments */ - if (PG_NARGS() >= 3) + if (PG_NARGS() == 2) + missing_ok = PG_GETARG_BOOL(1); + else if (PG_NARGS() >= 3) { seek_offset = PG_GETARG_INT64(1); bytes_to_read = PG_GETARG_INT64(2); @@ -299,9 +304,10 @@ pg_read_file_v2(PG_FUNCTION_ARGS) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("requested length cannot be negative"))); + + if (PG_NARGS() >= 4) + missing_ok = PG_GETARG_BOOL(3); } - if (PG_NARGS() >= 4) - missing_ok = PG_GETARG_BOOL(3); filename = convert_and_check_filename(filename_t); @@ -326,6 +332,8 @@ pg_read_binary_file(PG_FUNCTION_ARGS) bytea *result; /* handle optional arguments */ + if (PG_NARGS() == 2) + missing_ok = PG_GETARG_BOOL(1); if (PG_NARGS() >= 3) { seek_offset = PG_GETARG_INT64(1); @@ -335,9 +343,10 @@ pg_read_binary_file(PG_FUNCTION_ARGS) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
Add a test for "cannot truncate foreign table"
Hello, During checking regression tests of TRUNCATE on foreign tables for other patch [1], I found that there is no test for foreign tables that don't support TRUNCATE. When a foreign table has handler but doesn't support TRUNCATE, an error "cannot truncate foreign table xxx" occurs. So, what about adding a test this message output? We can add this test for file_fdw because it is one of the such foreign data wrappers. I attached a patch. [1] https://postgr.es/m/20220527172543.0a2fdb469cf048b81c096...@sraoss.co.jp -- Yugo NAGATA diff --git a/contrib/file_fdw/expected/file_fdw.out b/contrib/file_fdw/expected/file_fdw.out index 0029f36b35..261af1a8b5 100644 --- a/contrib/file_fdw/expected/file_fdw.out +++ b/contrib/file_fdw/expected/file_fdw.out @@ -246,6 +246,8 @@ UPDATE agg_csv SET a = 1; ERROR: cannot update foreign table "agg_csv" DELETE FROM agg_csv WHERE a = 100; ERROR: cannot delete from foreign table "agg_csv" +TRUNCATE agg_csv; +ERROR: cannot truncate foreign table "agg_csv" -- but this should be allowed SELECT * FROM agg_csv FOR UPDATE; a |b diff --git a/contrib/file_fdw/sql/file_fdw.sql b/contrib/file_fdw/sql/file_fdw.sql index 563d824ccc..46670397ca 100644 --- a/contrib/file_fdw/sql/file_fdw.sql +++ b/contrib/file_fdw/sql/file_fdw.sql @@ -166,6 +166,7 @@ SELECT tableoid::regclass, b FROM agg_csv; INSERT INTO agg_csv VALUES(1,2.0); UPDATE agg_csv SET a = 1; DELETE FROM agg_csv WHERE a = 100; +TRUNCATE agg_csv; -- but this should be allowed SELECT * FROM agg_csv FOR UPDATE;
Re: Testing autovacuum wraparound (including failsafe)
Hi, On Tue, Feb 1, 2022 at 11:58 AM Masahiko Sawada wrote: > > On Fri, Jun 11, 2021 at 10:19 AM Andres Freund wrote: > > > > Hi, > > > > On 2021-06-10 16:42:01 +0300, Anastasia Lubennikova wrote: > > > Cool. Thank you for working on that! > > > Could you please share a WIP patch for the $subj? I'd be happy to help > > > with > > > it. > > > > I've attached the current WIP state, which hasn't evolved much since > > this message... I put the test in > > src/backend/access/heap/t/001_emergency_vacuum.pl > > but I'm not sure that's the best place. But I didn't think > > src/test/recovery is great either. > > > > Thank you for sharing the WIP patch. > > Regarding point (1) you mentioned (StartupSUBTRANS() takes a long time > for zeroing out all pages), how about using single-user mode instead > of preparing the transaction? That is, after pg_resetwal we check the > ages of datfrozenxid by executing a query in single-user mode. That > way, we don’t need to worry about autovacuum concurrently running > while checking the ages of frozenxids. I’ve attached a PoC patch that > does the scenario like: > > 1. start cluster with autovacuum=off and create tables with a few data > and make garbage on them > 2. stop cluster and do pg_resetwal > 3. start cluster in single-user mode > 4. check age(datfrozenxid) > 5. stop cluster > 6. start cluster and wait for autovacuums to increase template0, > template1, and postgres datfrozenxids The above steps are wrong. I think we can expose a function in an extension used only by this test in order to set nextXid to a future value with zeroing out clog/subtrans pages. We don't need to fill all clog/subtrans pages between oldestActiveXID and nextXid. I've attached a PoC patch for adding this regression test and am going to register it to the next CF. BTW, while testing the emergency situation, I found there is a race condition where anti-wraparound vacuum isn't invoked with the settings autovacuum = off, autovacuum_max_workers = 1. AN autovacuum worker sends a signal to the postmaster after advancing datfrozenxid in SetTransactionIdLimit(). But with the settings, if the autovacuum launcher attempts to launch a worker before the autovacuum worker who has signaled to the postmaster finishes, the launcher exits without launching a worker due to no free workers. The new launcher won’t be launched until new XID is generated (and only when new XID % 65536 == 0). Although autovacuum_max_workers = 1 is not mandatory for this test, it's easier to verify the order of operations. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ v1-0001-Add-regression-tests-for-emergency-vacuums.patch Description: Binary data
Re: Emit extra debug message when executing extension script.
On Wed, Jun 29, 2022 at 9:26 AM Peter Eisentraut wrote: > On 28.06.22 21:10, Jeff Davis wrote: > > + ereport(DEBUG1, errmsg("executing extension script: %s", filename)); > > This should either be elog or use errmsg_internal. Why? -- Robert Haas EDB: http://www.enterprisedb.com
Re: enable/disable broken for statement triggers on partitioned tables
On Tue, May 24, 2022 at 3:11 PM Amit Langote wrote: > Simon reported $subject off-list. > > For triggers on partitioned tables, various enable/disable trigger > variants recurse to also process partitions' triggers by way of > ATSimpleRecursion() done in the "prep" phase. While that way of > recursing is fine for row-level triggers which are cloned to > partitions, it isn't for statement-level triggers which are not > cloned, so you get an unexpected error as follows: > > create table p (a int primary key) partition by list (a); > create table p1 partition of p for values in (1); > create function trigfun () returns trigger language plpgsql as $$ > begin raise notice 'insert on p'; end; $$; > create trigger trig before insert on p for statement execute function > trigfun(); > alter table p disable trigger trig; > ERROR: trigger "trig" for table "p1" does not exist > > The problem is that ATPrepCmd() is too soon to perform the recursion > in this case as it's not known at that stage if the trigger being > enabled/disabled is row-level or statement level, so it's better to > perform it during ATExecCmd(). Actually, that is how it used to be > done before bbb927b4db9b changed things to use ATSimpleRecursion() to > fix a related problem, which was that the ONLY specification was > ignored by the earlier implementation. The information of whether > ONLY is specified in a given command is only directly available in the > "prep" phase and must be remembered somehow if the recursion must be > handled in the "exec" phase. The way that's typically done that I see > in tablecmds.c is to have ATPrepCmd() change the AlterTableCmd.subtype > to a recursive variant of a given sub-command. For example, > AT_ValidateConstraint by AT_ValidateConstraintRecurse if ONLY is not > specified. > > So, I think we should do something like the attached. A lot of > boilerplate is needed given that the various enable/disable trigger > variants are represented as separate sub-commands (AlterTableCmd > subtypes), which can perhaps be avoided by inventing a > EnableDisableTrigStmt sub-command node that stores (only?) the recurse > flag. Added to the next CF: https://commitfest.postgresql.org/38/3728/ -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Re: PostgreSQL 15 beta 2 release announcement draft
On 6/29/22 3:07 PM, Erik Rijkers wrote: Op 29-06-2022 om 02:04 schreef Jonathan S. Katz: Hi, Attached is a draft of the release announcement for PostgreSQL 15 Beta 2. Please provide feedback on technical accuracy and if there are glaring omissions. Hardly 'glaring' but still: 'Multiples fixes' should be 'Multiple fixes' I think this could make or break the announcement :) Thanks for the catch -- fixed in the local copy. I'll read through again for other typos prior to release. Jonathan OpenPGP_signature Description: OpenPGP digital signature
Re: minor change for create_list_bounds()
On Tue, Mar 08, 2022 at 11:05:10AM -0800, Zhihong Yu wrote: > I was looking at commit db632fbca and noticed that, > in create_list_bounds(), if index is added to boundinfo->interleaved_parts > in the first if statement, there is no need to perform the second check > involving call to partition_bound_accepts_nulls(). Given this change probably doesn't meaningfully impact performance or code clarity, I'm personally -1 for this patch. Is there another motivation that I am missing? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: replacing role-level NOINHERIT with a grant-level option
On Wed, Jun 22, 2022 at 04:30:36PM -0400, Robert Haas wrote: > On Thu, Jun 2, 2022 at 1:17 PM Tom Lane wrote: >> Point 2 would cause every existing pg_dumpall script to fail, which >> seems like kind of a large gotcha. Less unpleasant alternatives >> could include >> >> * Continue to accept the syntax, but ignore it, maybe with a WARNING >> for the alternative that doesn't correspond to the new behavior. >> >> * Keep pg_authid.rolinherit, and have it act as supplying the default >> behavior for subsequent GRANTs to that role. > > Here's a rather small patch that does it the first way. I've been thinking about whether we ought to WARNING or ERROR with the deprecated syntax. WARNING has the advantage of not breaking existing scripts, but users might not catch that NOINHERIT roles will effectively become INHERIT roles, which IMO is just a different type of breakage. However, I don't think I can defend ERROR-ing and breaking all existing pg_dumpall scripts completely, so perhaps the best we can do is emit a WARNING until we feel comfortable removing the option completely 5-10 years from now. I'm guessing we'll also need a new pg_dumpall option for generating pre-v16 style role commands versus the v16+ style ones. When run on v16 and later, you'd have to require the latter option, as you won't always be able to convert grant-level inheritance options into role-level options. However, you can always do the reverse. I'm thinking that by default, pg_dumpall would output the style of commands for the current server version. pg_upgrade would make use of this option when upgrading from I suppose if we did it the second way, we could make the syntax GRANT > granted_role TO recipient_role WITH INHERIT { TRUE | FALSE | DEFAULT > }, and DEFAULT would copy the current value of the rolinherit > property, so that changing the rolinherit property later would not > affect previous grants. The reverse is also possible: with the same > syntax, the rolinherit column could be changed from bool to "char", > storing t/f/d, and 'd' could mean the value of the rolinherit property > at time of use; thus, changing rolinherit would affect previous grants > performed using WITH INHERIT DEFAULT but not those that specified WITH > INHERIT TRUE or WITH INHERIT FALSE. Yeah, something like this might be a nice way to sidestep the issue. I was imagining something more like your second option, but instead of continuing to allow grant-level options to take effect when rolinherit was true or false, I was thinking we would ignore them or even disallow them. By disallowing grant-level options when a role-level option was set, we might be able to avoid the confusion about what takes effect when. That being said, the syntax for this sort of thing might not be the cleanest. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: making relfilenodes 56 bits
On Wed, 29 Jun 2022 at 14:41, Simon Riggs wrote: > > On Tue, 28 Jun 2022 at 19:18, Matthias van de Meent > wrote: > > > I will be the first to admit that it is quite unlikely to be common > > practise, but this workload increases the number of dbOid+spcOid > > combinations to 100s (even while using only a single tablespace), > > Which should still fit nicely in 32bits then. Why does that present a > problem to this idea? It doesn't, or at least not the bitspace part. I think it is indeed quite unlikely anyone will try to build as many tablespaces as the 100 million tables project, which utilized 1000 tablespaces to get around file system limitations [0]. The potential problem is 'where to store such mapping efficiently'. Especially considering that this mapping might (and likely: will) change across restarts and when database churn (create + drop database) happens in e.g. testing workloads. > The reason to mention this now is that it would give more space than > 56bit limit being suggested here. I am not opposed to the current > patch, just finding ways to remove some objections mentioned by > others, if those became blockers. > > > which in my opinion requires some more thought than just handwaving it > > into an smgr array and/or checkpoint records. > > The idea is that we would store the mapping as an array, with the > value in the RelFileNode as the offset in the array. The array would > be mostly static, so would cache nicely. That part is not quite clear to me. Any cluster may have anywhere between 3 and hundreds or thousands of entries in that mapping. Do you suggest to dynamically grow that (presumably shared, considering the addressing is shared) array, or have a runtime parameter limiting the amount of those entries (similar to max_connections)? > For convenience, I imagine that the mapping could be included in WAL > in or near the checkpoint record, to ensure that the mapping was > available in all backups. Why would we need this mapping in backups, considering that it seems to be transient state that is lost on restart? Won't we still use full dbOid and spcOid in anything we communicate or store on disk (file names, WAL, pg_class rows, etc.), or did I misunderstand your proposal? Kind regards, Matthias van de Meent [0] https://www.pgcon.org/2013/schedule/attachments/283_Billion_Tables_Project-PgCon2013.pdf
Strange failures on chipmunk
Hi, chipmunk (an armv6l-powered original Raspberry Pi model 1?) has failed in a couple of weird ways recently on 14 and master. On 14 I see what appears to be a corrupted log file name: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=chipmunk=2022-06-16%2006%3A48%3A07 cp: cannot stat \342\200\230/home/pgbfarm/buildroot/REL_14_STABLE/pgsql.build/src/test/recovery/tmp_check/t_002_archiving_primary_data/archives/00010003\342\200\231: No such file or directory On master, you can ignore this failure, because it was addressed by 93759c66: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=chipmunk=2022-05-11%2015%3A26%3A01 Then there's this one-off, that smells like WAL corruption: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=chipmunk=2022-06-13%2015%3A12%3A44 2022-06-13 23:02:06.988 EEST [30121:5] LOG: incorrect resource manager data checksum in record at 0/79B4FE0 Hmmm. I suppose it's remotely possible that Linux/armv6l ext4 suffers from concurrency bugs like Linux/sparc. In that particular kernel bug's case it's zeroes, so I guess it'd be easier to speculate about if the log message included the checksum when it fails like that...
Re: [PATCH] minor reloption regression tests improvement
This patch is hanging open in the March commitfest. There was a bit of back-and-forth on whether it should be rejected, but no clear statement on the issue, so I'm going to mark it Returned with Feedback. If you still feel strongly about this patch, please feel free to re-register it in the July commitfest. Thanks, --Jacob
Re: [PoC/RFC] Multiple passwords, interval expirations
Greetings, On Wed, Jun 29, 2022 at 17:22 Jacob Champion wrote: > On 4/8/22 10:04, Joshua Brindle wrote: > > It's unclear if I will be able to continue working on this featureset, > > this email address will be inactive after today. > > I'm assuming the answer to this was "no". Is there any interest out > there to pick this up for the July CF? Short answer to that is yes, I’m interested in continuing this (though certainly would welcome it if there are others who are also interested, and may be able to bring someone else to help work on it too but that might be more August / September time frame). Thanks, Stephen >
Re: [PoC/RFC] Multiple passwords, interval expirations
On 4/8/22 10:04, Joshua Brindle wrote: > It's unclear if I will be able to continue working on this featureset, > this email address will be inactive after today. I'm assuming the answer to this was "no". Is there any interest out there to pick this up for the July CF? --Jacob
Re: Can we do something to help stop users mistakenly using force_parallel_mode?
On Wed, 29 Jun 2022 at 18:57, Laurenz Albe wrote: > Perhaps some stronger wording in the documetation would be beneficial. > I have little sympathy with people who set unusual parameters without > even glancing at the documentation. My thoughts are that the documentation is ok as is. I have a feeling the misusages come from stumbling upon a GUC that has a name which seems to indicate the GUC does exactly what they want. David
Re: Can we do something to help stop users mistakenly using force_parallel_mode?
On Thu, 30 Jun 2022 at 00:31, Justin Pryzby wrote: > Since the user in this recent thread is running v13.7, I'm *guessing* that > if that had been backpatched, they wouldn't have made this mistake. I wasn't aware of that change. Thanks for highlighting it. Maybe it's worth seeing if fewer mistakes are made now that we've changed the GUC into a developer option. David
Re: Stats collector's idx_blks_hit value is highly misleading in practice
Hello, I would like to get some feedback on that task. > pg_statio_*_tables.idx_blks_hit are highly misleading in practice > because they fail to take account of the difference between internal > pages and leaf pages in B-Tree indexes. I see it is still the case, so the issue is relevant, isn't it ? > The main challenge would be > passing information about what page we're dealing with (internal/leaf) > to the place actually calling pgstat_count_buffer_(read|hit). That > happens in ReadBufferExtended, which just has no idea what page it's > dealing with. Not sure how to do that cleanly ... I do not immediately see the way to pass the information in a completely clean manner. Either (1) ReadBufferExtended needs to know the type of an index page (leaf/internal) or (2) caller of ReadBufferExtended that can check the page type needs to learn if there was a hit and call pgstat_count_buffer_(read|hit) accordingly. In either case necessary code changes seem quite invasive to me. I have attached a code snippet to illustrate the second idea. Regards, Sergey diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c index 20adb602a4..d8c22be949 100644 --- a/src/backend/access/nbtree/nbtpage.c +++ b/src/backend/access/nbtree/nbtpage.c @@ -31,6 +31,7 @@ #include "miscadmin.h" #include "storage/indexfsm.h" #include "storage/lmgr.h" +#include "pgstat.h" #include "storage/predicate.h" #include "storage/procarray.h" #include "utils/memdebug.h" @@ -870,14 +871,23 @@ _bt_log_reuse_page(Relation rel, BlockNumber blkno, FullTransactionId safexid) Buffer _bt_getbuf(Relation rel, BlockNumber blkno, int access) { - Buffer buf; + Buffer buf; + Page page; + BTPageOpaque opaque; if (blkno != P_NEW) { + bool hit; /* Read an existing block of the relation */ - buf = ReadBuffer(rel, blkno); + buf = ReadIndexBuffer(rel, blkno, ); _bt_lockbuf(rel, buf, access); _bt_checkpage(rel, buf); + + page = BufferGetPage(buf); + opaque = BTPageGetOpaque(page); + if (hit && P_ISLEAF(opaque)) + pgstat_count_buffer_hit(rel); + } else { diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c index ae13011d27..ab9522fd50 100644 --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@ -704,6 +704,16 @@ ReadBuffer(Relation reln, BlockNumber blockNum) return ReadBufferExtended(reln, MAIN_FORKNUM, blockNum, RBM_NORMAL, NULL); } +/* + * ReadIndexBuffer -- a shorthand for ReadIndexBufferExtended, for reading from main + * fork with RBM_NORMAL mode and default strategy. + */ +Buffer +ReadIndexBuffer(Relation reln, BlockNumber blockNum, bool *hit) +{ + return ReadIndexBufferExtended(reln, MAIN_FORKNUM, blockNum, RBM_NORMAL, NULL, hit); +} + /* * ReadBufferExtended -- returns a buffer containing the requested * block of the requested relation. If the blknum @@ -774,6 +784,34 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum, return buf; } +/* + * Proof-of-concept. + * + * Same as ReadBufferExtended but returns the value of "hit" to improve + * statistics collection for indexes. + */ +Buffer +ReadIndexBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum, + ReadBufferMode mode, BufferAccessStrategy strategy, bool *hit) +{ + Buffer buf; + + /* + * Reject attempts to read non-local temporary relations; we would be + * likely to get wrong data since we have no visibility into the owning + * session's local buffers. + */ + if (RELATION_IS_OTHER_TEMP(reln)) + ereport(ERROR, +(errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot access temporary tables of other sessions"))); + + buf = ReadBuffer_common(RelationGetSmgr(reln), reln->rd_rel->relpersistence, + forkNum, blockNum, mode, strategy, hit); + + return buf; +} + /* * ReadBufferWithoutRelcache -- like ReadBufferExtended, but doesn't require diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h index 58391406f6..d4483c1666 100644 --- a/src/include/storage/bufmgr.h +++ b/src/include/storage/bufmgr.h @@ -179,9 +179,13 @@ extern PrefetchBufferResult PrefetchBuffer(Relation reln, ForkNumber forkNum, extern bool ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum, BlockNumber blockNum, Buffer recent_buffer); extern Buffer ReadBuffer(Relation reln, BlockNumber blockNum); +extern Buffer ReadIndexBuffer(Relation reln, BlockNumber blockNum, bool *hit); extern Buffer ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum, ReadBufferMode mode, BufferAccessStrategy strategy); +extern Buffer ReadIndexBufferExtended(Relation reln, ForkNumber forkNum, + BlockNumber blockNum, ReadBufferMode mode, + BufferAccessStrategy strategy, bool *hit); extern Buffer ReadBufferWithoutRelcache(RelFileNode rnode, ForkNumber forkNum, BlockNumber blockNum, ReadBufferMode mode, BufferAccessStrategy
Re: TAP output format in pg_regress
Attached is a new version of this patch, which completes the TAP output format option such that all codepaths emitting output are TAP compliant. The verbose option is fixed to to not output extraneous newlines which the previous PoC did. The output it made to conform to the original TAP spec since v13/14 TAP parsers seem less common than those that can handle the original spec. Support for the new format additions should be quite simple to add should we want that. Running pg_regress --verbose should give the current format output. I did end up combining TAP and --verbose into a single patch, as the TAP format sort of depends on the verbose flag as TAP has no verbose mode. I can split it into two separate should a reviewer prefer that. -- Daniel Gustafsson https://vmware.com/ v5-0001-pg_regress-TAP-output-format.patch Description: Binary data
Re: PostgreSQL 15 beta 2 release announcement draft
Op 29-06-2022 om 02:04 schreef Jonathan S. Katz: Hi, Attached is a draft of the release announcement for PostgreSQL 15 Beta 2. Please provide feedback on technical accuracy and if there are glaring omissions. Hardly 'glaring' but still: 'Multiples fixes' should be 'Multiple fixes' Please provide any feedback prior to 2022-06-22 0:00 AoE. Thanks, Jonathan
Re: pg_upgrade (12->14) fails on aggregate
> On 29 Jun 2022, at 23:07, Justin Pryzby wrote: > > On Wed, Jun 29, 2022 at 10:58:44PM +0500, Andrey Borodin wrote: >>> On 28 Jun 2022, at 04:30, Justin Pryzby wrote: >>> >>> Nope, it's as I said: this would break pg_upgrade from older versions. >> >> As far as I understand 9.5 is not supported. Probably, it makes sense to >> keep pg_upgrade running against 9.5 clusters, but I'm not sure if we do this >> routinely. > > As of last year, there's a reasonably clear policy for support of old > versions: > > https://www.postgresql.org/docs/devel/pgupgrade.html > |pg_upgrade supports upgrades from 9.2.X and later to the current major > release of PostgreSQL, including snapshot and beta releases. This makes sense, thank you for clarification. The patch is marked WiP, what is in progress as of now? Best regards, Andrey Borodin.
Re: pg_upgrade (12->14) fails on aggregate
On Wed, Jun 29, 2022 at 10:58:44PM +0500, Andrey Borodin wrote: > > On 28 Jun 2022, at 04:30, Justin Pryzby wrote: > > > > Nope, it's as I said: this would break pg_upgrade from older versions. > > As far as I understand 9.5 is not supported. Probably, it makes sense to keep > pg_upgrade running against 9.5 clusters, but I'm not sure if we do this > routinely. As of last year, there's a reasonably clear policy for support of old versions: https://www.postgresql.org/docs/devel/pgupgrade.html |pg_upgrade supports upgrades from 9.2.X and later to the current major release of PostgreSQL, including snapshot and beta releases. See: e469f0aaf3c586c8390bd65923f97d4b1683cd9f -- Justin
Re: pg_upgrade (12->14) fails on aggregate
> On 28 Jun 2022, at 04:30, Justin Pryzby wrote: > > Nope, it's as I said: this would break pg_upgrade from older versions. As far as I understand 9.5 is not supported. Probably, it makes sense to keep pg_upgrade running against 9.5 clusters, but I'm not sure if we do this routinely. Besides this the patch seems to be RfC. Best regards, Andrey Borodin.
Re: PostgreSQL 15 beta 2 release announcement draft
On 6/29/22 9:30 AM, Tom Lane wrote: "Jonathan S. Katz" writes: On 6/29/22 2:55 AM, Pantelis Theodosiou wrote: Is the major version upgrade still needed if they are upgrading from 15 Beta 1? No, but it would be required if you a upgrading from a different version. The language attempts to be a "catch all" to account for the different cases. Actually, I think you do need the hard-way upgrade, because there was a catversion bump since beta1. Oh -- I didn't see that when I scanned the commit logs, but good to know. Jonathan OpenPGP_signature Description: OpenPGP digital signature
Re: JSON/SQL: jsonpath: incomprehensible error message
On 2022-06-29 We 10:58, Alexander Korotkov wrote: > On Wed, Jun 29, 2022 at 4:28 PM Erik Rijkers wrote: >> Op 29-06-2022 om 15:00 schreef Amit Kapila: >>> On Mon, Jun 27, 2022 at 8:46 PM Andrew Dunstan wrote: On 2022-06-26 Su 11:44, Erik Rijkers wrote: > JSON/SQL jsonpath > > For example, a jsonpath string with deliberate typo 'like_regexp' > (instead of 'like_regex'): > > select js > from (values (jsonb '{}')) as f(js) > where js @? '$ ? (@ like_regexp "^xxx")'; > > ERROR: syntax error, unexpected IDENT_P at or near " " of jsonpath input > LINE 1: ...s from (values (jsonb '{}')) as f(js) where js @? '$ ? (@ > li... > > Both 'IDENT_P' and 'at or near " "' seem pretty useless. > >>> removing this. One thing that is not clear to me is why OP sees an >>> acceptable message (ERROR: syntax error, unexpected invalid token at >>> or near "=" of jsonpath input) for a similar query in 14? >> To mention that was perhaps unwise of me because The IDENT_P (or more >> generally, *_P) messages can be provoked on 14 too. I just thought >> 'invalid token' might be a better message because 'token' gives a more >> direct association with 'errors during parsing' which I assume is the >> case here. >> >> IDENT_P or ANY_P convey exactly nothing. > +1 > I agree, but I don't think "invalid token" is all that much better. I think the right fix is just to get rid of the parser setting that causes production of these additions to the error message, and make it just like all the other bison parsers we have. Then the problem just disappears. It's a very slight change of behaviour, but I agree with Amit that we can backpatch it. I will do so shortly unless there's an objection. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: JSON/SQL: jsonpath: incomprehensible error message
On Wed, Jun 29, 2022 at 4:28 PM Erik Rijkers wrote: > Op 29-06-2022 om 15:00 schreef Amit Kapila: > > On Mon, Jun 27, 2022 at 8:46 PM Andrew Dunstan wrote: > >> > >> On 2022-06-26 Su 11:44, Erik Rijkers wrote: > >>> JSON/SQL jsonpath > >>> > >>> For example, a jsonpath string with deliberate typo 'like_regexp' > >>> (instead of 'like_regex'): > >>> > >>> select js > >>> from (values (jsonb '{}')) as f(js) > >>> where js @? '$ ? (@ like_regexp "^xxx")'; > >>> > >>> ERROR: syntax error, unexpected IDENT_P at or near " " of jsonpath input > >>> LINE 1: ...s from (values (jsonb '{}')) as f(js) where js @? '$ ? (@ > >>> li... > >>> > >>> Both 'IDENT_P' and 'at or near " "' seem pretty useless. > >>> > > > removing this. One thing that is not clear to me is why OP sees an > > acceptable message (ERROR: syntax error, unexpected invalid token at > > or near "=" of jsonpath input) for a similar query in 14? > > To mention that was perhaps unwise of me because The IDENT_P (or more > generally, *_P) messages can be provoked on 14 too. I just thought > 'invalid token' might be a better message because 'token' gives a more > direct association with 'errors during parsing' which I assume is the > case here. > > IDENT_P or ANY_P convey exactly nothing. +1 -- Regards, Alexander Korotkov
Re: Export log_line_prefix(); useful for emit_log_hook.
On 2022-Jun-28, Jeff Davis wrote: > Patch attached. Some kinds of emit log hooks might find it useful to > also compute the log_line_prefix. Hmm, maybe your hypothetical book would prefer to use a different setting for log line prefix than Log_line_prefix, so it would make sense to pass the format string as a parameter to the function instead of relying on the GUC global. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Java is clearly an example of money oriented programming" (A. Stepanov)
Re: Allowing REINDEX to have an optional name
On Wed, 29 Jun 2022 at 05:35, Michael Paquier wrote: > > On Tue, Jun 28, 2022 at 11:02:25AM +0100, Simon Riggs wrote: > > Attached patch is tested, documented and imho ready to be committed, > > so I will mark it so in CFapp. Thanks for the review Michael. > The behavior introduced by this patch should be reflected in > reindexdb. See in particular reindex_one_database(), where a > REINDEX_SYSTEM is enforced first on the catalogs for the > non-concurrent mode when running the reindex on a database. Originally, I was trying to avoid changing prior behavior, but now that we have agreed to do so, this makes sense. That section of code has been removed, tests updated. No changes to docs seem to be required. > +-- unqualified reindex database > +-- if you want to test REINDEX DATABASE, uncomment the following line, > +-- but note that this adds about 0.5s to the regression tests and the > +-- results are volatile when run in parallel to other tasks. Note also > +-- that REINDEX SYSTEM is specifically not tested because it can deadlock. > +-- REINDEX (VERBOSE) DATABASE; > > No need to add that IMHO. That was more a comment to reviewer, but I think something should be said for later developers. > REINDEX [ ( option [, > ...] ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ > CONCURRENTLY ] name > +REINDEX [ ( option [, > ...] ) ] { DATABASE | SYSTEM } [ CONCURRENTLY ] [ class="parameter">name ] > > Shouldn't you remove DATABASE and SYSTEM from the first line, keeping > only INDEX. TABLE and SCHEMA? The second line, with its optional > "name" would cover the DATABASE and SYSTEM cases at 100%. I agree that your proposal is clearer. Done. > -if (strcmp(objectName, get_database_name(objectOid)) != 0) > +if (objectName && strcmp(objectName, get_database_name(objectOid)) > != 0) > ereport(ERROR, > (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > errmsg("can only reindex the currently open > database"))); > if (!pg_database_ownercheck(objectOid, GetUserId())) > aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_DATABASE, > - objectName); > + get_database_name(objectOid)); > > This could call get_database_name() just once. It could, but I couldn't see any benefit in changing that for the code under discussion. If calling get_database_name() multiple times is an issue, I've added a cache for that - another patch attached, if you think its worth it. > +* You might think it would be good to include catalogs, > +* but doing that can deadlock, so isn't much use in real world, > +* nor can we safely test that it even works. > > Not sure what you mean here exactly. REINDEX SYSTEM can deadlock, which is why we are avoiding it. This was a comment to later developers as to why things are done that way. Feel free to update the wording or location, but something should be mentioned to avoid later discussion. Thanks for the review, new version attached. -- Simon Riggshttp://www.EnterpriseDB.com/ reindex_not_require_database_name.v5.patch Description: Binary data cache_get_database_name.v1.patch Description: Binary data
Re: [BUG] Panic due to incorrect missingContrecPtr after promotion
> Would you mind trying the second attached to abtain detailed log on > your testing environment? With the patch, the modified TAP test yields > the log lines like below. Thanks for this. I will apply this to the testing environment and will share the output. Regards, Sami Imseih Amazon Web Services (AWS)
Re: Proposal: allow database-specific role memberships
Kenaniah Cerny wrote: > Attached is a newly-rebased patch -- would love to get a review from someone > whenever possible. I've picked this patch for a review. The patch currently does not apply to the master branch, so I could only read the diff. Following are my comments: * I think that roles_is_member_of() deserves a comment explaining why the code that you moved into append_role_memberships() needs to be called twice, i.e. once for global memberships and once for the database-specific ones. I think the reason is that if, for example, role "A" is a database-specific member of role "B" and "B" is a "global" member of role "C", then "A" should not be considered a member of "C", unless "A" is granted "C" explicitly. Is this behavior intended? Note that in this example, the "C" members are a superset of "B" members, and thus "C" should have weaker permissions on database objects than "B". What's then the reason to not consider "A" a member of "C"? If "C" gives its members some permissions of "B" (e.g. "pg_write_all_data"), then I think the roles hierarchy is poorly designed. A counter-example might help me to understand. * Why do you think that "unsafe_tests" is the appropriate name for the directory that contains regression tests? I can spend more time on the review if the patch gets rebased. -- Antonin Houska Web: https://www.cybertec-postgresql.com
Re: [PoC] Let libpq reject unexpected authentication requests
On 27.06.22 23:40, Jacob Champion wrote: -HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, keep_connections +HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslcertmode, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, require_auth, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, keep_connections It's not strictly related to your patch, but maybe this hint has outlived its usefulness? I mean, we don't list all available tables when you try to reference a table that doesn't exist. And unordered on top of that.
Re: PostgreSQL 15 beta 2 release announcement draft
"Jonathan S. Katz" writes: > On 6/29/22 2:55 AM, Pantelis Theodosiou wrote: >> Is the major version upgrade still needed if they are upgrading from 15 Beta >> 1? > No, but it would be required if you a upgrading from a different > version. The language attempts to be a "catch all" to account for the > different cases. Actually, I think you do need the hard-way upgrade, because there was a catversion bump since beta1. regards, tom lane
Re: JSON/SQL: jsonpath: incomprehensible error message
Op 29-06-2022 om 15:00 schreef Amit Kapila: On Mon, Jun 27, 2022 at 8:46 PM Andrew Dunstan wrote: On 2022-06-26 Su 11:44, Erik Rijkers wrote: JSON/SQL jsonpath For example, a jsonpath string with deliberate typo 'like_regexp' (instead of 'like_regex'): select js from (values (jsonb '{}')) as f(js) where js @? '$ ? (@ like_regexp "^xxx")'; ERROR: syntax error, unexpected IDENT_P at or near " " of jsonpath input LINE 1: ...s from (values (jsonb '{}')) as f(js) where js @? '$ ? (@ li... Both 'IDENT_P' and 'at or near " "' seem pretty useless. removing this. One thing that is not clear to me is why OP sees an acceptable message (ERROR: syntax error, unexpected invalid token at or near "=" of jsonpath input) for a similar query in 14? To mention that was perhaps unwise of me because The IDENT_P (or more generally, *_P) messages can be provoked on 14 too. I just thought 'invalid token' might be a better message because 'token' gives a more direct association with 'errors during parsing' which I assume is the case here. IDENT_P or ANY_P convey exactly nothing. Erik
Re: Emit extra debug message when executing extension script.
On 28.06.22 21:10, Jeff Davis wrote: + ereport(DEBUG1, errmsg("executing extension script: %s", filename)); This should either be elog or use errmsg_internal.
Re: JSON/SQL: jsonpath: incomprehensible error message
On Mon, Jun 27, 2022 at 8:46 PM Andrew Dunstan wrote: > > On 2022-06-26 Su 11:44, Erik Rijkers wrote: > > JSON/SQL jsonpath > > > > For example, a jsonpath string with deliberate typo 'like_regexp' > > (instead of 'like_regex'): > > > > select js > > from (values (jsonb '{}')) as f(js) > > where js @? '$ ? (@ like_regexp "^xxx")'; > > > > ERROR: syntax error, unexpected IDENT_P at or near " " of jsonpath input > > LINE 1: ...s from (values (jsonb '{}')) as f(js) where js @? '$ ? (@ > > li... > > ^ > > > > Both 'IDENT_P' and 'at or near " "' seem pretty useless. > > > > Perhaps some improvement can be thought of? > > > > Similar messages in release 14 seem to use 'invalid token', which is > > better: > > > > select js > > from (values (jsonb '{"a":"b"}')) as f(js) > > where js @? '$ ? (@.a .= "b")'; > > ERROR: syntax error, unexpected invalid token at or near "=" of > > jsonpath input > > > > > > Yeah :-( > > This apparently goes back to the original jsonpath commit 72b6460336e. > There are similar error messages in the back branch regression tests: > > andrew@ub20:pgl $ grep -r IDENT_P pg_*/src/test/regress/expected/ > pg_12/src/test/regress/expected/jsonpath.out:ERROR: syntax error, unexpected > IDENT_P at end of jsonpath input > pg_13/src/test/regress/expected/jsonpath.out:ERROR: syntax error, unexpected > IDENT_P at end of jsonpath input > pg_14/src/test/regress/expected/jsonpath.out:ERROR: syntax error, unexpected > IDENT_P at end of jsonpath input > > For some reason the parser contains a '%error-verbose' directive, unlike > all our other bison parsers. Removing that fixes it, as in this patch. > I'm a bit inclined to say we should backpatch the removal of the > directive, > I guess it is okay to backpatch unless we think some user will be dependent on such a message or there could be other side effects of removing this. One thing that is not clear to me is why OP sees an acceptable message (ERROR: syntax error, unexpected invalid token at or near "=" of jsonpath input) for a similar query in 14? -- With Regards, Amit Kapila.
Re: PostgreSQL 15 beta 2 release announcement draft
On 6/29/22 2:12 AM, Erik Rijkers wrote: Op 29-06-2022 om 02:04 schreef Jonathan S. Katz: Hi, 'not advise you to run PostgreSQL 15 Beta 1' should be 'not advise you to run PostgreSQL 15 Beta 2' Thanks; I adjusted the copy. Jonathan OpenPGP_signature Description: OpenPGP digital signature
Re: PostgreSQL 15 beta 2 release announcement draft
On 6/29/22 2:55 AM, Pantelis Theodosiou wrote: Upgrading to PostgreSQL 15 Beta 2 - To upgrade to PostgreSQL 15 Beta 2 from an earlier version of PostgreSQL, you will need to use a strategy similar to upgrading between major versions of PostgreSQL (e.g. `pg_upgrade` or `pg_dump` / `pg_restore`). For more information, please visit the documentation section on [upgrading](https://www.postgresql.org/docs/15/static/upgrading.html). Is the major version upgrade still needed if they are upgrading from 15 Beta 1? No, but it would be required if you a upgrading from a different version. The language attempts to be a "catch all" to account for the different cases. Jonathan OpenPGP_signature Description: OpenPGP digital signature
Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal
> Justin Pryzby writes: > > I have nothing new to add, but wanted to point out this is still an issue. > > This is on the v13 Opened Items list - for lack of anywhere else to put > > them, I > > also added two other, unresolved issues. Two minor things to add: 1. This issue is still reproducible on 15Beta2 (c1d033fcb5) - Backtrace here [2] 2. There was a mention that amcheck could throw up errors, but despite quickly stopping the workload, I didn't find anything interesting [1]. (gdb) frame 3 #3 0x55bf9fe496c8 in comparetup_index_btree (a=0x7f0a1a50ba80, b=0x7f0a1a50b9d8, state=0x55bfa19ce960) at tuplesort.c:4454 4454Assert(false); (gdb) info locals sortKey = 0x55bfa19cef10 tuple1 = 0x7f0a1a563ee0 tuple2 = 0x7f0a1a5642a0 keysz = 2 tupDes = 0x7f0a1a9e66a8 equal_hasnull = false nkey = 3 compare = 0 datum1 = 2085305 datum2 = 2085305 isnull1 = false isnull2 = false __func__ = "comparetup_index_btree" (gdb) p *tuple1 $5 = {t_tid = {ip_blkid = {bi_hi = 0, bi_lo = 205}, ip_posid = 3}, t_info = 16} (gdb) p *tuple2 $9 = {t_tid = {ip_blkid = {bi_hi = 0, bi_lo = 205}, ip_posid = 3}, t_info = 16} (gdb) p *sortKey $7 = {ssup_cxt = 0x40, ssup_collation = 0, ssup_reverse = false, ssup_nulls_first = false, ssup_attno = 0, ssup_extra = 0x0, comparator = 0x7f7f7f7f7f7f7f7f, abbreviate = 127, abbrev_converter = 0x7f7f7f7f7f7f7f7f, abbrev_abort = 0x7f7f7f7f7f7f7f7f, abbrev_full_comparator = 0x7f7f7f7f7f7f7f7f} (gdb) p *tupDes $8 = {natts = 2, tdtypeid = 2249, tdtypmod = -1, tdrefcount = 1, constr = 0x0, attrs = 0x7f0a1a9e66c0} Reference: 1) postgres=# select bt_index_parent_check('pg_class_tblspc_relfilenode_index', true, true); bt_index_parent_check --- (1 row) postgres=# select bt_index_check('pg_class_tblspc_relfilenode_index', true); bt_index_check (1 row) 2) Backtrace - (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x7f0a25692859 in __GI_abort () at abort.c:79 #2 0x55bf9fdf1718 in ExceptionalCondition (conditionName=0x55bfa0036a0b "false", errorType=0x55bfa0035d5e "FailedAssertion", fileName=0x55bfa0035dbd "tuplesort.c", lineNumber=4454) at assert.c:69 #3 0x55bf9fe496c8 in comparetup_index_btree (a=0x7f0a1a50ba80, b=0x7f0a1a50b9d8, state=0x55bfa19ce960) at tuplesort.c:4454 #4 0x55bf9fe40901 in qsort_tuple (data=0x7f0a1a50b8e8, n=13, compare=0x55bf9fe484ab , arg=0x55bfa19ce960) at ../../../../src/include/lib/sort_template.h:341 #5 0x55bf9fe40b2f in qsort_tuple (data=0x7f0a1a50b3a8, n=61, compare=0x55bf9fe484ab , arg=0x55bfa19ce960) at ../../../../src/include/lib/sort_template.h:378 #6 0x55bf9fe40b9f in qsort_tuple (data=0x7f0a1a509e78, n=343, compare=0x55bf9fe484ab , arg=0x55bfa19ce960) at ../../../../src/include/lib/sort_template.h:392 #7 0x55bf9fe40b2f in qsort_tuple (data=0x7f0a1a509e78, n=833, compare=0x55bf9fe484ab , arg=0x55bfa19ce960) at ../../../../src/include/lib/sort_template.h:378 #8 0x55bf9fe40b9f in qsort_tuple (data=0x7f0a1a4d9050, n=2118, compare=0x55bf9fe484ab , arg=0x55bfa19ce960) at ../../../../src/include/lib/sort_template.h:392 #9 0x55bf9fe46df8 in tuplesort_sort_memtuples (state=0x55bfa19ce960) at tuplesort.c:3698 #10 0x55bf9fe44043 in tuplesort_performsort (state=0x55bfa19ce960) at tuplesort.c:2217 #11 0x55bf9f783a85 in _bt_leafbuild (btspool=0x55bfa1913318, btspool2=0x0) at nbtsort.c:559 #12 0x55bf9f7835a6 in btbuild (heap=0x7f0a1a9df940, index=0x7f0a1a9e2898, indexInfo=0x55bfa19bc740) at nbtsort.c:336 #13 0x55bf9f81c8cc in index_build (heapRelation=0x7f0a1a9df940, indexRelation=0x7f0a1a9e2898, indexInfo=0x55bfa19bc740, isreindex=true, parallel=true) at index.c:3018 #14 0x55bf9f81dbe6 in reindex_index (indexId=3455, skip_constraint_checks=false, persistence=112 'p', params=0x7ffcfa60a524) at index.c:3718 #15 0x55bf9f925148 in ReindexIndex (indexRelation=0x55bfa18f09a0, params=0x7ffcfa60a598, isTopLevel=true) at indexcmds.c:2727 #16 0x55bf9f924f67 in ExecReindex (pstate=0x55bfa1913070, stmt=0x55bfa18f09f8, isTopLevel=true) at indexcmds.c:2651 #17 0x55bf9fc3397f in standard_ProcessUtility (pstmt=0x55bfa18f0d48, queryString=0x55bfa18eff30 "REINDEX INDEX pg_class_tblspc_relfilenode_index;", readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x55bfa18f0e38, qc=0x7ffcfa60ad20) at utility.c:960 #18 0x7f0a251d6887 in pgss_ProcessUtility (pstmt=0x55bfa18f0d48, queryString=0x55bfa18eff30 "REINDEX INDEX pg_class_tblspc_relfilenode_index;", readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x55bfa18f0e38, qc=0x7ffcfa60ad20) at pg_stat_statements.c:1143 #19 0x55bf9fc32d34 in ProcessUtility (pstmt=0x55bfa18f0d48, queryString=0x55bfa18eff30 "REINDEX INDEX pg_class_tblspc_relfilenode_index;", readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x55bfa18f0e38, qc=0x7ffcfa60ad20) at utility.c:526 #20
Re: making relfilenodes 56 bits
On Tue, 28 Jun 2022 at 19:18, Matthias van de Meent wrote: > I will be the first to admit that it is quite unlikely to be common > practise, but this workload increases the number of dbOid+spcOid > combinations to 100s (even while using only a single tablespace), Which should still fit nicely in 32bits then. Why does that present a problem to this idea? The reason to mention this now is that it would give more space than 56bit limit being suggested here. I am not opposed to the current patch, just finding ways to remove some objections mentioned by others, if those became blockers. > which in my opinion requires some more thought than just handwaving it > into an smgr array and/or checkpoint records. The idea is that we would store the mapping as an array, with the value in the RelFileNode as the offset in the array. The array would be mostly static, so would cache nicely. For convenience, I imagine that the mapping could be included in WAL in or near the checkpoint record, to ensure that the mapping was available in all backups. -- Simon Riggshttp://www.EnterpriseDB.com/
Re: Can we do something to help stop users mistakenly using force_parallel_mode?
On Wed, Jun 29, 2022 at 03:23:27PM +1200, David Rowley wrote: > Over on [1] I noticed that the user had set force_parallel_mode to > "on" in the hope that would trick the planner into making their query > run more quickly. Of course, that's not what they want since that GUC > is only there to inject some parallel nodes into the plan in order to > verify the tuple communication works. > > I get the idea that Robert might have copped some flak about this at > some point, given that he wrote the blog post at [2]. > > The user would have realised this if they'd read the documentation > about the GUC. However, I imagine they only went as far as finding a > GUC with a name which appears to be exactly what they need. I mean, > what else could force_parallel_mode possibly do? Note that it was already changed to be a developer GUC https://www.postgresql.org/message-id/20210404012546.GK6592%40telsasoft.com And I asked if that re-classification should be backpatched: > It's to their benefit and ours if they don't do that on v10-13 for the next 5 > years, not just v14-17. Since the user in this recent thread is running v13.7, I'm *guessing* that if that had been backpatched, they wouldn't have made this mistake. > I wonder if \dconfig *parallel* is going to make force_parallel_mode > even easier to find once PG15 is out. Maybe. Another consequence is that if someone *does* set f_p_m, it may be a bit easier and more likely for a local admin to discover it (before mailing the pgsql lists). -- Justin
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages
So I wrote some more test scenarios for this, and as I wrote in some other thread, I realized that there are more problems than just some NOTICE trouble. For instance, if you send a query, then read the result but not the terminating NULL then send another query, everything gets confused; the next thing you'll read is the result for the second query, without having read the NULL that terminates the results of the first query. Any application that expects the usual flow of results will be confused. Kyotaro's patch to add PIPELINE_IDLE fixes this bit too, as far as I can tell. However, another problem case, not fixed by PIPELINE_IDLE, occurs if you exit pipeline mode after PQsendQuery() and then immediately use PQexec(). The CloseComplete will be received at the wrong time, and a notice is emitted nevertheless. I spent a lot of time trying to understand how to fix this last bit, and the solution I came up with is that PQsendQuery() must add a second entry to the command queue after the PGQUERY_EXTENDED one, to match the CloseComplete message being expected; with that entry in the queue, PQgetResult will really only get to IDLE mode after the Close has been seen, which is what we want. I named it PGQUERY_CLOSE. Sadly, some hacks are needed to make this work fully: 1. the client is never expecting that PQgetResult() would return anything for the CloseComplete, so something needs to consume the CloseComplete silently (including the queue entry for it) when it is received; I chose to do this directly in pqParseInput3. I tried to make PQgetResult itself do it, but it became a pile of hacks until I was no longer sure what was going on. Putting it in fe-protocol3.c ends up a lot cleaner. However, we still need PQgetResult to invoke parseInput() at the point where Close is expected. 2. if an error occurs while executing the query, the CloseComplete will of course never arrive, so we need to erase it from the queue silently if we're returning an error. I toyed with the idea of having parseInput() produce a PGresult that encodes the Close message, and have PQgetResult consume and discard that, then read some further message to have something to return. But it seemed inefficient and equally ugly and I'm not sure that flow control is any simpler. I think another possibility would be to make PQexitPipelineMode responsible for /something/, but I'm not sure what that would be. Checking the queue and seeing if the next message is CloseComplete, then eating that message before exiting pipeline mode? That seems way too magical. I didn't attempt this. I attach a patch series that implements the proposed fix (again for REL_14_STABLE) in steps, to make it easy to read. I intend to squash the whole lot into a single commit before pushing. But while writing this email it occurred to me that I need to add at least one more test, to receive a WARNING while waiting for CloseComplete. AFAICT it should work, but better make sure. I produced pipeline_idle.trace file by running the test in the fully fixed tree, then used it to verify that all tests fail in different ways when run without the fixes. The first fix with PIPELINE_IDLE fixes some of these failures, and the PGQUERY_CLOSE one fixes the remaining one. Reading the trace file, it looks correct to me. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "Doing what he did amounts to sticking his fingers under the hood of the implementation; if he gets his fingers burnt, it's his problem." (Tom Lane) >From 64fc6f56f88cf3d5e6c3eaada32887939ad3b49f Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Wed, 15 Jun 2022 19:42:23 +0200 Subject: [PATCH v7 1/6] Use Test::Differences if available --- src/test/modules/libpq_pipeline/README | 3 +++ .../modules/libpq_pipeline/t/001_libpq_pipeline.pl | 13 - 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/src/test/modules/libpq_pipeline/README b/src/test/modules/libpq_pipeline/README index d8174dd579..59c6ea8109 100644 --- a/src/test/modules/libpq_pipeline/README +++ b/src/test/modules/libpq_pipeline/README @@ -1 +1,4 @@ Test programs and libraries for libpq + +If you have Test::Differences installed, any differences in the trace files +are displayed in a format that's easier to read than the standard format. diff --git a/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl b/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl index d8d496c995..49eec8a63a 100644 --- a/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl +++ b/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl @@ -9,6 +9,17 @@ use PostgresNode; use TestLib; use Test::More; +# Use Test::Differences if installed, and select unified diff output. +# No decent way to select a context line count with this; +# we could use a sub ref to allow that. +BEGIN +{ + if (!eval q{ use Test::Differences; unified_diff(); 1 }) + { + *eq_or_diff = \ + }
Re: doc: BRIN indexes and autosummarize
On Tue, Jun 28, 2022 at 05:22:34PM -0600, Roberto Mello wrote: > Here's a patch to clarify the BRIN indexes documentation, particularly with > regards to autosummarize, vacuum and autovacuum. It basically breaks down a > big blob of a paragraph into multiple paragraphs for clarity, plus explicitly > tells how summarization happens manually or automatically. See also this older thread https://www.postgresql.org/message-id/flat/20220224193520.gy9...@telsasoft.com -- Justin
Re: Hardening PostgreSQL via (optional) ban on local file system access
Ah, I see. My counterclaim was that there are lots of use cases where one would want to be extra sure that *only* they, as the owner of the database, can access the database as a superuser and not someone else. Even if there is some obscure way for that "someone else" to either connect as a superuser or to escalate privileges to superuser. And what I propose would be a means to achieve that at the expense of extra steps when starting to act as a superuser. In a nutshell this would be equivalent for two factor authentication for acting as a superuser - 1) you must be able to log in as a user with superuser attribute 2) you must present proof that you can access the underlying file system Cheers, Hannu Krosing On Wed, Jun 29, 2022 at 12:48 PM Laurenz Albe wrote: > > On Wed, 2022-06-29 at 00:05 -0700, Andres Freund wrote: > > On 2022-06-29 08:51:10 +0200, Laurenz Albe wrote: > > > On Tue, 2022-06-28 at 16:27 -0700, Andres Freund wrote: > > > > > Experience shows that 99% of the time one can run PostgreSQL just fine > > > > > without a superuser > > > > > > > > IME that's not at all true. It might not be needed interactively, but > > > > that's > > > > not all the same as not being needed at all. > > > > > > I also disagree with that. Not having a superuser is one of the pain > > > points with using a hosted database: no untrusted procedural languages, > > > no untrusted extensions (unless someone hacked up PostgreSQL or provided > > > a workaround akin to a SECURITY DEFINER function), etc. > > > > I'm not sure what exactly you're disagreeing with? I'm not saying that > > superuser isn't needed interactively in general, just that there are > > reasonably common scenarios in which that's the case. > > I was unclear, sorry. I agreed with you that you can't do without superuser > and disagreed with the claim that 99% of the time nobody needs superuser > access. > > Yours, > Laurenz Albe
Re: Implementing Incremental View Maintenance
> 2022年4月22日 下午1:58,Yugo NAGATA 写道: > > On Fri, 22 Apr 2022 11:29:39 +0900 > Yugo NAGATA wrote: > >> Hi, >> >> On Fri, 1 Apr 2022 11:09:16 -0400 >> Greg Stark wrote: >> >>> This patch has bitrotted due to some other patch affecting trigger.c. >>> >>> Could you post a rebase? >>> >>> This is the last week of the CF before feature freeze so time is of the >>> essence. >> >> I attached a rebased patch-set. >> >> Also, I made the folowing changes from the previous. >> >> 1. Fix to not use a new deptye >> >> In the previous patch, we introduced a new deptye 'm' into pg_depend. >> This deptype was used for looking for IVM triggers to be removed at >> REFRESH WITH NO DATA. However, we decided to not use it for reducing >> unnecessary change in the core code. Currently, the trigger name and >> dependent objclass are used at that time instead of it. >> >> As a result, the number of patches are reduced to nine from ten. > > >> 2. Bump the version numbers in psql and pg_dump >> >> This feature's target is PG 16 now. > > Sorry, I revert this change. It was too early to bump up the > version number. > > -- > Yugo NAGATA > > Hi, Nagata-san I read your patch with v27 version and has some new comments,I want to discuss with you. 1. How about use DEPENDENCY_INTERNAL instead of DEPENDENCY_AUTO when record dependence on trigger created by IMV.( related code is in the end of CreateIvmTrigger) Otherwise, User can use sql to drop trigger and corrupt IVM, DEPENDENCY_INTERNAL is also semantically more correct Crash case like: create table t( a int); create incremental materialized view s as select * from t; drop trigger "IVM_trigger_”; Insert into t values(1); 2. In get_matching_condition_string, Considering NULL values, we can not use simple = operator. But how about 'record = record', record_eq treat NULL = NULL it should fast than current implementation for only one comparation Below is my simple implementation with this, Variables are named arbitrarily.. I test some cases it’s ok static char * get_matching_condition_string(List *keys) { StringInfoData match_cond; ListCell*lc; /* If there is no key columns, the condition is always true. */ if (keys == NIL) return "true"; else { StringInfoData s1; StringInfoData s2; initStringInfo(_cond); initStringInfo(); initStringInfo(); /* Considering NULL values, we can not use simple = operator. */ appendStringInfo(, "ROW("); appendStringInfo(, "ROW("); foreach (lc, keys) { Form_pg_attribute attr = (Form_pg_attribute) lfirst(lc); char *resname = NameStr(attr->attname); char *mv_resname = quote_qualified_identifier("mv", resname); char *diff_resname = quote_qualified_identifier("diff", resname); appendStringInfo(, "%s", mv_resname); appendStringInfo(, "%s", diff_resname); if (lnext(lc)) { appendStringInfo(, ", "); appendStringInfo(, ", "); } } appendStringInfo(, ")::record"); appendStringInfo(, ")::record"); appendStringInfo(_cond, "%s operator(pg_catalog.=) %s", s1.data, s2.data); return match_cond.data; } } 3. Consider truncate base tables, IVM will not refresh, maybe raise an error will be better 4. In IVM_immediate_before,I know Lock base table with ExclusiveLock is for concurrent updates to the IVM correctly, But how about to Lock it when actually need to maintain MV which in IVM_immediate_maintenance In this way you don't have to lock multiple times. 5. Why we need CreateIndexOnIMMV, is it a optimize? It seems like when maintenance MV, the index may not be used because of our match conditions can’t use simple = operator Looking forward to your early reply to answer my above doubts, thank you a lot! Regards, Yajun Hu
Re: better error description on logical replication
On Wed, Jun 29, 2022 at 4:30 PM Marcos Pegoraro wrote: > > I´m using 14.4. > This additional information will be available in 15 as it is committed as part of commit abc0910e. -- With Regards, Amit Kapila.
Re: better error description on logical replication
I´m using 14.4. These are some lines with that error, and context is empty. And yes, if context had this info you wrote would be fine 2022-06-28 08:18:23.600 -03,,,20915,,62b9c77b.51b3,1328,,2022-06-27 12:06:35 -03,4/690182,433844252,ERROR,23505,"duplicate key value violates unique constraint ""pkcustomer""","Key (customer_id)=(530540) already exists.""","logical replication worker",,5539589780750922391 2022-06-28 08:18:23.609 -03,,,20377,,62bae37f.4f99,1,,2022-06-28 08:18:23 -03,4/690184,0,LOG,0,"logical replication apply worker for subscription ""sub_replica_5"" has started","","logical replication worker",,0 2022-06-28 08:18:23.929 -03,,,2009,,62b35392.7d9,88468,,2022-06-22 14:38:26 -03,,0,LOG,0,"background worker ""logical replication worker"" (PID 20915) exited with exit code 1","","postmaster",,0 2022-06-28 08:18:24.151 -03,,,20377,,62bae37f.4f99,2,,2022-06-28 08:18:23 -03,4/690187,433844253,ERROR,23505,"duplicate key value violates unique constraint ""pkcustomer""","Key (customer_id)=(530540) already exists.""","logical replication worker",,6675519194010520265 2022-06-28 08:18:24.160 -03,,,2009,,62b35392.7d9,88469,,2022-06-22 14:38:26 -03,,0,LOG,0,"background worker ""logical replication worker"" (PID 20377) exited with exit code 1","","postmaster",,0 2 thanks Marcos Em ter., 28 de jun. de 2022 às 23:52, Amit Kapila escreveu: > On Tue, Jun 28, 2022 at 5:50 PM Marcos Pegoraro wrote: > > > > I don´t know how to create a patch, maybe someday, but for now I´m just > sending this little problem if somebody can solve it. > > > > In a multi schema environment where several tables has same structure is > a little bit hard to know which one already has that primary key. > > > > On log I see now on replica server. > > Message:duplicate key value violates unique constraint "pkcustomer" > > Detail: Key (customer_id)=(530540) already exists. > > > > So, I know what table is but I don´t know what schema it belongs. > > > > On which version, have you tried this? In HEAD, I am getting below > information: > ERROR: duplicate key value violates unique constraint "idx_t1" > DETAIL: Key (c1)=(1) already exists. > CONTEXT: processing remote data for replication origin "pg_16388" > during "INSERT" for replication target relation "public.t1" in > transaction 739 finished at 0/150D640 > > You can see that CONTEXT has schema information. Will that serve your > purpose? > > -- > With Regards, > Amit Kapila. >
Re: OpenSSL 3.0.0 compatibility
> On 29 Jun 2022, at 11:44, Jelte Fennema wrote: > >> See upthread in ef5c7896-20cb-843f-e91e-0ee5f7fd9...@enterprisedb.com > > I saw that section, but I thought that only applied before you > backpatched the actual fixes to PG13 and below. I mean there's no > reason anymore not to compile those older versions with OpenSSL 3.0, > right? If so, it seems confusing for the build to spit out warnings > that indicate the contrary. The project isn't automatically fixing compiler warnings or library deprecation warnings in back-branches. I guess one could make the argument for this case given how widespread OpenSSL 3.0, but it comes with a significant testing effort to ensure that all back-branches behave correctly with all version of OpenSSL so it's not for free (it should be, but with OpenSSL I would personally not trust that). Also, PG12 and below had 0.9.8 as minimum version. -- Daniel Gustafsson https://vmware.com/
Re: Hardening PostgreSQL via (optional) ban on local file system access
On Wed, 2022-06-29 at 00:05 -0700, Andres Freund wrote: > On 2022-06-29 08:51:10 +0200, Laurenz Albe wrote: > > On Tue, 2022-06-28 at 16:27 -0700, Andres Freund wrote: > > > > Experience shows that 99% of the time one can run PostgreSQL just fine > > > > without a superuser > > > > > > IME that's not at all true. It might not be needed interactively, but > > > that's > > > not all the same as not being needed at all. > > > > I also disagree with that. Not having a superuser is one of the pain > > points with using a hosted database: no untrusted procedural languages, > > no untrusted extensions (unless someone hacked up PostgreSQL or provided > > a workaround akin to a SECURITY DEFINER function), etc. > > I'm not sure what exactly you're disagreeing with? I'm not saying that > superuser isn't needed interactively in general, just that there are > reasonably common scenarios in which that's the case. I was unclear, sorry. I agreed with you that you can't do without superuser and disagreed with the claim that 99% of the time nobody needs superuser access. Yours, Laurenz Albe
Re: Support logical replication of DDLs
On Wed, Jun 29, 2022 at 3:24 PM houzj.f...@fujitsu.com wrote: > > On Wednesday, June 29, 2022 11:07 AM Amit Kapila > wrote: > > > > On Tue, Jun 28, 2022 at 5:43 PM Amit Kapila > > wrote: > > > > > > 5. > > > +static ObjTree * > > > +deparse_CreateStmt(Oid objectId, Node *parsetree) > > > { > > > ... > > > + tmp = new_objtree_VA("TABLESPACE %{tablespace}I", 0); if > > > + (node->tablespacename) append_string_object(tmp, "tablespace", > > > + node->tablespacename); else { append_null_object(tmp, "tablespace"); > > > + append_bool_object(tmp, "present", false); } > > > + append_object_object(createStmt, "tablespace", tmp); > > > ... > > > } > > > > > > Why do we need to append the objects (tablespace, with clause, etc.) > > > when they are not present in the actual CREATE TABLE statement? The > > > reason to ask this is that this makes the string that we want to send > > > downstream much longer than the actual statement given by the user on > > > the publisher. > > > > > > > After thinking some more on this, it seems the user may want to optionally > > change some of these attributes, for example, on the subscriber, it may > > want to > > associate the table with a different tablespace. I think to address that we > > can > > append these additional attributes optionally, say via an additional > > parameter > > (append_all_options/append_all_attributes or something like that) in exposed > > APIs like deparse_utility_command(). > > I agree and will research this part. > Okay, note that similar handling would be required at other places like deparse_ColumnDef. Few other comments on v9-0001-Functions-to-deparse-DDL-commands. 1. +static ObjElem *new_bool_object(bool value); +static ObjElem *new_string_object(char *value); +static ObjElem *new_object_object(ObjTree *value); +static ObjElem *new_array_object(List *array); +static ObjElem *new_integer_object(int64 value); +static ObjElem *new_float_object(float8 value); Here, new_object_object() seems to be declared out-of-order (not in sync with its order of definition). Similarly, see all other append_* functions. 2. The function printTypmod in ddl_deparse.c appears to be the same as the function with the same name in format_type.c. If so, isn't it better to have a single copy of that function? 3. As I pointed out yesterday, there is some similarity between format_type_extended and format_type_detailed. Can we try to extract common functionality? If this is feasible, it is better to do this as a separate patch. Also, this can obviate the need to expose printTypmod from format_type.c. I am not really sure if this will be better than the current one or not but it seems worth trying. 4. new_objtree_VA() { ... switch (type) + { + case ObjTypeBool: + elem = new_bool_object(va_arg(args, int)); + break; + case ObjTypeString: + elem = new_string_object(va_arg(args, char *)); + break; + case ObjTypeObject: + elem = new_object_object(va_arg(args, ObjTree *)); + break; + case ObjTypeArray: + elem = new_array_object(va_arg(args, List *)); + break; + case ObjTypeInteger: + elem = new_integer_object(va_arg(args, int64)); + break; + case ObjTypeFloat: + elem = new_float_object(va_arg(args, double)); + break; + case ObjTypeNull: + /* Null params don't have a value (obviously) */ + elem = new_null_object(); ... I feel here ObjType's should be handled in the same order as they are defined in ObjType. 5. There appears to be common code among node_*_object() functions. Can we just have one function instead new_object(ObjType, void *val)? In the function based on type, we can typecast the value. Is there a reason why that won't be better than current one? 6. deparse_ColumnDef() { ... /* Composite types use a slightly simpler format string */ + if (composite) + column = new_objtree_VA("%{name}I %{coltype}T %{collation}s", + 3, + "type", ObjTypeString, "column", + "name", ObjTypeString, coldef->colname, + "coltype", ObjTypeObject, + new_objtree_for_type(typid, typmod)); + else + column = new_objtree_VA("%{name}I %{coltype}T %{default}s %{not_null}s %{collation}s", + 3, + "type", ObjTypeString, "column", + "name", ObjTypeString, coldef->colname, + "coltype", ObjTypeObject, + new_objtree_for_type(typid, typmod)); ... } To avoid using the same arguments, we can define fmtstr for composite and non-composite types as the patch is doing in deparse_CreateStmt(). 7. It is not clear from comments or otherwise what should be considered for default format string in functions like deparse_ColumnDef() or deparse_CreateStmt(). 8. + * FIXME --- actually, what about default values? + */ +static ObjTree * +deparse_ColumnDef_typed(Relation relation, List *dpcontext, ColumnDef *coldef) I think we need to handle default values for this FIXME. -- With Regards, Amit Kapila.
Re: margay fails assertion in stats/dsa/dsm code
On Wed, Jun 29, 2022 at 4:00 PM Thomas Munro wrote: > I suppose this could indicate that the machine and/or RAM disk is > overloaded/swapping and one of those open() or unlink() calls is > taking a really long time, and that could be fixed with some system > tuning. Hmm, I take that bit back. Every backend that starts up is trying to attach to the same segment, the one with the new pgstats stuff in it (once the small space in the main shmem segment is used up and we create a DSM segment). There's no fairness/queue, random back-off or guarantee of progress in that librt lock code, so you can get into lock-step with other backends retrying, and although some waiter always gets to make progress, any given backend can lose every round and run out of retries. Even when you're lucky and don't fail with an undocumented incomprehensible error, it's very slow, and I'd considering filing a bug report about that. A work-around on PostgreSQL would be to set dynamic_shared_memory_type to mmap (= we just open our own files and map them directly), and making pg_dynshmem a symlink to something under /tmp (or some other RAM disk) to avoid touch regular disk file systems.
RE: Support logical replication of DDLs
On Tuesday, June 28, 2022 11:27 AM Amit Kapila > On Sun, Jun 26, 2022 at 11:47 PM Alvaro Herrera > wrote: > > > > On 2022-Jun-22, vignesh C wrote: > > > > > 1) Creation of temporary table fails infinitely in the subscriber. > > > CREATE TEMPORARY TABLE temp1 (a int primary key); > > > > > > The above statement is converted to the below format: > > > CREATE TEMPORARY TABLE pg_temp.temp1 (a pg_catalog.int4 , > > > CONSTRAINT temp1_pkey PRIMARY KEY (a)); While handling the creation > > > of temporary table in the worker, the worker fails continuously with > > > the following error: > > > 2022-06-22 14:24:01.317 IST [240872] ERROR: schema "pg_temp" does > > > not exist > > > > Perhaps one possible fix is to change the JSON format string used in > > deparse_CreateStmt. Currently, the following is used: > > > > + if (node->ofTypename) > > + fmtstr = "CREATE %{persistence}s > TABLE %{if_not_exists}s %{identity}D " > > + "OF %{of_type}T %{table_elements}s " > > + "%{with_clause}s %{on_commit}s %{tablespace}s"; > > + else > > + fmtstr = "CREATE %{persistence}s > TABLE %{if_not_exists}s %{identity}D " > > + "(%{table_elements:, }s) %{inherits}s " > > + "%{with_clause}s %{on_commit}s > > + %{tablespace}s"; > > + > > + createStmt = > > + new_objtree_VA(fmtstr, 1, > > + "persistence", ObjTypeString, > > + > > + get_persistence_str(relation->rd_rel->relpersistence)); > > > > (Note that the word for the "persistence" element here comes straight > > from relation->rd_rel->relpersistence.) Maybe it would be more > > appropriate to set the schema to empty when the table is temp, since > > the temporary-ness is in the %{persistence} element, and thus there is > > no need to schema-qualify the table name. > > > > > > However, that would still replicate a command that involves a > > temporary table, which perhaps should not be considered fit for > > replication. So another school of thought is that if the > > %{persistence} is set to TEMPORARY, then it would be better to skip > > replicating the command altogether. > > > > +1. I think it doesn't make sense to replicate temporary tables. > Similarly, we don't need to replicate the unlogged tables. I agree that we don’t need to replicate temporary tables. For unlogged table, one thing I noticed is that we always replicate the DDL action on unlogged table in streaming replication. So, to be consistent, maybe we need to generate WAL for DDL on unlogged table as well ? Best regards, Hou zj
Re: OpenSSL 3.0.0 compatibility
> See upthread in ef5c7896-20cb-843f-e91e-0ee5f7fd9...@enterprisedb.com I saw that section, but I thought that only applied before you backpatched the actual fixes to PG13 and below. I mean there's no reason anymore not to compile those older versions with OpenSSL 3.0, right? If so, it seems confusing for the build to spit out warnings that indicate the contrary.
Re: OpenSSL 3.0.0 compatibility
> On 29 Jun 2022, at 11:02, Jelte Fennema wrote: > > On Wed, 29 Jun 2022 at 10:55, Daniel Gustafsson wrote: >> These have now been pushed to 14 through to 10 ahead of next week releases > > I upgraded my OS to Ubuntu 22.04 and it seems that "Define > OPENSSL_API_COMPAT" commit was never backported > (4d3db13621be64fbac2faf7c01c4879d20885c1b). I now get various > deprecation warnings when compiling PG13 on Ubuntu 22.04, because of > OpenSSL 3.0. Was this simply forgotten, or is there a reason why it > wasn't backported? See upthread in ef5c7896-20cb-843f-e91e-0ee5f7fd9...@enterprisedb.com, below is the relevant portion: >> 13 and older will, when compiled against OpenSSL 3.0.0, produce a fair amount >> of compiler warnings on usage of depreceted functionality but there is really >> anything we can do as suppressing that is beyond the scope of a backpatchable >> fix IMHO. > > Right, that's just a matter of adjusting the compiler warnings. > > Earlier in this thread, I had suggested backpatching the OPENSSL_API_COMPAT > definition to PG13, but now I'm thinking I wouldn't bother, since that still > wouldn't help with anything older. -- Daniel Gustafsson https://vmware.com/
Re: Fix proposal for comparaison bugs in PostgreSQL::Version
On Tue, 28 Jun 2022 18:17:40 -0400 Andrew Dunstan wrote: > On 2022-06-28 Tu 16:53, Jehan-Guillaume de Rorthais wrote: > > ... > > A better fix would be to store the version internally as version_num that > > are trivial to compute and compare. Please, find in attachment an > > implementation of this. > > > > The patch is a bit bigger because it improved the devel version to support > > rc/beta/alpha comparison like 14rc2 > 14rc1. > > > > Moreover, it adds a bunch of TAP tests to check various use cases. > > > Nice catch, but this looks like massive overkill. I think we can very > simply fix the test in just a few lines of code, instead of a 190 line > fix and a 130 line TAP test. I explained why the patch was a little bit larger than required: it fixes the bugs and do a little bit more. The _version_cmp sub is shorter and easier to understand, I use multi-line code where I could probably fold them in a one-liner, added some comments... Anyway, I don't feel the number of line changed is "massive". But I can probably remove some code and shrink some other if it is really important... Moreover, to be honest, I don't mind the number of additional lines of TAP tests. Especially since it runs really, really fast and doesn't hurt day-to-day devs as it is independent from other TAP tests anyway. It could be 1k, if it runs fast, is meaningful and helps avoiding futur regressions, I would welcome the addition. If we really want to save some bytes, I have a two lines worth of code fix that looks more readable to me than fixing _version_cmp: +++ b/src/test/perl/PostgreSQL/Version.pm @@ -92,9 +92,13 @@ sub new # Split into an array my @numbers = split(/\./, $arg); + # make sure all digit of the array-represented version are set so we can + # keep _version_cmp code as a "simple" digit-to-digit comparison loop + $numbers[$_] += 0 for 0..3; + # Treat development versions as having a minor/micro version one less than # the first released version of that branch. - push @numbers, -1 if ($devel); + $numbers[3] = -1 if $devel; $devel ||= ""; But again, in my humble opinion, the internal version array representation is more a burden we should replace by the version_num... > It was never intended to be able to compare markers like rc1 vs rc2, and > I don't see any need for it. If you can show me a sane use case I'll > have another look, but right now it seems quite unnecessary. I don't have a practical use case right now, but I thought the module would be more complete with these little few more line of codes. Now, keep in mind these TAP modules might help external projects, not just core. In fact, I wonder what was your original use case to support devel/alpha/beta/rc versions, especially since it was actually not working? Should we just get rid of this altogether and wait for an actual use case? Cheers,
Re: Lazy JIT IR code generation to increase JIT speed with partitions
Hi Alvaro, That's a very interesting case and might indeed be fixed or at least improved by this patch. I tried to reproduce this, but at least when running a simple, serial query with increasing numbers of functions, the time spent per function is linear or even slightly sub-linear (same as Tom observed in [1]). I also couldn't reproduce the JIT runtimes you shared, when running the attached catalog query. The catalog query ran serially for me with the following JIT stats: JIT: Functions: 169 Options: Inlining true, Optimization true, Expressions true, Deforming true Timing: Generation 12.223 ms, Inlining 17.323 ms, Optimization 388.491 ms, Emission 283.464 ms, Total 701.501 ms Is it possible that the query ran in parallel for you? For parallel queries, every worker JITs all of the functions it uses. Even though the workers might JIT the functions in parallel, the time reported in the EXPLAIN ANALYZE output is the sum of the time spent by all workers. With this patch applied, the JIT time drops significantly, as many of the generated functions remain unused. JIT: Modules: 15 Functions: 26 Options: Inlining true, Optimization true, Expressions true, Deforming true Timing: Generation 1.931 ms, Inlining 0.722 ms, Optimization 67.195 ms, Emission 70.347 ms, Total 140.195 ms Of course, this does not prove that the nonlinearity that you observed went away. Could you share with me how you ran the query so that I can reproduce your numbers on master to then compare them with the patched version? Also, which LLVM version did you run with? I'm currently running with LLVM 13. Thanks! -- David Geier (ServiceNow) On Mon, Jun 27, 2022 at 5:37 PM Alvaro Herrera wrote: > On 2021-Jan-18, Luc Vlaming wrote: > > > I would like this topic to somehow progress and was wondering what other > > benchmarks / tests would be needed to have some progress? I've so far > > provided benchmarks for small(ish) queries and some tpch numbers, > assuming > > those would be enough. > > Hi, some time ago I reported a case[1] where our JIT implementation does > a very poor job and perhaps the changes that you're making could explain > what is going on, and maybe even fix it: > > [1] https://postgr.es/m/20241706.wqq7xoyigwa2@alvherre.pgsql > > The query for which I investigated the problem involved some pg_logical > metadata tables, so I didn't post it anywhere public; but the blog post > I found later contains a link to a query that shows the same symptoms, > and which is luckily still available online: > https://gist.github.com/saicitus/251ba20b211e9e73285af35e61b19580 > I attach it here in case it goes missing sometime. > > -- > Álvaro Herrera 48°01'N 7°57'E — > https://www.EnterpriseDB.com/ >
Re: OpenSSL 3.0.0 compatibility
On Wed, 29 Jun 2022 at 10:55, Daniel Gustafsson wrote: > These have now been pushed to 14 through to 10 ahead of next week releases I upgraded my OS to Ubuntu 22.04 and it seems that "Define OPENSSL_API_COMPAT" commit was never backported (4d3db13621be64fbac2faf7c01c4879d20885c1b). I now get various deprecation warnings when compiling PG13 on Ubuntu 22.04, because of OpenSSL 3.0. Was this simply forgotten, or is there a reason why it wasn't backported?
Re: Prevent writes on large objects in read-only transactions
Hello Michael-san, Thank you for reviewing the patch. I attached the updated patch. On Thu, 16 Jun 2022 17:31:22 +0900 Michael Paquier wrote: > Looking at the docs of large objects, as of "Client Interfaces", we > mention that any action must take place in a transaction block. > Shouldn't we add a note that no write operations are allowed in a > read-only transaction? I added a description about read-only transaction to the doc. > + if (mode & INV_WRITE) > + PreventCommandIfReadOnly("lo_open() in write mode"); > Nit. This breaks translation. I think that it could be switched to > "lo_open(INV_WRITE)" instead as the flag name is documented. Changed it as you suggested. > The patch is forgetting a refresh for largeobject_1.out. I added changes for largeobject_1.out. > --- INV_READ = 0x2 > --- INV_WRITE = 0x4 > +-- INV_READ = 0x4 > +-- INV_WRITE = 0x2 > Good catch! This one is kind of independent, so I have fixed it > separately, in all the expected output files. Thanks! -- Yugo NAGATA diff --git a/doc/src/sgml/lobj.sgml b/doc/src/sgml/lobj.sgml index b767ba1d05..2dbc95c4ad 100644 --- a/doc/src/sgml/lobj.sgml +++ b/doc/src/sgml/lobj.sgml @@ -114,7 +114,8 @@ All large object manipulation using these functions must take place within an SQL transaction block, since large object file descriptors are only valid for the duration of -a transaction. +a transaction. Write operations, including lo_open +with INV_WRITE mode, are not allowed in a read-only transaction. diff --git a/src/backend/libpq/be-fsstubs.c b/src/backend/libpq/be-fsstubs.c index 5804532881..043e47b93f 100644 --- a/src/backend/libpq/be-fsstubs.c +++ b/src/backend/libpq/be-fsstubs.c @@ -93,6 +93,9 @@ be_lo_open(PG_FUNCTION_ARGS) elog(DEBUG4, "lo_open(%u,%d)", lobjId, mode); #endif + if (mode & INV_WRITE) + PreventCommandIfReadOnly("lo_open(INV_WRITE)"); + /* * Allocate a large object descriptor first. This will also create * 'fscxt' if this is the first LO opened in this transaction. @@ -179,6 +182,8 @@ lo_write(int fd, const char *buf, int len) int status; LargeObjectDesc *lobj; + PreventCommandIfReadOnly("lo_write"); + if (fd < 0 || fd >= cookies_size || cookies[fd] == NULL) ereport(ERROR, (errcode(ERRCODE_UNDEFINED_OBJECT), @@ -245,6 +250,8 @@ be_lo_creat(PG_FUNCTION_ARGS) { Oid lobjId; + PreventCommandIfReadOnly("lo_creat()"); + lo_cleanup_needed = true; lobjId = inv_create(InvalidOid); @@ -256,6 +263,8 @@ be_lo_create(PG_FUNCTION_ARGS) { Oid lobjId = PG_GETARG_OID(0); + PreventCommandIfReadOnly("lo_create()"); + lo_cleanup_needed = true; lobjId = inv_create(lobjId); @@ -306,6 +315,8 @@ be_lo_unlink(PG_FUNCTION_ARGS) { Oid lobjId = PG_GETARG_OID(0); + PreventCommandIfReadOnly("lo_unlink()"); + /* * Must be owner of the large object. It would be cleaner to check this * in inv_drop(), but we want to throw the error before not after closing @@ -368,6 +379,8 @@ be_lowrite(PG_FUNCTION_ARGS) int bytestowrite; int totalwritten; + PreventCommandIfReadOnly("lowrite()"); + bytestowrite = VARSIZE_ANY_EXHDR(wbuf); totalwritten = lo_write(fd, VARDATA_ANY(wbuf), bytestowrite); PG_RETURN_INT32(totalwritten); @@ -413,6 +426,8 @@ lo_import_internal(text *filename, Oid lobjOid) LargeObjectDesc *lobj; Oid oid; + PreventCommandIfReadOnly("lo_import()"); + /* * open the file to be read in */ @@ -561,6 +576,8 @@ be_lo_truncate(PG_FUNCTION_ARGS) int32 fd = PG_GETARG_INT32(0); int32 len = PG_GETARG_INT32(1); + PreventCommandIfReadOnly("lo_truncate()"); + lo_truncate_internal(fd, len); PG_RETURN_INT32(0); } @@ -571,6 +588,8 @@ be_lo_truncate64(PG_FUNCTION_ARGS) int32 fd = PG_GETARG_INT32(0); int64 len = PG_GETARG_INT64(1); + PreventCommandIfReadOnly("lo_truncate64()"); + lo_truncate_internal(fd, len); PG_RETURN_INT32(0); } @@ -815,6 +834,8 @@ be_lo_from_bytea(PG_FUNCTION_ARGS) LargeObjectDesc *loDesc; int written PG_USED_FOR_ASSERTS_ONLY; + PreventCommandIfReadOnly("lo_from_bytea()"); + lo_cleanup_needed = true; loOid = inv_create(loOid); loDesc = inv_open(loOid, INV_WRITE, CurrentMemoryContext); @@ -837,6 +858,8 @@ be_lo_put(PG_FUNCTION_ARGS) LargeObjectDesc *loDesc; int written PG_USED_FOR_ASSERTS_ONLY; + PreventCommandIfReadOnly("lo_put()"); + lo_cleanup_needed = true; loDesc = inv_open(loOid, INV_WRITE, CurrentMemoryContext); diff --git a/src/test/regress/expected/largeobject.out b/src/test/regress/expected/largeobject.out index 60d7b991c1..31fba2ff9d 100644 --- a/src/test/regress/expected/largeobject.out +++ b/src/test/regress/expected/largeobject.out @@ -506,6 +506,55 @@ SELECT lo_create(2121); (1 row) COMMENT ON LARGE OBJECT 2121 IS 'testing comments'; +-- Test writes on large objects in read-only transactions +START TRANSACTION READ ONLY; +-- INV_READ ... ok +SELECT lo_open(2121, x'4'::int); + lo_open
Re: CREATE INDEX CONCURRENTLY on partitioned index
Justin Pryzby писал 2022-06-28 21:33: Hi, On Thu, Feb 10, 2022 at 06:07:08PM +0300, Alexander Pyhalov wrote: I've rebased patches and tried to fix issues I've seen. I've fixed reference after table_close() in the first patch (can be seen while building with CPPFLAGS='-DRELCACHE_FORCE_RELEASE'). Rebased patches on the current master. They still require proper review. -- Best regards, Alexander Pyhalov, Postgres ProfessionalFrom 5c11849ceb2a1feb0e44dbdf30cc27de0282a659 Mon Sep 17 00:00:00 2001 From: Alexander Pyhalov Date: Mon, 28 Feb 2022 10:50:58 +0300 Subject: [PATCH 5/5] Mark intermediate partitioned indexes as valid --- src/backend/commands/indexcmds.c | 33 ++- src/test/regress/expected/indexing.out | 80 +- src/test/regress/sql/indexing.sql | 8 +++ 3 files changed, 118 insertions(+), 3 deletions(-) diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index d09f0390413..d3ced6265b6 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -3139,6 +3139,7 @@ static void ReindexPartitions(Oid relid, ReindexParams *params, bool isTopLevel) { List *partitions = NIL; + List *inhpartindexes = NIL; char relkind = get_rel_relkind(relid); char *relname = get_rel_name(relid); char *relnamespace = get_namespace_name(get_rel_namespace(relid)); @@ -3193,6 +3194,17 @@ ReindexPartitions(Oid relid, ReindexParams *params, bool isTopLevel) char partkind = get_rel_relkind(partoid); MemoryContext old_context; + /* Create a list of invalid inherited partitioned indexes */ + if (partkind == RELKIND_PARTITIONED_INDEX) + { + if (partoid == relid || get_index_isvalid(partoid)) +continue; + + old_context = MemoryContextSwitchTo(reindex_context); + inhpartindexes = lappend_oid(inhpartindexes, partoid); + MemoryContextSwitchTo(old_context); + } + /* * This discards partitioned tables, partitioned indexes and foreign * tables. @@ -3237,9 +3249,28 @@ ReindexPartitions(Oid relid, ReindexParams *params, bool isTopLevel) Oid tableoid = IndexGetRelation(relid, false); List *child_tables = find_all_inheritors(tableoid, ShareLock, NULL); - /* Both lists include their parent relation as well as any intermediate partitioned rels */ + /* + * Both lists include their parent relation as well as any + * intermediate partitioned rels + */ if (list_length(inhoids) == list_length(child_tables)) + { index_set_state_flags(relid, INDEX_CREATE_SET_VALID); + + /* Mark any intermediate partitioned index as valid */ + foreach(lc, inhpartindexes) + { +Oid partoid = lfirst_oid(lc); + +Assert(get_rel_relkind(partoid) == RELKIND_PARTITIONED_INDEX); +Assert(!get_index_isvalid(partoid)); + +/* Can't mark an index valid without marking it ready */ +index_set_state_flags(partoid, INDEX_CREATE_SET_READY); +CommandCounterIncrement(); +index_set_state_flags(partoid, INDEX_CREATE_SET_VALID); + } + } } /* diff --git a/src/test/regress/expected/indexing.out b/src/test/regress/expected/indexing.out index a4ccae50de3..b4f1aea6fca 100644 --- a/src/test/regress/expected/indexing.out +++ b/src/test/regress/expected/indexing.out @@ -57,6 +57,8 @@ create table idxpart11 partition of idxpart1 for values from (0) to (10) partiti create table idxpart111 partition of idxpart11 default partition by range(a); create table idxpart partition of idxpart111 default partition by range(a); create table idxpart2 partition of idxpart for values from (10) to (20); +create table idxpart3 partition of idxpart for values from (30) to (40) partition by range(a); +create table idxpart31 partition of idxpart3 default; insert into idxpart2 values(10),(10); -- not unique create index concurrently on idxpart (a); -- partitioned create index concurrently on idxpart1 (a); -- partitioned and partition @@ -76,7 +78,7 @@ Partition key: RANGE (a) Indexes: "idxpart_a_idx" btree (a) "idxpart_a_idx1" UNIQUE, btree (a) INVALID -Number of partitions: 2 (Use \d+ to list them.) +Number of partitions: 3 (Use \d+ to list them.) \d idxpart1 Partitioned table "public.idxpart1" @@ -88,11 +90,59 @@ Number of partitions: 2 (Use \d+ to list them.) Partition of: idxpart FOR VALUES FROM (0) TO (10) Partition key: RANGE (a) Indexes: -"idxpart1_a_idx" btree (a) INVALID +"idxpart1_a_idx" btree (a) "idxpart1_a_idx1" btree (a) "idxpart1_a_idx2" UNIQUE, btree (a) INVALID Number of partitions: 1 (Use \d+ to list them.) +\d idxpart11 + Partitioned table "public.idxpart11" + Column | Type | Collation | Nullable | Default ++-+---+--+- + a | integer | | | + b | integer | | | + c | text| | | +Partition of: idxpart1 FOR VALUES FROM (0) TO (10) +Partition key: RANGE (a) +Indexes: +"idxpart11_a_idx"
Re: Hardening PostgreSQL via (optional) ban on local file system access
The idea is to allow superuser, but only in case you *already* have access to the file system. You could think of it as a two factor authentication for using superuser. So in the simplest implementation it would be touch $PGDATA/allow_superuser psql hannuk=# CREATE EXTENSION ... rm $PGDATA/allow_superuser and in more sophisticated implementation it could be terminal 1: psql hannuk=# select pg_backend_pid(); pg_backend_pid 1749025 (1 row) terminal 2: echo 1749025 > $PGDATA/allow_superuser back to terminal 1 still connected to backend with pid 1749025: $ CREATE EXTENSION ... .. and then clean up the sentinel file after, or just make it valid for N minutes from creation Cheers, Hannu Krosing On Wed, Jun 29, 2022 at 8:51 AM Laurenz Albe wrote: > > On Tue, 2022-06-28 at 16:27 -0700, Andres Freund wrote: > > > Experience shows that 99% of the time one can run PostgreSQL just fine > > > without a superuser > > > > IME that's not at all true. It might not be needed interactively, but that's > > not all the same as not being needed at all. > > I also disagree with that. Not having a superuser is one of the pain > points with using a hosted database: no untrusted procedural languages, > no untrusted extensions (unless someone hacked up PostgreSQL or provided > a workaround akin to a SECURITY DEFINER function), etc. > > Yours, > Laurenz Albe
Re: Hardening PostgreSQL via (optional) ban on local file system access
Hi, On 2022-06-29 08:51:10 +0200, Laurenz Albe wrote: > On Tue, 2022-06-28 at 16:27 -0700, Andres Freund wrote: > > > Experience shows that 99% of the time one can run PostgreSQL just fine > > > without a superuser > > > > IME that's not at all true. It might not be needed interactively, but that's > > not all the same as not being needed at all. > > I also disagree with that. Not having a superuser is one of the pain > points with using a hosted database: no untrusted procedural languages, > no untrusted extensions (unless someone hacked up PostgreSQL or provided > a workaround akin to a SECURITY DEFINER function), etc. I'm not sure what exactly you're disagreeing with? I'm not saying that superuser isn't needed interactively in general, just that there are reasonably common scenarios in which that's the case. Greetings, Andres Freund
Re: Can we do something to help stop users mistakenly using force_parallel_mode?
On Wed, 2022-06-29 at 15:23 +1200, David Rowley wrote: > Over on [1] I noticed that the user had set force_parallel_mode to > "on" in the hope that would trick the planner into making their query > run more quickly. Of course, that's not what they want since that GUC > is only there to inject some parallel nodes into the plan in order to > verify the tuple communication works. > > I get the idea that Robert might have copped some flak about this at > some point, given that he wrote the blog post at [2]. > > The user would have realised this if they'd read the documentation > about the GUC. However, I imagine they only went as far as finding a > GUC with a name which appears to be exactly what they need. I mean, > what else could force_parallel_mode possibly do? > > Should we maybe rename it to something less tempting? Maybe > debug_parallel_query? > > I wonder if \dconfig *parallel* is going to make force_parallel_mode > even easier to find once PG15 is out. > > [1] > https://www.postgresql.org/message-id/DB4PR02MB8774E06D595D3088BE04ED92E7B99%40DB4PR02MB8774.eurprd02.prod.outlook.com > [2] > https://www.enterprisedb.com/postgres-tutorials/using-forceparallelmode-correctly-postgresql I share the sentiment, but at the same time am worried about an unnecessary compatibility break. The parameter is not in "postgresql.conf" and documented as a "developer option", which should already be warning enough. Perhaps some stronger wording in the documetation would be beneficial. I have little sympathy with people who set unusual parameters without even glancing at the documentation. Yours, Laurenz Albe
Re: PostgreSQL 15 beta 2 release announcement draft
> Upgrading to PostgreSQL 15 Beta 2 > - > > To upgrade to PostgreSQL 15 Beta 2 from an earlier version of PostgreSQL, > you will need to use a strategy similar to upgrading between major versions of > PostgreSQL (e.g. `pg_upgrade` or `pg_dump` / `pg_restore`). For more > information, please visit the documentation section on > [upgrading](https://www.postgresql.org/docs/15/static/upgrading.html). Is the major version upgrade still needed if they are upgrading from 15 Beta 1?
Re: Hardening PostgreSQL via (optional) ban on local file system access
On Tue, 2022-06-28 at 16:27 -0700, Andres Freund wrote: > > Experience shows that 99% of the time one can run PostgreSQL just fine > > without a superuser > > IME that's not at all true. It might not be needed interactively, but that's > not all the same as not being needed at all. I also disagree with that. Not having a superuser is one of the pain points with using a hosted database: no untrusted procedural languages, no untrusted extensions (unless someone hacked up PostgreSQL or provided a workaround akin to a SECURITY DEFINER function), etc. Yours, Laurenz Albe
Re: CREATE INDEX CONCURRENTLY on partitioned index
Justin Pryzby писал 2022-06-28 21:33: Hi, On Thu, Feb 10, 2022 at 06:07:08PM +0300, Alexander Pyhalov wrote: I've rebased patches and tried to fix issues I've seen. I've fixed reference after table_close() in the first patch (can be seen while building with CPPFLAGS='-DRELCACHE_FORCE_RELEASE'). Thanks for finding that. The patches other than 0001 are more experimental, and need someone to check if it's even a good approach to use, so I kept them separate from the essential patch. Your latest 0005 patch (mark intermediate partitioned indexes as valid) is probably fixing a bug in my SKIPVALID patch, right ? I'm not sure whether the SKIPVALID patch should be merged into 0001, and I've been awaiting feedback on the main patch before handling progress reporting. Hi. I think it's more about fixing ReindexPartitions-to-set-indisvalid patch, as we also should mark intermediate indexes as valid when reindex succeeds. Sorry for not responding sooner. The patch saw no activity for ~11 months so I wasn't prepared to pick it up in March, at least not without guidance from a committer. Would you want to take over this patch ? I wrote it following someone's question, but don't expect that I'd use the feature myself. I can help review it or try to clarify the organization of my existing patches (but still haven't managed to work my way through your amendments to my patches). Yes, I'm glad to work on the patches, as this for us this is a very important feature. -- Best regards, Alexander Pyhalov, Postgres Professional
Re: PostgreSQL 15 beta 2 release announcement draft
Op 29-06-2022 om 02:04 schreef Jonathan S. Katz: Hi, 'not advise you to run PostgreSQL 15 Beta 1'should be 'not advise you to run PostgreSQL 15 Beta 2' Erik