Re: missing GRANT on pg_subscription columns
On Mon, Jun 7, 2021 at 2:38 PM Amit Kapila wrote: > > On Thu, Jun 3, 2021 at 10:39 PM Tom Lane wrote: > > > > "Euler Taveira" writes: > > > I was checking the GRANT on pg_subscription and noticed that the command > > > is not > > > correct. There is a comment that says "All columns of pg_subscription > > > except > > > subconninfo are readable". However, there are columns that aren't > > > included: oid > > > and subsynccommit. It seems an oversight in the commits 6f236e1eb8c and > > > 887227a1cc8. > > > > Ugh. > > > > > There are monitoring tools and data collectors that aren't using a > > > superuser to read catalog information (I usually recommend using > > > pg_monitor). > > > Hence, you cannot join pg_subscription with relations such as > > > pg_subscription_rel or pg_stat_subscription because column oid has no > > > column-level privilege. I'm attaching a patch to fix it (indeed, 2 patches > > > because of additional columns for v14). We should add instructions in the > > > minor > > > version release notes too. > > > > I agree with fixing this in HEAD. But given that this has been wrong > > since v10 with zero previous complaints, I doubt that it is worth the > > complication of trying to do something about it in the back branches. > > Maybe we could just adjust the docs there, instead. > > > > This sounds reasonable to me. Euler, can you provide the doc updates > for back-branches? Attached patch has the documentation changes for the back-branches. As there is no specific reason for this, I have just mentioned "Additionally normal users can't access columns oid and subsynccommit." The same patch applies till V10 branch. Regards, Vignesh From 37f9fc48baa2c233d16ee8ac1e8547680cd05b04 Mon Sep 17 00:00:00 2001 From: vignesh Date: Mon, 28 Jun 2021 10:06:58 +0530 Subject: [PATCH v1] Documentation for normal users not having permission for columns oid and subsynccommit in pg_subscription catalog table. Documentation for normal users not having permission for columns oid and subsynccommit in pg_subscription catalog table. --- doc/src/sgml/catalogs.sgml | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 4dff3f60a2..f6b5c2e562 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -7457,7 +7457,9 @@ SCRAM-SHA-256$iteration count: Access to the column subconninfo is revoked from - normal users, because it could contain plain-text passwords. + normal users, because it could contain plain-text passwords. Additionally + normal users can't access columns oid and + subsynccommit. -- 2.25.1
Re: Optionally automatically disable logical replication subscriptions on error
On Mon, Jun 21, 2021 at 11:26 AM Mark Dilger wrote: > > > > > On Jun 20, 2021, at 7:17 PM, Masahiko Sawada wrote: > > > > I will submit the patch. > > Great, thanks! I've submitted the patches on that thread[1]. There are three patches: skipping the transaction on the subscriber side, reporting error details in the errcontext, and reporting the error details to the stats collector. Feedback is very welcome. [1] https://www.postgresql.org/message-id/CAD21AoBU4jGEO6AXcykQ9y7tat0RrB5--8ZoJgfcj%2BLPs7nFZQ%40mail.gmail.com -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Skipping logical replication transactions on subscriber side
On Thu, Jun 17, 2021 at 6:20 PM Masahiko Sawada wrote: > > On Thu, Jun 17, 2021 at 3:24 PM Masahiko Sawada wrote: > > > > > Now, if this function is used by super > > > users then we can probably trust that they provide the XIDs that we > > > can trust to be skipped but OTOH making a restriction to allow these > > > functions to be used by superusers might restrict the usage of this > > > repair tool. > > > > If we specify the subscription id or name, maybe we can allow also the > > owner of subscription to do that operation? > > Ah, the owner of the subscription must be superuser. I've attached PoC patches. 0001 patch introduces the ability to skip transactions on the subscriber side. We can specify XID to the subscription by like ALTER SUBSCRIPTION test_sub SET SKIP TRANSACTION 100. The implementation seems straightforward except for setting origin state. After skipping the transaction we have to update the session origin state so that we can start streaming the transaction next to the one that we just skipped in case of the server crash or restarting the apply worker. We set origin state to the commit WAL record. However, since we skip all changes we don’t write any WAL even if we call CommitTransaction() at the end of the skipped transaction. So the patch sets the origin state to the transaction that updates the pg_subscription system catalog to reset the skip XID. I think we need a discussion of this part. With 0002 and 0003 patches, we report the error information in server logs and the stats view, respectively. 0002 patch adds errcontext for messages that happened during applying the changes: ERROR: duplicate key value violates unique constraint "hoge_pkey" DETAIL: Key (c)=(1) already exists. CONTEXT: during apply of "INSERT" for relation "public.hoge" in transaction with xid 736 committs 2021-06-27 12:12:30.053887+09 0003 patch adds pg_stat_logical_replication_error statistics view discussed on another thread[1]. The apply worker sends the error information to the stats collector if an error happens during applying changes. We can check those errors as follow: postgres(1:25250)=# select * from pg_stat_logical_replication_error; subname | relid | action | xid | last_failure --+---++-+--- test_sub | 16384 | INSERT | 736 | 2021-06-27 12:12:45.142675+09 (1 row) I added only columns required for the skipping transaction feature to the view for now. Please note that those patches are meant to evaluate the concept we've discussed so far. Those don't have the doc update yet. Regards, [1] https://www.postgresql.org/message-id/DB35438F-9356-4841-89A0-412709EBD3AB%40enterprisedb.com -- Masahiko Sawada EDB: https://www.enterprisedb.com/ v1-0003-Add-pg_stat_logical_replication_error-statistics-.patch Description: Binary data v1-0002-Add-errcontext-to-errors-of-the-applying-logical-.patch Description: Binary data v1-0001-Add-ALTER-SUBSCRIPTION-SET-SKIP-TRANSACTION.patch Description: Binary data
Re: Deadlock risk while inserting directly into partition?
On Thu, Jun 24, 2021 at 7:27 AM David Rowley wrote: > On Wed, 23 Jun 2021 at 21:07, Amit Kapila wrote: > > I noticed that while inserting directly into a partition table we > > compute the PartitionCheckExpr by traversing all the parent partitions > > via > > ExecPartitionCheck()->RelationGetPartitionQual()->generate_partition_qual(). > > We take AccessShareLock on parent tables while generating qual. > > > > Now, on the other hand, while dropping constraint on a partitioned > > table, we take the lock from parent to all the child tables. > > > > I think taking locks in opposite directions can lead to deadlock in > > these operations. > > I wonder if it's possible to do any better here? Surely when > traversing from child to parent we must lock the child before checking > what the parent relation is. I remember there was a discussion where I proposed to document the deadlock hazard that exists when performing DML directly on partitions. The proposal didn't get enough attention, perhaps because it was in the middle of a long reply about other concerns: https://www.postgresql.org/message-id/16db1458-67cf-4add-736e-31b053115e8e%40lab.ntt.co.jp Maybe a good idea to add a line or 2 in 5.11. Table Partitioning? -- Amit Langote EDB: http://www.enterprisedb.com
Re: PQconnectdb/PQerrorMessage changed behavior on master
27.06.2021 23:07, Tom Lane wrote: >> While trying to use sqlsmith with postgres compiled from the master >> branch, I've found that the PQerrorMessage() function now returns >> non-informational but not empty error message after the successful >> PQconnectdb() call. > Yeah, see thread here: > > https://www.postgresql.org/message-id/20210506162651.GJ27406%40telsasoft.com > > sqlsmith is definitely Doing It Wrong there, but there's a > reasonable question whether such coding is common. Thanks for info! I agree that sqlsmith's check is incorrect, nonetheless I was embarrassed by the incomplete error message. Best regards, Alexander
Re: Deadlock risk while inserting directly into partition?
On Fri, Jun 25, 2021 at 10:26 AM David Rowley wrote: > On Thu, 24 Jun 2021 at 12:32, David Rowley wrote: > > The overhead of taking these locks is pretty significant for > > partitioned tables with lots of partitions where only 1 of them > > survives run-time partition pruning. That's really terrible for > > people that want to PREPARE queries and just look up a single row from > > a single partition. That seems like a pretty big use case that we're > > just terrible at today. > > I wonder, since we can't delay taking locks until after run-time > pruning due to being unable to invalidate cached plans, maybe instead > we could tag on any PartitionPruneInfo onto the PlannedStmt itself and > do the init plan run-time prune run during AcquireExecutorLocks(). This is exactly what I was mulling doing when working on [1] some last year, after an off-list discussion with Robert (he suggested the idea IIRC), though I never quite finished writing a patch. I have planned to revisit this topic ("locking overhead in generic plans") for v15, now that we have *some* proposals mentioned in [1] committed to v14, so can look into this. > A lock would need to be taken on each partitioned table before we > prune for it. So if there was multi-level partitioning, we'd need to > lock the partitioned table, do pruning for that partitioned table, > then lock any sub-partitioned tables before doing pruning on those. > > I don't immediately see why it couldn't be made to work, it's just > that it adds quite a lot of complexity to what's being done in > AcquireExecutorLocks(), which today is a very simple function. Yeah, AcquireExecutorLocks()'s current method of finding the set of relations to lock is very simple -- just scan the range table (PlannedStmt.rtable). If we're to remove prunable leaf partitions from that set, maybe we'd have to find a way to remove them from PlannedStmt.rtable as part of running the "init" pruning, which we'd have to do anyway, because perhaps the executor proper (mainly InitPlan) should also see the shrunken version of the range table. Not to mention the complexity of getting the "init" pruning itself to run outside a full-blown executor context. Anyway, do you agree with starting a thread to discuss possible approaches to attack this? -- Amit Langote EDB: http://www.enterprisedb.com [1] https://www.postgresql.org/message-id/ca+hiwqg7zrubmmih3wpsbz4s0h2ehywrnxeducky5hr3fwz...@mail.gmail.com
Re: fdatasync performance problem with large number of DB files
On Tue, Jun 22, 2021 at 5:01 PM Thomas Munro wrote: > On Fri, Jun 18, 2021 at 1:11 PM Justin Pryzby wrote: > > Thomas, could you comment on this ? > > Sorry, I missed that. It is initially a confusing proposal, but after > trying it out (that is: making recovery_init_sync_method PGC_SIGHUP > and testing a scenario where you want to make the next crash use it > that way and without the change), I agree. +1 from me. ... and pushed.
Re: Fix uninitialized copy_data var (src/backend/commands/subscriptioncmds.c)
On Mon, Jun 28, 2021 at 10:17:55AM +1000, Peter Smith wrote: > IIUC for the case ALTER_SUBSCRIPTION_DROP_PUBLICATION it looks like > the uninitialized copy_data local stack var would remain uninitialized > (undefined) still at the time it is passed at > AlterSubscription_refresh(sub, copy_data); Yes, that's wrong. AlterSubscription_refresh() would happily look at this uninitialized value when performing a refresh with this command. That's the only code path using parse_subscription_options() with this pattern. Applied on HEAD. -- Michael signature.asc Description: PGP signature
Re: Farewell greeting
Hi Tsunakawa-san, On 2021/06/27 16:41, tsunakawa.ta...@fujitsu.com wrote: I'm moving to another company from July 1st. I may possibly not be allowed to do open source activities there, so let me say goodbye once. Thank you all for your help. I really enjoyed learning PostgreSQL and participating in its development. It's a pity that I may not be able to part of PostgreSQL's great history until it becomes the most popular database (in the DB-Engines ranking.) However, if possible, I'd like to come back as just MauMau. I hope you all will succeed. Good luck in your future career! :-D See you at the Japanese PG developers' meetup. Thanks, Tatsuro Yamada
Re: Farewell greeting
On Sun, Jun 27, 2021 at 07:41:19AM +, tsunakawa.ta...@fujitsu.com wrote: > It's a pity that I may not be able to part of PostgreSQL's great > history until it becomes the most popular database (in the > DB-Engines ranking.) However, if possible, I'd like to come back as > just MauMau. This is an open community. So there would be nothing preventing you to come back. Good luck on your next position and all the best. -- Michael signature.asc Description: PGP signature
Re: pg14b2: FailedAssertion("_bt_posting_valid(nposting)", File: "nbtdedup.c", ...
I've just realized that this VM has a strange storage configuration. It's using LVM thin pools, which I don't use anywhere else. Someone else set this up, and I think I've literally never used pools before. Some time ago, the pool ran out of space, and I ran LVM repair on it. It seems very possible that's the issue. A latent problem might've been tickled by pg_upgrade --link. That said, the relevant table is the active "alarms" table, and it would've gotten plenty of DML with no issue for months running v13. Feel free to dismiss this report if it seems dubious. -- Justin
RE: Farewell greeting
>I'm moving to another company from July 1st. I may possibly not be allowed to >do open source activities there, so let me say >goodbye once. Thank you all >for your help. I really enjoyed learning PostgreSQL and participating in its >development. Thank you for everything, Tsunakawa-san. Good luck !!! Regards Sho Kato -Original Message- From: tsunakawa.ta...@fujitsu.com Sent: Sunday, June 27, 2021 4:41 PM To: pgsql-hackers@lists.postgresql.org Cc: MauMau Subject: Farewell greeting Hello all, I'm moving to another company from July 1st. I may possibly not be allowed to do open source activities there, so let me say goodbye once. Thank you all for your help. I really enjoyed learning PostgreSQL and participating in its development. It's a pity that I may not be able to part of PostgreSQL's great history until it becomes the most popular database (in the DB-Engines ranking.) However, if possible, I'd like to come back as just MauMau. I hope you all will succeed. Regards Takayuki Tsunakawa
Re: What is "wraparound failure", really?
On Mon, Jun 28, 2021 at 8:36 AM Peter Geoghegan wrote: > "The sole disadvantage of increasing autovacuum_freeze_max_age (and > vacuum_freeze_table_age along with it) is that the pg_xact and > pg_commit_ts subdirectories of the database cluster will take more > space..." Just by the way, if we're updating this sentence, it continues "because it must store..." but it should surely be "because they must store...".
Re: Fix uninitialized copy_data var (src/backend/commands/subscriptioncmds.c)
On Fri, Jun 25, 2021 at 11:55 PM Ranier Vilela wrote: > > > https://github.com/postgres/postgres/commit/3af10943ce21450e299b3915b9cad47cd90369e9 > fixes some issues with subscriptioncmds.c, but IMHO still lack this issue. > I have not tested this, and gcc gave no warnings about it, but just by visual code inspection I do agree with you that this looks like a problem, even in the latest code. IIUC for the case ALTER_SUBSCRIPTION_DROP_PUBLICATION it looks like the uninitialized copy_data local stack var would remain uninitialized (undefined) still at the time it is passed at AlterSubscription_refresh(sub, copy_data); -- Kind Regards, Peter Smith. Fujitsu Australia.
Re: What is "wraparound failure", really?
On 6/27/21 4:36 PM, Peter Geoghegan wrote: > The wraparound failsafe mechanism added by commit 1e55e7d1 had minimal > documentation -- just a basic description of how the GUCs work. I > think that it certainly merits some discussion under "25.1. Routine > Vacuuming" -- more specifically under "25.1.5. Preventing Transaction > ID Wraparound Failures". One reason why this didn't happen in the > original commit was that I just didn't know where to start with it. > The docs in question have said this since 2006's commit 48188e16 first > added autovacuum_freeze_max_age: > > "The sole disadvantage of increasing autovacuum_freeze_max_age (and > vacuum_freeze_table_age along with it) is that the pg_xact and > pg_commit_ts subdirectories of the database cluster will take more > space..." > > This sentence seems completely unreasonable to me. It seems to just > ignore the huge disadvantage of increasing autovacuum_freeze_max_age: > the *risk* that the system will stop being able to allocate new XIDs > because GetNewTransactionId() errors out with "database is not > accepting commands to avoid wraparound data loss...". Sure, it's > possible to take a lot of risk here without it ever blowing up in your > face. And if it doesn't blow up then the downside really is zero. This > is hardly a sensible way to talk about this important risk. Or any > risk at all. > > At first I thought that the sentence was not just misguided -- it > seemed downright bizarre. I thought that it was directly at odds with > the title "Preventing Transaction ID Wraparound Failures". I thought > that the whole point of this section was how not to have a wraparound > failure (as I understand the term), and yet we seem to deliberately > ignore the single most important practical aspect of making sure that > that doesn't happen. But I now suspect that the basic definitions have > been mixed up in a subtle but important way. > > What the documentation calls a "wraparound failure" seems to be rather > different to what I thought that that meant. As I said, I thought that > that meant the condition of being unable to get new transaction IDs > (at least until the DBA runs VACUUM in single user mode). But the > documentation in question seems to actually define it as "the > condition of an old MVCC snapshot failing to see a version from the > distant past, because somehow an XID wraparound suddenly makes it look > as if it's in the distant future rather than in the past". It's > actually talking about a subtly different thing, so the "sole > disadvantage" sentence is not actually bizarre. It does still seem > impractical and confusing, though. > > I strongly suspect that my interpretation of what "wraparound failure" > means is actually the common one. Of course the system is never under > any circumstances allowed to give totally wrong answers to queries, no > matter what -- users should be able to take that much for granted. > What users care about here is sensibly managing XIDs as a resource -- > preventing "XID exhaustion" while being conservative, but not > ridiculously conservative. Could the documentation be completely > misleading users here? > > I have two questions: > > 1. Do I have this right? Is there really confusion about what a > "wraparound failure" means, or is the confusion mine alone? > > 2. How do I go about integrating discussion of the failsafe here? > Anybody have thoughts on that? > AIUI, actual wraparound (i.e. an xid crossing the event horizon so it appears to be in the future) is no longer possible. But it once was a very real danger. Maybe the docs haven't quite caught up. In practical terms, there is an awful lot of head room between the default for autovacuum_freeze_max_age and any danger of major anti-wraparound measures. Say you increase it to 1bn from the default 200m. That still leaves you ~1bn transactions of headroom. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: pg14b2: FailedAssertion("_bt_posting_valid(nposting)", File: "nbtdedup.c", ...
On Sun, Jun 27, 2021 at 03:08:13PM -0700, Peter Geoghegan wrote: > Can you please amcheck all of the indexes? ts=# SELECT bt_index_check('child.alarms_null_alarm_clear_time_idx'::regclass); ERROR: item order invariant violated for index "alarms_null_alarm_clear_time_idx" DETAIL: Lower index tid=(1,77) (points to heap tid=(29,9)) higher index tid=(1,78) (points to heap tid=(29,9)) page lsn=80/4B9C69D0. ts=# SELECT itemoffset, ctid ,itemlen, nulls, vars, dead, htid FROM bt_page_items('child.alarms_null_alarm_clear_time_idx', 1); itemoffset | ctid| itemlen | nulls | vars | dead | htid +---+-+---+--+--+- ... 77 | (29,9)| 16 | t | f| f| (29,9) 78 | (29,9)| 16 | t | f| f| (29,9) ts=# SELECT lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid, t_infomask2, t_infomask, t_hoff, t_bits, t_oid FROM heap_page_items(get_raw_page('child.alarms_null', 29)); lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid ++--+++--+--+-+-+++--+--- 1 | 6680 |1 | 1512 | 88669 | 27455486 | 44 | (29,1) | 8225 | 10691 | 32 | 11101000 | 2 | 6 |2 | 0 || | | | ||| | 3 | 5168 |1 | 1512 | 87374 | 27455479 | 37 | (29,3) | 8225 | 10691 | 32 | 11101000 | 4 | 4192 |1 |976 | 148104 | 27574887 |0 | (29,4) | 8225 | 10695 | 32 | 11101000 | 5 | 10 |2 | 0 || | | | ||| | 6 | 3216 |1 |976 | 148137 | 27574888 |0 | (29,6) | 40993 | 10695 | 32 | 11101000 | 7 | 8 |2 | 0 || | | | ||| | 8 | 2240 |1 |976 | 47388 | 27574858 |7 | (29,8) | 40993 | 10695 | 32 | 11101000 | 9 | 0 |3 | 0 || | | | ||| | 10 | 1264 |1 |976 | 148935 | 27574889 |0 | (29,10) | 40993 | 10695 | 32 | 11101000 | 11 | 0 |3 | 0 || | | | ||| | 12 | 0 |3 | 0 || | | | ||| | (12 rows) (gdb) fr 4 #4 0x00509a14 in _bt_insertonpg (rel=rel@entry=0x7f6dfd3cd628, itup_key=itup_key@entry=0x2011b40, buf=15, cbuf=cbuf@entry=0, stack=stack@entry=0x2011bd8, itup=0x2011c00, itup@entry=0x200d608, itemsz=16, newitemoff=2, postingoff=62, split_only_page=split_only_page@entry=false) at nbtinsert.c:1174 1174in nbtinsert.c (gdb) p page $5 = 0x7f6de58e0e00 "\200" (gdb) dump binary memory /tmp/dump_block.page page (page + 8192) ts=# SELECT lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid, t_infomask2, t_infomask, t_hoff, t_bits, t_oid FROM heap_page_items(pg_read_binary_file('/tmp/dump_block.page')) WHERE t_xmin IS NOT NULL; lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid -++--++-++--++-++++--- 1 | 8152 |1 | 24 | 1048576 | 2685931521 |0 | (0,0) | 0 |120 | 1 || 2 | 7288 |1 |864 | 1048576 | 2740985997 |0 | (0,0) | 0 | 1 | 0 || 67 | 6368 |1 |920 | 1048576 | 2744656022 |0 | (0,0) | 33 | 4 | 0 || 137 | 5056 |1 | 1312 | 1048576 | 2770346200 |0 | (0,0) | 69 | 6 | 0 || 142 | 4608 |1 |448 | 1048576 | 2713722952 |0 | (0,0) | 107 | 4 | 0 || (5 rows) I didn't change the kernel here, nor on the previous bug report - it was going to be my "next step", until I found the
Re: pg14b2: FailedAssertion("_bt_posting_valid(nposting)", File: "nbtdedup.c", ...
Did you also change the kernel on upgrade? I recall that that was a factor on the other recent bug thread. Peter Geoghegan (Sent from my phone)
Re: pg14b2: FailedAssertion("_bt_posting_valid(nposting)", File: "nbtdedup.c", ...
Would also be good to get a raw binary copy of the page image in question. Hopefully the data isn't confidential. Same gdb procedure as before. Thanks Peter Geoghegan (Sent from my phone)
Re: pg14b2: FailedAssertion("_bt_posting_valid(nposting)", File: "nbtdedup.c", ...
Can you please amcheck all of the indexes? Peter Geoghegan (Sent from my phone)
pg14b2: FailedAssertion("_bt_posting_valid(nposting)", File: "nbtdedup.c", ...
This is crashing repeatedly during insert/update immediately after upgrading an instance to v14, from v13.3. In case it matters, the cluster was originally initdb at 13.2. TRAP: FailedAssertion("_bt_posting_valid(nposting)", File: "nbtdedup.c", Line: 1062, PID: 28580) postgres: telsasoft ts 127.0.0.1(52250) INSERT(ExceptionalCondition+0x8d)[0x967d1d] postgres: telsasoft ts 127.0.0.1(52250) INSERT(_bt_swap_posting+0x2cd)[0x507cdd] postgres: telsasoft ts 127.0.0.1(52250) INSERT[0x509a14] postgres: telsasoft ts 127.0.0.1(52250) INSERT(_bt_doinsert+0xcb7)[0x50d0b7] postgres: telsasoft ts 127.0.0.1(52250) INSERT(btinsert+0x52)[0x5130f2] postgres: telsasoft ts 127.0.0.1(52250) INSERT(ExecInsertIndexTuples+0x231)[0x687b81] postgres: telsasoft ts 127.0.0.1(52250) INSERT[0x6b8718] postgres: telsasoft ts 127.0.0.1(52250) INSERT[0x6b9297] postgres: telsasoft ts 127.0.0.1(52250) INSERT(standard_ExecutorRun+0x142)[0x688b32] postgres: telsasoft ts 127.0.0.1(52250) INSERT[0x82da8a] postgres: telsasoft ts 127.0.0.1(52250) INSERT[0x82e673] postgres: telsasoft ts 127.0.0.1(52250) INSERT[0x82e936] postgres: telsasoft ts 127.0.0.1(52250) INSERT(PortalRun+0x2eb)[0x82ec8b] postgres: telsasoft ts 127.0.0.1(52250) INSERT(PostgresMain+0x1f97)[0x82c777] postgres: telsasoft ts 127.0.0.1(52250) INSERT[0x48f71a] postgres: telsasoft ts 127.0.0.1(52250) INSERT(PostmasterMain+0x1138)[0x794c98] postgres: telsasoft ts 127.0.0.1(52250) INSERT(main+0x6f2)[0x491292] < 2021-06-27 23:46:43.257 CAT >DETAIL: Failed process was running: INSERT INTO alarms(... #3 0x00507cdd in _bt_swap_posting (newitem=newitem@entry=0x2011c00, oposting=oposting@entry=0x7f6de58e2a78, postingoff=postingoff@entry=62) at nbtdedup.c:1062 nhtids = replacepos = 0x2011dac "" nposting = 0x2011c28 __func__ = "_bt_swap_posting" #4 0x00509a14 in _bt_insertonpg (rel=rel@entry=0x7f6dfd3cd628, itup_key=itup_key@entry=0x2011b40, buf=15, cbuf=cbuf@entry=0, stack=stack@entry=0x2011bd8, itup=0x2011c00, itup@entry=0x200d608, itemsz=16, newitemoff=2, postingoff=62, split_only_page=split_only_page@entry=false) at nbtinsert.c:1174 itemid = 0x7f6de58e0e1c page = 0x7f6de58e0e00 "\200" opaque = 0x7f6de58e2df0 isleaf = true isroot = false isrightmost = false isonly = false oposting = 0x7f6de58e2a78 origitup = nposting = 0x0 __func__ = "_bt_insertonpg" #5 0x0050d0b7 in _bt_doinsert (rel=rel@entry=0x7f6dfd3cd628, itup=itup@entry=0x200d608, checkUnique=checkUnique@entry=UNIQUE_CHECK_NO, indexUnchanged=indexUnchanged@entry=false, heapRel=heapRel@entry=0x7f6dfd48ba80) at nbtinsert.c:257 newitemoff = 0 is_unique = false insertstate = {itup = 0x200d608, itemsz = 16, itup_key = 0x2011b40, buf = 15, bounds_valid = true, low = 2, stricthigh = 3, postingoff = 62} itup_key = checkingunique = #6 0x005130f2 in btinsert (rel=0x7f6dfd3cd628, values=, isnull=, ht_ctid=0x212e250, heapRel=0x7f6dfd48ba80, checkUnique=UNIQUE_CHECK_NO, indexUnchanged=false, indexInfo=0x200d2e8) at nbtree.c:199 result = itup = 0x200d608 #7 0x00687b81 in ExecInsertIndexTuples (resultRelInfo=resultRelInfo@entry=0x200cd90, slot=slot@entry=0x212e220, estate=estate@entry=0x212c6b0, update=update@entry=false, noDupErr=noDupErr@entry=false, specConflict=specConflict@entry=0x0, arbiterIndexes=arbiterIndexes@entry=0x0) at execIndexing.c:415 (gdb) p *newitem $2 = {t_tid = {ip_blkid = {bi_hi = 0, bi_lo = 22}, ip_posid = 4}, t_info = 32784} (gdb) p *oposting $3 = {t_tid = {ip_blkid = {bi_hi = 0, bi_lo = 16}, ip_posid = 8333}, t_info = 41824} I will save a copy of the data dir and see if reindexing helps. Let me know if there's anything else I can provide. -- Justin
What is "wraparound failure", really?
The wraparound failsafe mechanism added by commit 1e55e7d1 had minimal documentation -- just a basic description of how the GUCs work. I think that it certainly merits some discussion under "25.1. Routine Vacuuming" -- more specifically under "25.1.5. Preventing Transaction ID Wraparound Failures". One reason why this didn't happen in the original commit was that I just didn't know where to start with it. The docs in question have said this since 2006's commit 48188e16 first added autovacuum_freeze_max_age: "The sole disadvantage of increasing autovacuum_freeze_max_age (and vacuum_freeze_table_age along with it) is that the pg_xact and pg_commit_ts subdirectories of the database cluster will take more space..." This sentence seems completely unreasonable to me. It seems to just ignore the huge disadvantage of increasing autovacuum_freeze_max_age: the *risk* that the system will stop being able to allocate new XIDs because GetNewTransactionId() errors out with "database is not accepting commands to avoid wraparound data loss...". Sure, it's possible to take a lot of risk here without it ever blowing up in your face. And if it doesn't blow up then the downside really is zero. This is hardly a sensible way to talk about this important risk. Or any risk at all. At first I thought that the sentence was not just misguided -- it seemed downright bizarre. I thought that it was directly at odds with the title "Preventing Transaction ID Wraparound Failures". I thought that the whole point of this section was how not to have a wraparound failure (as I understand the term), and yet we seem to deliberately ignore the single most important practical aspect of making sure that that doesn't happen. But I now suspect that the basic definitions have been mixed up in a subtle but important way. What the documentation calls a "wraparound failure" seems to be rather different to what I thought that that meant. As I said, I thought that that meant the condition of being unable to get new transaction IDs (at least until the DBA runs VACUUM in single user mode). But the documentation in question seems to actually define it as "the condition of an old MVCC snapshot failing to see a version from the distant past, because somehow an XID wraparound suddenly makes it look as if it's in the distant future rather than in the past". It's actually talking about a subtly different thing, so the "sole disadvantage" sentence is not actually bizarre. It does still seem impractical and confusing, though. I strongly suspect that my interpretation of what "wraparound failure" means is actually the common one. Of course the system is never under any circumstances allowed to give totally wrong answers to queries, no matter what -- users should be able to take that much for granted. What users care about here is sensibly managing XIDs as a resource -- preventing "XID exhaustion" while being conservative, but not ridiculously conservative. Could the documentation be completely misleading users here? I have two questions: 1. Do I have this right? Is there really confusion about what a "wraparound failure" means, or is the confusion mine alone? 2. How do I go about integrating discussion of the failsafe here? Anybody have thoughts on that? -- Peter Geoghegan
Re: Overflow hazard in pgbench
I wrote: > ... according to the C99 > spec this code is broken, because the compiler is allowed to assume > that signed integer overflow doesn't happen, whereupon the second > if-block is provably unreachable. The failure still represents a gcc > bug, because we're using -fwrapv which should disable that assumption. > However, not all compilers have that switch, so it'd be better to code > this in a spec-compliant way. BTW, for grins I tried building today's HEAD without -fwrapv, using gcc version 11.1.1 20210531 (Red Hat 11.1.1-3) (GCC) which is the newest version I have at hand. Not very surprisingly, that reproduced the failure shown on moonjelly. However, after adding the patch I proposed, "make check-world" passed! I was not expecting that result; I supposed we still had lots of lurking assumptions of traditional C overflow handling. I'm not in any hurry to remove -fwrapv, because (a) this result doesn't show that we have no such assumptions, only that they must be lurking in darker, poorly-tested corners, and (b) I'm not aware of any reason to think that removing -fwrapv would provide benefits worth taking any risks for. But we may be closer to being able to do without that switch than I thought. regards, tom lane
Re: PQconnectdb/PQerrorMessage changed behavior on master
Alexander Lakhin writes: > While trying to use sqlsmith with postgres compiled from the master > branch, I've found that the PQerrorMessage() function now returns > non-informational but not empty error message after the successful > PQconnectdb() call. Yeah, see thread here: https://www.postgresql.org/message-id/20210506162651.GJ27406%40telsasoft.com sqlsmith is definitely Doing It Wrong there, but there's a reasonable question whether such coding is common. regards, tom lane
PQconnectdb/PQerrorMessage changed behavior on master
Hello, While trying to use sqlsmith with postgres compiled from the master branch, I've found that the PQerrorMessage() function now returns non-informational but not empty error message after the successful PQconnectdb() call. conn = PQconnectdb(conninfo.c_str()); char *errmsg = PQerrorMessage(conn); returns 'connection to server on socket "/tmp/ody8OuOaqV/.s.PGSQL.59860" failed: ' The affected sqlsmith code: https://github.com/anse1/sqlsmith/blob/master/postgres.cc#L305 Best regards, Alexander
Re: PITR promote bug: Checkpointer writes to older timeline
I wrote: > It sure looks like recovering a prepared > transaction creates a transient state in which a new backend will > compute a broken snapshot. Oh, after further digging this is the same issue discussed here: https://www.postgresql.org/message-id/flat/20210422203603.fdnh3fu2mmfp2iov%40alap3.anarazel.de regards, tom lane
Re: PITR promote bug: Checkpointer writes to older timeline
I wrote: > Buildfarm member hornet just reported a failure in this test: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet=2021-06-27%2013%3A40%3A57 > It's not clear whether this is a problem with the test case or an > actual server bug, but I'm leaning to the latter theory. My gut > feel is it's some problem in the "snapshot scalability" work. It > doesn't look the same as the known open issue, but maybe related? Hmm, the plot thickens. I scraped the buildfarm logs for similar-looking assertion failures back to last August, when the snapshot scalability patches went in. The first such failure is not until 2021-03-24 (see attachment), and they all look to be triggered by 023_pitr_prepared_xact.pl. It sure looks like recovering a prepared transaction creates a transient state in which a new backend will compute a broken snapshot. regards, tom lane sysname| branch | snapshot | stage | l ---++-+---+--- calliphoridae | HEAD | 2021-03-24 06:50:09 | recoveryCheck | TRAP: FailedAssertion("TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin)", File: "/mnt/resource/andres/bf/calliphoridae/HEAD/pgsql.build/../pgsql/src/backend/storage/ipc/procarray.c", Line: 2463, PID: 3890215) francolin | HEAD | 2021-03-29 16:21:58 | recoveryCheck | TRAP: FailedAssertion("TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin)", File: "/mnt/resource/andres/bf/francolin/HEAD/pgsql.build/../pgsql/src/backend/storage/ipc/procarray.c", Line: 2463, PID: 1861665) moonjelly | HEAD | 2021-04-01 15:25:38 | recoveryCheck | TRAP: FailedAssertion("TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin)", File: "procarray.c", Line: 2463, PID: 2345153) francolin | HEAD | 2021-04-07 12:30:08 | recoveryCheck | TRAP: FailedAssertion("TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin)", File: "/mnt/resource/andres/bf/francolin/HEAD/pgsql.build/../pgsql/src/backend/storage/ipc/procarray.c", Line: 2468, PID: 3257637) fairywren | HEAD | 2021-04-20 03:04:04 | recoveryCheck | TRAP: FailedAssertion("TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin)", File: "C:/tools/msys64/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/storage/ipc/procarray.c", Line: 2094, PID: 94824) mantid| HEAD | 2021-04-25 10:07:06 | recoveryCheck | TRAP: FailedAssertion("TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin)", File: "procarray.c", Line: 2094, PID: 2820886) thorntail | HEAD | 2021-04-29 07:18:09 | recoveryCheck | TRAP: FailedAssertion("TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin)", File: "/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/storage/ipc/procarray.c", Line: 2094, PID: 3099560) mantid| HEAD | 2021-05-03 13:07:06 | recoveryCheck | TRAP: FailedAssertion("TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin)", File: "procarray.c", Line: 2094, PID: 1163004) mantid| HEAD | 2021-05-10 01:07:07 | recoveryCheck | TRAP: FailedAssertion("TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin)", File: "procarray.c", Line: 2468, PID: 2812704) hornet| HEAD | 2021-06-27 13:40:57 | recoveryCheck | TRAP: FailedAssertion("TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin)", File: "procarray.c", Line: 2492, PID: 11862234) (10 rows)
Re: PITR promote bug: Checkpointer writes to older timeline
Michael Paquier writes: > I have been working on that over the last couple of days, and applied > a fix down to 10. One thing that I did not like in the test was the > use of compare() to check if the contents of the WAL segment before > and after the timeline jump remained the same as this would have been > unstable with any concurrent activity. Instead, I have added a phase > at the end of the test with an extra checkpoint and recovery triggered > once, which is enough to reproduce the PANIC reported at the top of > the thread. Buildfarm member hornet just reported a failure in this test: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet=2021-06-27%2013%3A40%3A57 the critical bit being 2021-06-27 17:35:46.504 UTC [11862234:1] [unknown] LOG: connection received: host=[local] 2021-06-27 17:35:46.505 UTC [18350260:12] LOG: recovering prepared transaction 734 from shared memory TRAP: FailedAssertion("TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin)", File: "procarray.c", Line: 2492, PID: 11862234) 2021-06-27 17:35:46.511 UTC [14876838:4] LOG: database system is ready to accept connections It's not clear whether this is a problem with the test case or an actual server bug, but I'm leaning to the latter theory. My gut feel is it's some problem in the "snapshot scalability" work. It doesn't look the same as the known open issue, but maybe related? regards, tom lane
PoC: using sampling to estimate joins / complex conditions
Hi, estimating joins is one of the significant gaps related to extended statistics, and I've been regularly asked about what we might do about that. This is an early experimental patch that I think might help us with improving this, possible even in PG15. Note: I do not claim this is exactly how it should be implemented, but it's probably sufficient to demonstrate the pros/cons of various alternative approaches, etc. In short, the patch samples the tables and uses those samples to estimate selectivity for scans and joins. The samples are collected during planning, which may be quite expensive - random I/O for each query, etc. It'd be possible to build them during analyze, but that'd require solving serialization, tweak CREATE STATISTICS to handle join queries, etc. I decided to keep the PoC simple. It still uses CREATE STATISTICS with a new "sample" kind, instructing the optimizer to use sampling when estimating clauses on the attributes. A little example demonstrating what the patch does: create table t (a int, b int, c int); insert into t select mod(i,10), mod(i,20), mod(i,40) from generate_series(1,1000) s(i); analyze t; -- estimate without any statistics / sampling explain analyze select * from t where a = 0 and b = 0 and c = 0; QUERY PLAN --- Seq Scan on t (cost=0.00..229055.00 rows=1361 width=12) (actual time=0.025..761.571 rows=25 loops=1) Filter: ((a = 0) AND (b = 0) AND (c = 0)) Rows Removed by Filter: 975 Planning Time: 0.471 ms Execution Time: 901.182 ms (5 rows) -- enable sampling on those columns create statistics s (sample) on a, b, c from t; explain analyze select * from t where a = 0 and b = 0 and c = 0; QUERY PLAN --- Seq Scan on t (cost=0.00..229055.00 rows=250390 width=12) (actual time=0.307..717.937 rows=25 loops=1) Filter: ((a = 0) AND (b = 0) AND (c = 0)) Rows Removed by Filter: 975 Planning Time: 194.528 ms Execution Time: 851.832 ms (5 rows) Of course, in this case a MCV would work well too, because there are very few combinations in (a,b,c) - a sample would work even when that's not the case, and it has various other benefits (can estimate almost any expression while MCV supports only a subset, etc.) Now, let's look at a join between a fact and a dimension table: create table f (d1 int, d2 int, f1 int, f2 int, f3 int); create table d (d1 int, d2 int, d3 int, d4 int, d5 int, primary key (d1, d2)); insert into d select i, i, mod(i,100), mod(i,100), mod(i,100) from generate_series(0,999) s(i); insert into f select mod(i,1000), mod(i,1000), mod(i,100), mod(i,100), mod(i,100) from generate_series(1,100) s(i); analyze f, d; explain analyze select * from f join d using (d1,d2) where f1 < 50 and f2 < 50 and d3 < 50 and d4 < 50; QUERY PLAN -- Hash Join (cost=25.75..22717.01 rows=63 width=32) (actual time=3.197..861.899 rows=50 loops=1) Hash Cond: ((f.d1 = d.d1) AND (f.d2 = d.d2)) -> Seq Scan on f (cost=0.00..21370.00 rows=251669 width=20) (actual time=0.033..315.401 rows=50 loops=1) Filter: ((f1 < 50) AND (f2 < 50)) Rows Removed by Filter: 50 -> Hash (cost=22.00..22.00 rows=250 width=20) (actual time=3.139..3.141 rows=500 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 34kB -> Seq Scan on d (cost=0.00..22.00 rows=250 width=20) (actual time=0.018..1.706 rows=500 loops=1) Filter: ((d3 < 50) AND (d4 < 50)) Rows Removed by Filter: 500 Planning Time: 0.806 ms Execution Time: 1099.229 ms (12 rows) So, not great - underestimated by 1x is likely to lead to inefficient plans. And now with the samples enabled on both sides: create statistics s1 (sample) on d1, d2, f1, f2, f3 from f; create statistics s2 (sample) on d1, d2, d3, d4, d5 from d; QUERY PLAN -- Hash Join (cost=29.50..24057.25 rows=503170 width=32) (actual time=0.630..837.483 rows=50 loops=1) Hash Cond: ((f.d1 = d.d1) AND (f.d2 = d.d2)) -> Seq Scan on f (cost=0.00..21370.00 rows=503879 width=20) (actual time=0.008..301.584 rows=50 loops=1) Filter: ((f1 < 50) AND (f2 < 50)) Rows Removed by Filter: 50 -> Hash (cost=22.00..22.00 rows=500 width=20) (actual time=0.616..0.618 rows=500 loops=1) Buckets: 1024 Batches: 1 Memory
Re: pgsql: Fix pattern matching logic for logs in TAP tests of pgbench
However, if slurp_file fails it raises an exception and aborts the whole TAP unexpectedly, which is pretty unclean. So I'd suggest to keep the eval, as attached. I tested it by changing the file name so that the slurp fails. Seem quite unnecessary. We haven't found that to be an issue elsewhere in the code where slurp_file is used. And in the present case we know the file exists because we got its name from list_files(). Fine with me! -- Fabien.
Re: pgsql: Fix pattern matching logic for logs in TAP tests of pgbench
Seem quite unnecessary. We haven't found that to be an issue elsewhere in the code where slurp_file is used. And in the present case we know the file exists because we got its name from list_files(). Agreed. That's an exchange between a hard failure mid-test and a failure while letting the whole test run. Here, we expect the test to find the log file all the time, so a hard failure does not sound like a bad thing to me either. Ok, fine with me! -- Fabien.
Re: pgsql: Fix pattern matching logic for logs in TAP tests of pgbench
On Sat, Jun 26, 2021 at 11:01:07AM -0400, Andrew Dunstan wrote: > On 6/26/21 2:47 AM, Fabien COELHO wrote: >> However, if slurp_file fails it raises an exception and aborts the >> whole TAP unexpectedly, which is pretty unclean. So I'd suggest to >> keep the eval, as attached. I tested it by changing the file name so >> that the slurp fails. > > Seem quite unnecessary. We haven't found that to be an issue elsewhere > in the code where slurp_file is used. And in the present case we know > the file exists because we got its name from list_files(). Agreed. That's an exchange between a hard failure mid-test and a failure while letting the whole test run. Here, we expect the test to find the log file all the time, so a hard failure does not sound like a bad thing to me either. -- Michael signature.asc Description: PGP signature
Re: pgsql: Fix pattern matching logic for logs in TAP tests of pgbench
On 6/26/21 2:47 AM, Fabien COELHO wrote: > > Hello Andrew & Michaël, > > My 0.02€: > >> There's a whole lot wrong with this code. To start with, why is that >> unchecked eval there. > > Yep. The idea was that other tests would go on being collected eg if > the file is not found, but it should have been checked anyway. > >> And why is it reading in log files on its own instead of using >> TestLib::slurp_file, which, among other things, normalizes line endings? > > Indeed. > > However, if slurp_file fails it raises an exception and aborts the > whole TAP unexpectedly, which is pretty unclean. So I'd suggest to > keep the eval, as attached. I tested it by changing the file name so > that the slurp fails. Seem quite unnecessary. We haven't found that to be an issue elsewhere in the code where slurp_file is used. And in the present case we know the file exists because we got its name from list_files(). cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: pgsql: Fix pattern matching logic for logs in TAP tests of pgbench
Hello Andrew & Michaël, My 0.02€: There's a whole lot wrong with this code. To start with, why is that unchecked eval there. Yep. The idea was that other tests would go on being collected eg if the file is not found, but it should have been checked anyway. And why is it reading in log files on its own instead of using TestLib::slurp_file, which, among other things, normalizes line endings? Indeed. However, if slurp_file fails it raises an exception and aborts the whole TAP unexpectedly, which is pretty unclean. So I'd suggest to keep the eval, as attached. I tested it by changing the file name so that the slurp fails. There's a very good chance that this latter is the issue. It only affects msys which is why you didn't see an issue on MSVC. And also, why does it carefully unlink the log files so that any trace of what's gone wrong is deleted? Based on the little I've seen this file needs a serious code review. Probably: My very old perl expertise is fading away because I'm not using it much these days. Cannot say I miss it:-) -- Fabien.diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl index 781cc08fb1..6a82a3494d 100644 --- a/src/bin/pgbench/t/001_pgbench_with_server.pl +++ b/src/bin/pgbench/t/001_pgbench_with_server.pl @@ -1190,29 +1190,32 @@ sub check_pgbench_logs ok(@logs == $nb, "number of log files"); ok(grep(/\/$prefix\.\d+(\.\d+)?$/, @logs) == $nb, "file name format"); - my $log_number = 0; for my $log (sort @logs) { - # Check the contents of each log file. - my $contents_raw = slurp_file($log); + eval { + # Check the contents of each log file. + my $contents_raw = slurp_file($log); - my @contents = split(/\n/, $contents_raw); - my $clen = @contents; - ok( $min <= $clen && $clen <= $max, - "transaction count for $log ($clen)"); - my $clen_match = grep(/$re/, @contents); - ok($clen_match == $clen, "transaction format for $prefix"); + my @contents = split(/\n/, $contents_raw); + my $clen = @contents; + ok( $min <= $clen && $clen <= $max, +"transaction count for $log ($clen)"); + my $clen_match = grep(/$re/, @contents); + ok($clen_match == $clen, "transaction format for $prefix"); - # Show more information if some logs don't match - # to help with debugging. - if ($clen_match != $clen) - { - foreach my $log (@contents) + # Show more information if some logs don't match + # to help with debugging. + if ($clen_match != $clen) { -print "# Log entry not matching: $log\n" - unless $log =~ /$re/; +foreach my $log (@contents) +{ + print "# Log entry not matching: $log\n" + unless $log =~ /$re/; +} } - } + }; + # Tell if an exception was raised + ok(0, "Error while checking log file \"$log\": $@") if $@; } return; }
RE: [Patch] change the default value of shared_buffers in postgresql.conf.sample
> On the whole this seems pretty cosmetic so I'm inclined to leave it alone. > But if we were going to do anything I think that adjusting both initdb.c and > guc.c to use 128MB might be the most appropriate thing. Thank you for your suggestions. initdb.c and guc.c have been modified together. Best Regards! Zhangjie -Original Message- From: Tom Lane Sent: Thursday, June 24, 2021 11:50 PM To: Zhang, Jie/张 杰 Cc: pgsql-hackers@lists.postgresql.org Subject: Re: [Patch] change the default value of shared_buffers in postgresql.conf.sample "zhangj...@fujitsu.com" writes: > In PostgreSQL 14, The default value of shared_buffers is 128MB, but in > postgresql.conf.sample, the default value of shared_buffers is 32MB. > I think the following changes should be made. > File: postgresql\src\backend\utils\misc\ postgresql.conf.sample > #shared_buffers = 32MB => #shared_buffers = 128MB As submitted, this patch breaks initdb, which is looking for the exact string "#shared_buffers = 32MB". We could adjust that too of course, but I'm dubious first that any change is needed, and second that this is the right one: 1. Since initdb will replace that string, users will never see this entry as-is in live databases. So is it worth doing anything? 2. The *actual*, hard-wired, default in guc.c is 1024 (8MB), not either of these numbers. So maybe the sample file ought to use that instead. Or maybe we should change that value too ... it's surely as obsolete as can be. On the whole this seems pretty cosmetic so I'm inclined to leave it alone. But if we were going to do anything I think that adjusting both initdb.c and guc.c to use 128MB might be the most appropriate thing. regards, tom lane postgresql.conf.sample_0625.patch Description: postgresql.conf.sample_0625.patch
Overflow hazard in pgbench
moonjelly just reported an interesting failure [1]. It seems that with the latest bleeding-edge gcc, this code is misoptimized: /* check random range */ if (imin > imax) { pg_log_error("empty range given to random"); return false; } else if (imax - imin < 0 || (imax - imin) + 1 < 0) { /* prevent int overflows in random functions */ pg_log_error("random range is too large"); return false; } such that the second if-test doesn't fire. Now, according to the C99 spec this code is broken, because the compiler is allowed to assume that signed integer overflow doesn't happen, whereupon the second if-block is provably unreachable. The failure still represents a gcc bug, because we're using -fwrapv which should disable that assumption. However, not all compilers have that switch, so it'd be better to code this in a spec-compliant way. I suggest applying the attached in branches that have the required functions. [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=moonjelly=2021-06-26%2007%3A03%3A17 regards, tom lane diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index e61055b6b7..c4023bfa27 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -2450,7 +2450,8 @@ evalStandardFunc(CState *st, case PGBENCH_RANDOM_ZIPFIAN: { int64 imin, - imax; + imax, + delta; Assert(nargs >= 2); @@ -2464,7 +2465,8 @@ evalStandardFunc(CState *st, pg_log_error("empty range given to random"); return false; } -else if (imax - imin < 0 || (imax - imin) + 1 < 0) +else if (pg_sub_s64_overflow(imax, imin, ) || + pg_add_s64_overflow(delta, 1, )) { /* prevent int overflows in random functions */ pg_log_error("random range is too large");
Re: Farewell greeting
On Sun, Jun 27, 2021 at 12:41 AM tsunakawa.ta...@fujitsu.com wrote: > I'm moving to another company from July 1st. I may possibly not be allowed > to do open source activities there, so let me say goodbye once. It's a pity that you may not be around anymore. I wish you all the best. -- Peter Geoghegan
Re: Composite types as parameters
Elijah Stone writes: > On Sat, 26 Jun 2021, Tom Lane wrote: >> If it still doesn't work, please provide a more concrete example. > Thanks, unfortunately adding the explicit cast doesn't help. I've > attached a minimal runnable example. So your problem is that you're explicitly saying that the input is of generic-RECORD type. You should let the server infer its type, instead, which it can easily do from context in this example. That is, pass zero as the type OID, or leave out the paramTypes array altogether. The example works for me with this change: @@ -30,13 +30,13 @@ // error: check(PQexecParams(c, "INSERT INTO tab VALUES($1, 8);", - 1, &(Oid){RECORDOID}, &(const char*){recbuf}, + 1, &(Oid){0}, &(const char*){recbuf}, &(int){rec - recbuf}, &(int){1/*binary*/}, 1/*binary result*/)); // error as well: check(PQexecParams(c, "INSERT INTO tab VALUES($1::some_record, 8);", - 1, &(Oid){RECORDOID}, &(const char*){recbuf}, + 1, &(Oid){0}, &(const char*){recbuf}, &(int){rec - recbuf}, &(int){1}, 1)); In more complicated cases you might need to fetch the composite type's actual OID and pass that. But I'd go with the lazy approach until forced to do differently. > Is there a > way to discover the OID of a composite type? And is the wire format the > same as for a generic record? Same as for any other type: SELECT 'mytypename'::regtype::oid. And yes. regards, tom lane
Re: Deparsing rewritten query
On Sun, Jun 27, 2021 at 11:14:05AM -0400, Tom Lane wrote: > Julien Rouhaud writes: > > > Agreed. One the other hand having such a function in core may ensure that > > any > > significant change in those area will keep an API to retrieve the final > > query > > representation. > > My point is precisely that I'm unwilling to make such a promise. > > I do not buy that this capability is worth very much, given that > we've gotten along fine without it for twenty-plus years. If you > want to have it as an internal, might-change-at-any-time API, > that seems all right. I'm totally fine with a might-change-at-any-time API, but not with a might-disappear-at-anytime API. If exposing get_query_def() can become virtually useless in a few releases with no hope to get required in-core change for retrieving the final query representation, I agree this we can stop the discussion here.
Re: Deparsing rewritten query
Julien Rouhaud writes: > On Sun, Jun 27, 2021 at 10:34:52AM -0400, Tom Lane wrote: >> It's not very hard to imagine someday moving view >> expansion into the planner on efficiency grounds, leaving the rewriter >> handling only the rare uses of INSERT/UPDATE/DELETE rules. > Agreed. One the other hand having such a function in core may ensure that any > significant change in those area will keep an API to retrieve the final query > representation. My point is precisely that I'm unwilling to make such a promise. I do not buy that this capability is worth very much, given that we've gotten along fine without it for twenty-plus years. If you want to have it as an internal, might-change-at-any-time API, that seems all right. If you're trying to lock it down as something that will be there forevermore, you're likely to end up with nothing. regards, tom lane
Re: Deparsing rewritten query
On Sun, Jun 27, 2021 at 10:34:52AM -0400, Tom Lane wrote: > > I'm not really excited by this, as it seems like it's exposing internal > decisions we could change someday; to wit, first that there is any such > thing as a separate rewriting pass Sure, but the fact that views will significantly impact the query being executed from the one written is not an internal decision. In my opinion knowing what the final "real" query will be is still a valid concern, whether we have a rewriting pass or not. > and second that its output is > interpretable as pure SQL. (TBH, I'm not 100% sure that the second > assumption is true even today, although I know there are ancient comments > that claim that.) I totally agree. Note that there was at least one gotcha handled in this patch: rewritten views didn't get an alias, which is mandatory for an SQL query. > It's not very hard to imagine someday moving view > expansion into the planner on efficiency grounds, leaving the rewriter > handling only the rare uses of INSERT/UPDATE/DELETE rules. Agreed. One the other hand having such a function in core may ensure that any significant change in those area will keep an API to retrieve the final query representation. > If there's functions in ruleutils.c that we'd need to make public to > let somebody write a debugging extension that does this kind of thing, > I'd be happier with that approach than with creating a core-server SQL > function for it. There might be more than one use-case for the > exposed bits. It would mean exposing at least get_query_def(). I thought that exposing this function was already suggested and refused, but I may be wrong. Maybe other people would like to have nearby functions exposed too. Note that if we go this way, we would still need at least something like this patch's chunk in rewriteHandler.c applied, as otherwise the vast majority of rewritten and deparsed queries won't be valid.
code fork June 28th
I am planning on forking the tree so we can start adding developments for Postgres 15 in the upcoming commitfest. This will be done tomorrow, June 28, late morning US East coast time. I will be following the procedures laid out in src/tools/RELEASE_CHANGES under the heading "Starting a New Development Cycle". cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Deparsing rewritten query
Julien Rouhaud writes: > As an alternative, maybe we could expose a simple SRF that would take care of > rewriting the query and deparsing the resulting query tree(s)? I'm not really excited by this, as it seems like it's exposing internal decisions we could change someday; to wit, first that there is any such thing as a separate rewriting pass, and second that its output is interpretable as pure SQL. (TBH, I'm not 100% sure that the second assumption is true even today, although I know there are ancient comments that claim that.) It's not very hard to imagine someday moving view expansion into the planner on efficiency grounds, leaving the rewriter handling only the rare uses of INSERT/UPDATE/DELETE rules. If there's functions in ruleutils.c that we'd need to make public to let somebody write a debugging extension that does this kind of thing, I'd be happier with that approach than with creating a core-server SQL function for it. There might be more than one use-case for the exposed bits. regards, tom lane
Re: Deparsing rewritten query
On Sun, Jun 27, 2021 at 10:06:00AM -0300, Ranier Vilela wrote: > > 1. strcmp(sql, "") could not be replaced by sql[0] == '\0'? It's a debugging leftover that I forgot to remove. There's no point trying to catch an empty string as e.g. a single space would be exactly the same, and both would be caught by the next (and correct) test. > 3. initStringInfo() inside a loop, wouldn't it be exaggerated? each > time call palloc0. initStringInfo calls palloc, not palloc0. It's unlikely to make any difference. Rules have been strongly discouraged for many years, and if you have enough to make a noticeable difference here, you probably have bigger problems. But no objection to reset the StringInfo instead.
Re: Deparsing rewritten query
> Hi, > > I sometimes have to deal with queries referencing multiple and/or complex > views. In such cases, it's quite troublesome to figure out what is the > query > really executed. Debug_print_rewritten isn't really useful for non trivial > queries, and manually doing the view expansion isn't great either. > > While not being ideal, I wouldn't mind using a custom extension for that > but > this isn't an option as get_query_def() is private and isn't likely to > change. > > As an alternative, maybe we could expose a simple SRF that would take care > of > rewriting the query and deparsing the resulting query tree(s)? > > I'm attaching a POC patch for that, adding a new pg_get_query_def(text) > SRF. +1 If you don't mind, I made small corrections to your patch. 1. strcmp(sql, "") could not be replaced by sql[0] == '\0'? 2. the error message would not be: "empty statement not allowed"? 3. initStringInfo() inside a loop, wouldn't it be exaggerated? each time call palloc0. regards, Ranier Vilela v1-0002-Add-pg_get_query_def-to-deparse-and-print-a-rewri.patch Description: Binary data
Re: pglz compression performance, take two
> 20 марта 2021 г., в 00:35, Mark Dilger > написал(а): > > > >> On Jan 21, 2021, at 6:48 PM, Justin Pryzby wrote: >> >> @cfbot: rebased >> <0001-Reorganize-pglz-compression-code.patch> > > Review comments. Thanks for the review, Mark! And sorry for such a long delay, I've been trying to figure out a way to do things less-platform dependent. And here's what I've come up with. We use pglz_read32() not the way xxhash and lz4 does - we really do not need to get 4-byte value, we only need to compare 4 bytes at once. So, essentially, we need to compare two implementation of 4-byte comparison bool cpm_a(const void *ptr1, const void *ptr2) { return *(const uint32_t *) ptr1 == *(const uint32_t *) ptr2; } bool cmp_b(const void *ptr1, const void *ptr2) { return memcmp(ptr1, ptr2, 4) == 0; } Variant B is more portable. Inspecting it Godblot's compiler explorer I've found out that for GCC 7.1+ it generates assembly without memcmp() call. For x86-64 and ARM64 assembly of cmp_b is identical to cmp_a. So I think maybe we could just stick with version cmp_b instead of optimising for ARM6 and similar architectures like Arduino. I've benchmarked the patch with "REINDEX table pgbench_accounts" on pgbench -i of scale 100. wal_compression was on, other settings were default. Without patch it takes ~11055.077 ms on my machine, with patch it takes ~9512.411 ms, 14% speedup overall. PFA v5. Thanks! Best regards, Andrey Borodin. v5-0001-Reorganize-pglz-compression-code.patch Description: Binary data
Re: Farewell greeting
On Sun, Jun 27, 2021 at 07:41:19AM +, tsunakawa.ta...@fujitsu.com wrote: > > I'm moving to another company from July 1st. I may possibly not be allowed > to do open source activities there, so let me say goodbye once. Thank you > all for your help. I really enjoyed learning PostgreSQL and participating in > its development. > > It's a pity that I may not be able to part of PostgreSQL's great history > until it becomes the most popular database (in the DB-Engines ranking.) > However, if possible, I'd like to come back as just MauMau. > > I hope you all will succeed. Wish you all the best!
Re: Farewell greeting
> Hello all, > > > I'm moving to another company from July 1st. I may possibly not be allowed > to do open source activities there, so let me say goodbye once. Thank you > all for your help. I really enjoyed learning PostgreSQL and participating in > its development. > > It's a pity that I may not be able to part of PostgreSQL's great history > until it becomes the most popular database (in the DB-Engines ranking.) > However, if possible, I'd like to come back as just MauMau. > > I hope you all will succeed. > > > Regards > Takayuki Tsunakawa Good luck, Tsunakawa-san! -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Farewell greeting
Hello all, I'm moving to another company from July 1st. I may possibly not be allowed to do open source activities there, so let me say goodbye once. Thank you all for your help. I really enjoyed learning PostgreSQL and participating in its development. It's a pity that I may not be able to part of PostgreSQL's great history until it becomes the most popular database (in the DB-Engines ranking.) However, if possible, I'd like to come back as just MauMau. I hope you all will succeed. Regards Takayuki Tsunakawa