Re: [HACKERS] [BUG] pg_basebackup from disconnected standby fails
Sorry, I'm confused about the minRecoveryPoint. Reconsidered a bit. At Tue, 14 Jun 2016 20:31:11 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHIwrote in <20160614.203111.229211034.horiguchi.kyot...@lab.ntt.co.jp> > > > After looking more closely, I found that the minRecoveryPoint > > > tends to be too small as the backup end point, and up to the > > > record at the lastReplayedRecPtr can affect the pages on disk and > > > they can go into the backup just taken. > > > > > > My conclusion here is that do_pg_stop_backup should return > > > lastReplayedRecPtr, not minRecoveryPoint. > > > > I have been thinking quite a bit about this patch, and this logic > > sounds quite right to me. When stopping the backup we need to let the > > user know up to which point it needs to replay WAL, and relation pages > > are touched up to lastReplayedEndRecPtr. > > Yes, but by design, the changes in buffers don't go into disk > until buffer replacing occurs, which updates minRecoveryPoint. My > understanding is that the problem is that a restart point that is > not accompanied with buffer updates advances only the redo point > of the last checkpoint and doesn't update minRecoveryPoint, which > may be behind the redo point at the time. > > It seems to me that we could more agressively advance the > minRecoveryPoint (but must not let it go too far..), but it is > right for it to aim a bit smaller than the ideal location. It's wrong. minRecoveryPoint should be greater than or equal to the maximum buffer-touching LSN reached in previous recoveries, and it can be the same to replayEndRecPtr but may be behind it if no acutual modification on page files is done hereafter. xlog.c works that way. The value of the minRecoveryPoint smaller than the redo point of the last checkpoint with no buffer flush is allowable from this point of view but it is not proper for the end point of a backup. If we skip recording the last checkpoint position when it eventually causes no buffer flush, minRecoveryPoint is again usable for the purpose. However, it causes repeated restartpoint trial on the same (skipped) checkpoint record. As the consequence, we can solve this problemn also by explicitly updating the minRecoveryPoint for an executed restartpoint without no buffer flush. The attached patch performs this way and also solves the problem. Which one do you think is more preferable? Or any other solution? This patch updates minRecoeryPoint only for restartpoints that caused no buffer flush but such restriction might not be necessary. regards, -- Kyotaro Horiguchi NTT Open Source Software Center >From 5927a196295eea63424011c15d7359ed8141812c Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Wed, 15 Jun 2016 12:00:33 +0900 Subject: [PATCH] Advancing minRecoveryPoint for executed empty restart point. Currently, restart point with no buffer flush doesn't update the minRecoveryPoint but updates lastCheckPoint. This can cause do_pg_stop_backup() to return the stop LSN smaller than the start LSN given by do_pg_start_backup() and leads to falure in taking base backup. This patch let CreateRestartPoint update the minRecoveryPoint if an executed restartpoint is accompanied with no buffer flush. --- src/backend/access/transam/xlog.c | 9 + src/test/recovery/t/001_stream_rep.pl | 5 + 2 files changed, 14 insertions(+) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index e4645a3..7697223 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -8797,6 +8797,15 @@ CreateRestartPoint(int flags) CheckPointGuts(lastCheckPoint.redo, flags); /* + * basebackup needs minRecoveryPoint to be grater than or equal to the + * redo point of this checkpoint. If no buffer is flushed so far, + * minRecoveryPoint has not advanced during this checkpoint. So explicitly + * advance it to there for the case. + */ + if (CheckpointStats.ckpt_bufs_written == 0) + UpdateMinRecoveryPoint(lastCheckPointRecPtr, false); + + /* * Remember the prior checkpoint's redo pointer, used later to determine * the point at which we can truncate the log. */ diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl index 7b42f21..cee4768 100644 --- a/src/test/recovery/t/001_stream_rep.pl +++ b/src/test/recovery/t/001_stream_rep.pl @@ -24,6 +24,11 @@ $node_standby_1->start; # pg_basebackup works on a standby). $node_standby_1->backup($backup_name); +# Take a second backup of the standby while the master is offline. +$node_master->stop; +$node_standby_1->backup('my_backup_2'); +$node_master->start; + # Create second standby node linking to standby 1 my $node_standby_2 = get_new_node('standby_2'); $node_standby_2->init_from_backup($node_standby_1, $backup_name, -- 1.8.3.1 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your
Re: [HACKERS] increase message string buffer size of watch command of psql
Ioseph Kimwrites: > 2016ë 06ì 15ì¼ 01:56ì Tom Lane ì´(ê°) ì´ ê¸: >> I take it from the vast silence that nobody particularly cares one way >> or the other. On reflection I think that this would be a good change >> to make, so I'll go do so unless I hear complaints soon. regards, tom >> lane > I propose to change from asctime() to sql current_timestamp value, > then users will change date format with set command DateStyle. That would require an additional SQL query each time through the loop, which seems like undue expense. It's also not terribly friendly to the goal of localization, I should think, given the limited number of datestyle options and the fact that none of them actually change day or month names to non-English choices. And it would imply changing the timestamps from psql's timezone to the backend's. While that might have been a good way to do it in a green field, it's not the way \watch has worked in the past, and given the lack of complaints I'm disinclined to change that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increase message string buffer size of watch command of psql
2016년 06월 15일 01:56에 Tom Lane 이(가) 쓴 글: I take it from the vast silence that nobody particularly cares one way or the other. On reflection I think that this would be a good change to make, so I'll go do so unless I hear complaints soon. regards, tom lane I propose to change from asctime() to sql current_timestamp value, then users will change date format with set command DateStyle. Regards, Ioseph. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 10.0
On 06/14/2016 05:08 PM, Cat wrote: We have the capability to provide (semi-)structured data. Might be an idea to make greater use of it. postgres=# SELECT * from to_json(row(current_setting('server_version_num'))) as version; Sincerely, jD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 10.0
On 15/06/16 02:08, Cat wrote: > is it possible to introduce a JSONB output to it. No thanks. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Failed assertion in postgres_fdw/deparse.c:1116
On 2016/06/15 0:50, Robert Haas wrote: > On Tue, Jun 14, 2016 at 4:06 AM, Amit Langote wrote: >> You're right. It indeed should be possible to push down ft1-ft2 join. >> However it could not be done without also modifying >> build_tlist_to_deparse() a little (as Ashutosh proposed [1] to do >> upthread). Attached new version of the patch with following changes: >> >> 1) Modified the check in foreign_join_ok() so that it is not overly >> restrictive. Previously, it used ph_needed as the highest level at which >> the PHV is needed (by definition) and checked using it whether it is above >> the join under consideration, which ended up being an overkill. ISTM, we >> can just decide from joinrel->relids of the join under consideration >> whether we are above the lowest level where the PHV could be evaluated >> (ph_eval_at) and stop if so. So in the example you provided, it won't now >> reject {ft1, ft2}, but will {ft4, ft1, ft2}. >> >> 2) Modified build_tlist_to_deparse() to get rid of the PlaceHolderVars in >> the targetlist to deparse by using PVC_RECURSE_PLACEHOLDER flag of >> pull_var_clause(). That is because we do not yet support anything other >> than Vars in deparseExplicitTargetList() (+1 to your patch to change the >> Assert to elog). > > OK, I committed this version with some cosmetic changes. I ripped out > the header comment to foreign_join_ok(), which is just a nuisance to > maintain. It unnecessarily recapitulates the tests contained within > the function, requiring us to update the comments in two places > instead of one every time we make a change for no real gain. And I > rewrote the comment you added. Thanks. Regards, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 10.0
On Tue, Jun 14, 2016 at 01:38:44PM -0700, Joshua D. Drake wrote: > On 06/14/2016 12:46 PM, Jim Nasby wrote: > > >Any ideas on naming for such a function? version_detail()? I suppose > >while we're at this we might as well provide the compile details as well. > > version(detail) or version(verbose) If we're looking at forward only changes, is it possible to introduce a JSONB output to it. Then people can rip out whichever component they want at will. For example: { "full": 10.0, "major": 10, "patchlevel": 0 } and whatever else may be pertinent. I used numeric types above but they can be strings if that works better. We have the capability to provide (semi-)structured data. Might be an idea to make greater use of it. -- "A search of his car uncovered pornography, a homemade sex aid, women's stockings and a Jack Russell terrier." - http://www.dailytelegraph.com.au/news/wacky/indeed/story-e6frev20-118083480 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reviewing freeze map code
On Wed, Jun 15, 2016 at 12:44 AM, Robert Haaswrote: > On Tue, Jun 14, 2016 at 8:11 AM, Robert Haas wrote: I noticed that the tuples that it reported were always offset 1 in a page, and that the page always had a maxoff over a couple of hundred, and that we called record_corrupt_item because VM_ALL_VISIBLE returned true but HeapTupleSatisfiesVacuum on the first tuple returned HEAPTUPLE_DELETE_IN_PROGRESS instead of the expected HEAPTUPLE_LIVE. It did that because HEAP_XMAX_COMMITTED was not set and TransactionIdIsInProgress returned true for xmax. >>> >>> So this seems like it might be a visibility map bug rather than a bug >>> in the test code, but I'm not completely sure of that. How was it >>> legitimate to mark the page as all-visible if a tuple on the page >>> still had a live xmax? If xmax is live and not just a locker then the >>> tuple is not visible to the transaction that wrote xmax, at least. >> >> Ah, wait a minute. I see how this could happen. Hang on, let me >> update the pg_visibility patch. > > The problem should be fixed in the attached revision of > pg_check_visible. I think what happened is: > > 1. pg_check_visible computed an OldestXmin. > 2. Some transaction committed. > 3. VACUUM computed a newer OldestXmin and marked a page all-visible with it. > 4. pg_check_visible then used its older OldestXmin to check the > visibility of tuples on that page, and saw delete-in-progress as a > result. > > I added a guard against a similar scenario involving xmin in the last > version of this patch, but forgot that we need to protect xmax in the > same way. With this version of the patch, I can no longer get any > TIDs to pop up out of pg_check_visible in my testing. (I haven't run > your test script for lack of the proper Python environment...) I can still reproduce the problem with this new patch. What I see is that the OldestXmin, the new RecomputedOldestXmin and the tuple's xmax are all the same. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 10.0
On Tue, Jun 14, 2016 at 5:58 PM, Merlin Moncurewrote: > On Tue, Jun 14, 2016 at 4:13 PM, Jim Nasby > wrote: > > On 6/14/16 3:56 PM, Tom Lane wrote: > >> > >> Jim Nasby writes: > >>> > >>> On 6/14/16 3:01 PM, Robert Haas wrote: > > This seems kind of silly, because anybody who is writing code that > might have to run against an existing version of the database won't be > able to use it. The one thing that absolutely has to be cross-version > is the method of determining which version you're running against. > >> > >> > >>> We're talking about a function that doesn't currently exist anyway. > >> > >> > >> Huh? We're talking about PQserverVersion(), comparisons to > >> PG_VERSION_NUM, > >> and related APIs. Those most certainly exist now, and trying to > supplant > >> them seems like a giant waste of time. > >> > >> On the other hand, parsing fields out of version() mechanically has been > >> deprecated for as long as those other APIs have existed (which is since > >> 8.0 or so). version() is only meant for human consumption, so I see no > >> reason it shouldn't just start returning "10.0", "10.1", etc. If that > >> breaks anyone's code, well, they should have switched to one of the > >> easier methods years ago. > > > > > > The original post was: > > > >> IF substring(version() FROM $q$([0-9]+\.[0-9]+)$q$)::NUMERIC >= 9.3 > > > > and \df *version* on my HEAD doesn't show any other options. > > Right. It's the only way to handle things on the SQL level well, > that, and pg_settings approaches. In other words, there is no easier > way. I think it's pretty reasonable to assume there's a lot more code > interfacing with the database from SQL than from C. > We could stand to be more explicit here. The docs for version() indicated the server_version_num should be used for "machine processing". The implied correct way to access the canonical server version is thus: SELECT current_setting('server_version_num'); I'd say we should at least provide the above as an example; the reader can find their way to Chapter 18.1 if they are curious about alternatives. On the topic of option "D" I'd be fine with fine with functions: and ; but that is independent of this discussion. Option E: Give the DBA control. If they know they have one or more mis-behaving applications but it is out their control to patch the code to work properly they can change this supposedly human-readable output to conform the historical x.y.z format. This would disabled by default. Humans can easily interpret both versions so please save the comment about not having GUCs that influence user-visible behavior. If your argument for changing the format outright is "its for human consumption only" then apparently this output should not be considered important enough to adhere to that rule. Non-humans depending on its format are subject to our, or the DBA's, whims. David J.
Re: [HACKERS] 10.0
On Tue, Jun 14, 2016 at 4:13 PM, Jim Nasbywrote: > On 6/14/16 3:56 PM, Tom Lane wrote: >> >> Jim Nasby writes: >>> >>> On 6/14/16 3:01 PM, Robert Haas wrote: This seems kind of silly, because anybody who is writing code that might have to run against an existing version of the database won't be able to use it. The one thing that absolutely has to be cross-version is the method of determining which version you're running against. >> >> >>> We're talking about a function that doesn't currently exist anyway. >> >> >> Huh? We're talking about PQserverVersion(), comparisons to >> PG_VERSION_NUM, >> and related APIs. Those most certainly exist now, and trying to supplant >> them seems like a giant waste of time. >> >> On the other hand, parsing fields out of version() mechanically has been >> deprecated for as long as those other APIs have existed (which is since >> 8.0 or so). version() is only meant for human consumption, so I see no >> reason it shouldn't just start returning "10.0", "10.1", etc. If that >> breaks anyone's code, well, they should have switched to one of the >> easier methods years ago. > > > The original post was: > >> IF substring(version() FROM $q$([0-9]+\.[0-9]+)$q$)::NUMERIC >= 9.3 > > and \df *version* on my HEAD doesn't show any other options. Right. It's the only way to handle things on the SQL level well, that, and pg_settings approaches. In other words, there is no easier way. I think it's pretty reasonable to assume there's a lot more code interfacing with the database from SQL than from C. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 10.0
On 6/14/16 3:56 PM, Tom Lane wrote: Jim Nasbywrites: On 6/14/16 3:01 PM, Robert Haas wrote: This seems kind of silly, because anybody who is writing code that might have to run against an existing version of the database won't be able to use it. The one thing that absolutely has to be cross-version is the method of determining which version you're running against. We're talking about a function that doesn't currently exist anyway. Huh? We're talking about PQserverVersion(), comparisons to PG_VERSION_NUM, and related APIs. Those most certainly exist now, and trying to supplant them seems like a giant waste of time. On the other hand, parsing fields out of version() mechanically has been deprecated for as long as those other APIs have existed (which is since 8.0 or so). version() is only meant for human consumption, so I see no reason it shouldn't just start returning "10.0", "10.1", etc. If that breaks anyone's code, well, they should have switched to one of the easier methods years ago. The original post was: IF substring(version() FROM $q$([0-9]+\.[0-9]+)$q$)::NUMERIC >= 9.3 and \df *version* on my HEAD doesn't show any other options. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Use of index for 50% column restriction
On Tue, Jun 14, 2016 at 03:24:12PM -0500, Jim Nasby wrote: > On 6/8/16 4:36 PM, Bruce Momjian wrote: > >Just a follow-up, but even with a randomized correlation order, it seems > >25% restrictivity generates a Bitmap Index Scan: > > AFAIK we do the bitmap heap scan in heap order, thereby eliminating the > effect of correlation? High correlation would still cause us to access fewer heap pages than random data. -- Bruce Momjianhttp://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 10.0
On 6/14/16 3:38 PM, Joshua D. Drake wrote: On 06/14/2016 12:46 PM, Jim Nasby wrote: Any ideas on naming for such a function? version_detail()? I suppose while we're at this we might as well provide the compile details as well. version(detail) or version(verbose) I don't think that makes as much sense as a different function name, since the output is fundamentally different than version(). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Should pg_export_snapshot() and currtid() be tagged parallel-unsafe?
Digging through the sqlsmith logging db, I noticed the following errors: ERROR: cannot update SecondarySnapshot during a parallel operation ERROR: cannot assign XIDs during a parallel operation Queries raising the first one always contain calls to currtid() or currtid2(). Queries raising the second one always contain a call to pg_export_snapshot(). Patch to tag them as unsafe attached. regards, Andreas diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index f33c3ff..6a65e77 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -1347,9 +1347,9 @@ DATA(insert OID = 1291 ( suppress_redundant_updates_trigger PGNSP PGUID 12 1 0 DESCR("trigger to suppress updates when new and old records match"); DATA(insert OID = 1292 ( tideq PGNSP PGUID 12 1 0 0 0 f f f t t f i s 2 0 16 "27 27" _null_ _null_ _null_ _null_ _null_ tideq _null_ _null_ _null_ )); -DATA(insert OID = 1293 ( currtid PGNSP PGUID 12 1 0 0 0 f f f f t f v s 2 0 27 "26 27" _null_ _null_ _null_ _null_ _null_ currtid_byreloid _null_ _null_ _null_ )); +DATA(insert OID = 1293 ( currtid PGNSP PGUID 12 1 0 0 0 f f f f t f v u 2 0 27 "26 27" _null_ _null_ _null_ _null_ _null_ currtid_byreloid _null_ _null_ _null_ )); DESCR("latest tid of a tuple"); -DATA(insert OID = 1294 ( currtid2 PGNSP PGUID 12 1 0 0 0 f f f f t f v s 2 0 27 "25 27" _null_ _null_ _null_ _null_ _null_ currtid_byrelname _null_ _null_ _null_ )); +DATA(insert OID = 1294 ( currtid2 PGNSP PGUID 12 1 0 0 0 f f f f t f v u 2 0 27 "25 27" _null_ _null_ _null_ _null_ _null_ currtid_byrelname _null_ _null_ _null_ )); DESCR("latest tid of a tuple"); DATA(insert OID = 1265 ( tidne PGNSP PGUID 12 1 0 0 0 f f f t t f i s 2 0 16 "27 27" _null_ _null_ _null_ _null_ _null_ tidne _null_ _null_ _null_ )); DATA(insert OID = 2790 ( tidgt PGNSP PGUID 12 1 0 0 0 f f f t t f i s 2 0 16 "27 27" _null_ _null_ _null_ _null_ _null_ tidgt _null_ _null_ _null_ )); @@ -3135,7 +3135,7 @@ DESCR("xlog filename, given an xlog location"); DATA(insert OID = 3165 ( pg_xlog_location_diff PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 1700 "3220 3220" _null_ _null_ _null_ _null_ _null_ pg_xlog_location_diff _null_ _null_ _null_ )); DESCR("difference in bytes, given two xlog locations"); -DATA(insert OID = 3809 ( pg_export_snapshot PGNSP PGUID 12 1 0 0 0 f f f f t f v r 0 0 25 "" _null_ _null_ _null_ _null_ _null_ pg_export_snapshot _null_ _null_ _null_ )); +DATA(insert OID = 3809 ( pg_export_snapshot PGNSP PGUID 12 1 0 0 0 f f f f t f v u 0 0 25 "" _null_ _null_ _null_ _null_ _null_ pg_export_snapshot _null_ _null_ _null_ )); DESCR("export a snapshot"); DATA(insert OID = 3810 ( pg_is_in_recovery PGNSP PGUID 12 1 0 0 0 f f f f t f v s 0 0 16 "" _null_ _null_ _null_ _null_ _null_ pg_is_in_recovery _null_ _null_ _null_ )); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 10.0
Jim Nasbywrites: > On 6/14/16 3:01 PM, Robert Haas wrote: >> This seems kind of silly, because anybody who is writing code that >> might have to run against an existing version of the database won't be >> able to use it. The one thing that absolutely has to be cross-version >> is the method of determining which version you're running against. > We're talking about a function that doesn't currently exist anyway. Huh? We're talking about PQserverVersion(), comparisons to PG_VERSION_NUM, and related APIs. Those most certainly exist now, and trying to supplant them seems like a giant waste of time. On the other hand, parsing fields out of version() mechanically has been deprecated for as long as those other APIs have existed (which is since 8.0 or so). version() is only meant for human consumption, so I see no reason it shouldn't just start returning "10.0", "10.1", etc. If that breaks anyone's code, well, they should have switched to one of the easier methods years ago. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 10.0
On 6/14/16 3:01 PM, Robert Haas wrote: D) Add a version function to 10.0 that returns both parts separately. > > My vote is D. Parsing version() output is a wart, but coming out with a > split output version of that in 9.6 that still has to support 3 numbers > would also be a wart. We've lived with the parsing wart this long, so lets > just add an explicit output version to 10.0. > > Any ideas on naming for such a function? version_detail()? I suppose while > we're at this we might as well provide the compile details as well. This seems kind of silly, because anybody who is writing code that might have to run against an existing version of the database won't be able to use it. The one thing that absolutely has to be cross-version is the method of determining which version you're running against. We're talking about a function that doesn't currently exist anyway. So no matter what, you won't be able to use it if you're interested in <10.0 (or <9.6 if we went with one of the other proposals). Unless folks were thinking this is something that would be backpatched? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Use of index for 50% column restriction
On 6/8/16 4:36 PM, Bruce Momjian wrote: Just a follow-up, but even with a randomized correlation order, it seems 25% restrictivity generates a Bitmap Index Scan: AFAIK we do the bitmap heap scan in heap order, thereby eliminating the effect of correlation? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 10.0
On 06/14/2016 12:46 PM, Jim Nasby wrote: Any ideas on naming for such a function? version_detail()? I suppose while we're at this we might as well provide the compile details as well. version(detail) or version(verbose) JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gettimeofday is at the end of its usefulness?
On 6/8/16 9:56 AM, Tom Lane wrote: Thom Brownwrites: On 15 May 2014 at 19:56, Bruce Momjian wrote: On Tue, May 13, 2014 at 06:58:11PM -0400, Tom Lane wrote: A recent question from Tim Kane prompted me to measure the overhead costs of EXPLAIN ANALYZE, which I'd not checked in awhile. Things are far worse than I thought. On my current server (by no means lavish hardware: Xeon E5-2609 @2.40GHz) a simple seqscan can run at something like 110 nsec per row: Did this idea die, or is it still worth considering? We still have a problem, for sure. I'm not sure that there was any consensus on what to do about it. Using clock_gettime(CLOCK_REALTIME) if available would be a straightforward change that should ameliorate gettimeofday()'s 1-usec-precision-limit problem; but it doesn't do anything to fix the excessive-overhead problem. The ideas about the latter were all over the map, and none of them looked easy. If you're feeling motivated to work on this area, feel free. Semi-related: someone (Robert I think) recently mentioned investigating "vectorized" executor nodes, where multiple tuples would be processed in one shot. If we had that presumably the explain penalty would be a moot point. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prepared statements and generic plans
On Tue, Jun 14, 2016 at 08:20:12AM -0400, ''br...@momjian.us' *EXTERN*' wrote: > > This has caused confussion in the past, see > > https://www.postgresql.org/message-id/flat/561E749D.4090301%40socialserve.com#561e749d.4090...@socialserve.com > > > > > Right. Updated patch attached. > > > > I am happy with the patch as it is. > > Good. Patch applied. Thanks for the assistance. -- Bruce Momjianhttp://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel.c is not marked as test covered
On Tue, Jun 14, 2016 at 1:14 PM, Tom Lanewrote: > Robert Haas writes: >> On Tue, Jun 14, 2016 at 12:51 PM, Tom Lane wrote: >>> FWIW, I follow all of your reasoning except this. If we believe that the >>> parallel worker context line is useful, then it is a bug that plpgsql >>> suppresses it. If we don't believe it's useful, then we should get >>> rid of it. "Do nothing" is simply not a consistent stance here. > >> Well, if PL/pgsql suppresses context and nobody's complained about >> that up until now, fixing it doesn't seem any more urgent than any >> other bug we've had for N releases. > > I have not dug into the code enough to find out exactly what's happening > in Peter's complaint, but it seems like it would be a good idea to find > that out before arguing along these lines. It seems entirely likely > to me that this is somehow parallel-query-specific. Even if it isn't, > I don't buy your argument. Adding a new case in which context is > suppressed is a perfectly reasonable basis for thinking that an old > bug has acquired new urgency. OK. I find this whole thing slightly puzzling because Noah wrote this in the test file: -- Provoke error in worker. The original message CONTEXT contains a worker -- PID that must be hidden in the test output. PL/pgSQL conveniently -- substitutes its own CONTEXT. The phrasing of the comment implies that (1) the behavior is desirable at least for the purpose at hand and (2) the behavior is unrelated to parallel query. However, it's surely possible that (2) is false and (1) is not the consensus. Noah is usually pretty careful about this sort of thing, but he might have made a mistake. Considering that, I decide to take a look. I wasn't able in a quick test to verify that this behavior doesn't happen without parallel query. I also wasn't able to figure out exactly why it was happening. I did verify that this statement generates an error that is propagated to the leader. The expected output looks like this: ERROR: invalid input syntax for integer: "BA" CONTEXT: SQL statement "select stringu1::int2 from tenk1 where unique1 = 1" PL/pgSQL function inline_code_block line 5 at SQL statement The first line of context begins with "SQL statement" and I believe that's coming from spi.c. The second line of the context seems to come from PL/pgsql. But what accounts for the failure of the "parallel query" line to appear in the context message? This could be explained by _SPI_error_callback flushing the existing context in its handler, but it doesn't seem to do that. I guess this needs some more looking at. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 10.0
On Tue, Jun 14, 2016 at 3:46 PM, Jim Nasbywrote: > On 6/13/16 2:12 PM, Merlin Moncure wrote: >> >> A) make a variant of version() that returns major/minor/bugfix as >> separate fields with minor being set to 0 for all released versions >> 10.0 and beyond. We could then add a NOTE to the version function and >> other places suggesting to use that instead for 9.6. >> >> B) Preserve x.y.z as returned by version() and show server_version for >> those usages only, with 'y' being always 0 for 10.0 and beyond. For >> all other purposes (marketing/documentation/etc) that structure would >> *not* be preserved, and we would put notes in the documentation >> describing why the extra digit is there. >> >> C) Do neither A or B, and let our users discover such issues on their own. > > > D) Add a version function to 10.0 that returns both parts separately. > > My vote is D. Parsing version() output is a wart, but coming out with a > split output version of that in 9.6 that still has to support 3 numbers > would also be a wart. We've lived with the parsing wart this long, so lets > just add an explicit output version to 10.0. > > Any ideas on naming for such a function? version_detail()? I suppose while > we're at this we might as well provide the compile details as well. This seems kind of silly, because anybody who is writing code that might have to run against an existing version of the database won't be able to use it. The one thing that absolutely has to be cross-version is the method of determining which version you're running against. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On 6/12/16 2:13 AM, Ants Aasma wrote: On Fri, Jun 10, 2016 at 5:23 AM, Haribabu Kommiwrote: > 1. Instead of doing the entire database files encryption, how about > providing user an option to protect only some particular tables that > wants the encryption at table/tablespace level. This not only provides > an option to the user, it reduces the performance impact on tables > that doesn't need any encryption. The problem with this approach > is that every xlog record needs to validate to handle the encryption > /decryption, instead of at page level. Is there a real need for this? The customers I have talked to want to encrypt the whole database and my goal is to make the feature fast enough to make that feasible for pretty much everyone. I guess switching encryption off per table would be feasible, but the key setup would still need to be done at server startup. Per record encryption would result in some additional information leakage though. Overall I thought it would not be worth it, but I'm willing to have my mind changed on this. I actually design with this in mind. Tables that contain sensitive info go into designated schemas, partly so that you can blanket move all of those to an encrypted tablespace (or safer would be to move things not in those schemas to an unencrypted tablespace). Since that can be done with an encrypted filesystem maybe that's good enough. (It's not really clear to me what this buys us over an encrypted FS, other than a feature comparison checkmark...) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel safety tagging of extension functions
On Tue, Jun 14, 2016 at 1:51 PM, Robert Haaswrote: > On Tue, Jun 14, 2016 at 6:37 AM, Andreas Karlsson wrote: >> I have rebased all my patches on the current master now (and skipped the >> extensions I previously listed). > > Thanks, this is really helpful. It was starting to get hard to keep > track of what hadn't been applied yet. I decided to prioritize > getting committed the patches where the extension version had already > been bumped by 749a787c5b25ae33b3d4da0ef12aa05214aa73c7, so I've now > committed the patches for cube, hstore, intarray, ltree, pg_trgm, and > seg. I've now also committed the patches for sslinfo, unaccept, uuid-ossp, and xml2. I took at look at the patch for tsearch2, but I think token_type() is mismarked. You have it marked PARALLEL SAFE but seems to depend on the result of GetCurrentParser(), which returns a backend-private state variable. That was the only clear mistake I found, but I tend to think that changing the markings on anything defined by UNSUPPORTED_FUNCTION() is pretty silly, because there's no point in going to extra planner effort to generate a parallel plan only to error out as soon as we try to execute it. I think you should leave all of those out of the patch. I also took a look at the patch for tablefunc. I think that you've got the markings right, here, but I think that it would be good to add PARALLEL UNSAFE explicitly to the 1.1 version of the file for the functions are unsafe, and add a comment like "-- query might do anything" or some other indication as to why they are so marked, for the benefit of future readers. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 10.0
On 6/13/16 2:12 PM, Merlin Moncure wrote: A) make a variant of version() that returns major/minor/bugfix as separate fields with minor being set to 0 for all released versions 10.0 and beyond. We could then add a NOTE to the version function and other places suggesting to use that instead for 9.6. B) Preserve x.y.z as returned by version() and show server_version for those usages only, with 'y' being always 0 for 10.0 and beyond. For all other purposes (marketing/documentation/etc) that structure would *not* be preserved, and we would put notes in the documentation describing why the extra digit is there. C) Do neither A or B, and let our users discover such issues on their own. D) Add a version function to 10.0 that returns both parts separately. My vote is D. Parsing version() output is a wart, but coming out with a split output version of that in 9.6 that still has to support 3 numbers would also be a wart. We've lived with the parsing wart this long, so lets just add an explicit output version to 10.0. Any ideas on naming for such a function? version_detail()? I suppose while we're at this we might as well provide the compile details as well. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel safety tagging of extension functions
On Tue, Jun 14, 2016 at 6:37 AM, Andreas Karlssonwrote: > I have rebased all my patches on the current master now (and skipped the > extensions I previously listed). Thanks, this is really helpful. It was starting to get hard to keep track of what hadn't been applied yet. I decided to prioritize getting committed the patches where the extension version had already been bumped by 749a787c5b25ae33b3d4da0ef12aa05214aa73c7, so I've now committed the patches for cube, hstore, intarray, ltree, pg_trgm, and seg. However, in pg_trgm, I changed some of the functions that you had marked as PARALLEL RESTRICTED to be PARALLEL SAFE, because I didn't see any reason why they needed to be PARALLEL RESTRICTED. It's OK for a parallel-safe function to depend on GUC values, because those are synchronized from the leader to all worker processes. Random global variables won't necessarily be kept in sync, but anything controlled by the GUC mechanism will be. If there's some other reason you think those functions aren't parallel-safe, please let me know. I'm still in favor of rejecting the adminpack patch. To me, that just seems like attaching a larger magazine to the gun pointed at your foot. I can't deny that in a strict sense those functions are parallel-safe, but I think they are better left alone. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel.c is not marked as test covered
Robert Haaswrites: > On Tue, Jun 14, 2016 at 12:51 PM, Tom Lane wrote: >> FWIW, I follow all of your reasoning except this. If we believe that the >> parallel worker context line is useful, then it is a bug that plpgsql >> suppresses it. If we don't believe it's useful, then we should get >> rid of it. "Do nothing" is simply not a consistent stance here. > Well, if PL/pgsql suppresses context and nobody's complained about > that up until now, fixing it doesn't seem any more urgent than any > other bug we've had for N releases. I have not dug into the code enough to find out exactly what's happening in Peter's complaint, but it seems like it would be a good idea to find that out before arguing along these lines. It seems entirely likely to me that this is somehow parallel-query-specific. Even if it isn't, I don't buy your argument. Adding a new case in which context is suppressed is a perfectly reasonable basis for thinking that an old bug has acquired new urgency. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rename max_parallel_degree?
On Tue, Jun 14, 2016 at 12:16 PM, Tom Lanewrote: > Robert Haas writes: >> Of course, it would be nice if we could make these counters 64-bit >> integers, but we can't, because we don't rely on 64-bit reads and >> writes to be atomic on all platforms. So instead they'll have to be >> uint32. That means they could wrap (if you really work at it) but >> subtraction will still return the right answer, so it's OK. > > OK ... > >> If we >> want to allow the number of parallel workers started to be available >> for statistical purposes, we can keep to uint32 values for that >> (parallel_register_count_lo and parallel_register_count_hi, for >> example), and increment the second one whenever the first one rolls >> over to zero. > > And that's going to be atomic how exactly? The only process that can look at that structure without taking a lock is the postmaster. And the postmaster would only examine parallel_register_count_lo. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel.c is not marked as test covered
On Tue, Jun 14, 2016 at 12:51 PM, Tom Lanewrote: > Robert Haas writes: >> On Fri, Jun 10, 2016 at 4:12 PM, Robert Haas wrote: >>> On Fri, Jun 10, 2016 at 1:49 PM, Peter Eisentraut >>> wrote: Elsewhere in this thread I suggested getting rid of the parallel worker context by default (except for debugging), but if we do want to keep it, then it seems to be a bug that a PL/pgSQL function can just eliminate it. > >> This is currently listed as an open item, but it doesn't seem very >> actionable to me. The fact that PL/plpgsql chucks the existing >> context instead of appending to it is presumably a property of >> PL/plpgsql, not parallel query, and changing that seems like it ought >> to be out of scope for 9.6. > > FWIW, I follow all of your reasoning except this. If we believe that the > parallel worker context line is useful, then it is a bug that plpgsql > suppresses it. If we don't believe it's useful, then we should get > rid of it. "Do nothing" is simply not a consistent stance here. Well, if PL/pgsql suppresses context and nobody's complained about that up until now, fixing it doesn't seem any more urgent than any other bug we've had for N releases. That would go on the 9.6 open items list in the section entitled "Older Bugs", where it would have plenty of company. Any time somebody wants to fix one of those, they can, and that would be great, but there's no more or less urgency right now than, say, four months ago, or six months from now. It can't be said that this open item is holding up the release if it's just a rediscovery of an existing behavior which somebody happens not to like. On the other hand, if PL/pgsql does not suppress context in general but suppresses only this one particular bit of context from parallel query, then that is probably a bug in new code which should be fixed before release. But I don't think that's what is being argued. Am I confused? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increase message string buffer size of watch command of psql
I wrote: > Robert Haaswrites: >> On Sun, Jun 12, 2016 at 10:55 AM, Ioseph Kim wrote: >>> Increase size of this title, please. >>> 50 bytes is so small for multi language. >>> And. I suggest that date string might be local language, >>> or current_timestamp string. >> This was already changed in commit dea2b5960. > Well, we did part of that, but it's still using asctime(). Should we > change that to strftime(..."%c"...) to be less English-centric? > (The result seems to be the same in C locale. pg_controldata has done > it that way for a long time, with few complaints.) If we want to do so, > now would be the time, since 9.6 already whacked around the format > of \watch output. I take it from the vast silence that nobody particularly cares one way or the other. On reflection I think that this would be a good change to make, so I'll go do so unless I hear complaints soon. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel.c is not marked as test covered
Robert Haaswrites: > On Fri, Jun 10, 2016 at 4:12 PM, Robert Haas wrote: >> On Fri, Jun 10, 2016 at 1:49 PM, Peter Eisentraut >> wrote: >>> Elsewhere in this thread I suggested getting rid of the parallel worker >>> context by default (except for debugging), but if we do want to keep it, >>> then it seems to be a bug that a PL/pgSQL function can just eliminate it. > This is currently listed as an open item, but it doesn't seem very > actionable to me. The fact that PL/plpgsql chucks the existing > context instead of appending to it is presumably a property of > PL/plpgsql, not parallel query, and changing that seems like it ought > to be out of scope for 9.6. FWIW, I follow all of your reasoning except this. If we believe that the parallel worker context line is useful, then it is a bug that plpgsql suppresses it. If we don't believe it's useful, then we should get rid of it. "Do nothing" is simply not a consistent stance here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel.c is not marked as test covered
On Fri, Jun 10, 2016 at 4:12 PM, Robert Haaswrote: > On Fri, Jun 10, 2016 at 1:49 PM, Peter Eisentraut > wrote: >> Regarding the patch that ended up being committed, I wonder if it is >> intentional that PL/pgSQL overwrites the context from the parallel worker. >> Shouldn't the context effectively look like >> >> ERROR: message >> CONTEXT: parallel worker >> CONTEXT: PL/pgSQL function >> >> Elsewhere in this thread I suggested getting rid of the parallel worker >> context by default (except for debugging), but if we do want to keep it, >> then it seems to be a bug that a PL/pgSQL function can just eliminate it. > > Several people have suggested getting rid of that now, so maybe we > should just go ahead and do it. > > How would we make it available for debugging, though? That seems like > something people will want. This is currently listed as an open item, but it doesn't seem very actionable to me. The fact that PL/plpgsql chucks the existing context instead of appending to it is presumably a property of PL/plpgsql, not parallel query, and changing that seems like it ought to be out of scope for 9.6. As far as the context is concerned, the reason why that's there is because I felt (and still feel, to some extent) that users might experience errors that happen inside a parallel worker and it might be important for debugging purposes to know that. Suppose, for example, that you mislabel a function as PARALLEL SAFE when in fact it relies on some backend-local state (and should therefore be PARALLEL RESTRICTED). When it runs in a worker, it might generate an error message like this: ERROR: can't generate random numbers because you haven't specified a seed ...to which the user will reply, "oh yes I did; in fact I ran SELECT magic_setseed(42) just before I ran the offending query!". They'll then contact an expert (hopefully) who will very possibly spend a lot of time troubleshooting the wrong thing. If the message says: ERROR: can't generate random numbers because you haven't specified a seed CONTEXT: parallel worker, PID 62413 ...then the expert has a very good chance of guessing what has gone wrong right away. Now, against this scenario, there is every possibility that this message will just annoy a lot of people. It is certainly annoying for regression test authors because the PID changes and, even if we took the PID out, any given error might sometimes happen in the leader rather than the worker, depending on timing. I am not aware of a concrete reason why it will annoy people in other situations, but there may well be one (or more). The simplest thing we could do here is nothing. We can wait to see what happens with PostgreSQL 9.6 and then decide whether to make a change. Or we can do something now. We can eliminate the context entirely and take our chances with scenarios like the one mentioned above. Or we can add a new GUC, something like hide_parallel_context, which suppresses it and turn that off when running regression tests, or even by default. It's all just a matter of how much it bothers you to sometimes get an extra context line vs. how much annoyance you think will be caused by people wasting time troubleshooting because they didn't realize parallel query was involved. Personally, I'm very doubtful about our ability to paper over artifacts like this. Sophisticated users are going to end up having to understand that there are parallel workers and that they propagate their messages back to the leader, which appends its own context. We often seem to assume users won't find out about internal details like that and so we don't make an effort to expose and document them, but I don't think that often works out. There are a whole lot of people who know that they need to run EXPLAIN 6 times to find out what's really going on. I don't expect this case to be any different. I'm actually sort of flattered that parallel query is working well enough that multiple people think they might not ever need to know whether an error comes from the worker or from the leader, but I'm rather doubtful that will be true in practice. All that having been said, I don't know everything and this is just a judgement call. I exercised my judgement and that's how we got here. If there's a consensus that my judgement should be overruled, I'm fine with that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist
Amit Kapilawrites: > On Mon, Jun 13, 2016 at 9:37 PM, Tom Lane wrote: >> ... I got a core dump in the window.sql test: >> which I think may be another manifestation of the failure-to-apply-proper- >> pathtarget issue we're looking at in this thread. Or maybe it's just >> an unjustified assumption in make_partialgroup_input_target that the >> input path must always have some sortgrouprefs assigned. > I don't see this problem after your recent commit > - 89d53515e53ea080029894939118365b647489b3. Are you still getting it? No. I am suspicious that there's still a bug there, ie we're looking at the wrong pathtarget; but the regression test doesn't actually crash. That might only be because we don't choose the parallelized path. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rename max_parallel_degree?
Robert Haaswrites: > Of course, it would be nice if we could make these counters 64-bit > integers, but we can't, because we don't rely on 64-bit reads and > writes to be atomic on all platforms. So instead they'll have to be > uint32. That means they could wrap (if you really work at it) but > subtraction will still return the right answer, so it's OK. OK ... > If we > want to allow the number of parallel workers started to be available > for statistical purposes, we can keep to uint32 values for that > (parallel_register_count_lo and parallel_register_count_hi, for > example), and increment the second one whenever the first one rolls > over to zero. And that's going to be atomic how exactly? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist
Robert Haaswrites: > On Mon, Jun 13, 2016 at 6:21 PM, Tom Lane wrote: >> I think this is bad because it forces any external creators of >> UPPERREL_GROUP_AGG to duplicate that code --- and heaven help everybody >> if anyone is out of sync on whether to set the flag. So I'd rather keep >> set_grouped_rel_consider_parallel(), even if all it does is the above. >> And make it global not static. Ditto for the other upperrels. > I'm slightly mystified by this because we really shouldn't be setting > that flag more than once. We don't want to do that work repeatedly, > just once, and prior to adding any paths to the rel. Are you > imagining that somebody would try to created grouped paths before we > finish scan/join planning? Exactly. The original pathification design contemplated that FDWs and other extensions could inject Paths for the upperrels whenever they felt like it, specifically during relation path building. Even if you think that's silly, it *must* be possible to do it during GetForeignUpperPaths, else that hook is completely useless. Therefore, adding any data other than new Paths to the upperrel during create_grouping_paths is a broken design unless there is some easy, reliable, and not too expensive way for an FDW to make the same thing happen a bit earlier. See the commentary in optimizer/README starting at line 922. I generally don't like rel->consider_parallel at all for basically this reason, but it may be too late to undo that aspect of the parallel join planning design. I will settle for having an API call that allows FDWs to get the flag set correctly. Another possibility is to set the flag before calling GetForeignUpperPaths, but that might be messy too. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Failed assertion in postgres_fdw/deparse.c:1116
On Tue, Jun 14, 2016 at 4:06 AM, Amit Langotewrote: > On 2016/06/14 6:51, Robert Haas wrote: >> On Fri, Jun 10, 2016 at 4:14 PM, Robert Haas wrote: >>> Although I have done a bit of review of this patch, it needs more >>> thought than I have so far had time to give it. I will update again >>> by Tuesday. >> >> I've reviewed this a bit further and have discovered an infelicity. >> The following is all with the patch applied. >> >> By itself, this join can be pushed down: >> >> contrib_regression=# EXPLAIN SELECT 13 FROM ft1 RIGHT JOIN ft2 ON >> ft1.c1 = ft2.c1; >> QUERY PLAN >> -- >> Foreign Scan (cost=100.00..137.66 rows=822 width=4) >>Relations: (public.ft2) LEFT JOIN (public.ft1) >> (2 rows) >> >> That's great. However, when that query is used as the outer rel of a >> left join, it can't: >> >> contrib_regression=# explain verbose select * from ft4 LEFT JOIN >> (SELECT 13 FROM ft1 RIGHT JOIN ft2 ON ft1.c1 = ft2.c1) q on true; >> QUERY PLAN >> - >> Nested Loop Left Join (cost=353.50..920.77 rows=41100 width=19) >>Output: ft4.c1, ft4.c2, ft4.c3, (13) >>-> Foreign Scan on public.ft4 (cost=100.00..102.50 rows=50 width=15) >> Output: ft4.c1, ft4.c2, ft4.c3 >> Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 3" >>-> Materialize (cost=253.50..306.57 rows=822 width=4) >> Output: (13) >> -> Hash Left Join (cost=253.50..302.46 rows=822 width=4) >>Output: 13 >>Hash Cond: (ft2.c1 = ft1.c1) >>-> Foreign Scan on public.ft2 (cost=100.00..137.66 >> rows=822 width=4) >> Output: ft2.c1 >> Remote SQL: SELECT "C 1" FROM "S 1"."T 1" >>-> Hash (cost=141.00..141.00 rows=1000 width=4) >> Output: ft1.c1 >> -> Foreign Scan on public.ft1 >> (cost=100.00..141.00 rows=1000 width=4) >>Output: ft1.c1 >>Remote SQL: SELECT "C 1" FROM "S 1"."T 1" >> (18 rows) >> >> Of course, because of the PlaceHolderVar, there's no way to push down >> the ft1-ft2-ft4 join to the remote side. But we could still push down >> the ft1-ft2 join and then locally perform the join between the result >> and ft4. However, the proposed fix doesn't allow that, because >> ph_eval_at is (b 4 5) and relids for the ft1-ft2 join is also (b 4 5), >> and so the bms_is_subset(phinfo->ph_eval_at, relids) test returns >> true. > > You're right. It indeed should be possible to push down ft1-ft2 join. > However it could not be done without also modifying > build_tlist_to_deparse() a little (as Ashutosh proposed [1] to do > upthread). Attached new version of the patch with following changes: > > 1) Modified the check in foreign_join_ok() so that it is not overly > restrictive. Previously, it used ph_needed as the highest level at which > the PHV is needed (by definition) and checked using it whether it is above > the join under consideration, which ended up being an overkill. ISTM, we > can just decide from joinrel->relids of the join under consideration > whether we are above the lowest level where the PHV could be evaluated > (ph_eval_at) and stop if so. So in the example you provided, it won't now > reject {ft1, ft2}, but will {ft4, ft1, ft2}. > > 2) Modified build_tlist_to_deparse() to get rid of the PlaceHolderVars in > the targetlist to deparse by using PVC_RECURSE_PLACEHOLDER flag of > pull_var_clause(). That is because we do not yet support anything other > than Vars in deparseExplicitTargetList() (+1 to your patch to change the > Assert to elog). OK, I committed this version with some cosmetic changes. I ripped out the header comment to foreign_join_ok(), which is just a nuisance to maintain. It unnecessarily recapitulates the tests contained within the function, requiring us to update the comments in two places instead of one every time we make a change for no real gain. And I rewrote the comment you added. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist
On Mon, Jun 13, 2016 at 10:36 PM, Tom Lanewrote: > > Robert Haas writes: > > On Mon, Jun 13, 2016 at 11:02 AM, Tom Lane wrote: > >> BTW, decent regression tests could be written without the need to create > >> enormous tables if the minimum rel size in create_plain_partial_paths() > >> could be configured to something less than 1000 blocks. I think it's > >> fairly crazy that that arbitrary constant is hard-wired anyway. Should > >> we make it a GUC? > > > That was proposed before, and I didn't do it mostly because I couldn't > > think of a name for it that didn't sound unbelievably corny. > > min_parallel_relation_size, or min_parallelizable_relation_size, or > something like that? > You are right that such a variable will make it simpler to write tests for parallel query. I have implemented such a guc and choose to keep the name as min_parallel_relation_size. One thing to note is that in function create_plain_partial_paths(), curently it is using PG_INT32_MAX/3 for parallel_threshold to check for overflow, I have changed it to INT_MAX/3 so as to be consistent with guc.c. I am not sure if it is advisable to use PG_INT32_MAX in guc.c as other similar parameters use INT_MAX. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com min_parallel_relation_size_v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist
David Rowleywrites: > Do you think it's worth also applying the attached so as to have > ressortgroupref set to NULL more consistently, instead of sometimes > NULL and other times allocated to memory wastefully filled with zeros? Meh --- that seems to add complication without buying a whole lot. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Use of CREATE OR REPLACE in bloom--1.0.sql
Andreas Karlssonwrites: > I do not think that extension SQL scripts should be using CREATE OR > REPLACE FUNCTION like bloom--1.0.sql currently does. I suspect that this > might just be a typo. It's definitely a bug. Grepping around found another instance in sslinfo, and I also noticed the lack of a standard header on this script and others. Pushed, thanks for noticing that! regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Using FDW AddForeignUpdateTargets for a hidden pseudo-column
Bernd Helmlewrites: > --On 14. Juni 2016 10:32:13 + Albe Laurenz > wrote: >> I first thought of using the internal ROWID column that's probably >> similar to your case, but that wouldn't fit into a tid's 6 bytes, and I >> found that I could only add resjunk columns for existing columns of the >> table. >> Making the internal ROWID an explicit column in the foreign table seemed >> just too ugly. > The Informix FDW uses SelfItemPointerAttributeNumber. Luckily the Informix > ROWID is a 4 byte encoded identifier (3 first significant bytes are the > logical page number, last significant bytes is the slot number within that > page). Maybe you can find a way of logically addressing your data, too? It > only needs to fit within 6 bytes, afaik. There's been some speculation about allowing FDWs to control the type of the CTID column created for a foreign table, but it hasn't gotten past the speculation stage yet. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Using FDW AddForeignUpdateTargets for a hidden pseudo-column
A very quick and dirty hack I did in src/backend/optimizer/plan/initsplan.c (in 9.5.3): --- initsplan.c.orig2016-06-14 19:08:27.0 +0600 +++ initsplan.c 2016-06-14 19:10:55.0 +0600 @@ -185,9 +185,12 @@ if (IsA(node, Var)) { Var*var = (Var *) node; - RelOptInfo *rel = find_base_rel(root, var->varno); + RelOptInfo *rel; int attno = var->varattno; + if (var->varno == INDEX_VAR) + continue; + rel = find_base_rel(root, var->varno); if (bms_is_subset(where_needed, rel->relids)) continue; Assert(attno >= rel->min_attr && attno <= rel->max_attr); And then in my FDW I add the tuple id column like this: static void MyAddForeignUpdateTargets(Query *parsetree, RangeTblEntry *target_rte, Relation target_relation) { Var*var; TargetEntry *tle; /* Make a Var representing the desired value */ var = makeVar(INDEX_VAR, /* instead of parsetree->resultRelation,*/ target_relation->rd_att->natts + 1, INT8OID, -1, InvalidOid, 0); /* Wrap it in a resjunk TLE with the right name ... */ tle = makeTargetEntry((Expr *) var, list_length(parsetree->targetList) + 1, pstrdup(MY_FDW_TUPLE_ID), true); /* ... and add it to the query's targetlist */ parsetree->targetList = lappend(parsetree->targetList, tle); } I was able to run successfully a couple of very simple tests with these. This seems to indicate that tweaking the core to handle this case properly is doable. The question is if this approach is conceptually correct and if so what are the other required places to patch. Regards, Aleksey -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist
On Tue, Jun 14, 2016 at 2:16 AM, Tom Lanewrote: > > Robert Haas writes: > > In practice, we don't yet have the ability for > > parallel-safe paths from subqueries to affect planning at higher query > > levels, but that's because the pathification stuff landed too late in > > the cycle for me to fully react to it. > > I wonder if that's not just from confusion between subplans and > subqueries. > Won't the patch as written will allow parallel-restricted things to be pushed to workers for UNION ALL kind of queries? Explain verbose Select * from (SELECT c1+1 FROM t1 UNION ALL SELECT c1+1 FROM t2 where c1 < 10) ss(a); In above kind of queries, set_rel_consider_parallel() might set consider_parallel as true for rel, but later in set_append_rel_size(), it might pull some unsafe clauses in target of childrel. Basically, I am wondering about the same problem as we discussed here [1]. [1] - https://www.postgresql.org/message-id/CAA4eK1Jz5tG2D2PCNYqYob2cb4dKowKYwVJ7OmP5vwyRyCMx4g%40mail.gmail.com With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Using FDW AddForeignUpdateTargets for a hidden pseudo-column
--On 14. Juni 2016 10:32:13 + Albe Laurenzwrote: > I first thought of using the internal ROWID column that's probably > similar to your case, but that wouldn't fit into a tid's 6 bytes, and I > found that I could only add resjunk columns for existing columns of the > table. > Making the internal ROWID an explicit column in the foreign table seemed > just too ugly. The Informix FDW uses SelfItemPointerAttributeNumber. Luckily the Informix ROWID is a 4 byte encoded identifier (3 first significant bytes are the logical page number, last significant bytes is the slot number within that page). Maybe you can find a way of logically addressing your data, too? It only needs to fit within 6 bytes, afaik. -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reviewing freeze map code
On Tue, Jun 14, 2016 at 8:11 AM, Robert Haaswrote: >>> I noticed that the tuples that it reported were always offset 1 in a >>> page, and that the page always had a maxoff over a couple of hundred, >>> and that we called record_corrupt_item because VM_ALL_VISIBLE returned >>> true but HeapTupleSatisfiesVacuum on the first tuple returned >>> HEAPTUPLE_DELETE_IN_PROGRESS instead of the expected HEAPTUPLE_LIVE. >>> It did that because HEAP_XMAX_COMMITTED was not set and >>> TransactionIdIsInProgress returned true for xmax. >> >> So this seems like it might be a visibility map bug rather than a bug >> in the test code, but I'm not completely sure of that. How was it >> legitimate to mark the page as all-visible if a tuple on the page >> still had a live xmax? If xmax is live and not just a locker then the >> tuple is not visible to the transaction that wrote xmax, at least. > > Ah, wait a minute. I see how this could happen. Hang on, let me > update the pg_visibility patch. The problem should be fixed in the attached revision of pg_check_visible. I think what happened is: 1. pg_check_visible computed an OldestXmin. 2. Some transaction committed. 3. VACUUM computed a newer OldestXmin and marked a page all-visible with it. 4. pg_check_visible then used its older OldestXmin to check the visibility of tuples on that page, and saw delete-in-progress as a result. I added a guard against a similar scenario involving xmin in the last version of this patch, but forgot that we need to protect xmax in the same way. With this version of the patch, I can no longer get any TIDs to pop up out of pg_check_visible in my testing. (I haven't run your test script for lack of the proper Python environment...) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company From 18815b0d6fcfc2048e47f104ef85ee981687d4de Mon Sep 17 00:00:00 2001 From: Robert Haas Date: Fri, 10 Jun 2016 14:42:46 -0400 Subject: [PATCH 1/2] Add integrity-checking functions to pg_visibility. The new pg_check_visible() and pg_check_frozen() functions can be used to verify that the visibility map bits for a relation's data pages match the actual state of the tuples on those pages. Amit Kapila and Robert Haas, reviewed by Andres Freund. Additional testing help by Thomas Munro. --- contrib/pg_visibility/Makefile| 2 +- contrib/pg_visibility/pg_visibility--1.0--1.1.sql | 17 ++ contrib/pg_visibility/pg_visibility--1.0.sql | 52 contrib/pg_visibility/pg_visibility--1.1.sql | 67 + contrib/pg_visibility/pg_visibility.c | 313 ++ contrib/pg_visibility/pg_visibility.control | 2 +- doc/src/sgml/pgvisibility.sgml| 28 +- src/tools/pgindent/typedefs.list | 1 + 8 files changed, 427 insertions(+), 55 deletions(-) create mode 100644 contrib/pg_visibility/pg_visibility--1.0--1.1.sql delete mode 100644 contrib/pg_visibility/pg_visibility--1.0.sql create mode 100644 contrib/pg_visibility/pg_visibility--1.1.sql diff --git a/contrib/pg_visibility/Makefile b/contrib/pg_visibility/Makefile index fbbaa2e..379591a 100644 --- a/contrib/pg_visibility/Makefile +++ b/contrib/pg_visibility/Makefile @@ -4,7 +4,7 @@ MODULE_big = pg_visibility OBJS = pg_visibility.o $(WIN32RES) EXTENSION = pg_visibility -DATA = pg_visibility--1.0.sql +DATA = pg_visibility--1.1.sql pg_visibility--1.0--1.1.sql PGFILEDESC = "pg_visibility - page visibility information" ifdef USE_PGXS diff --git a/contrib/pg_visibility/pg_visibility--1.0--1.1.sql b/contrib/pg_visibility/pg_visibility--1.0--1.1.sql new file mode 100644 index 000..2c97dfd --- /dev/null +++ b/contrib/pg_visibility/pg_visibility--1.0--1.1.sql @@ -0,0 +1,17 @@ +/* contrib/pg_visibility/pg_visibility--1.0--1.1.sql */ + +-- complain if script is sourced in psql, rather than via ALTER EXTENSION +\echo Use "ALTER EXTENSION pg_visibility UPDATE TO '1.1'" to load this file. \quit + +CREATE FUNCTION pg_check_frozen(regclass, t_ctid OUT tid) +RETURNS SETOF tid +AS 'MODULE_PATHNAME', 'pg_check_frozen' +LANGUAGE C STRICT; + +CREATE FUNCTION pg_check_visible(regclass, t_ctid OUT tid) +RETURNS SETOF tid +AS 'MODULE_PATHNAME', 'pg_check_visible' +LANGUAGE C STRICT; + +REVOKE ALL ON FUNCTION pg_check_frozen(regclass) FROM PUBLIC; +REVOKE ALL ON FUNCTION pg_check_visible(regclass) FROM PUBLIC; diff --git a/contrib/pg_visibility/pg_visibility--1.0.sql b/contrib/pg_visibility/pg_visibility--1.0.sql deleted file mode 100644 index da511e5..000 --- a/contrib/pg_visibility/pg_visibility--1.0.sql +++ /dev/null @@ -1,52 +0,0 @@ -/* contrib/pg_visibility/pg_visibility--1.0.sql */ - --- complain if script is sourced in psql, rather than via CREATE EXTENSION -\echo Use "CREATE EXTENSION pg_visibility" to load this file. \quit - --- Show visibility map information. -CREATE FUNCTION pg_visibility_map(regclass, blkno bigint, -
Re: [HACKERS] [BUG] pg_basebackup from disconnected standby fails
On Tue, Jun 14, 2016 at 8:31 PM, Kyotaro HORIGUCHIwrote: >> +# Take a second backup of the standby while the master is offline. >> +$node_master->stop; >> +$node_standby_1->backup('my_backup_2'); >> +$node_master->start; > > I'm not sure that adding the test case for a particular bug like > this is appropriate. But it would be acceptable because it > doesn't take long time and it is separate from standard checks. We already take a backup from a standby when master is connected, it should not cost much in terms of time. > It seems to me that we could more agressively advance the > minRecoveryPoint (but must not let it go too far..), but it is > right for it to aim a bit smaller than the ideal location. It may be risky to propose such a change for a backpatch. Anyway, in any case there is no guarantee that when using the low-level backup routines pg_start/stop_backup with a custom backup method the minimum recovery point will be correct.. pg_basebackup does that a bit more carefully if I recall correctly (too lazy to look at the code now :)). -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prepared statements and generic plans
On Tue, Jun 14, 2016 at 08:37:12AM +, Albe Laurenz wrote: > Bruce Momjian wrote: > > However, for the wire protocol prepare/execute, how do you do EXPLAIN? > > The only way I can see doing it is to put the EXPLAIN in the prepare > > query, but I wasn't sure that works. So, I just wrote and tested the > > attached C program and it properly output the explain information, e.g. > > > > res = PQprepare(conn, "prep1", "EXPLAIN SELECT * FROM pg_language", 0, > > NULL); > > --- > > generated: > > > > QUERY PLAN > > > > Seq Scan on pg_language (cost=0.00..1.04 rows=4 width=114) > > > > so that works --- good. > > Hm, yes. > > Were you just curious or is it relevant for the documentation update? I was curious because if there was no way to do it, I should document that. > >>> Looking at how the code behaves, it seems custom plans that are _more_ > >>> expensive (plus planning cost) than the generic plan switch to the > >>> generic plan after five executions, as now documented. Custom plans > >>> that are significantly _cheaper_ than the generic plan _never_ use the > >>> generic plan. > >> > >> Yes, that's what the suggested documentation improvement says as well, > >> right? > > > > Yes. What is odd is that it isn't the plan of the actual supplied > > parameters that is cheaper, just the generic plan that assumes each > > distinct value in the query is equally likely to be used. So, when we > > say the generic plan is cheaper, it is just comparing the custom plan > > with the supplied parameters vs. the generic plan --- it is not saying > > that running the supplied constants with the generic plan will execute > > faster, because in fact we might be using a sub-optimial generic plan. > > Right, that's why it is important to document that it is estimates that are > compared, not actual costs. > > This has caused confussion in the past, see > https://www.postgresql.org/message-id/flat/561E749D.4090301%40socialserve.com#561e749d.4090...@socialserve.com > > > Right. Updated patch attached. > > I am happy with the patch as it is. Good. -- Bruce Momjianhttp://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reviewing freeze map code
On Tue, Jun 14, 2016 at 8:08 AM, Robert Haaswrote: > On Tue, Jun 14, 2016 at 2:53 AM, Thomas Munro > wrote: >> On Tue, Jun 14, 2016 at 4:02 AM, Robert Haas wrote: >>> On Sat, Jun 11, 2016 at 5:00 PM, Robert Haas wrote: > How about changing the return tuple of heap_prepare_freeze_tuple to > a bitmap? Two flags: "Freeze [not] done" and "[No] more freezing > needed" Yes, I think something like that sounds about right. >>> >>> Here's a patch. I took the approach of adding a separate bool out >>> parameter instead. I am also attaching an update of the >>> check-visibility patch which responds to assorted review comments and >>> adjusting it for the problems found on Friday which could otherwise >>> lead to false positives. I'm still getting occasional TIDs from the >>> pg_check_visible() function during pgbench runs, though, so evidently >>> not all is well with the world. >> >> I'm still working out how half this stuff works, but I managed to get >> pg_check_visible() to spit out a row every few seconds with the >> following brute force approach: >> >> CREATE TABLE foo (n int); >> INSERT INTO foo SELECT generate_series(1, 10); >> >> Three client threads (see attached script): >> 1. Run VACUUM in a tight loop. >> 2. Run UPDATE foo SET n = n + 1 in a tight loop. >> 3. Run SELECT pg_check_visible('foo'::regclass) in a tight loop, and >> print out any rows it produces. >> >> I noticed that the tuples that it reported were always offset 1 in a >> page, and that the page always had a maxoff over a couple of hundred, >> and that we called record_corrupt_item because VM_ALL_VISIBLE returned >> true but HeapTupleSatisfiesVacuum on the first tuple returned >> HEAPTUPLE_DELETE_IN_PROGRESS instead of the expected HEAPTUPLE_LIVE. >> It did that because HEAP_XMAX_COMMITTED was not set and >> TransactionIdIsInProgress returned true for xmax. > > So this seems like it might be a visibility map bug rather than a bug > in the test code, but I'm not completely sure of that. How was it > legitimate to mark the page as all-visible if a tuple on the page > still had a live xmax? If xmax is live and not just a locker then the > tuple is not visible to the transaction that wrote xmax, at least. Ah, wait a minute. I see how this could happen. Hang on, let me update the pg_visibility patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reviewing freeze map code
On Tue, Jun 14, 2016 at 2:53 AM, Thomas Munrowrote: > On Tue, Jun 14, 2016 at 4:02 AM, Robert Haas wrote: >> On Sat, Jun 11, 2016 at 5:00 PM, Robert Haas wrote: How about changing the return tuple of heap_prepare_freeze_tuple to a bitmap? Two flags: "Freeze [not] done" and "[No] more freezing needed" >>> >>> Yes, I think something like that sounds about right. >> >> Here's a patch. I took the approach of adding a separate bool out >> parameter instead. I am also attaching an update of the >> check-visibility patch which responds to assorted review comments and >> adjusting it for the problems found on Friday which could otherwise >> lead to false positives. I'm still getting occasional TIDs from the >> pg_check_visible() function during pgbench runs, though, so evidently >> not all is well with the world. > > I'm still working out how half this stuff works, but I managed to get > pg_check_visible() to spit out a row every few seconds with the > following brute force approach: > > CREATE TABLE foo (n int); > INSERT INTO foo SELECT generate_series(1, 10); > > Three client threads (see attached script): > 1. Run VACUUM in a tight loop. > 2. Run UPDATE foo SET n = n + 1 in a tight loop. > 3. Run SELECT pg_check_visible('foo'::regclass) in a tight loop, and > print out any rows it produces. > > I noticed that the tuples that it reported were always offset 1 in a > page, and that the page always had a maxoff over a couple of hundred, > and that we called record_corrupt_item because VM_ALL_VISIBLE returned > true but HeapTupleSatisfiesVacuum on the first tuple returned > HEAPTUPLE_DELETE_IN_PROGRESS instead of the expected HEAPTUPLE_LIVE. > It did that because HEAP_XMAX_COMMITTED was not set and > TransactionIdIsInProgress returned true for xmax. So this seems like it might be a visibility map bug rather than a bug in the test code, but I'm not completely sure of that. How was it legitimate to mark the page as all-visible if a tuple on the page still had a live xmax? If xmax is live and not just a locker then the tuple is not visible to the transaction that wrote xmax, at least. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] pg_basebackup from disconnected standby fails
Hello, thank you for looking this. At Fri, 10 Jun 2016 17:39:59 +0900, Michael Paquierwrote in > On Thu, Jun 9, 2016 at 9:55 PM, Kyotaro HORIGUCHI > wrote: > > Hello, I found that pg_basebackup from a replication standby > > fails after the following steps, on 9.3 and the master. > > > > - start a replication master > > - start a replication standby > > - stop the master in the mode other than immediate. > > > > pg_basebackup to the standby will fail with the following error. > > > >> pg_basebackup: could not get transaction log end position from server: > >> ERROR: could not find any WAL files > > Indeed, and you could just do the following to reproduce the failure > with the recovery test suite, so I would suggest adding this test in > the patch: > --- a/src/test/recovery/t/001_stream_rep.pl > +++ b/src/test/recovery/t/001_stream_rep.pl > @@ -24,6 +24,11 @@ $node_standby_1->start; > # pg_basebackup works on a standby). > $node_standby_1->backup($backup_name); > > +# Take a second backup of the standby while the master is offline. > +$node_master->stop; > +$node_standby_1->backup('my_backup_2'); > +$node_master->start; I'm not sure that adding the test case for a particular bug like this is appropriate. But it would be acceptable because it doesn't take long time and it is separate from standard checks. > > After looking more closely, I found that the minRecoveryPoint > > tends to be too small as the backup end point, and up to the > > record at the lastReplayedRecPtr can affect the pages on disk and > > they can go into the backup just taken. > > > > My conclusion here is that do_pg_stop_backup should return > > lastReplayedRecPtr, not minRecoveryPoint. > > I have been thinking quite a bit about this patch, and this logic > sounds quite right to me. When stopping the backup we need to let the > user know up to which point it needs to replay WAL, and relation pages > are touched up to lastReplayedEndRecPtr. Yes, but by design, the changes in buffers don't go into disk until buffer replacing occurs, which updates minRecoveryPoint. My understanding is that the problem is that a restart point that is not accompanied with buffer updates advances only the redo point of the last checkpoint and doesn't update minRecoveryPoint, which may be behind the redo point at the time. It seems to me that we could more agressively advance the minRecoveryPoint (but must not let it go too far..), but it is right for it to aim a bit smaller than the ideal location. So I proposed the patch as a solution instead of changing minRecoveryPoint's movement. As the result, the explanation for it is accurate enough, but obscuring the root cause. > This LSN could be greater > than the minimum recovery point as there is no record to mark the end > of the backup, and pg_control has normally, well surely, being backup > up last but that's not a new problem it would as well have been backup > up before the minimum recovery point has been reached... Yes. it would cause no new problem except that the amount of WAL files to be copied could be larger than ideal amount. > Still, perhaps we could improve the documentation regarding that? Say > it is recommended to enforce the minimum recovery point in pg_control > to the stop backup LSN to ensure that the backup recovers to a > consistent state when taking a backup from a standby. If I understand that correctly, I don't find a explicit mention to minimum recovery point (and should not be, I suppose) in PITR and pg_baseback pages in the documentaiton. Up to where we need to reach a consistent state from a backup is not the "Minimum recovery ending point" in pg_control. It holds the consistency point in the previous recovery. What we need to reach there for a backup is the WAL files up to the LSN returned by the (do_)pg_stop_backup(). All of the files should have been archived on master and should have been automatically transferred by pg_basebackup with -X/x option. (I don't know what to do when -X/x is not specified for pg_basebackup to standby, though..) > I am attaching an updated patch with the test btw. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist
On Mon, Jun 13, 2016 at 8:11 PM, Tom Lanewrote: > > Amit Kapila writes: > > On Mon, Jun 13, 2016 at 7:51 PM, Tom Lane wrote: > >> I think the real question here is why the code removed by 04ae11f62 > >> was wrong. It was unsafe to use apply_projection_to_path, certainly, > >> but using create_projection_path directly would have avoided the > >> stated problem. And it's very unclear that this new patch doesn't > >> bring back that bug in a different place. > > > This new patch still doesn't seem to be right, but it won't bring back the > > original problem because apply_projection_to_path will be only done if > > grouped_rel is parallel_safe which means it doesn't have any > > parallel-unsafe or parallel-restricted clause in quals or target list. > > The problem cited in 04ae11f62's commit message is that > apply_projection_to_path would overwrite the paths' pathtargets in-place, > causing problems if they'd been used for other purposes elsewhere. > The main reason for removal of that code is that because there was no check there to prevent assigning of parallel-restricted clauses to pathtarget of partial paths. I think the same is indicated in commit message as well, if we focus on below part of commit message: "especially because this code has no check that the PathTarget is in fact parallel-safe." Due to above reason, I don't see how the suggestion given by you to avoid the problem by using create_projection_path can work. > I do > not share your confidence that using apply_projection_to_path within > create_grouping_paths is free of such a hazard. > Fair enough, let me try to explain the problem and someways to solve it in some more detail. The main thing that got missed by me in the patch related to commit-04ae11f62 is that the partial path list of rel also needs to have a scanjoin_target. I was under assumption that create_grouping_paths will take care of assigning appropriate Path targets for the parallel paths generated by it. If we see, create_grouping_paths() do take care of adding the appropriate path targets for the paths generated by that function. However, it doesn't do anything for partial paths. The patch sent by me yesterday [1] which was trying to assign partial_grouping_target to partial paths won't be the right fix, because (a) partial_grouping_target includes Aggregates (refer make_partialgroup_input_target) which we don't want for partial paths; (b) it is formed using grouping_target passed in function create_grouping_paths() whereas we need the path target formed from final_target for partial paths (as partial paths are scanjoin relations) as is done for main path list (in grouping_planner(), /* Forcibly apply that target to all the Paths for the scan/join rel.*/). Now, I think we have following ways to solve it: (a) check whether the scanjoin_target contains any parallel-restricted clause before applying the same to partial path list in grouping_planner. However it could lead to duplicate checks in some cases for parallel-restricted clause, once in apply_projection_to_path() for main pathlist and then again before applying to partial paths. I think we can avoid that by having an additional parameter in apply_projection_to_path() which can indicate if the check for parallel-safety is done inside that function and what is the result of same. (b) generate the appropriate scanjoin_target for partial path list in create_grouping_paths using final_target. However I think this might lead to some duplicate code in create_grouping_paths() as we might have to some thing similar to what we have done in grouping_planner for generating such a target. I think if we want we can pass scanjoin_target to create_grouping_paths() as well. Again, we have to check for parallel-safety for scanjoin_target before applying it to partial paths, but we need to do that only when grouped_rel is considered parallel-safe. Thoughts? [1] - https://www.postgresql.org/message-id/CAA4eK1Jm6HrJAPX26xyLGWes%2B%2Br5%3DdOyOGRWeTa4q8NKd-UhVQ%40mail.gmail.com With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Rename max_parallel_degree?
On 14/06/2016 04:09, Robert Haas wrote: > On Mon, Jun 13, 2016 at 5:42 AM, Julien Rouhaud >wrote: >> Agreed, and fixed in attached v3. > > I don't entirely like the new logic in > RegisterDynamicBackgroundWorker. I'm not that happy with it too. We can avoid iterating over every slots if the feature isn't activated though (max_parallel_workers >= max_worker_processes). > I wonder if we can't drive this off > of a couple of counters, instead of having the process registering the > background worker iterate over every slot. Suppose we add two > counters to BackgroundWorkerArray, parallel_register_count and > parallel_terminate_count. Whenever a backend successfully registers a > parallel worker, it increments parallel_register_count. Whenever the > postmaster marks a parallel wokrer slot as no longer in use, it > increments parallel_terminate_count. Then, the number of active > parallel workers is just parallel_register_count - > parallel_terminate_count. (We can't have the postmaster and the > backends share the same counter, because then it would need locking, > and the postmaster can't try to take spinlocks - can't even use > atomics, because those might be emulated using spinlocks.) > I wanted to maintain counters at first, but it seemed more invasive, and I thought that the max_parallel_worker would be ueful in environnements where there're lots of parallel workers and dynamic workers used, so finding a free slot would require iterating over most of the slots most of the time anyway. I'm of course also ok with maintaining counters. > If we want to allow the number of parallel workers started to be available > for statistical purposes, we can keep to uint32 values for that > (parallel_register_count_lo and parallel_register_count_hi, for > example), and increment the second one whenever the first one rolls > over to zero. > I didn't think about monitoring. I'm not sure if this counter would be really helpful without also having the number of time a parallel worker couldn't be launched (and I'd really like to have this one). -- Julien Rouhaud http://dalibo.com - http://dalibo.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Use of CREATE OR REPLACE in bloom--1.0.sql
Hi, I do not think that extension SQL scripts should be using CREATE OR REPLACE FUNCTION like bloom--1.0.sql currently does. I suspect that this might just be a typo. I have attached the tiny patch which fixes this. Andreas diff --git a/contrib/bloom/bloom--1.0.sql b/contrib/bloom/bloom--1.0.sql index 87b5442..132a550 100644 --- a/contrib/bloom/bloom--1.0.sql +++ b/contrib/bloom/bloom--1.0.sql @@ -1,4 +1,4 @@ -CREATE OR REPLACE FUNCTION blhandler(internal) +CREATE FUNCTION blhandler(internal) RETURNS index_am_handler AS 'MODULE_PATHNAME' LANGUAGE C; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel safety tagging of extension functions
On 06/07/2016 05:54 PM, Andreas Karlsson wrote: On 06/07/2016 05:44 PM, Robert Haas wrote: cube: I think we need a new extension version. hstore: Does not apply for me. intarray: Does not apply for me. Those three and ltree, pg_trgm, and seg depend on my patch with fixes for the GiST/GIN function signatures in https://www.postgresql.org/message-id/574f091a.3050...@proxel.se. The reason for the dependency is that I also mark the those functions with changed signatures as parallel safe. I have rebased all my patches on the current master now (and skipped the extensions I previously listed). Andreas parallel-contrib-v5-adminpack.patch.gz Description: application/gzip parallel-contrib-v5-cube.patch.gz Description: application/gzip parallel-contrib-v5-dblink.patch.gz Description: application/gzip parallel-contrib-v5-hstore.patch.gz Description: application/gzip parallel-contrib-v5-intarray.patch.gz Description: application/gzip parallel-contrib-v5-ltree.patch.gz Description: application/gzip parallel-contrib-v5-pg_trgm.patch.gz Description: application/gzip parallel-contrib-v5-pg_visibility.patch.gz Description: application/gzip parallel-contrib-v5-seg.patch.gz Description: application/gzip parallel-contrib-v5-sslinfo.patch.gz Description: application/gzip parallel-contrib-v5-tablefunc.patch.gz Description: application/gzip parallel-contrib-v5-tsearch2.patch.gz Description: application/gzip parallel-contrib-v5-unaccent.patch.gz Description: application/gzip parallel-contrib-v5-uuid-ossp.patch.gz Description: application/gzip parallel-contrib-v5-xml2.patch.gz Description: application/gzip -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Using FDW AddForeignUpdateTargets for a hidden pseudo-column
Aleksey Demakov wrote: > I have a data store where tuples have unique identities that normally are not > visible. > I also have a FDW to work with this data store. As per the docs to implement > updates > for this data store I have AddForeignUpdateTargets() function that adds an > artificial > column to the target list. > > It seems to me that I cannot re-use a system attribute number for this > artificial resjunk > column (as, for instance, postgres_fdw uses SelfItemPointerAttributeNumber). > These > attributes have specific meaning not compatible with my tuple identity. > > On other hand using a regular AttrNumber might confuse the query planner. In > contrast > e..g with Oracle FDW that can use a unique key to identify the row, in my > data store > the tuple identity is normally not visible. So the data planner might break > if it sees a > Var node with an unexpected varattno number. > > What is the best approach to handle such a case? > > 1. Give up on this entirely and require a unique key for any table used thru > FDW. > > 2. Force the FDW to expose the identity column either explicitly by the user > who > creates a foreign table or automatically adding it in the corresponding > trigger > (preferably still making it hidden for normal scans). > > 3. Modify the postgresql core to nicely handle the case of an unknown target > column added by a FDW. > > 4. Something else? When implementing this for oracle_fdw, I went with your solution #1. The downside is that anything that does not have a unique key cannot be modified. I first thought of using the internal ROWID column that's probably similar to your case, but that wouldn't fit into a tid's 6 bytes, and I found that I could only add resjunk columns for existing columns of the table. Making the internal ROWID an explicit column in the foreign table seemed just too ugly. I don't know if #3 would be difficult, but it sure would make things easier for FDW developers. Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Using FDW AddForeignUpdateTargets for a hidden pseudo-column
Hi all, I have a data store where tuples have unique identities that normally are not visible. I also have a FDW to work with this data store. As per the docs to implement updates for this data store I have AddForeignUpdateTargets() function that adds an artificial column to the target list. It seems to me that I cannot re-use a system attribute number for this artificial resjunk column (as, for instance, postgres_fdw uses SelfItemPointerAttributeNumber). These attributes have specific meaning not compatible with my tuple identity. On other hand using a regular AttrNumber might confuse the query planner. In contrast e..g with Oracle FDW that can use a unique key to identify the row, in my data store the tuple identity is normally not visible. So the data planner might break if it sees a Var node with an unexpected varattno number. What is the best approach to handle such a case? 1. Give up on this entirely and require a unique key for any table used thru FDW. 2. Force the FDW to expose the identity column either explicitly by the user who creates a foreign table or automatically adding it in the corresponding trigger (preferably still making it hidden for normal scans). 3. Modify the postgresql core to nicely handle the case of an unknown target column added by a FDW. 4. Something else? Regards, Aleksey -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist
On 14 June 2016 at 04:07, Tom Lanewrote: > Just as an experiment to see what would happen, I did > > - int parallel_threshold = 1000; > + int parallel_threshold = 1; > > and ran the regression tests. I got a core dump in the window.sql test: > > Program terminated with signal 11, Segmentation fault. > #0 0x00664dbc in make_partialgroup_input_target (root=0x1795018, > input_rel=0x17957a8, target=0x17bf228, rollup_lists=0x0, > rollup_groupclauses=0x0) at planner.c:4307 > 4307Index sgref = > final_target->sortgrouprefs[i]; > (gdb) bt > #0 0x00664dbc in make_partialgroup_input_target (root=0x1795018, > input_rel=0x17957a8, target=0x17bf228, rollup_lists=0x0, > rollup_groupclauses=0x0) at planner.c:4307 > #1 create_grouping_paths (root=0x1795018, input_rel=0x17957a8, > target=0x17bf228, rollup_lists=0x0, rollup_groupclauses=0x0) > at planner.c:3420 > #2 0x00667405 in grouping_planner (root=0x1795018, > inheritance_update=0 '\000', tuple_fraction=0) at planner.c:1794 > #3 0x00668c80 in subquery_planner (glob=, > parse=0x1703580, parent_root=, > hasRecursion=, tuple_fraction=0) at planner.c:769 > #4 0x00668ea5 in standard_planner (parse=0x1703580, > cursorOptions=256, boundParams=0x0) at planner.c:308 > #5 0x006691b6 in planner (parse=, > cursorOptions=, boundParams=) > at planner.c:178 > #6 0x006fb069 in pg_plan_query (querytree=0x1703580, > cursorOptions=256, boundParams=0x0) at postgres.c:798 > (gdb) p debug_query_string > $1 = 0x1702078 "SELECT SUM(COUNT(f1)) OVER () FROM int4_tbl WHERE f1=42;" I see you've committed a fix for this. Thank you. I thought it would be good to be consistent with the ressortgroupref handling, and I quite like your fix in that regard. Do you think it's worth also applying the attached so as to have ressortgroupref set to NULL more consistently, instead of sometimes NULL and other times allocated to memory wastefully filled with zeros? The patch also saved on allocating the array's memory when it's not needed. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services make_pathtarget_from_tlist.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prepared statements and generic plans
Bruce Momjian wrote: > However, for the wire protocol prepare/execute, how do you do EXPLAIN? > The only way I can see doing it is to put the EXPLAIN in the prepare > query, but I wasn't sure that works. So, I just wrote and tested the > attached C program and it properly output the explain information, e.g. > > res = PQprepare(conn, "prep1", "EXPLAIN SELECT * FROM pg_language", 0, > NULL); > --- > generated: > > QUERY PLAN > > Seq Scan on pg_language (cost=0.00..1.04 rows=4 width=114) > > so that works --- good. Hm, yes. Were you just curious or is it relevant for the documentation update? >>> Looking at how the code behaves, it seems custom plans that are _more_ >>> expensive (plus planning cost) than the generic plan switch to the >>> generic plan after five executions, as now documented. Custom plans >>> that are significantly _cheaper_ than the generic plan _never_ use the >>> generic plan. >> >> Yes, that's what the suggested documentation improvement says as well, >> right? > > Yes. What is odd is that it isn't the plan of the actual supplied > parameters that is cheaper, just the generic plan that assumes each > distinct value in the query is equally likely to be used. So, when we > say the generic plan is cheaper, it is just comparing the custom plan > with the supplied parameters vs. the generic plan --- it is not saying > that running the supplied constants with the generic plan will execute > faster, because in fact we might be using a sub-optimial generic plan. Right, that's why it is important to document that it is estimates that are compared, not actual costs. This has caused confussion in the past, see https://www.postgresql.org/message-id/flat/561E749D.4090301%40socialserve.com#561e749d.4090...@socialserve.com > Right. Updated patch attached. I am happy with the patch as it is. Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Failed assertion in postgres_fdw/deparse.c:1116
On 2016/06/14 6:51, Robert Haas wrote: > On Fri, Jun 10, 2016 at 4:14 PM, Robert Haaswrote: >> Although I have done a bit of review of this patch, it needs more >> thought than I have so far had time to give it. I will update again >> by Tuesday. > > I've reviewed this a bit further and have discovered an infelicity. > The following is all with the patch applied. > > By itself, this join can be pushed down: > > contrib_regression=# EXPLAIN SELECT 13 FROM ft1 RIGHT JOIN ft2 ON > ft1.c1 = ft2.c1; > QUERY PLAN > -- > Foreign Scan (cost=100.00..137.66 rows=822 width=4) >Relations: (public.ft2) LEFT JOIN (public.ft1) > (2 rows) > > That's great. However, when that query is used as the outer rel of a > left join, it can't: > > contrib_regression=# explain verbose select * from ft4 LEFT JOIN > (SELECT 13 FROM ft1 RIGHT JOIN ft2 ON ft1.c1 = ft2.c1) q on true; > QUERY PLAN > - > Nested Loop Left Join (cost=353.50..920.77 rows=41100 width=19) >Output: ft4.c1, ft4.c2, ft4.c3, (13) >-> Foreign Scan on public.ft4 (cost=100.00..102.50 rows=50 width=15) > Output: ft4.c1, ft4.c2, ft4.c3 > Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 3" >-> Materialize (cost=253.50..306.57 rows=822 width=4) > Output: (13) > -> Hash Left Join (cost=253.50..302.46 rows=822 width=4) >Output: 13 >Hash Cond: (ft2.c1 = ft1.c1) >-> Foreign Scan on public.ft2 (cost=100.00..137.66 > rows=822 width=4) > Output: ft2.c1 > Remote SQL: SELECT "C 1" FROM "S 1"."T 1" >-> Hash (cost=141.00..141.00 rows=1000 width=4) > Output: ft1.c1 > -> Foreign Scan on public.ft1 > (cost=100.00..141.00 rows=1000 width=4) >Output: ft1.c1 >Remote SQL: SELECT "C 1" FROM "S 1"."T 1" > (18 rows) > > Of course, because of the PlaceHolderVar, there's no way to push down > the ft1-ft2-ft4 join to the remote side. But we could still push down > the ft1-ft2 join and then locally perform the join between the result > and ft4. However, the proposed fix doesn't allow that, because > ph_eval_at is (b 4 5) and relids for the ft1-ft2 join is also (b 4 5), > and so the bms_is_subset(phinfo->ph_eval_at, relids) test returns > true. You're right. It indeed should be possible to push down ft1-ft2 join. However it could not be done without also modifying build_tlist_to_deparse() a little (as Ashutosh proposed [1] to do upthread). Attached new version of the patch with following changes: 1) Modified the check in foreign_join_ok() so that it is not overly restrictive. Previously, it used ph_needed as the highest level at which the PHV is needed (by definition) and checked using it whether it is above the join under consideration, which ended up being an overkill. ISTM, we can just decide from joinrel->relids of the join under consideration whether we are above the lowest level where the PHV could be evaluated (ph_eval_at) and stop if so. So in the example you provided, it won't now reject {ft1, ft2}, but will {ft4, ft1, ft2}. 2) Modified build_tlist_to_deparse() to get rid of the PlaceHolderVars in the targetlist to deparse by using PVC_RECURSE_PLACEHOLDER flag of pull_var_clause(). That is because we do not yet support anything other than Vars in deparseExplicitTargetList() (+1 to your patch to change the Assert to elog). Thanks, Amit [1] https://www.postgresql.org/message-id/CAFjFpRfQRKNCvfPj8p%3DD9%2BDVZeuTfSN3hnGowKho%3DrKCSeD9dw%40mail.gmail.com diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 7d2512c..1abff3f 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -731,7 +731,9 @@ build_tlist_to_deparse(RelOptInfo *foreignrel) * We require columns specified in foreignrel->reltarget->exprs and those * required for evaluating the local conditions. */ - tlist = add_to_flat_tlist(tlist, foreignrel->reltarget->exprs); + tlist = add_to_flat_tlist(tlist, + pull_var_clause((Node *) foreignrel->reltarget->exprs, + PVC_RECURSE_PLACEHOLDERS)); tlist = add_to_flat_tlist(tlist, pull_var_clause((Node *) fpinfo->local_conds, PVC_RECURSE_PLACEHOLDERS)); diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index 1de0bc4..73900d9 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -2202,6 +2202,64 @@ SELECT t1.c1, t2.c1 FROM (ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1)) left join (ft Remote SQL: SELECT c1 FROM "S 1"."T 4" (27 rows) +-- non-Var items
Re: [HACKERS] Possible gaps/garbage in the output of XLOG reader
On Thu, Apr 9, 2015 at 7:05 PM, Antonin Houskawrote: > While playing with xlogreader, I was lucky enough to see one of the many > record validations to fail. After having some fun with gdb, I found out that > in some cases the reader does not enforce enough data to be in state->readBuf > before copying into state->readRecordBuf starts. This should not happen if the > callback always reads XLOG_BLCKSZ bytes, but in fact only *reqLen* is the > mandatory size of the chunk delivered. > > There are probably various ways to fix this problem. Attached is what I did in > my environment. I hit the problem on 9.4.1, but the patch seems to apply to > master too. This looks similar to what is discussed here: https://www.postgresql.org/message-id/flat/caboikdpspbymig6j01dkq6om2+bnkxhtpkoyqhm2a4oywgk...@mail.gmail.com And there is a different patch which takes a lower-level approach, and it seems to me cleaner approach in its way of calculating the record offset when it goes through several XLOG pages. Perhaps you could help review it? It is attached to the next commit fest. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reviewing freeze map code
On Tue, Jun 14, 2016 at 4:02 AM, Robert Haaswrote: > On Sat, Jun 11, 2016 at 5:00 PM, Robert Haas wrote: >>> How about changing the return tuple of heap_prepare_freeze_tuple to >>> a bitmap? Two flags: "Freeze [not] done" and "[No] more freezing >>> needed" >> >> Yes, I think something like that sounds about right. > > Here's a patch. I took the approach of adding a separate bool out > parameter instead. I am also attaching an update of the > check-visibility patch which responds to assorted review comments and > adjusting it for the problems found on Friday which could otherwise > lead to false positives. I'm still getting occasional TIDs from the > pg_check_visible() function during pgbench runs, though, so evidently > not all is well with the world. I'm still working out how half this stuff works, but I managed to get pg_check_visible() to spit out a row every few seconds with the following brute force approach: CREATE TABLE foo (n int); INSERT INTO foo SELECT generate_series(1, 10); Three client threads (see attached script): 1. Run VACUUM in a tight loop. 2. Run UPDATE foo SET n = n + 1 in a tight loop. 3. Run SELECT pg_check_visible('foo'::regclass) in a tight loop, and print out any rows it produces. I noticed that the tuples that it reported were always offset 1 in a page, and that the page always had a maxoff over a couple of hundred, and that we called record_corrupt_item because VM_ALL_VISIBLE returned true but HeapTupleSatisfiesVacuum on the first tuple returned HEAPTUPLE_DELETE_IN_PROGRESS instead of the expected HEAPTUPLE_LIVE. It did that because HEAP_XMAX_COMMITTED was not set and TransactionIdIsInProgress returned true for xmax. Not sure how much of this was already obvious! I will poke at it some more tomorrow. -- Thomas Munro http://www.enterprisedb.com # create table foo (n int); # insert into foo select generate_series(1, 10); import psycopg2 import threading import time keep_running = True def vacuum_thread(dsn): conn = psycopg2.connect(dsn) conn.set_isolation_level(0) cursor = conn.cursor() while keep_running: cursor.execute("VACUUM") def update_thread(dsn): conn = psycopg2.connect(dsn) cursor = conn.cursor() while keep_running: cursor.execute("UPDATE foo SET n = n + 1") conn.commit() def check_thread(dsn): conn = psycopg2.connect(dsn) conn.set_isolation_level(0) cursor = conn.cursor() while keep_running: cursor.execute("SELECT pg_check_visible('foo'::regclass)") for row in cursor: print row conn.rollback() if __name__ == "__main__": dsn = "dbname=postgres" t1 = threading.Thread(target = vacuum_thread, args = [dsn]) t1.start() t2 = threading.Thread(target = update_thread, args = [dsn]) t2.start() t3 = threading.Thread(target = check_thread, args = [dsn]) t3.start() try: time.sleep(86400) except KeyboardInterrupt: keep_running = False t1.join() t2.join() t3.join() -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Failed assertion in postgres_fdw/deparse.c:1116
On Tue, Jun 14, 2016 at 4:10 AM, Robert Haaswrote: > On Mon, Jun 13, 2016 at 5:51 PM, Robert Haas > wrote: > > On Fri, Jun 10, 2016 at 4:14 PM, Robert Haas > wrote: > >> Although I have done a bit of review of this patch, it needs more > >> thought than I have so far had time to give it. I will update again > >> by Tuesday. > > > > I've reviewed this a bit further and have discovered an infelicity. > > Also, independent of the fix for this particular issue, I think it > would be smart to apply the attached patch to promote the assertion > that failed here to an elog(). If we have more bugs of this sort, now > or in the future, I'd like to catch them even in non-assert-enabled > builds by getting a sensible error rather than just by failing an > assertion. I think it's our general practice to check node types with > elog() rather than Assert() when the nodes are coming from some > far-distant part of the code, which is certainly the case here. > > I plan to commit this without delay unless there are vigorous, > well-reasoned objections. > Fine with me. Serves the purpose for which I added the Assert, but in a better manner. May be the error message can read "non-Var nodes/targets/expressions not expected in target list". I am not sure what do we call individual (whole) members of target list. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company