Re: [HACKERS] Multiple false-positive warnings from Valgrind
On Fri, Mar 31, 2017 at 04:40:07PM +0300, Aleksander Alekseev wrote: > > > And it seems to me that this is caused by the routines of OpenSSL. > > > When building without --with-openssl, using the fallback > > > implementations of SHA256 and RAND_bytes I see no warnings generated > > > by scram_build_verifier... I think it makes most sense to discard that > > > from the list of open items. > > > > FWIW a document of the function says that, > > > > https://www.openssl.org/docs/man1.0.1/crypto/RAND_bytes.html > > > > > The contents of buf is mixed into the entropy pool before > > > retrieving the new pseudo-random bytes unless disabled at compile > > > time (see FAQ). > > > > This isn't saying that RAND_bytes does the same thing but > > something similar can be happening there. > > OK, turned out that warnings regarding uninitialized values disappear > after removing --with-openssl. That's a good thing. Does this remove the noise under --with-openssl? --- a/src/port/pg_strong_random.c +++ b/src/port/pg_strong_random.c @@ -104,7 +104,10 @@ pg_strong_random(void *buf, size_t len) */ #if defined(USE_OPENSSL_RANDOM) if (RAND_bytes(buf, len) == 1) + { + VALGRIND_MAKE_MEM_DEFINED(buf, len); return true; + } return false; /* > What about all these memory leak reports [1]? If I see them should I just > ignore them or, if reports look false positive, suggest a patch that > modifies a Valgrind suppression file? In other words what is current > consensus in community regarding Valgrind and it's reports? Pass --leak-check=no; PostgreSQL intentionally leaks a lot. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Making clausesel.c Smarter
On Fri, Feb 24, 2017 at 7:00 AM, David Rowleywrote: > I've stumbled over an interesting query out in the wild where the > query was testing a nullable timestamptz column was earlier than some > point in time, and also checking the column IS NOT NULL. > > The use pattern here was that records which required processing had > this timestamp set to something other than NULL, a worker would come > along and process those, and UPDATE the record to NULL to mark the > fact that it was now processed. So what we are left with was a table > with a small number of rows with a value in this timestamp column, and > an ever-growing number of rows with a NULL value. > > A highly simplified version of the query was checking for records that > required processing before some date, say: > > SELECT * FROM a WHERE ts < '2017-02-24' AND ts IS NOT NULL; > > Of course, the ts IS NOT NULL is not required here, but I can > understand how someone might make the mistake of adding this. The > simple solution to the problem was to have that null test removed, but > that seemingly innocent little mistake caused some pain due to very > slow running queries which held on to a nested loop plan 33 times > longer than it should have been doing. Basically what was happening > here is that clauselist_selectivity() was applying the selectivity > with the ~0.97 null_fact from pg_statistic over the top of the already > applied estimate on the number of rows below the constant timestamp. > > Since the diagnosis of this problem was not instant, and some amount > of pain was suffered before the solution was found, I wondered if it > might be worth smartening up the planner a bit in this area. > > We're already pretty good at handling conditions like: SELECT * FROM a > WHERE x < 10 and x < 1; where we'll effectively ignore the x < 10 > estimate since x < 1 is more restrictive, so if we just build on that > ability a bit we could get enough code to cover the above case. > > I've attached a draft patch which improves the situation here. I thought to take a quick look at this patch. I'll probably take a deeper look later, but some initial comments: -typedef struct RangeQueryClause +typedef struct CachedSelectivityClause { -struct RangeQueryClause *next;/* next in linked list */ +struct CachedSelectivityClause *next;/* next in linked list */ Node *var;/* The common variable of the clauses */ -boolhave_lobound;/* found a low-bound clause yet? */ -boolhave_hibound;/* found a high-bound clause yet? */ +intselmask;/* Bitmask of which sel types are stored */ Selectivity lobound;/* Selectivity of a var > something clause */ Selectivity hibound;/* Selectivity of a var < something clause */ -} RangeQueryClause; As seems customary in other code, perhaps you should define some HAS_X() macros for dealing with the selmask. Something like #SELMASK_HAS_LOBOUND(selmask) (((selmask) & CACHESEL_LOBOUND) != 0) etc.. +static void addCachedSelectivityRangeVar(CachedSelectivityClause **cslist, Node *var, bool varonleft, bool isLTsel, Selectivity s2); You're changing clause -> var throughout the code when dealing with nodes, but looking at their use, they hold neither. All those addCachedSelectivity functions are usually passed expression nodes. If you're renaming, perhaps you should rename to "expr". On Fri, Feb 24, 2017 at 7:00 AM, David Rowley wrote: > Now one thing I was unsure of in the patch was this "Other clauses" > concept that I've come up with. In the patch we have: > > default: > addCachedSelectivityOtherClause(, var, s2); > break; > > I was unsure if this should only apply to F_EQSEL and F_NEQSEL. My > imagination here won't stretch far enough to imagine a OpExpr which > passes with a NULL operand. If it did my logic is broken, and if > that's the case then I think limiting "Others" to mean F_EQSEL and > F_NEQSEL would be the workaround. While not specifically applicable only to "Others", something needs consideration here: User-defined functions can be nonstrict. An OpExpr involving such a user-defined function could possibly pass on null. I would suggest avoiding making the assumption that it doesn't unless the operator is strict. One could argue that such an operator should provide its own selectivity estimator, but the strictness check shouldn't be too difficult to add, and I think it's a better choice. So you'd have to check that: default: if (op_strict(expr->opno) && func_strict(expr->opfuncid)) addCachedSelectivityOtherClause(, var, s2); else s1 = s1 * s2; break; So, I went ahead and did that. Considering this setup: createdb pgtest cat
Re: [HACKERS] Table collision in join.sql and aggregates.sql
D'oh! The "temp" declaration had been removed from our test since we don't use temp tables. I missed that when applying it to the community code. You can ignore me now. On Fri, Mar 31, 2017 at 4:01 PM Tom Lanewrote: > Douglas Doole writes: > > As we've been merging our code with 9.6, a couple developers have had > > one-off failures in the join.sql and aggregates.sql test because the > tables > > T1, T2 and T3 have the wrong definitions. > > > Digging into it, I found that both files create the tables T1, T2, and T3 > > for a short period of time and then drop them. Since the > parallel_schedule > > file puts these two files into the same group, they can run concurrently. > > If it happens that the the two files hit the T1, T2, T3 tests at the same > > time, then you see the table definition problems. > > Hmmm ... that would indeed be a problem, except that aggregate.sql's > tables are temp tables, which should mean that they are in a schema > different from the one that join.sql is putting its tables in. Are you > sure you've identified the issue correctly? Because if this doesn't > work, there are an awful lot of similar hazards elsewhere in the > regression tests. > > regards, tom lane >
Re: [HACKERS] logical replication launcher crash on buildfarm
On 01/04/17 04:19, Andres Freund wrote: > On 2017-03-31 21:30:17 -0400, Robert Haas wrote: >> On Fri, Mar 31, 2017 at 9:26 PM, Petr Jelinek >>wrote: Hmm, I don't know if there's any good reason not to just use strcmp(), but sure, OK. Committed and back-patched. >>> >>> Hmm culicidae still fails, this time only in parallel worker code. This >>> didn't happen on my machine which is strange. Looking at the code, we >>> are passing the fps->entrypoint as function pointer again so of course >>> it fails. We have some code to load libraries again but even that gets >>> initial entrypoint passed as function pointer >>> (ParallelExtensionTrampoline). I wonder if we'll have to generalize the >>> InternalBGWorkers even more to some kind of internal function name to >>> pointer map and add the parallel entry points there as well. >> >> Argh, I forgot about that. I think we need to use something like >> ParallelExtensionTrampoline all the time, not just for libraries. >> Since effectively, we've determined that postgres itself has the same >> problem. > > I wonder if we shouldn't just introduce a KnownFunctionPointer[] map, > that can be used by various facilities (bgworkers themselves, > parallelism, and probably more coming). Otherwise we'll introduce > mechanisms like 7d8f6986b8 in multiple places. > Yes that's what I meant with what I said above. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TPC-H Q20 from 1 hour to 19 hours!
On 04/01/2017 02:57 AM, Robert Haas wrote: On Fri, Mar 31, 2017 at 7:53 AM, Rafia Sabihwrote: So, it looks like in the problematic area, it is not improving much. Please find the attached file for the query plan of Q20 with and without patch. I haven't evaluated the performance of this query with patch. Yeah, I don't think this patch is fixing whatever is going wrong with Q20. It's just a problem Tomas found while investigating that issue. Unfortunately, I don't think we know what's really going on here yet. Yeah, it's just a side-bug. I'll look into the issue again once I get back home from pgconf.us, it's a little bit difficult to investigate this over ssh on the in-flight wifi :-/ My hunch is that the two foreign keys interfere in some strange way, i.e. we should apply just one and end up applying both, or something like that. Or perhaps it gets confused by the join-to-aggregated-rel?. It's a bit weird, though, because the patch was originally meant to help with multi-column keys. It was then extended to also consider single-column FKs, but that should really be just 1/ndistinct anyway. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Avoid GatherMerge crash when there are no workers.
Hi, On 2017-04-01 01:22:14 +, Robert Haas wrote: > Avoid GatherMerge crash when there are no workers. I think the gather merge code needs a bit more test coverage (sorry to make this a larger theme today). As shown by https://coverage.postgresql.org/src/backend/executor/nodeGatherMerge.c.gcov.html we don't actually merge anything (heap_compare_slots is not exercised). I btw also wonder if it's good that we have a nearly identical copy of heap_compare_slots and a bunch of the calling code in both nodeMergeAppend.c and nodeGatherMerge.c. On the other hand, it's not heavily envolving code. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 2017-03-31 21:30:17 -0400, Robert Haas wrote: > On Fri, Mar 31, 2017 at 9:26 PM, Petr Jelinek >wrote: > >> Hmm, I don't know if there's any good reason not to just use strcmp(), > >> but sure, OK. Committed and back-patched. > > > > Hmm culicidae still fails, this time only in parallel worker code. This > > didn't happen on my machine which is strange. Looking at the code, we > > are passing the fps->entrypoint as function pointer again so of course > > it fails. We have some code to load libraries again but even that gets > > initial entrypoint passed as function pointer > > (ParallelExtensionTrampoline). I wonder if we'll have to generalize the > > InternalBGWorkers even more to some kind of internal function name to > > pointer map and add the parallel entry points there as well. > > Argh, I forgot about that. I think we need to use something like > ParallelExtensionTrampoline all the time, not just for libraries. > Since effectively, we've determined that postgres itself has the same > problem. I wonder if we shouldn't just introduce a KnownFunctionPointer[] map, that can be used by various facilities (bgworkers themselves, parallelism, and probably more coming). Otherwise we'll introduce mechanisms like 7d8f6986b8 in multiple places. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 2017-04-01 03:57:05 +0200, Petr Jelinek wrote: > On 01/04/17 03:44, Andres Freund wrote: > > On 2017-04-01 03:26:01 +0200, Petr Jelinek wrote: > >> On 01/04/17 02:53, Robert Haas wrote: > >>> On Fri, Mar 31, 2017 at 8:32 PM, Petr Jelinek > >>>wrote: > Right, changed to BGW_MAXLEN. > >>> > >>> Hmm, I don't know if there's any good reason not to just use strcmp(), > >>> but sure, OK. Committed and back-patched. > > > > Cool! > > > > > >> Hmm culicidae still fails, this time only in parallel worker code. > >> This didn't happen on my machine which is strange. > > > > Interesting. Did you reproduce the failure before? I wonder if the > > issue is that you don't build a position independent executable? > > > > I did reproduce the original issue yes, that's how I tested the patch. I > just copy pasted CFLAGS from the buildfarm configuration of culicidae TBH. That's not necessarily sufficient, unless you also use a sufficiently new debian - they changed gcc to default to PIE being enabled by default. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Failed assertion in _hash_kill_items/MarkBufferDirtyHint
Hi, On Sat, Apr 1, 2017 at 7:06 AM, Ashutosh Sharmawrote: > On Sat, Apr 1, 2017 at 6:00 AM, Robert Haas wrote: >> On Fri, Mar 31, 2017 at 6:09 PM, Ashutosh Sharma >> wrote: >>> Well, That is another option but the main idea was to be inline with >>> the btree code. >> >> That's not a bad goal in principal, but _bt_killitems() doesn't have >> any similar argument. > > It was there but later got removed with some patch (may be the patch > for reducing pinning and buffer content lock for btree scans). The > following commit has this changes. > > commit 09cb5c0e7d6fbc9dee26dc429e4fc0f2a88e5272 > Author: Tom Lane > Date: Sun May 7 01:21:30 2006 + > >> >> (Also, speaking of consistency, why did we end up with >> _hash_kill_items, with an underscore between kill and items, and >> _bt_killitems, without one?) > > That is just to follow the naming convention in hash.h Please check > the review comments for this at - [1]. > >> >>> Moreover, I am not sure if acquiring lwlock inside >>> hashendscan (basically the place where we are trying to close down the >>> things) would look good. >> >> Well, if that's not a good thing to do, hiding it inside some other >> function doesn't make it better. > > okay, agreed. I will submit the patch very shortly. As suggested, I am now acquiring lock inside the caller function. Attached is the patch having this changes. Thanks. > > > [1] - > https://www.postgresql.org/message-id/cf6abd8a-77b5-f1c7-8e50-5bef461e0522%40redhat.com -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c index 34cc08f..b835f77 100644 --- a/src/backend/access/hash/hash.c +++ b/src/backend/access/hash/hash.c @@ -476,9 +476,17 @@ hashrescan(IndexScanDesc scan, ScanKey scankey, int nscankeys, HashScanOpaque so = (HashScanOpaque) scan->opaque; Relation rel = scan->indexRelation; - /* Before leaving current page, deal with any killed items */ + /* + * Before leaving current page, deal with any killed items. + * Also, ensure that we acquire lock on current page before + * calling _hash_kill_items. + */ if (so->numKilled > 0) + { + LockBuffer(so->hashso_curbuf, BUFFER_LOCK_SHARE); _hash_kill_items(scan); + LockBuffer(so->hashso_curbuf, BUFFER_LOCK_UNLOCK); + } _hash_dropscanbuf(rel, so); @@ -507,9 +515,17 @@ hashendscan(IndexScanDesc scan) HashScanOpaque so = (HashScanOpaque) scan->opaque; Relation rel = scan->indexRelation; - /* Before leaving current page, deal with any killed items */ + /* + * Before leaving current page, deal with any killed items. + * Also, ensure that we acquire lock on current page before + * calling _hash_kill_items. + */ if (so->numKilled > 0) + { + LockBuffer(so->hashso_curbuf, BUFFER_LOCK_SHARE); _hash_kill_items(scan); + LockBuffer(so->hashso_curbuf, BUFFER_LOCK_UNLOCK); + } _hash_dropscanbuf(rel, so); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 01/04/17 03:44, Andres Freund wrote: > On 2017-04-01 03:26:01 +0200, Petr Jelinek wrote: >> On 01/04/17 02:53, Robert Haas wrote: >>> On Fri, Mar 31, 2017 at 8:32 PM, Petr Jelinek >>>wrote: Right, changed to BGW_MAXLEN. >>> >>> Hmm, I don't know if there's any good reason not to just use strcmp(), >>> but sure, OK. Committed and back-patched. > > Cool! > > >> Hmm culicidae still fails, this time only in parallel worker code. >> This didn't happen on my machine which is strange. > > Interesting. Did you reproduce the failure before? I wonder if the > issue is that you don't build a position independent executable? > I did reproduce the original issue yes, that's how I tested the patch. I just copy pasted CFLAGS from the buildfarm configuration of culicidae TBH. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 2017-04-01 03:26:01 +0200, Petr Jelinek wrote: > On 01/04/17 02:53, Robert Haas wrote: > > On Fri, Mar 31, 2017 at 8:32 PM, Petr Jelinek > >wrote: > >> Right, changed to BGW_MAXLEN. > > > > Hmm, I don't know if there's any good reason not to just use strcmp(), > > but sure, OK. Committed and back-patched. Cool! > Hmm culicidae still fails, this time only in parallel worker code. > This didn't happen on my machine which is strange. Interesting. Did you reproduce the failure before? I wonder if the issue is that you don't build a position independent executable? - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Failed assertion in _hash_kill_items/MarkBufferDirtyHint
On Sat, Apr 1, 2017 at 6:00 AM, Robert Haaswrote: > On Fri, Mar 31, 2017 at 6:09 PM, Ashutosh Sharma > wrote: >> Well, That is another option but the main idea was to be inline with >> the btree code. > > That's not a bad goal in principal, but _bt_killitems() doesn't have > any similar argument. It was there but later got removed with some patch (may be the patch for reducing pinning and buffer content lock for btree scans). The following commit has this changes. commit 09cb5c0e7d6fbc9dee26dc429e4fc0f2a88e5272 Author: Tom Lane Date: Sun May 7 01:21:30 2006 + > > (Also, speaking of consistency, why did we end up with > _hash_kill_items, with an underscore between kill and items, and > _bt_killitems, without one?) That is just to follow the naming convention in hash.h Please check the review comments for this at - [1]. > >> Moreover, I am not sure if acquiring lwlock inside >> hashendscan (basically the place where we are trying to close down the >> things) would look good. > > Well, if that's not a good thing to do, hiding it inside some other > function doesn't make it better. okay, agreed. I will submit the patch very shortly. [1] - https://www.postgresql.org/message-id/cf6abd8a-77b5-f1c7-8e50-5bef461e0522%40redhat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] A better way to expand hash indexes.
On Fri, Mar 31, 2017 at 1:15 AM, Mithun Cywrote: > Thanks, I have tried to fix all of the comments. Thanks. Hmm, don't the changes to contrib/pageinspect/expected/hash.out indicate that you've broken something? The hash index has only 4 buckets, so the new code shouldn't be doing anything differently, but you've got highmask and lowmask changing for some reason. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On Fri, Mar 31, 2017 at 9:26 PM, Petr Jelinekwrote: >> Hmm, I don't know if there's any good reason not to just use strcmp(), >> but sure, OK. Committed and back-patched. > > Hmm culicidae still fails, this time only in parallel worker code. This > didn't happen on my machine which is strange. Looking at the code, we > are passing the fps->entrypoint as function pointer again so of course > it fails. We have some code to load libraries again but even that gets > initial entrypoint passed as function pointer > (ParallelExtensionTrampoline). I wonder if we'll have to generalize the > InternalBGWorkers even more to some kind of internal function name to > pointer map and add the parallel entry points there as well. Argh, I forgot about that. I think we need to use something like ParallelExtensionTrampoline all the time, not just for libraries. Since effectively, we've determined that postgres itself has the same problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 01/04/17 02:53, Robert Haas wrote: > On Fri, Mar 31, 2017 at 8:32 PM, Petr Jelinek >wrote: >> Right, changed to BGW_MAXLEN. > > Hmm, I don't know if there's any good reason not to just use strcmp(), > but sure, OK. Committed and back-patched. > Hmm culicidae still fails, this time only in parallel worker code. This didn't happen on my machine which is strange. Looking at the code, we are passing the fps->entrypoint as function pointer again so of course it fails. We have some code to load libraries again but even that gets initial entrypoint passed as function pointer (ParallelExtensionTrampoline). I wonder if we'll have to generalize the InternalBGWorkers even more to some kind of internal function name to pointer map and add the parallel entry points there as well. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] crashes due to setting max_parallel_workers=0
On Tue, Mar 28, 2017 at 11:08 PM, Tomas Vondrawrote: > Maybe. It depends on how valuable it's to keep Gather and GatherMerge > similar - having nreaders in one and not the other seems a bit weird. But > maybe the refactoring would remove it from both nodes? Yeah, it appears to be the case that both Gather and GatherMerge inevitably end up with nworkers_launched == nreaders, which seems dumb. If we're going to clean this up, I think we should make them both match. > Also, it does not really solve the issue that we're using 'nreaders' or > 'nworkers_launched' to access array with one extra element. I'm not clear that there's any fundamental solution to that problem other than trying to clarify it through comments. Meantime, I think it's not good to leave this crashing, so I pushed Rushabh's v2 patch for the actual crash for now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TPC-H Q20 from 1 hour to 19 hours!
On Fri, Mar 31, 2017 at 7:53 AM, Rafia Sabihwrote: > So, it looks like in the problematic area, it is not improving much. > Please find the attached file for the query plan of Q20 with and > without patch. I haven't evaluated the performance of this query with > patch. Yeah, I don't think this patch is fixing whatever is going wrong with Q20. It's just a problem Tomas found while investigating that issue. Unfortunately, I don't think we know what's really going on here yet. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On Fri, Mar 31, 2017 at 8:32 PM, Petr Jelinekwrote: > Right, changed to BGW_MAXLEN. Hmm, I don't know if there's any good reason not to just use strcmp(), but sure, OK. Committed and back-patched. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 31/03/17 15:42, Robert Haas wrote: > On Tue, Mar 28, 2017 at 7:39 PM, Petr Jelinek >wrote: >> Sigh, forgot git add for the docs, so one more try... > > +if (strncmp(worker->bgw_library_name, "postgres", 8) != 0) > +return NULL; > > I think that's not right. You don't want to match postgresshdkgjsdglkjs. > Right, changed to BGW_MAXLEN. > Aside from that, these look good to me. > Cool. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From 74b9f0d7063071dd443fa820984fba477f4446cc Mon Sep 17 00:00:00 2001 From: Petr Jelinek Date: Tue, 28 Mar 2017 23:24:34 +0200 Subject: [PATCH] Use library_name/function_name for loading main entry point of internal bgworkers --- src/backend/access/transam/parallel.c| 7 +-- src/backend/postmaster/bgworker.c| 76 ++-- src/include/access/parallel.h| 2 + src/test/modules/worker_spi/worker_spi.c | 6 ++- 4 files changed, 73 insertions(+), 18 deletions(-) diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c index cde0ed3..d2bed72 100644 --- a/src/backend/access/transam/parallel.c +++ b/src/backend/access/transam/parallel.c @@ -109,7 +109,6 @@ static dlist_head pcxt_list = DLIST_STATIC_INIT(pcxt_list); /* Private functions. */ static void HandleParallelMessage(ParallelContext *pcxt, int i, StringInfo msg); static void ParallelExtensionTrampoline(dsm_segment *seg, shm_toc *toc); -static void ParallelWorkerMain(Datum main_arg); static void WaitForParallelWorkersToExit(ParallelContext *pcxt); @@ -456,7 +455,9 @@ LaunchParallelWorkers(ParallelContext *pcxt) BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION; worker.bgw_start_time = BgWorkerStart_ConsistentState; worker.bgw_restart_time = BGW_NEVER_RESTART; - worker.bgw_main = ParallelWorkerMain; + worker.bgw_main = NULL; + sprintf(worker.bgw_library_name, "postgres"); + sprintf(worker.bgw_function_name, "ParallelWorkerMain"); worker.bgw_main_arg = UInt32GetDatum(dsm_segment_handle(pcxt->seg)); worker.bgw_notify_pid = MyProcPid; memset(_extra, 0, BGW_EXTRALEN); @@ -928,7 +929,7 @@ AtEOXact_Parallel(bool isCommit) /* * Main entrypoint for parallel workers. */ -static void +void ParallelWorkerMain(Datum main_arg) { dsm_segment *seg; diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c index 52bc4e9..5ea5abf 100644 --- a/src/backend/postmaster/bgworker.c +++ b/src/backend/postmaster/bgworker.c @@ -16,6 +16,7 @@ #include "miscadmin.h" #include "libpq/pqsignal.h" +#include "access/parallel.h" #include "postmaster/bgworker_internals.h" #include "postmaster/postmaster.h" #include "storage/barrier.h" @@ -94,6 +95,25 @@ struct BackgroundWorkerHandle static BackgroundWorkerArray *BackgroundWorkerData; /* + * List of internal background workers. These are used for mapping the + * function name to actual function when building with EXEC_BACKEND and also + * to allow these to be loaded outside of shared_preload_libraries. + */ +typedef struct InternalBGWorkerMain +{ + char *bgw_function_name; + bgworker_main_type bgw_main; +} InternalBGWorkerMain; + +static const InternalBGWorkerMain InternalBGWorkers[] = { + {"ParallelWorkerMain", ParallelWorkerMain}, + /* Dummy entry marking end of the array. */ + {NULL, NULL} +}; + +static bgworker_main_type GetInternalBgWorkerMain(BackgroundWorker *worker); + +/* * Calculate shared memory needed. */ Size @@ -695,22 +715,27 @@ StartBackgroundWorker(void) #endif } + /* For internal workers set the entry point to known function address. */ + entrypt = GetInternalBgWorkerMain(worker); + /* - * If bgw_main is set, we use that value as the initial entrypoint. - * However, if the library containing the entrypoint wasn't loaded at - * postmaster startup time, passing it as a direct function pointer is not - * possible. To work around that, we allow callers for whom a function - * pointer is not available to pass a library name (which will be loaded, - * if necessary) and a function name (which will be looked up in the named + * Otherwise, if bgw_main is set, we use that value as the initial + * entrypoint. This does not work well EXEC_BACKEND outside Windows but + * we keep the logic for backwards compatibility. In other cases use + * the entry point specified by library name (which will be loaded, if + * necessary) and a function name (which will be looked up in the named * library). */ - if (worker->bgw_main != NULL) - entrypt = worker->bgw_main; - else - entrypt = (bgworker_main_type) - load_external_function(worker->bgw_library_name, - worker->bgw_function_name, - true, NULL); + if (entrypt == NULL) + { + if (worker->bgw_main != NULL) + entrypt = worker->bgw_main; + else + entrypt = (bgworker_main_type) +
Re: [HACKERS] [sqlsmith] Failed assertion in _hash_kill_items/MarkBufferDirtyHint
On Fri, Mar 31, 2017 at 6:09 PM, Ashutosh Sharmawrote: > Well, That is another option but the main idea was to be inline with > the btree code. That's not a bad goal in principal, but _bt_killitems() doesn't have any similar argument. (Also, speaking of consistency, why did we end up with _hash_kill_items, with an underscore between kill and items, and _bt_killitems, without one?) > Moreover, I am not sure if acquiring lwlock inside > hashendscan (basically the place where we are trying to close down the > things) would look good. Well, if that's not a good thing to do, hiding it inside some other function doesn't make it better. I think it's fine, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Somebody has not thought through subscription locking considerations
On 01/04/17 01:57, Petr Jelinek wrote: > On 01/04/17 01:20, Tom Lane wrote: >> Petr Jelinekwrites: >>> But the pg_subscription_rel is also not accessed on heap_open, the >>> problematic code is called from heap_drop_with_catalog. And VACUUM FULL >>> pg_class calls heap_drop_with_catalog() when doing the heap swap (and it >>> goes through performDeletion so through dependency info which is what I >>> mean by everything else does this). >> >> Hmmm. What the heap_drop_with_catalog call is being applied to is a >> short-lived table that is not pg_class. It happens to own the disk >> storage that used to belong to pg_class, but it is not pg_class; >> in particular it doesn't have the same OID, and it is not what would >> be looked in if someone happened to need to fetch a pg_class row >> at that point. So there's no circularity involved. >> >> On further reflection it seems like you were right, the problem is >> taking a self-exclusive lock on pg_subscription_rel during low-level >> catalog operations. We're going to have to find a way to reduce that >> lock strength, or we're going to have a lot of deadlock problems. >> > > Well we have heavy lock because we were worried about failure scenarios > in our dumb upsert in SetSubscriptionRelState which does cache lookup > and if it finds something it updates, otherwise inserts. We have similar > pattern in other places but they are called from DDL statements usually > so the worst that can happen is DDL fails with weird errors when done > concurrently with same name so using RowExclusiveLock is okay. > > That being said, looking at use-cases for SetSubscriptionRelState that's > basically CREATE SUBSCRIPTION, ALTER SUBSCRIPTION REFRESH and tablesync > worker. So the DDL thing applies to first ones as well and tablesync > should not be running in case the record does not exist so it's fine if > it fails. In terms of RemoveSubscriptionRel that's only called from > heap_drop_with_catalog and tablesync holds relation lock so there is no > way heap_drop_with_catalog will happen on the same relation. This leads > me to thinking that RowExclusiveLock is fine for both > SetSubscriptionRelState and RemoveSubscriptionRel as long as we document > that callers should be aware that SetSubscriptionRelState has > concurrency issues and fail on unique index check. > And a simple patch to do so. Peter do you see any problem with doing this? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From a12259efe7b4d0af570e7d2d7cb48b249158a1bd Mon Sep 17 00:00:00 2001 From: Petr Jelinek Date: Sat, 1 Apr 2017 02:20:16 +0200 Subject: [PATCH] Use weaker locks when updating subscription relation state --- src/backend/catalog/pg_subscription.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c index e7a1634..b0bd248 100644 --- a/src/backend/catalog/pg_subscription.c +++ b/src/backend/catalog/pg_subscription.c @@ -215,6 +215,9 @@ textarray_to_stringlist(ArrayType *textarray) /* * Set the state of a subscription table. + * + * The insert or update logic in this function is not concurrency safe so + * it may raise an error. */ Oid SetSubscriptionRelState(Oid subid, Oid relid, char state, @@ -227,7 +230,7 @@ SetSubscriptionRelState(Oid subid, Oid relid, char state, Datum values[Natts_pg_subscription_rel]; /* Prevent concurrent changes. */ - rel = heap_open(SubscriptionRelRelationId, ShareRowExclusiveLock); + rel = heap_open(SubscriptionRelRelationId, RowExclusiveLock); /* Try finding existing mapping. */ tup = SearchSysCacheCopy2(SUBSCRIPTIONRELMAP, @@ -358,8 +361,7 @@ RemoveSubscriptionRel(Oid subid, Oid relid) HeapTuple tup; int nkeys = 0; - /* Prevent concurrent changes (see SetSubscriptionRelState()). */ - rel = heap_open(SubscriptionRelRelationId, ShareRowExclusiveLock); + rel = heap_open(SubscriptionRelRelationId, RowExclusiveLock); if (OidIsValid(subid)) { @@ -387,7 +389,7 @@ RemoveSubscriptionRel(Oid subid, Oid relid) } heap_endscan(scan); - heap_close(rel, ShareRowExclusiveLock); + heap_close(rel, RowExclusiveLock); } -- 2.7.4 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] partitioned tables and contrib/sepgsql
On Fri, Mar 31, 2017 at 2:14 PM, Mike Palmiottowrote: > Attached you will find two patches, which were rebased on master as of > 156d388 (applied with `git am --revert [patch file]`). The first gets > rid of some pesky compiler warnings and the second implements the > sepgsql handling of partitioned tables. 0001 has the problem that we have a firm rule against putting any #includes whatsoever before "postgres.h". This stdbool issue has come up before, though, and I fear we're going to need to do something about it. 0002 looks extremely straightforward, but I wonder if we could get one of the people on this list who knows about sepgsql to have a look? (Stephen Frost, Joe Conway, KaiGai Kohei...) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] SortSupport for macaddr type
On Fri, Mar 31, 2017 at 3:03 PM, Brandur Leachwrote: > Thank you Peter for the assist here and great sleuthing in > the disassembly, and thanks Teodor for committing! > > Neha pointed out (thanks as well) a couple typos upthread > that I hadn't gotten around to fixing. I've attached a new > three line patch to sort those out. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Somebody has not thought through subscription locking considerations
On 01/04/17 01:57, Petr Jelinek wrote: > On 01/04/17 01:20, Tom Lane wrote: >> Petr Jelinekwrites: >>> But the pg_subscription_rel is also not accessed on heap_open, the >>> problematic code is called from heap_drop_with_catalog. And VACUUM FULL >>> pg_class calls heap_drop_with_catalog() when doing the heap swap (and it >>> goes through performDeletion so through dependency info which is what I >>> mean by everything else does this). >> >> Hmmm. What the heap_drop_with_catalog call is being applied to is a >> short-lived table that is not pg_class. It happens to own the disk >> storage that used to belong to pg_class, but it is not pg_class; >> in particular it doesn't have the same OID, and it is not what would >> be looked in if someone happened to need to fetch a pg_class row >> at that point. So there's no circularity involved. >> >> On further reflection it seems like you were right, the problem is >> taking a self-exclusive lock on pg_subscription_rel during low-level >> catalog operations. We're going to have to find a way to reduce that >> lock strength, or we're going to have a lot of deadlock problems. >> > > Well we have heavy lock because we were worried about failure scenarios > in our dump upsert in SetSubscriptionRelState which does cache lookup *dumb (ie, like me ;) ) -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Somebody has not thought through subscription locking considerations
On 01/04/17 01:20, Tom Lane wrote: > Petr Jelinekwrites: >> But the pg_subscription_rel is also not accessed on heap_open, the >> problematic code is called from heap_drop_with_catalog. And VACUUM FULL >> pg_class calls heap_drop_with_catalog() when doing the heap swap (and it >> goes through performDeletion so through dependency info which is what I >> mean by everything else does this). > > Hmmm. What the heap_drop_with_catalog call is being applied to is a > short-lived table that is not pg_class. It happens to own the disk > storage that used to belong to pg_class, but it is not pg_class; > in particular it doesn't have the same OID, and it is not what would > be looked in if someone happened to need to fetch a pg_class row > at that point. So there's no circularity involved. > > On further reflection it seems like you were right, the problem is > taking a self-exclusive lock on pg_subscription_rel during low-level > catalog operations. We're going to have to find a way to reduce that > lock strength, or we're going to have a lot of deadlock problems. > Well we have heavy lock because we were worried about failure scenarios in our dump upsert in SetSubscriptionRelState which does cache lookup and if it finds something it updates, otherwise inserts. We have similar pattern in other places but they are called from DDL statements usually so the worst that can happen is DDL fails with weird errors when done concurrently with same name so using RowExclusiveLock is okay. That being said, looking at use-cases for SetSubscriptionRelState that's basically CREATE SUBSCRIPTION, ALTER SUBSCRIPTION REFRESH and tablesync worker. So the DDL thing applies to first ones as well and tablesync should not be running in case the record does not exist so it's fine if it fails. In terms of RemoveSubscriptionRel that's only called from heap_drop_with_catalog and tablesync holds relation lock so there is no way heap_drop_with_catalog will happen on the same relation. This leads me to thinking that RowExclusiveLock is fine for both SetSubscriptionRelState and RemoveSubscriptionRel as long as we document that callers should be aware that SetSubscriptionRelState has concurrency might fail on unique index check. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection
On Sat, Apr 1, 2017 at 1:06 AM, Alvaro Herrerawrote: > Ashutosh Bapat wrote: >> Please add this to the next commitfest. > > If this cannot be reproduced in 9.6, then it must be added to the > Open Items wiki page instead. The behavior reported can be reproduced further down (just tried on 9.3, gave up below). Like Tsunakawa-san, I am surprised to see that an elog() message is exposed to the user. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Covering + unique indexes.
I had a quick look at this on the flight back from PGConf.US. On Fri, Mar 31, 2017 at 10:40 AM, Anastasia Lubennikovawrote: > But I haven't considered the possibility of index_form_tuple() failure. > Fixed in this version of the patch. Now it creates a copy of tupledesc to > pass it to index_form_tuple. I think that we need to be 100% sure that index_truncate_tuple() will not generate an IndexTuple that is larger than the original. Otherwise, you could violate the "1/3 of page size exceeded" thing. We need to catch that when the user actually inserts an oversized value. After that, it's always too late. (See my remarks to Tom on other thread about this, too.) > We'd discussed with other reviewers, they suggested index_truncate_tuple() > instead of index_reform_tuple(). > I think that this name reflects the essence of the function clear enough and > don't feel like renaming it again. +1. Feedback so far: * index_truncate_tuple() should have as an argument the number of attributes. No need to "#include utils/rel.h" that way. * I think that we should store this (the number of attributes), and use it directly when comparing, per my remarks to Tom over on that other thread. We should also use the free bit within IndexTupleData.t_info, to indicate that the IndexTuple was truncated, just to make it clear to everyone that might care that that's how these truncated IndexTuples need to be represented. Doing this would have no real impact on your patch, because for you this will be 100% redundant. It will help external tools, and perhaps another, more general suffix truncation patch that comes in the future. We should try very hard to have a future-proof on-disk representation. I think that this is quite possible. * I suggest adding a "can't happen" defensive check + error that checks that the tuple returned by index_truncate_tuple() is sized <= the original. This cannot be allowed to ever happen. (Not that I think it will.) * I see a small bug. You forgot to teach _bt_findsplitloc() about truncation. It does this currently, which you did not update: /* * The first item on the right page becomes the high key of the left page; * therefore it counts against left space as well as right space. */ leftfree -= firstrightitemsz; I think that this accounting needs to be fixed. * Note sure about one thing. What's the reason for this change? > - /* Log left page */ > - if (!isleaf) > - { > - /* > -* We must also log the left page's high key, because the right > -* page's leftmost key is suppressed on non-leaf levels. Show it > -* as belonging to the left page buffer, so that it is not stored > -* if XLogInsert decides it needs a full-page image of the left > -* page. > -*/ > - itemid = PageGetItemId(origpage, P_HIKEY); > - item = (IndexTuple) PageGetItem(origpage, itemid); > - XLogRegisterBufData(0, (char *) item, > MAXALIGN(IndexTupleSize(item))); > - } > + /* > +* We must also log the left page's high key, because the right > +* page's leftmost key is suppressed on non-leaf levels. Show it > +* as belonging to the left page buffer, so that it is not stored > +* if XLogInsert decides it needs a full-page image of the left > +* page. > +*/ > + itemid = PageGetItemId(origpage, P_HIKEY); > + item = (IndexTuple) PageGetItem(origpage, itemid); > + XLogRegisterBufData(0, (char *) item, MAXALIGN(IndexTupleSize(item))); Is this related to the problem that you mentioned to me that you'd fixed when we spoke in person earlier today? You said something about WAL logging, but I don't recall any details. I don't remember seeing this change in prior versions. Anyway, whatever the reason for doing this on the leaf level now, the comments should be updated to explain it. * Speaking of WAL-logging, I think that there is another bug in btree_xlog_split(). You didn't change this existing code at all: /* * On leaf level, the high key of the left page is equal to the first key * on the right page. */ if (isleaf) { ItemId hiItemId = PageGetItemId(rpage, P_FIRSTDATAKEY(ropaque)); left_hikey = PageGetItem(rpage, hiItemId); left_hikeysz = ItemIdGetLength(hiItemId); } It seems like this was missed when you changed WAL-logging, since you do something for this on the logging side, but not here, on the replay side. No? That's all I have for now. Maybe I can look again later, or tomorrow. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Somebody has not thought through subscription locking considerations
Petr Jelinekwrites: > But the pg_subscription_rel is also not accessed on heap_open, the > problematic code is called from heap_drop_with_catalog. And VACUUM FULL > pg_class calls heap_drop_with_catalog() when doing the heap swap (and it > goes through performDeletion so through dependency info which is what I > mean by everything else does this). Hmmm. What the heap_drop_with_catalog call is being applied to is a short-lived table that is not pg_class. It happens to own the disk storage that used to belong to pg_class, but it is not pg_class; in particular it doesn't have the same OID, and it is not what would be looked in if someone happened to need to fetch a pg_class row at that point. So there's no circularity involved. However ... by that same token, it ought to be okay to consult pg_subscription_rel during that drop step. Indeed, if we put an IsCatalogRelation check in there, it wouldn't fire, because the target table has OID 16409 according to your stack trace. So that line of thought is indeed not very helpful. On further reflection it seems like you were right, the problem is taking a self-exclusive lock on pg_subscription_rel during low-level catalog operations. We're going to have to find a way to reduce that lock strength, or we're going to have a lot of deadlock problems. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgres_fdw IMPORT SCHEMA and partitioned tables
On Sat, Apr 1, 2017 at 4:55 AM, Tom Lanewrote: > Robert Haas writes: >> On Fri, Mar 31, 2017 at 3:31 PM, Andres Freund wrote: >>> Hm. Wonder if something like that shouldn't be backpatched - because >>> otherwise using postgres_fdw from an old server against a newer one will >>> do weird stuff. I don't know what kind of policy we've committed to >>> with postgresImportForeignSchema... > >> I don't think I'd like to promise that postgres_fdw will always be >> forward-compatible. Backward-compatibility is hard enough already. Thanks for the commit. > Unless I'm missing something, the behavior will be that an older > version will simply ignore remote partitioned tables (they will not > pass the relkind filter in the query). Seems pretty fail-soft, > so I think it's fine. Yeah, I would suggest to revisit that if we get actual complaints, but I would not push much in favor of it. It's not an area where nothing can be done to improve the user experience. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Table collision in join.sql and aggregates.sql
Douglas Doolewrites: > As we've been merging our code with 9.6, a couple developers have had > one-off failures in the join.sql and aggregates.sql test because the tables > T1, T2 and T3 have the wrong definitions. > Digging into it, I found that both files create the tables T1, T2, and T3 > for a short period of time and then drop them. Since the parallel_schedule > file puts these two files into the same group, they can run concurrently. > If it happens that the the two files hit the T1, T2, T3 tests at the same > time, then you see the table definition problems. Hmmm ... that would indeed be a problem, except that aggregate.sql's tables are temp tables, which should mean that they are in a schema different from the one that join.sql is putting its tables in. Are you sure you've identified the issue correctly? Because if this doesn't work, there are an awful lot of similar hazards elsewhere in the regression tests. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Somebody has not thought through subscription locking considerations
On 01/04/17 00:52, Tom Lane wrote: > Petr Jelinekwrites: >> On 31/03/17 21:00, Tom Lane wrote: >>> Looking at dependency info isn't going to fix this, it only moves the >>> unsafe catalog access somewhere else (ie pg_depend instead of >>> pg_subscription_rel). I suspect the only safe solution is doing an >>> IsCatalogRelation or similar test pretty early in the logical replication >>> code paths. > >> I don't follow, everything else does dependency info check in this >> situation, how is this any different? > > What do you mean by "everything else"? The key point here is that > access to the bootstrap catalogs like pg_class and pg_attribute can't > be dependent on accesses to other catalogs, or we get into circularities. > We certainly aren't trying to look in pg_depend when we do a heap_open. > > (Or, if you tell me that we are now because the logical replication > patch added it, I'm going to start agitating for reverting the patch > and sending it back for redesign.) But the pg_subscription_rel is also not accessed on heap_open, the problematic code is called from heap_drop_with_catalog. And VACUUM FULL pg_class calls heap_drop_with_catalog() when doing the heap swap (and it goes through performDeletion so through dependency info which is what I mean by everything else does this). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Somebody has not thought through subscription locking considerations
Petr Jelinekwrites: > On 31/03/17 21:00, Tom Lane wrote: >> Looking at dependency info isn't going to fix this, it only moves the >> unsafe catalog access somewhere else (ie pg_depend instead of >> pg_subscription_rel). I suspect the only safe solution is doing an >> IsCatalogRelation or similar test pretty early in the logical replication >> code paths. > I don't follow, everything else does dependency info check in this > situation, how is this any different? What do you mean by "everything else"? The key point here is that access to the bootstrap catalogs like pg_class and pg_attribute can't be dependent on accesses to other catalogs, or we get into circularities. We certainly aren't trying to look in pg_depend when we do a heap_open. (Or, if you tell me that we are now because the logical replication patch added it, I'm going to start agitating for reverting the patch and sending it back for redesign.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Table collision in join.sql and aggregates.sql
As we've been merging our code with 9.6, a couple developers have had one-off failures in the join.sql and aggregates.sql test because the tables T1, T2 and T3 have the wrong definitions. Digging into it, I found that both files create the tables T1, T2, and T3 for a short period of time and then drop them. Since the parallel_schedule file puts these two files into the same group, they can run concurrently. If it happens that the the two files hit the T1, T2, T3 tests at the same time, then you see the table definition problems. I took the easy way of solving this and renamed the tables in aggregates.sql to AGG1, AGG2, and AGG3. (Picked on aggregates.sql since it had the T1, T2, T3 tests added in 9.6.) Doug - Salesforce *** a/src/test/regress/expected/aggregates.out --- b/src/test/regress/expected/aggregates.out *** *** 957,1023 LINE 1: select (select max(min(unique1)) from int8_tbl) from tenk1; -- -- Test removal of redundant GROUP BY columns -- ! create temp table t1 (a int, b int, c int, d int, primary key (a, b)); ! create temp table t2 (x int, y int, z int, primary key (x, y)); ! create temp table t3 (a int, b int, c int, primary key(a, b) deferrable); -- Non-primary-key columns can be removed from GROUP BY ! explain (costs off) select * from t1 group by a,b,c,d; ! QUERY PLAN ! -- HashAggregate Group Key: a, b !-> Seq Scan on t1 (3 rows) -- No removal can happen if the complete PK is not present in GROUP BY ! explain (costs off) select a,c from t1 group by a,c,d; ! QUERY PLAN ! -- HashAggregate Group Key: a, c, d !-> Seq Scan on t1 (3 rows) -- Test removal across multiple relations explain (costs off) select * ! from t1 inner join t2 on t1.a = t2.x and t1.b = t2.y ! group by t1.a,t1.b,t1.c,t1.d,t2.x,t2.y,t2.z; ! QUERY PLAN ! --- Group !Group Key: t1.a, t1.b, t2.x, t2.y -> Merge Join ! Merge Cond: ((t1.a = t2.x) AND (t1.b = t2.y)) ! -> Index Scan using t1_pkey on t1 ! -> Index Scan using t2_pkey on t2 (6 rows) ! -- Test case where t1 can be optimized but not t2 ! explain (costs off) select t1.*,t2.x,t2.z ! from t1 inner join t2 on t1.a = t2.x and t1.b = t2.y ! group by t1.a,t1.b,t1.c,t1.d,t2.x,t2.z; ! QUERY PLAN ! --- HashAggregate !Group Key: t1.a, t1.b, t2.x, t2.z -> Merge Join ! Merge Cond: ((t1.a = t2.x) AND (t1.b = t2.y)) ! -> Index Scan using t1_pkey on t1 ! -> Index Scan using t2_pkey on t2 (6 rows) -- Cannot optimize when PK is deferrable ! explain (costs off) select * from t3 group by a,b,c; ! QUERY PLAN ! -- HashAggregate Group Key: a, b, c !-> Seq Scan on t3 (3 rows) ! drop table t1; ! drop table t2; ! drop table t3; -- -- Test combinations of DISTINCT and/or ORDER BY -- --- 957,1023 -- -- Test removal of redundant GROUP BY columns -- ! create temp table agg1 (a int, b int, c int, d int, primary key (a, b)); ! create temp table agg2 (x int, y int, z int, primary key (x, y)); ! create temp table agg3 (a int, b int, c int, primary key(a, b) deferrable); -- Non-primary-key columns can be removed from GROUP BY ! explain (costs off) select * from agg1 group by a,b,c,d; !QUERY PLAN ! HashAggregate Group Key: a, b !-> Seq Scan on agg1 (3 rows) -- No removal can happen if the complete PK is not present in GROUP BY ! explain (costs off) select a,c from agg1 group by a,c,d; !QUERY PLAN ! HashAggregate Group Key: a, c, d !-> Seq Scan on agg1 (3 rows) -- Test removal across multiple relations explain (costs off) select * ! from agg1 inner join agg2 on agg1.a = agg2.x and agg1.b = agg2.y ! group by agg1.a,agg1.b,agg1.c,agg1.d,agg2.x,agg2.y,agg2.z; ! QUERY PLAN ! --- Group !Group Key: agg1.a, agg1.b, agg2.x, agg2.y -> Merge Join ! Merge Cond: ((agg1.a = agg2.x) AND (agg1.b = agg2.y)) ! -> Index Scan using agg1_pkey on agg1 ! -> Index Scan using agg2_pkey on agg2 (6 rows) ! -- Test case where agg1 can be optimized but not agg2 ! explain (costs off) select agg1.*,agg2.x,agg2.z ! from agg1 inner join agg2 on agg1.a = agg2.x and agg1.b = agg2.y ! group by agg1.a,agg1.b,agg1.c,agg1.d,agg2.x,agg2.z; ! QUERY PLAN ! --- HashAggregate !Group Key: agg1.a, agg1.b, agg2.x, agg2.z -> Merge Join ! Merge Cond: ((agg1.a =
Re: [HACKERS] Somebody has not thought through subscription locking considerations
On 31/03/17 21:00, Tom Lane wrote: > Petr Jelinekwrites: >> On 31/03/17 20:23, Tom Lane wrote: >>> No, the problematic part is that there is any heap_open happening at >>> all. That open could very easily result in a recursive attempt to read >>> pg_class, for example, which is going to be fatal if we're in the midst >>> of vacuum full'ing or reindex'ing pg_class. It's frankly astonishing >>> to me that this patch seems to have survived testing under >>> CLOBBER_CACHE_ALWAYS, because it's only the catalog caches that are >>> preventing such recursive lookups. > >> Hmm okay, so the solution is to either use standard dependency info for >> this so that it's only called for tables that are actually know to be >> subscribed or have some exceptions in the current code to call the >> function only for user catalogs. Any preferences? > > Looking at dependency info isn't going to fix this, it only moves the > unsafe catalog access somewhere else (ie pg_depend instead of > pg_subscription_rel). I suspect the only safe solution is doing an > IsCatalogRelation or similar test pretty early in the logical replication > code paths. > I don't follow, everything else does dependency info check in this situation, how is this any different? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LWLock optimization for multicore Power machines
Alexander Korotkovwrites: >> It seems that on this platform definition of atomics should be provided by >> fallback.h. But it doesn't because I already defined >> PG_HAVE_ATOMIC_U32_SUPPORT >> in arch-ppc.h. I think in this case we shouldn't provide ppc-specific >> implementation of pg_atomic_fetch_mask_add_u32(). However, I don't know >> how to do this assuming arch-ppc.h is included before compiler-specific >> headers. Thus, in arch-ppc.h we don't know yet if we would find >> implementation of atomics for this platform. One possible solution is to >> provide assembly implementation for all atomics in arch-ppc.h. > BTW, implementation for all atomics in arch-ppc.h would be too invasive and > shouldn't be considered for v10. > However, I made following workaround: declare pg_atomic_uint32 and > pg_atomic_fetch_mask_add_u32_impl() only when we know that generic-gcc.h > would declare gcc-based atomics. I don't have a well-informed opinion on whether this is a reasonable thing to do, but I imagine Andres does. > Could you, please, check it on Apple PPC? It does compile and pass "make check" on prairiedog. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partitioning vs ON CONFLICT
On Fri, Mar 31, 2017 at 5:44 PM, Robert Haaswrote: > And, indeed, you could get an unique constraint or exclusion error > because of an index on the child even though it's not global to the > partitioning hierarchy. So maybe we can support this after all, but > having messed it up once, I'm inclined to think we should postpone > this to v11, think it over some more, a Fine by me. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Failed assertion in _hash_kill_items/MarkBufferDirtyHint
On Sat, Apr 1, 2017 at 1:08 AM, Robert Haaswrote: > On Mon, Mar 27, 2017 at 5:39 AM, Ashutosh Sharma > wrote: >> Thanks for reporting this problem. Could you please let me know on for >> how long did you run sqlsmith to get this crash. However, I have found >> the reason for this crash. This is basically happening when trying to >> retrieve the tuples using cursor. Basically the current hash index >> scan work tuple-at-a-time which means once it finds tuple on page, it >> releases lock from the page but keeps pin on it and finally returns >> the tuple. When the requested number of tuples are processed there is >> no lock on the page that was being scanned but yes there is a pin on >> it. Finally, when trying to close a cursor at the end of scan, if any >> killed tuples has been identified we try to first mark these items as >> dead with the help of _hash_kill_items(). But, since we only have pin >> on this page, the assert check 'LWLockHeldByMe()' fails. > > Instead of adding bool haveLock to _hash_kill_items, how about just > having the caller acquire the lock before calling the function and > release it afterwards? Since the acquire is at the beginning of the > function and the release at the end, there seems to be no point in > putting the acquire/release inside the function rather than in the > caller. Well, That is another option but the main idea was to be inline with the btree code. Moreover, I am not sure if acquiring lwlock inside hashendscan (basically the place where we are trying to close down the things) would look good. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partitioning vs ON CONFLICT
On Fri, Mar 31, 2017 at 5:33 PM, Peter Geogheganwrote: > In my opinion, for the very limited ON CONFLICT DO NOTHING + no > inference specification case, the implementation should not care about > the presence or absence of unique indexes within or across partitions. Hmm. That's an interesting point. The documentation says: ON CONFLICT can be used to specify an alternative action to raising a unique constraint or exclusion constraint violation error. And, indeed, you could get an unique constraint or exclusion error because of an index on the child even though it's not global to the partitioning hierarchy. So maybe we can support this after all, but having messed it up once, I'm inclined to think we should postpone this to v11, think it over some more, and try to make sure that our second try doesn't crash... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Documentation improvements for partitioning
On Tue, Mar 28, 2017 at 6:35 AM, Amit Langotewrote: > Attached updated patch. Committed with editing here and there. I just left out the thing about UNION ALL views, which seemed to brief a treatment to deserve its own subsection. Maybe a longer explanation of that is worthwhile or maybe it's not, but that can be a separate patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partitioning vs ON CONFLICT
On Fri, Mar 31, 2017 at 4:47 PM, Robert Haaswrote: > /* > * Open partition indices (remember we do not support ON CONFLICT in > * case of partitioned tables, so we do not need support information > * for speculative insertion) > */ > > Part of the question here is definitional. Peter rightly pointed out > upthread that we support INSERT .. ON CONFLICT in an inheritance > situation, but that is different, because it infers whether there is a > conflict in the particular child into which you are trying to insert, > not whether there is a conflict across the whole hierarchy. I would say that it doesn't infer anything at all, in the sense that infer_arbiter_indexes() returns very early. It's then implied that whatever constraint index OIDs that the executor later happens to find will have speculative insertions. The optimizer doesn't try to predict what that will look like within the executor, or even on a foreign postgres_fdw server in the case of foreign tables. Foreign table indexes are not known to the local installation, which is one reason for this. You could INSERT ... ON CONFLICT DO NOTHING (no inference specification) into an ordinary table with no indexes, and that also works fine. (It's just silly.) > More or > less by definition, trying to insert into the room of the partitioning > hierarchy is a different beast: it should consider uniqueness across > the whole hierarchy in determining whether there is a conflict and, as > Simon pointed out in the second email on the thread, we lack a > mechanism to do that. In my opinion, for the very limited ON CONFLICT DO NOTHING + no inference specification case, the implementation should not care about the presence or absence of unique indexes within or across partitions. It might be sloppy for an application developer to do a whole lot of this, but that's not a judgement I think we can make for them. I don't feel strongly about this, though. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REFERENCES privilege should not be symmetric (was Re: [GENERAL] Postgres Permissions Article)
"David G. Johnston"writes: > On Fri, Mar 31, 2017 at 10:40 AM, Tom Lane wrote: >> I think the benefit is reduction of user confusion. Admittedly, since >> Paul is the first person I can remember ever having complained about it, >> maybe nobody else is confused. > After going back-and-forth on this (and not being able to independently > come to the conclusion that what we are adhering to is actually a typo) I'm > going to toss my +1 in with Robert's. OK, the consensus seems clearly against changing this in the back branches, so I'll just fix it in HEAD. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign tables don't enforce the partition constraint
On Fri, Mar 31, 2017 at 5:11 AM, Ashutosh Bapatwrote: > Per https://www.postgresql.org/docs/devel/static/sql-createforeigntable.html, > constraints on the foreign table should represent a constraint that is > being enforced by the remote server. Right. This is user error. Having the *local* server try to enforce the constraint would slow down the system without guaranteeing anything, because somebody could modify the table on the remote server directly. > Similarly, a partition constraint > should also be enforced at the foreign server. Probably we should > update documentation of create foreign table to mention this. That is a good idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partitioning vs ON CONFLICT
On Thu, Mar 30, 2017 at 5:54 AM, Amit Langotewrote: > I found out why the crash occurs, but while I was trying to fix it, I > started growing doubtful about the way this is being handled currently. > > Patch to fix the crash would be to pass 'true' instead of 'false' for > speculative when ExecSetupPartitionTupleRouting() calls ExecOpenIndices() > on leaf partitions. That will initialize the information needed when > ExecInsert() wants check for conflicts using the constraint-enforcing > indexes. If we do initialize the speculative insertion info (which will > fix the crash), ExecCheckIndexConstraints() will be called on a given leaf > partition's index to check if there is any conflict. But since the insert > was performed on the root table, conflicts should be checked across all > the partitions, which won't be the case. Even though the action is > NOTHING, the check for conflicts still uses only that one leaf partition's > index, which seems insufficient. > > Commit 8355a011a0 enabled specifying ON CONFLICT DO NOTHING on when > inserting into a partitioned root table, but given the above, I think we > might need to reconsider it. It seems obvious in retrospect that the commit in question was not quite up to the mark, considering that it didn't even update the comment here: /* * Open partition indices (remember we do not support ON CONFLICT in * case of partitioned tables, so we do not need support information * for speculative insertion) */ Part of the question here is definitional. Peter rightly pointed out upthread that we support INSERT .. ON CONFLICT in an inheritance situation, but that is different, because it infers whether there is a conflict in the particular child into which you are trying to insert, not whether there is a conflict across the whole hierarchy. More or less by definition, trying to insert into the room of the partitioning hierarchy is a different beast: it should consider uniqueness across the whole hierarchy in determining whether there is a conflict and, as Simon pointed out in the second email on the thread, we lack a mechanism to do that. If somebody wants to consider only conflicts within a specific partition, they can use INSERT .. ON CONFLICT with the name of that partition, and that'll work fine; if they target the parent, that should really apply globally to the hierarchy, which we can't support. So I think you (Amit) are right, and we should revert this commit. We can try again to make this work in a future release once we've had a chance to think about it some more. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partitioned tables and relfilenode
On Fri, Mar 31, 2017 at 4:33 AM, Kyotaro HORIGUCHIwrote: > I have no more comment on this. Thank you. I committed this with a few tweaks. I simplified the wording in the documentation a bit, removed or adjusted a couple of comments, and slightly changed the way the logic works in parseRelOptions in a way that I think is simpler. Thanks for reviewing, and thanks to Maksim as well, and thanks to Amit for writing the patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple
On Mar 31, 2017 2:17 PM, "Peter Geoghegan"wrote: On Fri, Mar 31, 2017 at 2:11 PM, Tom Lane wrote: >> The patch does actually store truncated/key-only tuples in the hi keys / >> non-leaf-nodes, which don't need the "included" columns. > > Hm. Since index tuples lack any means of indicating the actual number > of columns included (ie there's no equivalent of the natts field that > exists in HeapTupleHeaders), I think that this is an unreasonably > dangerous design. It'd be better to store nulls for the missing > fields. That would force a null bitmap to be included whether or > not there were nulls in the key columns, but if we're only doing it > once per page that doesn't seem like much cost. We're doing it once per page for the leaf page high key, but that's used as the downlink in the second phase of a B-Tree page split -- it's directly copied. So, including a NULL bitmap would make Anastasia's patch significantly less useful, since that now has to be in every internal page, and might imply that you end up with less fan-in much of the time (e.g., int4 attribute is truncated from the end relative to leaf IndexTuple format). BTW, what about the 1/3 of a page restriction on tuple size? I think that truncation has to strictly guarantee that the resulting tuple will be no larger than the original. Otherwise, you get into trouble here. But, that's still not my main objection to using the null bitmap. -- Peter Geoghegan (Sent from my phone)
Re: [HACKERS] postgres_fdw IMPORT SCHEMA and partitioned tables
Robert Haaswrites: > On Fri, Mar 31, 2017 at 3:31 PM, Andres Freund wrote: >> Hm. Wonder if something like that shouldn't be backpatched - because >> otherwise using postgres_fdw from an old server against a newer one will >> do weird stuff. I don't know what kind of policy we've committed to >> with postgresImportForeignSchema... > I don't think I'd like to promise that postgres_fdw will always be > forward-compatible. Backward-compatibility is hard enough already. Unless I'm missing something, the behavior will be that an older version will simply ignore remote partitioned tables (they will not pass the relkind filter in the query). Seems pretty fail-soft, so I think it's fine. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple
I don't see why you'd necessarily need to know where in the index the tuple came from under my proposal. Besides, as I already pointed out, we already hard code minus infinity handling within internal pages during tuple comparisons. Anastasia's patch could modify nbtree in exactly the same way as suffix truncation would. I see no conflict there. Quite the opposite, in fact. -- Peter Geoghegan (Sent from my phone)
Re: [HACKERS] [sqlsmith] Failed assertion in _hash_kill_items/MarkBufferDirtyHint
On Mon, Mar 27, 2017 at 5:39 AM, Ashutosh Sharmawrote: > Thanks for reporting this problem. Could you please let me know on for > how long did you run sqlsmith to get this crash. However, I have found > the reason for this crash. This is basically happening when trying to > retrieve the tuples using cursor. Basically the current hash index > scan work tuple-at-a-time which means once it finds tuple on page, it > releases lock from the page but keeps pin on it and finally returns > the tuple. When the requested number of tuples are processed there is > no lock on the page that was being scanned but yes there is a pin on > it. Finally, when trying to close a cursor at the end of scan, if any > killed tuples has been identified we try to first mark these items as > dead with the help of _hash_kill_items(). But, since we only have pin > on this page, the assert check 'LWLockHeldByMe()' fails. Instead of adding bool haveLock to _hash_kill_items, how about just having the caller acquire the lock before calling the function and release it afterwards? Since the acquire is at the beginning of the function and the release at the end, there seems to be no point in putting the acquire/release inside the function rather than in the caller. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple
Peter Geogheganwrites: > On Fri, Mar 31, 2017 at 2:32 PM, Tom Lane wrote: >> ... I also think we're setting up a situation where tools >> like amcheck and pg_filedump are going to be unable to cope, because >> figuring out what a particular tuple contains is going to require context >> they haven't got. It just seems way too dangerous. > Why wouldn't they have the context? Up to now, we have always had the property that you could decode an index tuple given only the tuple and its relation's tupdesc (and likewise for heap tuples). This patch breaks that: now you need to know where in the index the tuple came from. That's a property that I think we will regret losing. No doubt we can make it work for some value of "work", but there's going to be broken tools, and other opportunity costs down the road. > There is a free bit within IndexTupleData.t_info that could indicate > that this is what happened. Yeah, I was just looking at that, but it's the only bit we've got left. Don't think we can use it both to deal with Anastasia's patch and your suffix truncation idea. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgres_fdw IMPORT SCHEMA and partitioned tables
On Fri, Mar 31, 2017 at 3:31 PM, Andres Freundwrote: >> Committed after rewording the documentation. > > Hm. Wonder if something like that shouldn't be backpatched - because > otherwise using postgres_fdw from an old server against a newer one will > do weird stuff. I don't know what kind of policy we've committed to > with postgresImportForeignSchema... I don't think I'd like to promise that postgres_fdw will always be forward-compatible. Backward-compatibility is hard enough already. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgres_fdw IMPORT SCHEMA and partitioned tables
On 2017-03-31 15:25:19 -0400, Robert Haas wrote: > On Fri, Mar 31, 2017 at 12:51 AM, Amit Langote >wrote: > > Thanks, no more comments from my side. > > Committed after rewording the documentation. Hm. Wonder if something like that shouldn't be backpatched - because otherwise using postgres_fdw from an old server against a newer one will do weird stuff. I don't know what kind of policy we've committed to with postgresImportForeignSchema... Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] brin autosummarization -- autovacuum "work items"
On Tue, Mar 21, 2017 at 4:10 PM, Alvaro Herrerawrote: > Well, the number of work items is currently fixed; but if you have many > BRIN indexes, then you'd overflow (lose requests). By using DSA I am > making it easy to patch this afterwards so that an arbitrary number of > requests can be recorded. But that could also use an arbitrarily large amount of memory, and any leaks will be cluster-lifespan. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgres_fdw IMPORT SCHEMA and partitioned tables
On Fri, Mar 31, 2017 at 12:51 AM, Amit Langotewrote: > Thanks, no more comments from my side. Committed after rewording the documentation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple
On Fri, Mar 31, 2017 at 3:03 PM, Tom Lanewrote: > The reason it works fine for heap tuples is that heap tuple headers > explicitly record the number of attributes in the tuple. Index > tuples don't. Per my previous mail, I think we can change things so that Index tuples effectively record that in all relevant cases. i.e., in what I called separator key index tuples -- high key tuples in leaf pages, and all internal page index tuples (high keys and downlinks/internal items). We already store a minus infinity downlink on internal pages, which doesn't bother diagnostic tools at all, despite being its own special case without real Datum values. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] few fts functions for jsonb
On 30 Mar 2017 23:43, "Dmitry Dolgov" <9erthali...@gmail.com> wrote: On 31 March 2017 at 00:01, Andrew Dunstanwrote: > > I have just noticed as I was writing/testing the non-existent docs for > this patch that it doesn't supply variants of to_tsvector that take a > regconfig as the first argument. Is there a reason for that? Why > should the json(b) versions be different from the text versions? No, there is no reason, I just missed that. Here is a new version of the patch (only the functions part) to add those variants. Congratulations with patch committed, who will write an addition documentation? I think we need to touch FTS and JSON parts.
Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
On Thu, Mar 30, 2017 at 4:13 AM, Pavan Deolaseewrote: > > > On Thu, Mar 30, 2017 at 3:29 PM, Dilip Kumar > wrote: > >> On Wed, Mar 29, 2017 at 11:51 AM, Pavan Deolasee >> wrote: >> > Thanks. I think your patch of tracking interesting attributes seems ok >> too >> > after the performance issue was addressed. Even though we can still >> improve >> > that further, at least Mithun confirmed that there is no significant >> > regression anymore and in fact for one artificial case, patch does >> better >> > than even master. >> >> I was trying to compile these patches on latest >> head(f90d23d0c51895e0d7db7910538e85d3d38691f0) for some testing but I >> was not able to compile it. >> >> make[3]: *** [postgres.bki] Error 1 >> > > Looks like OID conflict to me.. Please try rebased set. > broken again on oid conflicts for 3373 to 3375 from the monitoring permissions commi 25fff40798fc4. After bumping those, I get these compiler warnings: heapam.c: In function 'heap_delete': heapam.c:3298: warning: 'root_offnum' may be used uninitialized in this function heapam.c: In function 'heap_update': heapam.c:4311: warning: 'root_offnum' may be used uninitialized in this function heapam.c:4311: note: 'root_offnum' was declared here heapam.c:3784: warning: 'root_offnum' may be used uninitialized in this function heapam.c: In function 'heap_lock_tuple': heapam.c:5087: warning: 'root_offnum' may be used uninitialized in this function And I get a regression test failure, attached. Cheers, Jeff regression.diffs Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple
Robert Haaswrites: > On Fri, Mar 31, 2017 at 2:32 PM, Tom Lane wrote: >> It just seems way too dangerous. > So, we end up with heap tuples with different numbers of attributes > all the time, whenever you add a column. It works fine - on decoding, > the additional columns will be treated as null (unless somebody > commits Serge Rielau's patch, which regrettably nobody has gotten > around to reviewing). Why is this case different? The reason it works fine for heap tuples is that heap tuple headers explicitly record the number of attributes in the tuple. Index tuples don't. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] SortSupport for macaddr type
Hello, Thank you Peter for the assist here and great sleuthing in the disassembly, and thanks Teodor for committing! Neha pointed out (thanks as well) a couple typos upthread that I hadn't gotten around to fixing. I've attached a new three line patch to sort those out. Brandur On Wed, Mar 29, 2017 at 1:31 PM, Teodor Sigaevwrote: > Thank you, pushed > > Excellent! I've attached a new (and hopefully final) >> version of the patch. >> > > -- > Teodor Sigaev E-mail: teo...@sigaev.ru >WWW: > http://www.sigaev.ru/ > 0001-Fix-two-typos-introduced-by-macaddr-SortSupport.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Somebody has not thought through subscription locking considerations
Petr Jelinekwrites: > On 31/03/17 20:23, Tom Lane wrote: >> No, the problematic part is that there is any heap_open happening at >> all. That open could very easily result in a recursive attempt to read >> pg_class, for example, which is going to be fatal if we're in the midst >> of vacuum full'ing or reindex'ing pg_class. It's frankly astonishing >> to me that this patch seems to have survived testing under >> CLOBBER_CACHE_ALWAYS, because it's only the catalog caches that are >> preventing such recursive lookups. > Hmm okay, so the solution is to either use standard dependency info for > this so that it's only called for tables that are actually know to be > subscribed or have some exceptions in the current code to call the > function only for user catalogs. Any preferences? Looking at dependency info isn't going to fix this, it only moves the unsafe catalog access somewhere else (ie pg_depend instead of pg_subscription_rel). I suspect the only safe solution is doing an IsCatalogRelation or similar test pretty early in the logical replication code paths. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] parallel explain analyze support not exercised
Hi, As visible in [1], the explain analyze codepaths of parallel query isn't exercised in the tests. That used to be not entirely trivial if the output was to be displayed (due to timing), but we should be able to do that now that we have the SUMMARY option. E.g. SET max_parallel_workers = 0; EXPLAIN (analyze, timing off, summary off, costs off) SELECT * FROM blarg2 WHERE generate_series < 0; ┌───┐ │QUERY PLAN │ ├───┤ │ Gather (actual rows=0 loops=1)│ │ Workers Planned: 10 │ │ Workers Launched: 0 │ │ -> Parallel Seq Scan on blarg2 (actual rows=0 loops=1) │ │ Filter: (generate_series < 0) │ │ Rows Removed by Filter: 1000 │ └───┘ should be reproducible. I'd suggest additionally adding one tests that throws the EXPLAIN output away, but actually enables paralellism. Greetings, Andres Freund [1] https://coverage.postgresql.org/src/backend/executor/execParallel.c.gcov.html -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple
I wrote: > Andres Freundwrites: >> It'd be useful for FieldStore - we'd not have to error out anymore if >> the number of columns changes (which I previously had "solved" by using >> MaxHeapAttributeNumber sized values/nulls array). > Ah, I see. But in that case you're not really truncating the tuple > --- what you want is to be allowed to pass undersized datum/nulls > arrays and have the missing columns be taken as nulls. No, scratch that: you can't use an "extended" heap_form_tuple to solve that problem, because it's not only the form-tuple end but the deform-tuple end that needs fullsize arrays. Otherwise, we'd be losing trailing-column values in the deform-and-reform cycle. So I still don't see that there's a valid application for this feature. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] brin autosummarization -- autovacuum "work items"
Here's a version of this patch which I consider final. Thanks for your tips on using DSA. No crashes now. I am confused about not needing dsa_attach the second time around. If I do that, the dsa handle has been 0x7f'd, which I don't understand since it is supposed to be allocated in TopMemoryContext. I didn't dig too deep to try and find what is causing that behavior. Once we do, it's easy to remove the dsa_detach/dsa_attach calls. I added a new SQL-callable function to invoke summarization of an individual page range. That is what I wanted to do in vacuum (rather than a scan of the complete index), and it seems independently useful. I also removed the behavior that on index creation the final partial block range is always summarized. It's pointless. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/doc/src/sgml/brin.sgml b/doc/src/sgml/brin.sgml index 5bf11dc..5140a38 100644 --- a/doc/src/sgml/brin.sgml +++ b/doc/src/sgml/brin.sgml @@ -74,9 +74,14 @@ tuple; those tuples remain unsummarized until a summarization run is invoked later, creating initial summaries. This process can be invoked manually using the - brin_summarize_new_values(regclass) function, - or automatically when VACUUM processes the table. + brin_summarize_range(regclass, bigint) or + brin_summarize_new_values(regclass) functions; + automatically when VACUUM processes the table; + or by automatic summarization executed by autovacuum, as insertions + occur. (This last trigger is disabled by default and can be enabled + with the autosummarize parameter.) + diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 6887eab..25c18d1 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -19685,6 +19685,13 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup()); +brin_summarize_range(index regclass, blockNumber bigint) + + integer + summarize the page range covering the given block, if not already summarized + + + gin_clean_pending_list(index regclass) bigint @@ -19700,7 +19707,8 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup()); that are not currently summarized by the index; for any such range it creates a new summary index tuple by scanning the table pages. It returns the number of new page range summaries that were inserted -into the index. +into the index. brin_summarize_range does the same, except +it only summarizes the range that covers the given block number. diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml index 7163b03..83ee7d3 100644 --- a/doc/src/sgml/ref/create_index.sgml +++ b/doc/src/sgml/ref/create_index.sgml @@ -382,7 +382,7 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] -BRIN indexes accept a different parameter: +BRIN indexes accept different parameters: @@ -396,6 +396,16 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] + + +autosummarize + + + Defines whether a summarization run is invoked for the previous page + range whenever an insertion is detected on the next one. + + + diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c index b22563b..707d04e 100644 --- a/src/backend/access/brin/brin.c +++ b/src/backend/access/brin/brin.c @@ -26,6 +26,7 @@ #include "catalog/pg_am.h" #include "miscadmin.h" #include "pgstat.h" +#include "postmaster/autovacuum.h" #include "storage/bufmgr.h" #include "storage/freespace.h" #include "utils/builtins.h" @@ -60,10 +61,12 @@ typedef struct BrinOpaque BrinDesc *bo_bdesc; } BrinOpaque; +#define BRIN_ALL_BLOCKRANGES InvalidBlockNumber + static BrinBuildState *initialize_brin_buildstate(Relation idxRel, BrinRevmap *revmap, BlockNumber pagesPerRange); static void terminate_brin_buildstate(BrinBuildState *state); -static void brinsummarize(Relation index, Relation heapRel, +static void brinsummarize(Relation index, Relation heapRel, BlockNumber pageRange, double *numSummarized, double *numExisting); static void form_and_insert_tuple(BrinBuildState *state); static void union_tuples(BrinDesc *bdesc, BrinMemTuple *a, @@ -126,8 +129,11 @@ brinhandler(PG_FUNCTION_ARGS) * with those of the new tuple. If the tuple values are not consistent with * the summary tuple, we need to update the index tuple. * + * If autosummarization is enabled, check if we need to summarize the previous + * page range. + * * If the range is not currently summarized (i.e. the revmap returns NULL for - * it), there's nothing to do. + * it), there's nothing to do for this tuple. */ bool
Re: [HACKERS] bumping HASH_VERSION to 3
Robert Haaswrites: > Forcing a reindex in v10 kills three birds > with one stone: > - No old, not logged, possibly corrupt hash indexes floating around > after an upgrade to v10. > - Can remove the backward-compatibility code added by > 293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around > forever. > - No need to worry about doing an in-place upgrade of the metapage for > the above-mentioned patch. > Thoughts? +1, as long as we're clear on what will happen when pg_upgrade'ing an installation containing hash indexes. I think a minimum requirement is that it succeed and be able to start up, and allow the user to manually REINDEX such indexes afterwards. Bonus points for: 1. teaching pg_upgrade to create a script containing the required REINDEX commands. (I think it's produced scripts for similar requirements in the past.) 2. marking the index invalid so that the system would silently ignore it until it's been reindexed. I think there might be adequate infrastructure for that already thanks to REINDEX CONCURRENTLY, and it'd just be a matter of getting pg_upgrade to hack the indexes' catalog state. (If not, it's probably not worth the trouble.) A variant on that might just be to not transfer over hash indexes, leaving the user to CREATE them rather than REINDEX them. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] parallel bitmapscan isn't exercised in regression tests
Hi, The parallel code-path isn't actually exercised in the tests added in [1], as evidenced by [2] (they just explain). That imo needs to be fixed. Greetings, Andres Freund [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=f35742ccb7aa53ee3ed8416bbb378b0c3eeb6bb9 [2] https://coverage.postgresql.org/src/backend/executor/nodeBitmapHeapscan.c.gcov.html -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple
On Fri, Mar 31, 2017 at 2:32 PM, Tom Lanewrote: > I think you are failing to get the point. I am not on about whether > we need a few bytes more or less, I am on about whether the patch > even works. As an example, it's quite unclear what the width of > such an index tuple's nulls bitmap will be if it exists, and there > certainly will be cases where it must exist because of nulls in the > keys columns. I also think we're setting up a situation where tools > like amcheck and pg_filedump are going to be unable to cope, because > figuring out what a particular tuple contains is going to require context > they haven't got. It just seems way too dangerous. Why wouldn't they have the context? I think that we can use the page offset for internal items to indicate the number of attributes that were truncated in each case. That field is currently unused in all relevant cases (for "separator" IndexTuples). This wouldn't be the first time we put a magic value into an item pointer offset. That detail would be redundant for Anastasia's patch, but we can imagine a future in which that isn't the case. There is a free bit within IndexTupleData.t_info that could indicate that this is what happened. I am not going to comment on how dangerous any of this may or may not be just yet, but FWIW I think it would be worth using that bit to allow suffix truncation to work. Suffix truncation is a widely used technique, that Anastasia's patch happens to be a special case of. It's a technique that is almost as old as B-Trees themselves. The use of a dedicated bit probably wouldn't be necessary, but perhaps it makes things safer for the patch. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] partitioned tables and contrib/sepgsql
On Fri, Mar 31, 2017 at 2:14 PM, Mike Palmiottowrote: > On Mon, Mar 27, 2017 at 12:09 PM, Mike Palmiotto > wrote: >> On Mon, Mar 27, 2017 at 11:46 AM, Robert Haas wrote: >>> >>> Note that sepgsql hasn't been updated to work with RLS yet, either, >>> but we didn't regard that as an open item for RLS, or if we did the >>> resolution was just to document it. I am not opposed to giving a >>> little more time to get this straightened out, but if a patch doesn't >>> show up fairly soon then I think we should just document that sepgsql >>> doesn't support partitioned tables in v10. sepgsql has a fairly >>> lengthy list of implementation restrictions already, so one more is >>> not going to kill anybody -- or if it will then that person should >>> produce a patch soon. >> >> Okay, I'll make sure I get something fleshed out today or tomorrow. > > Apologies for the delay. I was waffling over whether to reference > PartitionedRelationId in sepgsql, but ended up deciding to just handle > RELKIND_PARTITIONED_TABLE and treat the classOid as > RelationRelationId. Seeing as there is a relid in pg_class which > corresponds to the partitioned table, this chosen route seemed > acceptable. Newest patches remove cruft from said waffling. No need to include pg_partitioned_table.h if we're not referencing PartitionedRelationId. > > Here is a demonstration of the partitioned table working with sepgsql hooks: > https://gist.github.com/anonymous/b10f476a95ae9cdd39b83ef872d4b1e6 > > Attached you will find two patches, which were rebased on master as of > 156d388 (applied with `git am --revert [patch file]`). The first gets > rid of some pesky compiler warnings and the second implements the > sepgsql handling of partitioned tables. That should have read `git am --reject [patch file]`. Apologies for the inaccuracy. Thanks, -- Mike Palmiotto Software Engineer Crunchy Data Solutions https://crunchydata.com From 82ff9a4e18d3baefa5530b8515e46cf8225519de Mon Sep 17 00:00:00 2001 From: Mike Palmiotto Date: Tue, 28 Mar 2017 16:44:54 + Subject: [PATCH 1/2] Silence some sepgsql compiler warnings selinux/label.h includes stdbool.h, which redefines the bool type and results in a warning: assignment from incompatible pointer type for sepgsql_fmgr_hook. Move selinux/label.h above postgres.h, so the bool type is properly defined. Additionally, sepgsql throws compiler warnings due to possibly uninitialized tclass in code paths for indexes. Set tclass to a bogus -1 to silence these warnings. --- contrib/sepgsql/label.c| 4 ++-- contrib/sepgsql/relation.c | 8 ++-- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/contrib/sepgsql/label.c b/contrib/sepgsql/label.c index 1a8f884..a404cd7 100644 --- a/contrib/sepgsql/label.c +++ b/contrib/sepgsql/label.c @@ -8,6 +8,8 @@ * * - */ +#include + #include "postgres.h" #include "access/heapam.h" @@ -37,8 +39,6 @@ #include "sepgsql.h" -#include - /* * Saved hook entries (if stacked) */ diff --git a/contrib/sepgsql/relation.c b/contrib/sepgsql/relation.c index ab98a9b..fd6fe8b 100644 --- a/contrib/sepgsql/relation.c +++ b/contrib/sepgsql/relation.c @@ -300,8 +300,10 @@ sepgsql_relation_post_create(Oid relOid) tclass = SEPG_CLASS_DB_VIEW; break; case RELKIND_INDEX: - /* deal with indexes specially; no need for tclass */ + /* other indexes are handled specially below; set tclass to -1 to + * silence compiler warning */ sepgsql_index_modify(relOid); + tclass = -1; goto out; default: /* ignore other relkinds */ @@ -432,7 +434,9 @@ sepgsql_relation_drop(Oid relOid) /* ignore indexes on toast tables */ if (get_rel_namespace(relOid) == PG_TOAST_NAMESPACE) return; - /* other indexes are handled specially below; no need for tclass */ + /* other indexes are handled specially below; set tclass to -1 to + * silence compiler warning */ + tclass = -1; break; default: /* ignore other relkinds */ -- 2.7.4 From b3a44aa37b5724ea88fb0d1110ba18be6618e283 Mon Sep 17 00:00:00 2001 From: Mike Palmiotto Date: Wed, 29 Mar 2017 14:59:37 + Subject: [PATCH 2/2] Add partitioned table support to sepgsql Account for RELKIND_PARTITIONED_RELATIONS in sepgsql and treat the objects like regular relations. This allows for proper create/alter/drop hook behavior for partitioned tables. --- contrib/sepgsql/label.c| 3 ++- contrib/sepgsql/relation.c | 33 +++-- 2 files changed, 25 insertions(+), 11 deletions(-) diff --git a/contrib/sepgsql/label.c b/contrib/sepgsql/label.c index a404cd7..88a04a4 100644 --- a/contrib/sepgsql/label.c +++ b/contrib/sepgsql/label.c @@ -779,7 +779,8 @@ exec_object_restorecon(struct selabel_handle * sehnd, Oid catalogId) case
Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple
On Fri, Mar 31, 2017 at 2:32 PM, Tom Lanewrote: > Peter Geoghegan writes: >> On Fri, Mar 31, 2017 at 2:11 PM, Tom Lane wrote: >>> Hm. Since index tuples lack any means of indicating the actual number >>> of columns included (ie there's no equivalent of the natts field that >>> exists in HeapTupleHeaders), I think that this is an unreasonably >>> dangerous design. It'd be better to store nulls for the missing >>> fields. That would force a null bitmap to be included whether or >>> not there were nulls in the key columns, but if we're only doing it >>> once per page that doesn't seem like much cost. > >> We're doing it once per page for the leaf page high key, but that's >> used as the downlink in the second phase of a B-Tree page split -- >> it's directly copied. So, including a NULL bitmap would make >> Anastasia's patch significantly less useful, > > I think you are failing to get the point. I am not on about whether > we need a few bytes more or less, I am on about whether the patch > even works. As an example, it's quite unclear what the width of > such an index tuple's nulls bitmap will be if it exists, and there > certainly will be cases where it must exist because of nulls in the > keys columns. I also think we're setting up a situation where tools > like amcheck and pg_filedump are going to be unable to cope, because > figuring out what a particular tuple contains is going to require context > they haven't got. It just seems way too dangerous. So, we end up with heap tuples with different numbers of attributes all the time, whenever you add a column. It works fine - on decoding, the additional columns will be treated as null (unless somebody commits Serge Rielau's patch, which regrettably nobody has gotten around to reviewing). Why is this case different? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Somebody has not thought through subscription locking considerations
On 31/03/17 20:23, Tom Lane wrote: > Petr Jelinekwrites: >> On 31/03/17 19:35, Tom Lane wrote: >>> At the very least, it would be a good idea to exclude the system >>> catalogs from logical replication, wouldn't it? > >> They are excluded. > > Well, the exclusion is apparently not checked early enough. I would say > that touches of system catalogs should never result in any attempts to > access the logical relation infrastructure, but what we have here is > such an access. > >> The problematic part is that the pg_subscription_rel manipulation >> functions acquire ShareRowExclusiveLock and not the usual >> RowExclusiveLock because there were some worries about concurrency. > > No, the problematic part is that there is any heap_open happening at > all. That open could very easily result in a recursive attempt to read > pg_class, for example, which is going to be fatal if we're in the midst > of vacuum full'ing or reindex'ing pg_class. It's frankly astonishing > to me that this patch seems to have survived testing under > CLOBBER_CACHE_ALWAYS, because it's only the catalog caches that are > preventing such recursive lookups. > Hmm okay, so the solution is to either use standard dependency info for this so that it's only called for tables that are actually know to be subscribed or have some exceptions in the current code to call the function only for user catalogs. Any preferences? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: [[Parallel] Shared] Hash
Hi Thomas, On 2017-03-31 17:53:12 +1300, Thomas Munro wrote: > Thanks very much to Rafia for testing, and to Andres for his copious > review feedback. Here's a new version. Changes: I've not looked at that aspect, but one thing I think would be good is to first add patch that increases coverage of nodeHash[join].c to nearly 100%. There's currently significant bits of nodeHash.c that aren't covered (skew optimization, large tuples). https://coverage.postgresql.org/src/backend/executor/nodeHash.c.gcov.html https://coverage.postgresql.org/src/backend/executor/nodeHashjoin.c.gcov.html - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple
Peter Geogheganwrites: > On Fri, Mar 31, 2017 at 2:11 PM, Tom Lane wrote: >> Hm. Since index tuples lack any means of indicating the actual number >> of columns included (ie there's no equivalent of the natts field that >> exists in HeapTupleHeaders), I think that this is an unreasonably >> dangerous design. It'd be better to store nulls for the missing >> fields. That would force a null bitmap to be included whether or >> not there were nulls in the key columns, but if we're only doing it >> once per page that doesn't seem like much cost. > We're doing it once per page for the leaf page high key, but that's > used as the downlink in the second phase of a B-Tree page split -- > it's directly copied. So, including a NULL bitmap would make > Anastasia's patch significantly less useful, I think you are failing to get the point. I am not on about whether we need a few bytes more or less, I am on about whether the patch even works. As an example, it's quite unclear what the width of such an index tuple's nulls bitmap will be if it exists, and there certainly will be cases where it must exist because of nulls in the keys columns. I also think we're setting up a situation where tools like amcheck and pg_filedump are going to be unable to cope, because figuring out what a particular tuple contains is going to require context they haven't got. It just seems way too dangerous. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] delta relations in AFTER triggers
On Fri, Mar 31, 2017 at 12:20:51PM -0500, Kevin Grittner wrote: > On Thu, Mar 30, 2017 at 11:51 AM, Kevin Grittnerwrote: > > > New version attached. It needs some of these problem cases added > > to the testing, and a mention in the docs that only C and plpgsql > > triggers can use the feature so far. I'll add those tomorrow. > > Done and attached. > > Now the question is, should it be pushed? Yes. Among other things, that'll get it buildfarm tested and give people interested in other PLs better visibility. That, and I suspect that people will start using this infrastructure for some very cool projects. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Covering + unique indexes.
31.03.2017 20:57, Robert Haas: On Fri, Mar 31, 2017 at 1:40 PM, Anastasia Lubennikovawrote: First of all, I want to thank you and Robert for reviewing this patch. Your expertise in postgres subsystems is really necessary for features like this. I just wonder, why don't you share your thoughts and doubts till the "last call". I haven't done any significant technical work other than review patches in 14 months, and in the last several months I've often worked 10 and 12 hour days to get more review done. I think at one level you've got a fair complaint here - it's hard to get things committed, and this patch probably didn't get as much attention as it deserved. It's not so easy to know how to fix that. I'm pretty sure "tell Andres and Robert to work harder" isn't it. *off-topic* No complaints from me, I understand how difficult is reviewing and highly appreciate your work. The problem is that not all developers are qualified enough to do a review. I've tried to make a course about postrges internals. Something like "Deep dive into postgres codebase for hackers". And it turned out to be really helpful for new developers. So, I wonder, maybe we could write some tips for new reviewers and testers as well. -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bumping HASH_VERSION to 3
On 03/31/2017 02:17 PM, Robert Haas wrote: Starting a new thread about this to get more visibility. Despite the extensive work that has been done on hash indexes this release, we have thus far not made any change to the on-disk format that is not nominally backward-compatible. Commit 293e24e507838733aba4748b514536af2d39d7f2 did make a change for new hash indexes, but included backward-compatibility code so that old indexes would continue to work. However, I'd like to also commit Mithun Cy's patch to expand hash indexes more gradually -- latest version in http://postgr.es/m/cad__oujd-ibxm91zcqziayftwqjxnfqgmv361v9zke83s6i...@mail.gmail.com -- and that's not backward-compatible. It would be possible to write code to convert the old metapage format to the new metapage format introduced by that patch, and it wouldn't be very hard, but I think it would be better to NOT do that, and instead force everybody upgrading to v10 to rebuild all of their hash indexes. If we don't do that, then we'll never know whether instances of hash index corruption reported against v10 or higher are caused by defects in the new code, because there's always the chance that the hash index could have been built on a pre-v10 version, got corrupted because of the lack of WAL-logging, and then been brought up to v10+ via pg_upgrade. Forcing a reindex in v10 kills three birds with one stone: - No old, not logged, possibly corrupt hash indexes floating around after an upgrade to v10. - Can remove the backward-compatibility code added by 293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around forever. - No need to worry about doing an in-place upgrade of the metapage for the above-mentioned patch. Thoughts? +1 Best regards, Jesper -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
On Fri, Mar 31, 2017 at 11:16 PM, Robert Haaswrote: > On Fri, Mar 31, 2017 at 10:24 AM, Pavan Deolasee > wrote: > > Having worked on it for some time now, I can say that WARM uses pretty > much > > the same infrastructure that HOT uses for cleanup/pruning tuples from the > > heap. So the risk of having a bug which can eat your data from the heap > is > > very low. Sure, it might mess up with indexes, return duplicate keys, not > > return a row when it should have. Not saying they are not bad bugs, but > > probably much less severe than someone removing live rows from the heap. > > Yes, that's true. If there's nothing wrong with the way pruning > works, then any other problem can be fixed by reindexing, I suppose. > > Yeah, I think so. > I'm not generally a huge fan of on-off switches for things like this, > but I know Simon likes them. I think the question is how much they > really insulate us from bugs. For the hash index patch, for example, > the only way to really get insulation from bugs added in this release > would be to ship both the old and the new code in separate index AMs > (hash, hash2). The code has been restructured so much in the process > of doing all of this that any other form of on-off switch would be > pretty hit-or-miss whether it actually provided any protection. > > Now, I am less sure about this case, but my guess is that you can't > really have this be something that can be flipped on and off for a > table. Once a table has any WARM updates in it, the code that knows > how to cope with that has to be enabled, and it'll work as well or > poorly as it does. That's correct. Once enabled, we will need to handle the case of two index pointers pointing to the same root. The only way to get rid of that is probably do a complete rewrite/reindex, I suppose. But I was mostly talking about immutable flag at table creation time as rightly guessed. > Now, I understand you to be suggesting a flag at > table-creation time that would, maybe, be immutable after that, but > even then - are we going to run completely unmodified 9.6 code for > tables where that's not enabled, and only go through any of the WARM > logic when it is enabled? Doesn't sound likely. The commits already > made from this patch series certainly affect everybody, and I can't > see us adding switches that bypass > ce96ce60ca2293f75f36c3661e4657a3c79ffd61 for example. I don't think I am going to claim that either. But probably only 5% of the new code would then be involved. Which is a lot less and a lot more manageable. Having said that, I think if we at all do this, we should only do it based on our experiences in the beta cycle, as a last resort. Based on my own experiences during HOT development, long running pgbench tests, with several concurrent clients, subjected to multiple AV cycles and periodic consistency checks, usually brings up issues related to heap corruption. So my confidence level is relatively high on that part of the code. That's not to suggest that there can't be any bugs. Obviously then there are other things such as regression to some workload or additional work required by vacuum etc. And I think we should address them and I'm fairly certain we can do that. It may not happen immediately, but if we provide right knobs, may be those who are affected can fall back to the old behaviour or not use the new code at all while we improve things for them. Some of these things I could have already implemented, but without a clear understanding of whether the feature will get in or not, it's hard to keep putting infinite efforts into the patch. All non-committers go through that dilemma all the time, I'm sure. Thanks, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Somebody has not thought through subscription locking considerations
Petr Jelinekwrites: > On 31/03/17 19:35, Tom Lane wrote: >> At the very least, it would be a good idea to exclude the system >> catalogs from logical replication, wouldn't it? > They are excluded. Well, the exclusion is apparently not checked early enough. I would say that touches of system catalogs should never result in any attempts to access the logical relation infrastructure, but what we have here is such an access. > The problematic part is that the pg_subscription_rel manipulation > functions acquire ShareRowExclusiveLock and not the usual > RowExclusiveLock because there were some worries about concurrency. No, the problematic part is that there is any heap_open happening at all. That open could very easily result in a recursive attempt to read pg_class, for example, which is going to be fatal if we're in the midst of vacuum full'ing or reindex'ing pg_class. It's frankly astonishing to me that this patch seems to have survived testing under CLOBBER_CACHE_ALWAYS, because it's only the catalog caches that are preventing such recursive lookups. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [BUGS] BUG #14600: Passwords in user mappings leaked by psql \deu+ command
Forwarding message from pgsql-bugs for review Attached a patch which copies the logic from commit 93a6be63a55a8cd0d73b3fa81eb6a46013a3a974. In the current implementation we only consider privileges of the foreign server in determining whether or not to show the user mapping details. This patch copies the same logic (and documentation) used in commit 93a6be63a55a8cd0d73b3fa81eb6a46013a3a974 to not always show the user mapping options. regards, Feike user_mappings_leak.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bumping HASH_VERSION to 3
On Fri, Mar 31, 2017 at 8:17 PM, Robert Haaswrote: > Starting a new thread about this to get more visibility. > > Despite the extensive work that has been done on hash indexes this > release, we have thus far not made any change to the on-disk format > that is not nominally backward-compatible. Commit > 293e24e507838733aba4748b514536af2d39d7f2 did make a change for new > hash indexes, but included backward-compatibility code so that old > indexes would continue to work. However, I'd like to also commit > Mithun Cy's patch to expand hash indexes more gradually -- latest > version in http://postgr.es/m/CAD__OujD-iBxm91ZcqziaYftWqJxnFqgMv361V9 > zke83s6i...@mail.gmail.com > -- and that's not backward-compatible. > > It would be possible to write code to convert the old metapage format > to the new metapage format introduced by that patch, and it wouldn't > be very hard, but I think it would be better to NOT do that, and > instead force everybody upgrading to v10 to rebuild all of their hash > indexes. If we don't do that, then we'll never know whether > instances of hash index corruption reported against v10 or higher are > caused by defects in the new code, because there's always the chance > that the hash index could have been built on a pre-v10 version, got > corrupted because of the lack of WAL-logging, and then been brought up > to v10+ via pg_upgrade. Forcing a reindex in v10 kills three birds > with one stone: > > - No old, not logged, possibly corrupt hash indexes floating around > after an upgrade to v10. > - Can remove the backward-compatibility code added by > 293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around > forever. > - No need to worry about doing an in-place upgrade of the metapage for > the above-mentioned patch. > > Thoughts? > Given the state of hash indexes in <= 9.6, I think this is a reasonable tradeoff. Most people won't be using them at all today. Those that do will have to "pay" with a REINDEX on upgrade. I think the benefits definitely outweigh the cost. So +1 for doing it. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] delta relations in AFTER triggers
On Fri, Mar 31, 2017 at 12:58 PM, Robert Haaswrote: > On Fri, Mar 31, 2017 at 1:50 PM, Tom Lane wrote: >> I would vote for calling the struct typedef EphemeralNamedRelation and >> using the abbreviation ENR (capitalized that way, not as Enr) in related >> function names, where that seemed sensible. I really do *not* like >> "Enr" as a global name. > > Yeah, I had the same thought about capitalization but wasn't sure if > it was worth suggesting. But since Tom did, +1 from me. Will do. -- Kevin Grittner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple
On Fri, Mar 31, 2017 at 2:11 PM, Tom Lanewrote: >> The patch does actually store truncated/key-only tuples in the hi keys / >> non-leaf-nodes, which don't need the "included" columns. > > Hm. Since index tuples lack any means of indicating the actual number > of columns included (ie there's no equivalent of the natts field that > exists in HeapTupleHeaders), I think that this is an unreasonably > dangerous design. It'd be better to store nulls for the missing > fields. That would force a null bitmap to be included whether or > not there were nulls in the key columns, but if we're only doing it > once per page that doesn't seem like much cost. We're doing it once per page for the leaf page high key, but that's used as the downlink in the second phase of a B-Tree page split -- it's directly copied. So, including a NULL bitmap would make Anastasia's patch significantly less useful, since that now has to be in every internal page, and might imply that you end up with less fan-in much of the time (e.g., int4 attribute is truncated from the end relative to leaf IndexTuple format). I've implemented a rough prototype of suffix truncation, that seems to work well enough, and keeps amcheck happy, and I base my remarks on the experience of writing that prototype. Using the NULL bitmap this way was the first thing I tried. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] bumping HASH_VERSION to 3
Starting a new thread about this to get more visibility. Despite the extensive work that has been done on hash indexes this release, we have thus far not made any change to the on-disk format that is not nominally backward-compatible. Commit 293e24e507838733aba4748b514536af2d39d7f2 did make a change for new hash indexes, but included backward-compatibility code so that old indexes would continue to work. However, I'd like to also commit Mithun Cy's patch to expand hash indexes more gradually -- latest version in http://postgr.es/m/cad__oujd-ibxm91zcqziayftwqjxnfqgmv361v9zke83s6i...@mail.gmail.com -- and that's not backward-compatible. It would be possible to write code to convert the old metapage format to the new metapage format introduced by that patch, and it wouldn't be very hard, but I think it would be better to NOT do that, and instead force everybody upgrading to v10 to rebuild all of their hash indexes. If we don't do that, then we'll never know whether instances of hash index corruption reported against v10 or higher are caused by defects in the new code, because there's always the chance that the hash index could have been built on a pre-v10 version, got corrupted because of the lack of WAL-logging, and then been brought up to v10+ via pg_upgrade. Forcing a reindex in v10 kills three birds with one stone: - No old, not logged, possibly corrupt hash indexes floating around after an upgrade to v10. - Can remove the backward-compatibility code added by 293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around forever. - No need to worry about doing an in-place upgrade of the metapage for the above-mentioned patch. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] partitioned tables and contrib/sepgsql
On Mon, Mar 27, 2017 at 12:09 PM, Mike Palmiottowrote: > On Mon, Mar 27, 2017 at 11:46 AM, Robert Haas wrote: >> >> Note that sepgsql hasn't been updated to work with RLS yet, either, >> but we didn't regard that as an open item for RLS, or if we did the >> resolution was just to document it. I am not opposed to giving a >> little more time to get this straightened out, but if a patch doesn't >> show up fairly soon then I think we should just document that sepgsql >> doesn't support partitioned tables in v10. sepgsql has a fairly >> lengthy list of implementation restrictions already, so one more is >> not going to kill anybody -- or if it will then that person should >> produce a patch soon. > > Okay, I'll make sure I get something fleshed out today or tomorrow. Apologies for the delay. I was waffling over whether to reference PartitionedRelationId in sepgsql, but ended up deciding to just handle RELKIND_PARTITIONED_TABLE and treat the classOid as RelationRelationId. Seeing as there is a relid in pg_class which corresponds to the partitioned table, this chosen route seemed acceptable. Here is a demonstration of the partitioned table working with sepgsql hooks: https://gist.github.com/anonymous/b10f476a95ae9cdd39b83ef872d4b1e6 Attached you will find two patches, which were rebased on master as of 156d388 (applied with `git am --revert [patch file]`). The first gets rid of some pesky compiler warnings and the second implements the sepgsql handling of partitioned tables. Thanks, -- Mike Palmiotto Software Engineer Crunchy Data Solutions https://crunchydata.com From 06463a4545c1cd0a2740e201d06a36b78dc2da8c Mon Sep 17 00:00:00 2001 From: Mike Palmiotto Date: Wed, 29 Mar 2017 14:59:37 + Subject: [PATCH 2/2] Add partitioned table support to sepgsql Account for RELKIND_PARTITIONED_RELATIONS in sepgsql and treat the objects like regular relations. This allows for proper create/alter/drop hook behavior for partitioned tables. --- contrib/sepgsql/hooks.c| 1 + contrib/sepgsql/label.c| 4 +++- contrib/sepgsql/relation.c | 34 -- 3 files changed, 28 insertions(+), 11 deletions(-) diff --git a/contrib/sepgsql/hooks.c b/contrib/sepgsql/hooks.c index 93cc8de..89e71e3 100644 --- a/contrib/sepgsql/hooks.c +++ b/contrib/sepgsql/hooks.c @@ -15,6 +15,7 @@ #include "catalog/pg_class.h" #include "catalog/pg_database.h" #include "catalog/pg_namespace.h" +#include "catalog/pg_partitioned_table.h" #include "catalog/pg_proc.h" #include "commands/seclabel.h" #include "executor/executor.h" diff --git a/contrib/sepgsql/label.c b/contrib/sepgsql/label.c index a404cd7..546 100644 --- a/contrib/sepgsql/label.c +++ b/contrib/sepgsql/label.c @@ -23,6 +23,7 @@ #include "catalog/pg_class.h" #include "catalog/pg_database.h" #include "catalog/pg_namespace.h" +#include "catalog/pg_partitioned_table.h" #include "catalog/pg_proc.h" #include "commands/dbcommands.h" #include "commands/seclabel.h" @@ -779,7 +780,8 @@ exec_object_restorecon(struct selabel_handle * sehnd, Oid catalogId) case RelationRelationId: relForm = (Form_pg_class) GETSTRUCT(tuple); -if (relForm->relkind == RELKIND_RELATION) +if (relForm->relkind == RELKIND_RELATION || + relForm->relkind == RELKIND_PARTITIONED_TABLE) objtype = SELABEL_DB_TABLE; else if (relForm->relkind == RELKIND_SEQUENCE) objtype = SELABEL_DB_SEQUENCE; diff --git a/contrib/sepgsql/relation.c b/contrib/sepgsql/relation.c index fd6fe8b..93022d4 100644 --- a/contrib/sepgsql/relation.c +++ b/contrib/sepgsql/relation.c @@ -19,6 +19,7 @@ #include "catalog/pg_attribute.h" #include "catalog/pg_class.h" #include "catalog/pg_namespace.h" +#include "catalog/pg_partitioned_table.h" #include "commands/seclabel.h" #include "lib/stringinfo.h" #include "utils/builtins.h" @@ -54,12 +55,14 @@ sepgsql_attribute_post_create(Oid relOid, AttrNumber attnum) ObjectAddress object; Form_pg_attribute attForm; StringInfoData audit_name; + char relkind; /* * Only attributes within regular relation have individual security * labels. */ - if (get_rel_relkind(relOid) != RELKIND_RELATION) + relkind = get_rel_relkind(relOid); + if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE) return; /* @@ -135,8 +138,10 @@ sepgsql_attribute_drop(Oid relOid, AttrNumber attnum) { ObjectAddress object; char *audit_name; + char relkind; - if (get_rel_relkind(relOid) != RELKIND_RELATION) + relkind = get_rel_relkind(relOid); + if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE) return; /* @@ -167,8 +172,11 @@ sepgsql_attribute_relabel(Oid relOid, AttrNumber attnum, { ObjectAddress object; char *audit_name; + char relkind; - if (get_rel_relkind(relOid) != RELKIND_RELATION) + relkind = get_rel_relkind(relOid); + + if (relkind != RELKIND_RELATION && relkind !=
Re: [HACKERS] Somebody has not thought through subscription locking considerations
On 31/03/17 19:35, Tom Lane wrote: > Masahiko Sawadawrites: >> On Fri, Mar 31, 2017 at 9:53 AM, Petr Jelinek >> wrote: >>> On 30/03/17 07:25, Tom Lane wrote: I await with interest an explanation of what "VACUUM FULL pg_class" is doing trying to acquire ShareRowExclusiveLock on pg_subscription_rel, not to mention why a DROP SEQUENCE is holding some fairly strong lock on that relation. > >> VACUUM FULL of any table acquires ShareRowExclusiveLock on >> pg_subscription_rel because when doDeletion removes old heap the >> RemoveSubscriptionRel is called in heap_drop_with_catalog. > > This seems entirely horrid: it *guarantees* deadlock possibilities. > And I wonder what happens when I VACUUM FULL pg_subscription_rel > itself. > > At the very least, it would be a good idea to exclude the system > catalogs from logical replication, wouldn't it? > They are excluded. It works same way for triggers and many other objects so I would not say it's horrid. The problematic part is that the pg_subscription_rel manipulation functions acquire ShareRowExclusiveLock and not the usual RowExclusiveLock because there were some worries about concurrency. I think though that it's not needed though given the access patterns there. It's only updated by CREATE SUBSCRIPTION/ALTER SUBSCRIPTION REFRESH and then by tablesync which holds exclusive lock on the table itself anyway. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple
Robert Haaswrites: > You might want to have a read through that patch. I think your > opinion would be helpful in determining how close that patch is to > being ready to commit (same for WARM). Well, now that we have an extra week, maybe I'll find the time. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple
Andres Freundwrites: > On 2017-03-31 13:44:52 -0400, Tom Lane wrote: >> Andres Freund writes: >>> The covering indexes patch [1] really needs a version of >>> heap_form_tuple/index_form_tuple that allows to specify the number of >>> columns in the to-be generated tuple. >> I was thinking about that this morning, and wondering why exactly it >> would need such a thing. Certainly we're not going to store such a >> truncated tuple in the index, so I assume that the purpose here is >> just to pass around the tuple internally for some reason. > The patch does actually store truncated/key-only tuples in the hi keys / > non-leaf-nodes, which don't need the "included" columns. Hm. Since index tuples lack any means of indicating the actual number of columns included (ie there's no equivalent of the natts field that exists in HeapTupleHeaders), I think that this is an unreasonably dangerous design. It'd be better to store nulls for the missing fields. That would force a null bitmap to be included whether or not there were nulls in the key columns, but if we're only doing it once per page that doesn't seem like much cost. >> Um, I didn't notice anyplace where that would have helped, certainly >> not on the "form tuple" side. Tuples that don't meet their own tupdesc >> don't seem to have wide application to me. > It'd be useful for FieldStore - we'd not have to error out anymore if > the number of columns changes (which I previously had "solved" by using > MaxHeapAttributeNumber sized values/nulls array). Ah, I see. But in that case you're not really truncating the tuple --- what you want is to be allowed to pass undersized datum/nulls arrays and have the missing columns be taken as nulls. In short, I'd be okay with an extension that allows shortened input arrays, but I think it needs to produce fully valid tuples with nulls in the unsupplied columns. (In the heap case we could indeed achieve that effect by storing the smaller natts value in the header, but not in the index case.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GUC for cleanup indexes threshold.
On Fri, Mar 31, 2017 at 11:44 PM, Robert Haaswrote: > On Wed, Mar 29, 2017 at 2:23 AM, Masahiko Sawada > wrote: >> I was thinking that the status of this patch is still "Needs review" >> because I sent latest version patch[1]. > > I think you're right. > > I took a look at this today. I think there is some problem with the > design of this patch. I originally proposed a threshold based on the > percentage of not-all-visible pages on the theory that we'd just skip > looking at the indexes altogether in that case. But that's not what > the patch does: it only avoids the index *cleanup*, not the index > *vacuum*. Hmm, I guess there is misunderstanding on this thread (my English might have confused you, sorry). I've been proposing the patch that allows lazy vacuum skip only the index cleanup vacuum. Because the problem reported by xu jian[1] is that second vacuum freeze takes long time even if the table is not modified since first vaucuum. The reason is that we cannot skip lazy_cleanup_index even if the table is not changed at all since previous vacuum. If the table has a garbage the lazy vacuum does lazy_vacuum_index, on the other hand, if the table doesn't have garbage, which means that only insertion was execued or the table was not modified, the lazy vacuum does lazy_cleanup_index instead. Since I'd been knew this restriction I proposed it. That's why proposing new GUC parameter name has been "vacuum_cleanup_index_scale_factor". > And the comments in btvacuumcleanup say this: > > /* > * If btbulkdelete was called, we need not do anything, just return the > * stats from the latest btbulkdelete call. If it wasn't called, we must > * still do a pass over the index, to recycle any newly-recyclable pages > * and to obtain index statistics. > * > * Since we aren't going to actually delete any leaf items, there's no > * need to go through all the vacuum-cycle-ID pushups. > */ In current design, we can skip lazy_cleanup_index in case where the number of pages need to be vacuumed is less than the threshold. For example, after frozen whole table by aggressive vacuum the vacuum can skip scanning on the index if only a few insertion is execute on that table. But if a transaction made a garbage on the table, this patch cannot prevent from vacuuming on the index as you mentioned. By skipping index cleanup we could leave recyclable pages that are not marked as a recyclable. But it can be reclaimed by both next index vacuum and next index cleanup becuase btvacuumpage calls RecordFreeIndexPage for recyclable page. So I think it doesn't become a problem. And a next concern pointed by Peter is that we stash an XID when a btree page is deleted, which is used to determine when it's finally safe to recycle the page. I prevent from this situation by not allowing lazy vacuum to skip the index cleanup when aggressive vacuum. But since this makes the advantage of this patch weak I'm considering better idea. > So, if I'm reading this correctly, the only time this patch saves > substantial work - at least in the case of a btree index - is in the > case where there are no dead tuples at all. But if we only want to > avoid the work in that case, then a threshold based on the percentage > of all-visible pages seems like the wrong thing, because the other > stuff btvacuumcleanup() is doing doesn't have anything to do with the > number of all-visible pages. I thought that if the lazy vacuum skipped almost table according to visibility map it means that the state of table is changed just a little and we can skip updating the index statistics. Am I missing something? Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] merge psql ef/ev sf/sv handling functions
Hello, While reviewing Corey's \if patch, I complained that there was some amount of copy-paste in "psql/command.c". Here is an attempt at merging some functions which removes 160 lines of code. -- Fabien.diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 94a3cfc..edf875d 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -75,10 +75,8 @@ static backslashResult exec_command_d(PsqlScanState scan_state, bool active_bran const char *cmd); static backslashResult exec_command_edit(PsqlScanState scan_state, bool active_branch, PQExpBuffer query_buf, PQExpBuffer previous_buf); -static backslashResult exec_command_ef(PsqlScanState scan_state, bool active_branch, -PQExpBuffer query_buf); -static backslashResult exec_command_ev(PsqlScanState scan_state, bool active_branch, -PQExpBuffer query_buf); +static backslashResult exec_command_ef_ev(PsqlScanState scan_state, bool active_branch, + PQExpBuffer query_buf, bool is_func); static backslashResult exec_command_echo(PsqlScanState scan_state, bool active_branch, const char *cmd); static backslashResult exec_command_elif(PsqlScanState scan_state, ConditionalStack cstack, @@ -118,10 +116,8 @@ static backslashResult exec_command_s(PsqlScanState scan_state, bool active_bran static backslashResult exec_command_set(PsqlScanState scan_state, bool active_branch); static backslashResult exec_command_setenv(PsqlScanState scan_state, bool active_branch, const char *cmd); -static backslashResult exec_command_sf(PsqlScanState scan_state, bool active_branch, -const char *cmd); -static backslashResult exec_command_sv(PsqlScanState scan_state, bool active_branch, -const char *cmd); +static backslashResult exec_command_sf_sv(PsqlScanState scan_state, bool active_branch, + const char *cmd, bool is_func); static backslashResult exec_command_t(PsqlScanState scan_state, bool active_branch); static backslashResult exec_command_T(PsqlScanState scan_state, bool active_branch); static backslashResult exec_command_timing(PsqlScanState scan_state, bool active_branch); @@ -322,9 +318,9 @@ exec_command(const char *cmd, status = exec_command_edit(scan_state, active_branch, query_buf, previous_buf); else if (strcmp(cmd, "ef") == 0) - status = exec_command_ef(scan_state, active_branch, query_buf); + status = exec_command_ef_ev(scan_state, active_branch, query_buf, true); else if (strcmp(cmd, "ev") == 0) - status = exec_command_ev(scan_state, active_branch, query_buf); + status = exec_command_ef_ev(scan_state, active_branch, query_buf, false); else if (strcmp(cmd, "echo") == 0 || strcmp(cmd, "qecho") == 0) status = exec_command_echo(scan_state, active_branch, cmd); else if (strcmp(cmd, "elif") == 0) @@ -380,9 +376,9 @@ exec_command(const char *cmd, else if (strcmp(cmd, "setenv") == 0) status = exec_command_setenv(scan_state, active_branch, cmd); else if (strcmp(cmd, "sf") == 0 || strcmp(cmd, "sf+") == 0) - status = exec_command_sf(scan_state, active_branch, cmd); + status = exec_command_sf_sv(scan_state, active_branch, cmd, true); else if (strcmp(cmd, "sv") == 0 || strcmp(cmd, "sv+") == 0) - status = exec_command_sv(scan_state, active_branch, cmd); + status = exec_command_sf_sv(scan_state, active_branch, cmd, false); else if (strcmp(cmd, "t") == 0) status = exec_command_t(scan_state, active_branch); else if (strcmp(cmd, "T") == 0) @@ -976,28 +972,29 @@ exec_command_edit(PsqlScanState scan_state, bool active_branch, } /* - * \ef -- edit the named function, or present a blank CREATE FUNCTION - * template if no argument is given + * \ef/\ev -- edit the named function/view, or + * present a blank CREATE FUNCTION/VIEW template if no argument is given */ static backslashResult -exec_command_ef(PsqlScanState scan_state, bool active_branch, -PQExpBuffer query_buf) +exec_command_ef_ev(PsqlScanState scan_state, bool active_branch, + PQExpBuffer query_buf, bool is_func) { backslashResult status = PSQL_CMD_SKIP_LINE; if (active_branch) { - char *func = psql_scan_slash_option(scan_state, - OT_WHOLE_LINE, NULL, true); + char *stuff = psql_scan_slash_option(scan_state, + OT_WHOLE_LINE, NULL, true); int lineno = -1; - if (pset.sversion < 80400) + if (pset.sversion < (is_func ? 80400 : 70400)) { char sverbuf[32]; - psql_error("The server (version %s) does not support editing function source.\n", + psql_error("The server (version %s) does not support editing %s.\n", formatPGVersionNumber(pset.sversion, false, - sverbuf, sizeof(sverbuf))); + sverbuf, sizeof(sverbuf)), + is_func ? "function source" : "view definitions"); status = PSQL_CMD_ERROR; } else if (!query_buf) @@ -1007,36 +1004,43 @@ exec_command_ef(PsqlScanState scan_state, bool active_branch, } else { - Oid foid = InvalidOid; + Oid stuff_oid = InvalidOid;
Re: [HACKERS] WIP: Covering + unique indexes.
31.03.2017 20:47, Andres Freund: Maybe it would be better to modify index_form_tuple() to accept a new argument with a number of attributes, then you can just Assert that this number is never higher than the number of attributes in the TupleDesc. Good point. I agree that this function is a bit strange. I have to set tupdesc->nattrs to support compatibility with index_form_tuple(). I didn't want to add neither a new field to tupledesc nor a new parameter to index_form_tuple(), because they are used widely. But I haven't considered the possibility of index_form_tuple() failure. Fixed in this version of the patch. Now it creates a copy of tupledesc to pass it to index_form_tuple. That seems like it'd actually be a noticeable increase in memory allocator overhead. I think we should just add (as just proposed in separate thread), a _extended version of it that allows to specify the number of columns. The function is called not that often. Only once per page split for indexes with included columns. Doesn't look like dramatic overhead. So I decided that a wrapper function would be more appropriate than refactoring of all index_form_tuple() calls. But index_form_tuple_extended() looks like a better solution. -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] delta relations in AFTER triggers
On Fri, Mar 31, 2017 at 1:50 PM, Tom Lanewrote: > Kevin Grittner writes: >> On Thu, Mar 23, 2017 at 7:14 PM, Thomas Munro >> wrote: >>> my only other comment would be a bikeshed issue: >>> Enr isn't a great name for a struct. > >> I know, but EphemeralNamedRelation starts to get kinda long, >> especially when making the normal sorts of concatenations. I >> started making the change and balked when I saw things like >> EphemeralNamedRelationMetadataData and a function named >> EphemeralNamedRelationMetadataGetTupDesc() in place of >> EnrmdGetTupDesc(). A 40 character function name make for a lot of >> line-wrapping to stay within pgindent limits. Any suggestions? > > I would vote for calling the struct typedef EphemeralNamedRelation and > using the abbreviation ENR (capitalized that way, not as Enr) in related > function names, where that seemed sensible. I really do *not* like > "Enr" as a global name. Yeah, I had the same thought about capitalization but wasn't sure if it was worth suggesting. But since Tom did, +1 from me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REFERENCES privilege should not be symmetric (was Re: [GENERAL] Postgres Permissions Article)
On Fri, Mar 31, 2017 at 7:40 PM, Tom Lanewrote: > Robert Haas writes: > > On Fri, Mar 31, 2017 at 11:29 AM, Tom Lane wrote: > >> The argument for not back-patching a bug fix usually boils down to > >> fear of breaking existing applications, but it's hard to see how > >> removal of a permission check could break a working application --- > >> especially when the permission check is as hard to trigger as this one. > >> How many table owners ever revoke their own REFERENCES permission? > > > Sure, but that argument cuts both ways. If nobody ever does that, who > > will be helped by back-patching this? > > I certainly agree that back-patching this change is pretty low risk. > > I just don't think it has any real benefits. > > I think the benefit is reduction of user confusion. Admittedly, since > Paul is the first person I can remember ever having complained about it, > maybe nobody else is confused. > I think we also need to be extra careful about changing *security related* behavior in back branches, even more so than other behavior. In this case I think it's quite unlikely that it would hit somebody, but the risk is there. And people generally auto-upgrade to the latest minor releases, whereas they at least in theory read the top of the release notes when doing a major upgrade (ok, most people probably don't, but at least some do). -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] WIP: Covering + unique indexes.
On Fri, Mar 31, 2017 at 1:40 PM, Anastasia Lubennikovawrote: > First of all, I want to thank you and Robert for reviewing this patch. > Your expertise in postgres subsystems is really necessary for features like > this. > I just wonder, why don't you share your thoughts and doubts till the "last > call". I haven't done any significant technical work other than review patches in 14 months, and in the last several months I've often worked 10 and 12 hour days to get more review done. I think at one level you've got a fair complaint here - it's hard to get things committed, and this patch probably didn't get as much attention as it deserved. It's not so easy to know how to fix that. I'm pretty sure "tell Andres and Robert to work harder" isn't it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REFERENCES privilege should not be symmetric (was Re: [GENERAL] Postgres Permissions Article)
On Fri, Mar 31, 2017 at 10:40 AM, Tom Lanewrote: > Robert Haas writes: > > On Fri, Mar 31, 2017 at 11:29 AM, Tom Lane wrote: > >> The argument for not back-patching a bug fix usually boils down to > >> fear of breaking existing applications, but it's hard to see how > >> removal of a permission check could break a working application --- > >> especially when the permission check is as hard to trigger as this one. > >> How many table owners ever revoke their own REFERENCES permission? > > > Sure, but that argument cuts both ways. If nobody ever does that, who > > will be helped by back-patching this? > > I certainly agree that back-patching this change is pretty low risk. > > I just don't think it has any real benefits. > > I think the benefit is reduction of user confusion. Admittedly, since > Paul is the first person I can remember ever having complained about it, > maybe nobody else is confused. > After going back-and-forth on this (and not being able to independently come to the conclusion that what we are adhering to is actually a typo) I'm going to toss my +1 in with Robert's. If anyone actually complains about the behavior and not just the documentation we could consider back-patching if any release before 10.0 is still under support. There have been non-bug fix improvements to the docs that didn't get back-patched covering topics more confusing than this. Expecting those learning the system to consult the most recent version of the docs is standard fare here. From a practical perspective the revised current docs will be applicable for past versions as long as one doesn't go a get their REFERENCES permission revoked somehow. If they do, and wonder why, the docs and these list will be able to explain it reasonably well. David J.
Re: [HACKERS] delta relations in AFTER triggers
Kevin Grittnerwrites: > On Thu, Mar 23, 2017 at 7:14 PM, Thomas Munro > wrote: >> my only other comment would be a bikeshed issue: >> Enr isn't a great name for a struct. > I know, but EphemeralNamedRelation starts to get kinda long, > especially when making the normal sorts of concatenations. I > started making the change and balked when I saw things like > EphemeralNamedRelationMetadataData and a function named > EphemeralNamedRelationMetadataGetTupDesc() in place of > EnrmdGetTupDesc(). A 40 character function name make for a lot of > line-wrapping to stay within pgindent limits. Any suggestions? I would vote for calling the struct typedef EphemeralNamedRelation and using the abbreviation ENR (capitalized that way, not as Enr) in related function names, where that seemed sensible. I really do *not* like "Enr" as a global name. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple
On 2017-03-31 13:44:52 -0400, Tom Lane wrote: > Andres Freundwrites: > > The covering indexes patch [1] really needs a version of > > heap_form_tuple/index_form_tuple that allows to specify the number of > > columns in the to-be generated tuple. > > I was thinking about that this morning, and wondering why exactly it > would need such a thing. Certainly we're not going to store such a > truncated tuple in the index, so I assume that the purpose here is > just to pass around the tuple internally for some reason. The patch does actually store truncated/key-only tuples in the hi keys / non-leaf-nodes, which don't need the "included" columns. > > Previously the faster expression > > evaluation stuff could also have benefited form the same for both > > forming and deforming tuples. > > Um, I didn't notice anyplace where that would have helped, certainly > not on the "form tuple" side. Tuples that don't meet their own tupdesc > don't seem to have wide application to me. It'd be useful for FieldStore - we'd not have to error out anymore if the number of columns changes (which I previously had "solved" by using MaxHeapAttributeNumber sized values/nulls array). Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple
On Fri, Mar 31, 2017 at 1:44 PM, Tom Lanewrote: > Andres Freund writes: >> The covering indexes patch [1] really needs a version of >> heap_form_tuple/index_form_tuple that allows to specify the number of >> columns in the to-be generated tuple. > > I was thinking about that this morning, and wondering why exactly it > would need such a thing. Certainly we're not going to store such a > truncated tuple in the index, ... The patch stores it in the index as a high key. You might want to have a read through that patch. I think your opinion would be helpful in determining how close that patch is to being ready to commit (same for WARM). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Covering + unique indexes.
On 2017-03-31 20:40:59 +0300, Anastasia Lubennikova wrote: > 30.03.2017 22:11, Andres Freund > > Any objection from reviewers to push both patches? > > First of all, I want to thank you and Robert for reviewing this patch. > Your expertise in postgres subsystems is really necessary for features like > this. > I just wonder, why don't you share your thoughts and doubts till the "last > call". Because there's a lot of other patches? I only looked because Teodor announced he was thinking about committing - I just don't have the energy to look at all patches before they're ready to commit. Unfortunatly "ready-for-committer" is very frequently not actually that :( > > Maybe it would be better to modify index_form_tuple() to accept a new > > argument with a number of attributes, then you can just Assert that > > this number is never higher than the number of attributes in the > > TupleDesc. > Good point. > I agree that this function is a bit strange. I have to set > tupdesc->nattrs to support compatibility with index_form_tuple(). > I didn't want to add neither a new field to tupledesc nor a new > parameter to index_form_tuple(), because they are used widely. > > > But I haven't considered the possibility of index_form_tuple() failure. > Fixed in this version of the patch. Now it creates a copy of tupledesc to > pass it to index_form_tuple. That seems like it'd actually be a noticeable increase in memory allocator overhead. I think we should just add (as just proposed in separate thread), a _extended version of it that allows to specify the number of columns. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
On Fri, Mar 31, 2017 at 10:24 AM, Pavan Deolaseewrote: > Having worked on it for some time now, I can say that WARM uses pretty much > the same infrastructure that HOT uses for cleanup/pruning tuples from the > heap. So the risk of having a bug which can eat your data from the heap is > very low. Sure, it might mess up with indexes, return duplicate keys, not > return a row when it should have. Not saying they are not bad bugs, but > probably much less severe than someone removing live rows from the heap. Yes, that's true. If there's nothing wrong with the way pruning works, then any other problem can be fixed by reindexing, I suppose. > And we can make it a table level property, keep it off by default, turn it > off on system tables in this release and change the defaults only when we > get more confidence assuming people use it by explicitly turning it on. Now > may be that's not the right approach and keeping it off by default will mean > it receives much less testing than we would like. So we can keep it on in > the beta cycle and then take a call. I went a good length to make it work on > system tables because during HOT development, Tom told me that it better > work for everything or it doesn't work at all. But with WARM it works for > system tables and I know no known bugs, but if we don't want to risk system > tables, we might want to turn it off (just prior to release may be). I'm not generally a huge fan of on-off switches for things like this, but I know Simon likes them. I think the question is how much they really insulate us from bugs. For the hash index patch, for example, the only way to really get insulation from bugs added in this release would be to ship both the old and the new code in separate index AMs (hash, hash2). The code has been restructured so much in the process of doing all of this that any other form of on-off switch would be pretty hit-or-miss whether it actually provided any protection. Now, I am less sure about this case, but my guess is that you can't really have this be something that can be flipped on and off for a table. Once a table has any WARM updates in it, the code that knows how to cope with that has to be enabled, and it'll work as well or poorly as it does. Now, I understand you to be suggesting a flag at table-creation time that would, maybe, be immutable after that, but even then - are we going to run completely unmodified 9.6 code for tables where that's not enabled, and only go through any of the WARM logic when it is enabled? Doesn't sound likely. The commits already made from this patch series certainly affect everybody, and I can't see us adding switches that bypass ce96ce60ca2293f75f36c3661e4657a3c79ffd61 for example. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
On Thu, Mar 30, 2017 at 10:49 AM, Petr Jelinekwrote: > While reading this thread I am thinking if we could just not do WARM on > TOAST and compressed values if we know there might be regressions there. > I mean I've seen the problem WARM tries to solve mostly on timestamp or > boolean values and sometimes counters so it would still be helpful to > quite a lot of people even if we didn't do TOAST and compressed values > in v1. It's not like not doing WARM sometimes is somehow terrible, we'll > just fall back to current behavior. Good point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow to specify #columns in heap/index_form_tuple
Andres Freundwrites: > The covering indexes patch [1] really needs a version of > heap_form_tuple/index_form_tuple that allows to specify the number of > columns in the to-be generated tuple. I was thinking about that this morning, and wondering why exactly it would need such a thing. Certainly we're not going to store such a truncated tuple in the index, so I assume that the purpose here is just to pass around the tuple internally for some reason. What's wrong with just generating the full tuple, perhaps with nulls for the extra columns if you're concerned about memory space? > Previously the faster expression > evaluation stuff could also have benefited form the same for both > forming and deforming tuples. Um, I didn't notice anyplace where that would have helped, certainly not on the "form tuple" side. Tuples that don't meet their own tupdesc don't seem to have wide application to me. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REFERENCES privilege should not be symmetric (was Re: [GENERAL] Postgres Permissions Article)
Robert Haaswrites: > On Fri, Mar 31, 2017 at 11:29 AM, Tom Lane wrote: >> The argument for not back-patching a bug fix usually boils down to >> fear of breaking existing applications, but it's hard to see how >> removal of a permission check could break a working application --- >> especially when the permission check is as hard to trigger as this one. >> How many table owners ever revoke their own REFERENCES permission? > Sure, but that argument cuts both ways. If nobody ever does that, who > will be helped by back-patching this? > I certainly agree that back-patching this change is pretty low risk. > I just don't think it has any real benefits. I think the benefit is reduction of user confusion. Admittedly, since Paul is the first person I can remember ever having complained about it, maybe nobody else is confused. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: pageinspect / add page_checksum and bt_page_items(bytea)
On 03/31/2017 06:01 PM, Ashutosh Sharma wrote: It seems like the latest patch(v4) shared by Tomas upthread is an empty patch. If I am not wrong, please share the correct patch. Thanks. OMG, this is the second time I managed to generate an empty patch. I really need to learn not to do that .. Attached is v5 of the patch, hopefully correct this time ;-) regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c index 3d69aa9..bc2d27c 100644 --- a/contrib/pageinspect/btreefuncs.c +++ b/contrib/pageinspect/btreefuncs.c @@ -41,6 +41,7 @@ PG_FUNCTION_INFO_V1(bt_metap); PG_FUNCTION_INFO_V1(bt_page_items); +PG_FUNCTION_INFO_V1(bt_page_items_bytea); PG_FUNCTION_INFO_V1(bt_page_stats); #define IS_INDEX(r) ((r)->rd_rel->relkind == RELKIND_INDEX) @@ -253,14 +254,62 @@ struct user_args OffsetNumber offset; }; +/* + * bt_page_print_tuples + * form a tuple describing index tuple at a given offset + */ +static Datum +bt_page_print_tuples(FuncCallContext *fctx, Page page, OffsetNumber offset) +{ + char *values[6]; + HeapTuple tuple; + ItemId id; + IndexTuple itup; + int j; + int off; + int dlen; + char *dump; + char *ptr; + + id = PageGetItemId(page, offset); + + if (!ItemIdIsValid(id)) + elog(ERROR, "invalid ItemId"); + + itup = (IndexTuple) PageGetItem(page, id); + + j = 0; + values[j++] = psprintf("%d", offset); + values[j++] = psprintf("(%u,%u)", + ItemPointerGetBlockNumberNoCheck(>t_tid), + ItemPointerGetOffsetNumberNoCheck(>t_tid)); + values[j++] = psprintf("%d", (int) IndexTupleSize(itup)); + values[j++] = psprintf("%c", IndexTupleHasNulls(itup) ? 't' : 'f'); + values[j++] = psprintf("%c", IndexTupleHasVarwidths(itup) ? 't' : 'f'); + + ptr = (char *) itup + IndexInfoFindDataOffset(itup->t_info); + dlen = IndexTupleSize(itup) - IndexInfoFindDataOffset(itup->t_info); + dump = palloc0(dlen * 3 + 1); + values[j] = dump; + for (off = 0; off < dlen; off++) + { + if (off > 0) + *dump++ = ' '; + sprintf(dump, "%02x", *(ptr + off) & 0xff); + dump += 2; + } + + tuple = BuildTupleFromCStrings(fctx->attinmeta, values); + + return HeapTupleGetDatum(tuple); +} + Datum bt_page_items(PG_FUNCTION_ARGS) { text *relname = PG_GETARG_TEXT_PP(0); uint32 blkno = PG_GETARG_UINT32(1); Datum result; - char *values[6]; - HeapTuple tuple; FuncCallContext *fctx; MemoryContext mctx; struct user_args *uargs; @@ -300,6 +349,11 @@ bt_page_items(PG_FUNCTION_ARGS) if (blkno == 0) elog(ERROR, "block 0 is a meta page"); + if (P_ISMETA(opaque)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("block is a meta page"))); + CHECK_RELATION_BLOCK_RANGE(rel, blkno); buffer = ReadBuffer(rel, blkno); @@ -345,47 +399,8 @@ bt_page_items(PG_FUNCTION_ARGS) if (fctx->call_cntr < fctx->max_calls) { - ItemId id; - IndexTuple itup; - int j; - int off; - int dlen; - char *dump; - char *ptr; - - id = PageGetItemId(uargs->page, uargs->offset); - - if (!ItemIdIsValid(id)) - elog(ERROR, "invalid ItemId"); - - itup = (IndexTuple) PageGetItem(uargs->page, id); - - j = 0; - values[j++] = psprintf("%d", uargs->offset); - values[j++] = psprintf("(%u,%u)", - ItemPointerGetBlockNumberNoCheck(>t_tid), - ItemPointerGetOffsetNumberNoCheck(>t_tid)); - values[j++] = psprintf("%d", (int) IndexTupleSize(itup)); - values[j++] = psprintf("%c", IndexTupleHasNulls(itup) ? 't' : 'f'); - values[j++] = psprintf("%c", IndexTupleHasVarwidths(itup) ? 't' : 'f'); - - ptr = (char *) itup + IndexInfoFindDataOffset(itup->t_info); - dlen = IndexTupleSize(itup) - IndexInfoFindDataOffset(itup->t_info); - dump = palloc0(dlen * 3 + 1); - values[j] = dump; - for (off = 0; off < dlen; off++) - { - if (off > 0) -*dump++ = ' '; - sprintf(dump, "%02x", *(ptr + off) & 0xff); - dump += 2; - } - - tuple = BuildTupleFromCStrings(fctx->attinmeta, values); - result = HeapTupleGetDatum(tuple); - - uargs->offset = uargs->offset + 1; - + result = bt_page_print_tuples(fctx, uargs->page, uargs->offset); + uargs->offset++; SRF_RETURN_NEXT(fctx, result); } else @@ -396,6 +411,90 @@ bt_page_items(PG_FUNCTION_ARGS) } } +/*--- + * bt_page_items_bytea() + * + * Get IndexTupleData set in a btree page + * + * Usage: SELECT * FROM bt_page_items(get_raw_page('t1_pkey', 1)); + *--- + */ + +Datum +bt_page_items_bytea(PG_FUNCTION_ARGS) +{ + bytea *raw_page = PG_GETARG_BYTEA_P(0); + Datum result; + FuncCallContext *fctx; + struct user_args *uargs; + int raw_page_size; + + if (!superuser()) + ereport(ERROR, +(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + (errmsg("must be superuser to use pageinspect functions"; + + if (SRF_IS_FIRSTCALL()) + { +