Re: [HACKERS] [POC] hash partitioning
On Fri, May 12, 2017 at 6:08 PM, amul sul wrote: > Hi, > > Please find the following updated patches attached: I have done some testing with the new patch, most of the cases worked as per the expectation except below I expect the planner to select only "Seq Scan on t1" whereas it's scanning both the partitions? create table t (a int, b varchar) partition by hash(a); create table t1 partition of t for values with (modulus 8, remainder 0); create table t2 partition of t for values with (modulus 8, remainder 1); postgres=# explain select * from t where a=8; QUERY PLAN -- Append (cost=0.00..51.75 rows=12 width=36) -> Seq Scan on t1 (cost=0.00..25.88 rows=6 width=36) Filter: (a = 8) -> Seq Scan on t2 (cost=0.00..25.88 rows=6 width=36) Filter: (a = 8) (5 rows) Some cosmetic comments. --- + RangeVar *rv = makeRangeVarFromNameList(castNode(List, nameEl->arg)); + Useless Hunk. /* - * Build a CREATE SEQUENCE command to create the sequence object, and - * add it to the list of things to be done before this CREATE/ALTER - * TABLE. + * Build a CREATE SEQUENCE command to create the sequence object, and add + * it to the list of things to be done before this CREATE/ALTER TABLE. */ Seems like, in src/backend/parser/parse_utilcmd.c, you have changed the existing code with pgindent. I think it's not a good idea to mix pgindent changes with your patch. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash Functions
On Sat, May 13, 2017 at 1:08 AM, Robert Haas wrote: > On Fri, May 12, 2017 at 2:45 PM, Tom Lane wrote: > > Maybe a shorter argument for hash partitioning is that not one but two > different people proposed patches for it within months of the initial > partitioning patch going in. When multiple people are thinking about > implementing the same feature almost immediately after the > prerequisite patches land, that's a good clue that it's a desirable > feature. So I think we should try to solve the problems, rather than > giving up. > Can we think of defining separate portable hash functions which can be used for the purpose of hash partitioning? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multi-column range partition constraint
On Sat, May 13, 2017 at 11:01 AM, Robert Haas wrote: > On Fri, May 12, 2017 at 12:46 PM, Robert Haas wrote: >> On Fri, May 12, 2017 at 3:26 AM, Amit Langote >> wrote: >>> Fixed. >> >> This seems to be the same patch version as last time, so none of the >> things you mention as fixed are, in fact, fixed. Really sorry about the mix-up. > We are kind of running out of time here before beta1, but Amit thinks > he can get the correct patch posted within about 24 hours, so I'm > going to wait for that to happen rather than trying to revise his > previous version myself. Hence, next update tomorrow. Argh. Attached is the correct version. Thanks, Amit 0001-Emit-correct-range-partition-constraint-expression.patch Description: Binary data 0002-Add-pg_get_partition_constraintdef.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash Functions
On 2017-05-12 21:56:30 -0400, Robert Haas wrote: > Cheap isn't free, though. It's got a double-digit percentage overhead > rather than a large-multiple-of-the-runtime overhead as triggers do, > but people still won't want to pay it unnecessarily, I think. That should be partiall addressable with reasonable amounts of engineering though. Efficiently computing the target partition in a hash-partitioned table can be implemented very efficiently, and adding infrastructure for multiple bulk insert targets in copy should be quite doable as well. It's also work that's generally useful, not just for backups. The bigger issue to me here wrt pg_dump is that partitions can restored in parallel, but that'd probably not work as well if dumped separately. Unless we'd do the re-routing on load, rather than when dumping - which'd also increase cache locality, by most of the time (same architecture/encoding/etc) having one backend insert into the same partition. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Addition of pg_dump --no-publications
On Fri, May 12, 2017 at 10:19 PM, Peter Eisentraut wrote: > On 5/11/17 21:59, Michael Paquier wrote: >> On Thu, May 11, 2017 at 3:19 PM, Michael Paquier >> wrote: >>> I imagine that pg_dump -s would be the basic operation that users >>> would do first before creating a subcription on a secondary node, but >>> what I find surprising is that publications are dumped by default. I >>> don't find confusing that those are actually included by default to be >>> consistent with the way subcriptions are handled, what I find >>> confusing is that there are no options to not dump them, and no >>> options to bypass their restore. >>> >>> So, any opinions about having pg_dump/pg_restore --no-publications? >> >> And that's really a boring patch, giving the attached. > > committed Thanks. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multi-column range partition constraint
On Fri, May 12, 2017 at 12:46 PM, Robert Haas wrote: > On Fri, May 12, 2017 at 3:26 AM, Amit Langote > wrote: >> Fixed. > > This seems to be the same patch version as last time, so none of the > things you mention as fixed are, in fact, fixed. We are kind of running out of time here before beta1, but Amit thinks he can get the correct patch posted within about 24 hours, so I'm going to wait for that to happen rather than trying to revise his previous version myself. Hence, next update tomorrow. Argh. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash Functions
On Fri, May 12, 2017 at 7:36 PM, David Fetter wrote: > On Fri, May 12, 2017 at 06:38:55PM -0400, Peter Eisentraut wrote: >> On 5/12/17 18:13, Alvaro Herrera wrote: >> > I think for logical replication the tuple should appear as being in the >> > parent table, not the partition. No? >> >> Logical replication replicates base table to base table. How those >> tables are tied together into a partitioned table or an inheritance tree >> is up to the system catalogs on each side. > > This seems like a totally reasonable approach to pg_dump, especially > in light of the fact that logical replication already (and quite > reasonably) does it this way. Hard work has been done to make > tuple-routing cheap, and this is one of the payoffs. Cheap isn't free, though. It's got a double-digit percentage overhead rather than a large-multiple-of-the-runtime overhead as triggers do, but people still won't want to pay it unnecessarily, I think. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH v2] Progress command to monitor progression of long running SQL queries
On Wed, May 10, 2017 at 10:05 PM, Michael Paquier wrote: > Regarding the patch, this is way to close to the progress facility > already in place. So why don't you extend it for the executor? I don't think that's a good idea. The existing progress facility advertises a fixed number of counters with a command-type-specific interpretation, but for *query* progress reporting, we really need a richer model. A query plan can contain an unbounded number of executor nodes and Remi's idea is, quite reasonably, to report something about each one. >From a design point of view, I think a patch like this has to clear a number of hurdles. It needs to: 1. Decide what data to expose. The sample output looks reasonable in this case, although the amount of code churn looks really high. 2. Find a way to advertise the data that it exposes. This patch looks like it allocates a half-MB of shared memory per backend and gives up if that's not enough. 3. Find a way to synchronize the access to the data that it exposes. This patch ignores that problem completely, so multiple progress report commands running at the same time will behave unpredictably. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH clause in CREATE STATISTICS
Alvaro Herrera writes: > Tom Lane wrote: >> Hm. That would behave less than desirably if getObjectClass() could >> return a value that wasn't a member of the enum, but it's hard to >> credit that happening. I guess I'd vote for removing the default: >> case from all of the places that have "switch (getObjectClass(object))", >> as forgetting to add an entry seems like a much larger hazard. > Ignoring the one in alter.c's AlterObjectNamespace_oid, which only > handles a small subset of the enum values, that gives the following > warnings: > /pgsql/source/master/src/backend/catalog/dependency.c: In function > 'doDeletion': > /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: > enumeration value 'OCLASS_ROLE' not handled in switch [-Wswitch] > switch (getObjectClass(object)) > ^ > /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: > enumeration value 'OCLASS_DATABASE' not handled in switch [-Wswitch] > /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: > enumeration value 'OCLASS_TBLSPACE' not handled in switch [-Wswitch] > /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: > enumeration value 'OCLASS_SUBSCRIPTION' not handled in switch [-Wswitch] Hm. I think it's intentional that shared objects are not handled there, but I wonder whether the lack of SUBSCRIPTION represents a bug. Anyway I think it would be fine to add those to the switch as explicit error cases. > /pgsql/source/master/src/backend/commands/tablecmds.c: In function > 'ATExecAlterColumnType': > /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: > enumeration value 'OCLASS_AM' not handled in switch [-Wswitch] >switch (getObjectClass(&foundObject)) >^ > /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: > enumeration value 'OCLASS_STATISTIC_EXT' not handled in switch [-Wswitch] > /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: > enumeration value 'OCLASS_EVENT_TRIGGER' not handled in switch [-Wswitch] > /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: > enumeration value 'OCLASS_PUBLICATION' not handled in switch [-Wswitch] > /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: > enumeration value 'OCLASS_PUBLICATION_REL' not handled in switch [-Wswitch] > /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: > enumeration value 'OCLASS_SUBSCRIPTION' not handled in switch [-Wswitch] > /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: > enumeration value 'OCLASS_TRANSFORM' not handled in switch [-Wswitch] All of those ought to go into the category that prints "unexpected object depending on column", I think; certainly "unrecognized object class" is not a better report, and the fact that we've evidently not touched this switch in the last few additions of object types provides fodder for the idea that the default: is unhelpful. (Not to mention that not having OCLASS_STATISTIC_EXT here is an outright bug as of today...) So losing the default: case here seems good to me. Alternatively, we could drop the explicit listing of oclasses we're not expecting to see, and let the default: case be "unexpected object depending on column". Treating the default separately is only of value if we'd expect getObjectDescription to fail, but there's no good reason for this code to be passing judgment on whether that might happen. What do we want this to do about statistics objects? We could either throw a feature-not-implemented error or try to update the stats object for the new column type. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash Functions
On Fri, May 12, 2017 at 06:38:55PM -0400, Peter Eisentraut wrote: > On 5/12/17 18:13, Alvaro Herrera wrote: > > I think for logical replication the tuple should appear as being in the > > parent table, not the partition. No? > > Logical replication replicates base table to base table. How those > tables are tied together into a partitioned table or an inheritance tree > is up to the system catalogs on each side. This seems like a totally reasonable approach to pg_dump, especially in light of the fact that logical replication already (and quite reasonably) does it this way. Hard work has been done to make tuple-routing cheap, and this is one of the payoffs. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash Functions
On 5/12/17 18:13, Alvaro Herrera wrote: > I think for logical replication the tuple should appear as being in the > parent table, not the partition. No? Logical replication replicates base table to base table. How those tables are tied together into a partitioned table or an inheritance tree is up to the system catalogs on each side. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash Functions
Peter Eisentraut wrote: > On 5/12/17 14:23, Robert Haas wrote: > > One alternative would be to change the way that we dump and restore > > the data. Instead of dumping the data with the individual partitions, > > dump it all out for the parent and let tuple routing sort it out at > > restore time. > > I think this could be a pg_dump option. One way it dumps out the > partitions, and another way it recomputes the partitions. I think that > could be well within pg_dump's mandate. I was thinking the same, but enable that option automatically for hash partitioning. Each partition is still dumped separately, but instead of referencing the specific partition in the TABLE DATA object, reference the parent table. This way, the restore can still be parallelized, but tuple routing is executed for each tuple. > (cough ... logical replication ... cough) I think for logical replication the tuple should appear as being in the parent table, not the partition. No? -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH clause in CREATE STATISTICS
Tom Lane wrote: > Alvaro Herrera writes: > > Tom Lane wrote: > >> Oh, I see your patch also fixes missing code in getObjectDescription(). > >> We need that. Is there a decent way to get the compiler to warn about > >> that oversight? > > > We could remove the default clause. That results in the expected > > warning, and no others (i.e. the switch was already complete): > > > /pgsql/source/master/src/backend/catalog/objectaddress.c:2657:2: warning: > > enumeration value 'OCLASS_STATISTIC_EXT' not handled in switch [-Wswitch] > > Hm. That would behave less than desirably if getObjectClass() could > return a value that wasn't a member of the enum, but it's hard to > credit that happening. I guess I'd vote for removing the default: > case from all of the places that have "switch (getObjectClass(object))", > as forgetting to add an entry seems like a much larger hazard. Ignoring the one in alter.c's AlterObjectNamespace_oid, which only handles a small subset of the enum values, that gives the following warnings: /pgsql/source/master/src/backend/catalog/dependency.c: In function 'doDeletion': /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: enumeration value 'OCLASS_ROLE' not handled in switch [-Wswitch] switch (getObjectClass(object)) ^ /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: enumeration value 'OCLASS_DATABASE' not handled in switch [-Wswitch] /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: enumeration value 'OCLASS_TBLSPACE' not handled in switch [-Wswitch] /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: enumeration value 'OCLASS_SUBSCRIPTION' not handled in switch [-Wswitch] /pgsql/source/master/src/backend/commands/tablecmds.c: In function 'ATExecAlterColumnType': /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_AM' not handled in switch [-Wswitch] switch (getObjectClass(&foundObject)) ^ /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_STATISTIC_EXT' not handled in switch [-Wswitch] /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_EVENT_TRIGGER' not handled in switch [-Wswitch] /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_PUBLICATION' not handled in switch [-Wswitch] /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_PUBLICATION_REL' not handled in switch [-Wswitch] /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_SUBSCRIPTION' not handled in switch [-Wswitch] /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_TRANSFORM' not handled in switch [-Wswitch] -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH clause in CREATE STATISTICS
Alvaro Herrera writes: > Tom Lane wrote: >> Oh, I see your patch also fixes missing code in getObjectDescription(). >> We need that. Is there a decent way to get the compiler to warn about >> that oversight? > We could remove the default clause. That results in the expected > warning, and no others (i.e. the switch was already complete): > /pgsql/source/master/src/backend/catalog/objectaddress.c:2657:2: warning: > enumeration value 'OCLASS_STATISTIC_EXT' not handled in switch [-Wswitch] Hm. That would behave less than desirably if getObjectClass() could return a value that wasn't a member of the enum, but it's hard to credit that happening. I guess I'd vote for removing the default: case from all of the places that have "switch (getObjectClass(object))", as forgetting to add an entry seems like a much larger hazard. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH clause in CREATE STATISTICS
Tom Lane wrote: > Alvaro Herrera writes: > > Tom Lane wrote: > >> Oh, sorry --- I already pushed something about this. > > > That's fine, thanks. > > Oh, I see your patch also fixes missing code in getObjectDescription(). > We need that. Is there a decent way to get the compiler to warn about > that oversight? We could remove the default clause. That results in the expected warning, and no others (i.e. the switch was already complete): /pgsql/source/master/src/backend/catalog/objectaddress.c:2657:2: warning: enumeration value 'OCLASS_STATISTIC_EXT' not handled in switch [-Wswitch] -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH clause in CREATE STATISTICS
Alvaro Herrera writes: > Tom Lane wrote: >> Oh, sorry --- I already pushed something about this. > That's fine, thanks. Oh, I see your patch also fixes missing code in getObjectDescription(). We need that. Is there a decent way to get the compiler to warn about that oversight? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH clause in CREATE STATISTICS
Tom Lane wrote: > Alvaro Herrera writes: > > Tom Lane wrote: > >> Although I've not done anything about it here, I'm not happy about the > >> handling of dependencies for stats objects. > > > Here are two patches regarding handling of dependencies. > > Oh, sorry --- I already pushed something about this. That's fine, thanks. I'm glad that this thread got you interested in look over the extended statistics stuff. Thanks for the input. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH clause in CREATE STATISTICS
Alvaro Herrera writes: > Tom Lane wrote: >> Although I've not done anything about it here, I'm not happy about the >> handling of dependencies for stats objects. > Here are two patches regarding handling of dependencies. Oh, sorry --- I already pushed something about this. > The first one > implements your first suggestion: add a NORMAL dependency on each > column, and do away with RemoveStatisticsExt. This works well and > should uncontroversial. No, I wanted an AUTO dependency there, for exactly the reason you mention: > If we only do this, then DROP TABLE needs to say CASCADE if there's a > statistics object in the table. I don't think we really want that, do we? A stats object can't be of any value if the underlying table is gone. Perhaps that calculus would change for cross-table stats but I don't see why. > This seems pointless to me, so the > second patch throws in an additional dependency on the table as a whole, > AUTO this time, so that if the table is dropped, the statistics goes > away without requiring cascade. There is no point in forcing CASCADE > for this case, ISTM. This works well too, but I admit there might be > resistance to doing it. OTOH this is how CREATE INDEX works. Uh, no it isn't. Indexes have auto dependencies on the individual columns they index, and *not* on the table as a whole. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Patch to fix documentation about AFTER triggers
Here is a patch to amend the docs here: https://www.postgresql.org/docs/devel/static/plpgsql-trigger.html In the example for an AFTER trigger, you see this code: -- -- Create a row in emp_audit to reflect the operation performed on emp, -- make use of the special variable TG_OP to work out the operation. -- IF (TG_OP = 'DELETE') THEN INSERT INTO emp_audit SELECT 'D', now(), user, OLD.*; RETURN OLD; ELSIF (TG_OP = 'UPDATE') THEN INSERT INTO emp_audit SELECT 'U', now(), user, NEW.*; RETURN NEW; ELSIF (TG_OP = 'INSERT') THEN INSERT INTO emp_audit SELECT 'I', now(), user, NEW.*; RETURN NEW; END IF; RETURN NULL; -- result is ignored since this is an AFTER trigger What are all those RETURNs doing in there? The comment on the final RETURN is correct, so returning NEW or OLD above seems confusing, and likely a copy/paste error. This patch just removes those three lines from the example code. Thanks! Paul diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml index a6088e9..dc29e7c 100644 --- a/doc/src/sgml/plpgsql.sgml +++ b/doc/src/sgml/plpgsql.sgml @@ -3999,13 +3999,10 @@ CREATE OR REPLACE FUNCTION process_emp_audit() RETURNS TRIGGER AS $emp_audit$ -- IF (TG_OP = 'DELETE') THEN INSERT INTO emp_audit SELECT 'D', now(), user, OLD.*; -RETURN OLD; ELSIF (TG_OP = 'UPDATE') THEN INSERT INTO emp_audit SELECT 'U', now(), user, NEW.*; -RETURN NEW; ELSIF (TG_OP = 'INSERT') THEN INSERT INTO emp_audit SELECT 'I', now(), user, NEW.*; -RETURN NEW; END IF; RETURN NULL; -- result is ignored since this is an AFTER trigger END; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] About forced dependencies on catversion.h
backend/access/transam/Makefile contains # ensure that version checks in xlog.c get recompiled when catversion.h changes xlog.o: xlog.c $(top_srcdir)/src/include/catalog/catversion.h which, at the time it was added, was sufficient to ensure you could do "make check" after bumping catversion and things would work, whether or not you bother with --enable-depend. It ain't sufficient anymore though. First off, pg_rewind.c also has compiled-in knowledge of CATALOG_VERSION_NO. So if you try "make check-world" after naively bumping catversion, the pg_rewind TAP tests will blow up most spectacularly, because the backend will have been rebuilt to use the new catversion while pg_rewind won't have. I started this post with the intention of proposing that we add a similar forced dependency for pg_rewind.o. However, grepping revealed a much bigger hazard, which is that CATALOG_VERSION_NO also factors into TABLESPACE_VERSION_DIRECTORY, and there are dependencies on that in about eight different files. The scary thing there is that a naive test run will appear to work as long as you didn't rebuild any of those files; while if you caused a rebuild of just some of them, it's gonna be a mess. So what I now think is that we'd be better off removing the forced dependency shown above. It's just encouraging people to take unsafe shortcuts. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH clause in CREATE STATISTICS
Tom Lane wrote: > Although I've not done anything about it here, I'm not happy about the > handling of dependencies for stats objects. I do not think that cloning > RemoveStatistics as RemoveStatisticsExt is sane at all. The former is > meant to deal with cleaning up pg_statistic rows that we know will be > there, so there's no real need to expend pg_depend overhead to track them. > For objects that are only loosely connected, the right thing is to use > the dependency system; in particular, this design has no real chance of > working well with cross-table stats. Also, it's really silly to have > *both* this hard-wired mechanism and a pg_depend entry; the latter is > surely redundant if you have the former. IMO we should revert > RemoveStatisticsExt and instead handle things by making stats objects > auto-dependent on the individual column(s) they reference (not the whole > table). > > I'm also of the opinion that having an AUTO dependency, rather than > a NORMAL dependency, on the stats object's schema is the wrong semantics. > There isn't any other case where you can drop a non-empty schema without > saying CASCADE, and I'm mystified why this case should act that way. Here are two patches regarding handling of dependencies. The first one implements your first suggestion: add a NORMAL dependency on each column, and do away with RemoveStatisticsExt. This works well and should uncontroversial. If we only do this, then DROP TABLE needs to say CASCADE if there's a statistics object in the table. This seems pointless to me, so the second patch throws in an additional dependency on the table as a whole, AUTO this time, so that if the table is dropped, the statistics goes away without requiring cascade. There is no point in forcing CASCADE for this case, ISTM. This works well too, but I admit there might be resistance to doing it. OTOH this is how CREATE INDEX works. (I considered what would happen with a stats object covering multiple tables. I think it's reasonable to drop the stats too in that case, under the rationale that the object is no longer useful. Not really sure about this.) -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >From 06fad759888d5060f9b4cf4d4f4fdec777350027 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Fri, 12 May 2017 17:16:26 -0300 Subject: [PATCH 1/2] fix dependencies problem --- src/backend/catalog/heap.c | 74 - src/backend/catalog/objectaddress.c | 20 + src/backend/commands/statscmds.c| 18 src/test/regress/expected/stats_ext.out | 13 -- src/test/regress/sql/stats_ext.sql | 9 ++-- 5 files changed, 45 insertions(+), 89 deletions(-) diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index ab3d83f29b..ba63939022 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -1615,10 +1615,7 @@ RemoveAttributeById(Oid relid, AttrNumber attnum) heap_close(attr_rel, RowExclusiveLock); if (attnum > 0) - { RemoveStatistics(relid, attnum); - RemoveStatisticsExt(relid, attnum); - } relation_close(rel, NoLock); } @@ -1873,7 +1870,6 @@ heap_drop_with_catalog(Oid relid) * delete statistics */ RemoveStatistics(relid, 0); - RemoveStatisticsExt(relid, 0); /* * delete attribute tuples @@ -2784,76 +2780,6 @@ RemoveStatistics(Oid relid, AttrNumber attnum) heap_close(pgstatistic, RowExclusiveLock); } - -/* - * RemoveStatisticsExt --- remove entries in pg_statistic_ext for a relation - * - * If attnum is zero, remove all entries for rel; else remove only the - * one(s) involving that column. - */ -void -RemoveStatisticsExt(Oid relid, AttrNumber attnum) -{ - Relationpgstatisticext; - SysScanDesc scan; - ScanKeyData key; - HeapTuple tuple; - - /* -* Scan pg_statistic_ext to delete relevant tuples -*/ - pgstatisticext = heap_open(StatisticExtRelationId, RowExclusiveLock); - - ScanKeyInit(&key, - Anum_pg_statistic_ext_stxrelid, - BTEqualStrategyNumber, F_OIDEQ, - ObjectIdGetDatum(relid)); - - scan = systable_beginscan(pgstatisticext, - StatisticExtRelidIndexId, - true, NULL, 1, &key); - - while (HeapTupleIsValid(tuple = systable_getnext(scan))) - { - booldelete = false; - - if (attnum == 0) - delete = true; - else if (attnum != 0) - { - Form_pg_statistic_ext staForm; - int i; - - /* -
Re: [HACKERS] CTE inlining
On Fri, May 12, 2017 at 3:39 PM, Bruce Momjian wrote: > On Tue, May 9, 2017 at 05:14:19PM -0400, Tom Lane wrote: >> Ilya Shkuratov writes: >> > Ok, it seems that most people in discussion are agree that removing >> > optimization >> > fence is a right thing to do. >> > Nonetheless I still hoping to discuss the algorithm and its implementation. >> >> Yeah, so far we've mainly discussed whether to do that and how to control >> it, not what the actual results would be. > > To summarize, it seems we have two options if we want to add fence > control to CTEs: > > 1. add INLINE to disable the CTE fence > 2. add MATERIALIZE to enable the CTE fence > > or some other keywords. I think most people prefer #2 because: > > * most users writing queries prefer #2 Yeah, I think there was rough consensus on this point.I think it's fair to assume that most (or at least a majority) of queries will benefit from inlining, people would want to opt out of, rather than opt in to, generally good optimization strategies. This will hit some in people today, but this is not a backwards compatibility issue since performance is generally not really fairly described as compatibility criteria. If this feature drops we ought to warn people in the release notes though. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CTE inlining
On Tue, May 9, 2017 at 05:14:19PM -0400, Tom Lane wrote: > Ilya Shkuratov writes: > > Ok, it seems that most people in discussion are agree that removing > > optimization > > fence is a right thing to do. > > Nonetheless I still hoping to discuss the algorithm and its implementation. > > Yeah, so far we've mainly discussed whether to do that and how to control > it, not what the actual results would be. To summarize, it seems we have two options if we want to add fence control to CTEs: 1. add INLINE to disable the CTE fence 2. add MATERIALIZE to enable the CTE fence or some other keywords. I think most people prefer #2 because: * most users writing queries prefer #2 * most users assume full optimization and it seems natural to turn _off_ an optimization via a keyword * while some queries can be inlined, all queries can be materialized, so doing #1 means INLINE would be only a preference, which could be confusing Anyway, I am very glad we are considering addressing this in PG 11. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH clause in CREATE STATISTICS
I wrote: > Tomas Vondra writes: >> On 5/12/17 4:46 AM, Tom Lane wrote: >>> Although I've not done anything about it here, I'm not happy about the >>> handling of dependencies for stats objects. >> Yeah, it's a bit frankensteinian ... > I'm prepared to create a fix for that, but it'd be easier to commit the > current patch first, to avoid merge conflicts. Done. I noted that we weren't making an owner dependency either :-(. Also, we might want to think about letting stats objects be extension members sometime. As things stand, if an extension script does CREATE STATISTICS, the stats object will just be "loose" rather than bound into the extension. We still have docs and nomenclature issues to work on. Anyone want to take point on those? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash Functions
On 5/12/17 14:23, Robert Haas wrote: > One alternative would be to change the way that we dump and restore > the data. Instead of dumping the data with the individual partitions, > dump it all out for the parent and let tuple routing sort it out at > restore time. I think this could be a pg_dump option. One way it dumps out the partitions, and another way it recomputes the partitions. I think that could be well within pg_dump's mandate. (cough ... logical replication ... cough) > Of course, this isn't very satisfying because now > dump-and-restore hasn't really preserved the state of the database; That depends no whether you consider the partitions to be a user-visible or an internal detail. The current approach is sort of in the middle, so it makes sense to allow the user to interpret it either way depending on need. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Tab-completing DROP STATISTICS
Tom Lane wrote: > I did not review the rest of the regression test changes, nor the > tab-complete changes. I can however report that DROP STATISTICS > doesn't successfully complete any stats object names. The attached patch fixes this problem. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 5f47c59f8a..d55656769b 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -16681,6 +16681,12 @@ SELECT relname FROM pg_class WHERE pg_table_is_visible(oid); is operator family visible in search path + pg_statistic_ext_is_visible(stat_oid) + + boolean + is statistics object visible in search path + + pg_table_is_visible(table_oid) boolean @@ -16739,6 +16745,9 @@ SELECT relname FROM pg_class WHERE pg_table_is_visible(oid); pg_opfamily_is_visible +pg_statistic_ext_is_visible + + pg_table_is_visible diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c index e521bd908a..5b339222af 100644 --- a/src/backend/catalog/namespace.c +++ b/src/backend/catalog/namespace.c @@ -34,6 +34,7 @@ #include "catalog/pg_operator.h" #include "catalog/pg_opfamily.h" #include "catalog/pg_proc.h" +#include "catalog/pg_statistic_ext.h" #include "catalog/pg_ts_config.h" #include "catalog/pg_ts_dict.h" #include "catalog/pg_ts_parser.h" @@ -2142,6 +2143,72 @@ get_statistics_oid(List *names, bool missing_ok) } /* + * StatisticsObjIsVisible + * Determine whether a statistics object (identified by OID) is visible in + * the current search path. Visible means "would be found by searching + * for the unqualified statistics object name". + */ +bool +StatisticsObjIsVisible(Oid relid) +{ + HeapTuple stxtup; + Form_pg_statistic_ext stxform; + Oid stxnamespace; + boolvisible; + + stxtup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(relid)); + if (!HeapTupleIsValid(stxtup)) + elog(ERROR, "cache lookup failed for statistics object %u", relid); + stxform = (Form_pg_statistic_ext) GETSTRUCT(stxtup); + + recomputeNamespacePath(); + + /* +* Quick check: if it ain't in the path at all, it ain't visible. Items in +* the system namespace are surely in the path and so we needn't even do +* list_member_oid() for them. +*/ + stxnamespace = stxform->stxnamespace; + if (stxnamespace != PG_CATALOG_NAMESPACE && + !list_member_oid(activeSearchPath, stxnamespace)) + visible = false; + else + { + /* +* If it is in the path, it might still not be visible; it could be +* hidden by another statistics object of the same name earlier in the +* path. So we must do a slow check for conflicting objects. +*/ + char *stxname = NameStr(stxform->stxname); + ListCell *l; + + visible = false; + foreach(l, activeSearchPath) + { + Oid namespaceId = lfirst_oid(l); + + if (namespaceId == stxnamespace) + { + /* Found it first in path */ + visible = true; + break; + } + if (SearchSysCacheExists2(STATEXTNAMENSP, + PointerGetDatum(stxname), + ObjectIdGetDatum(namespaceId))) + { + /* Found something else first in path */ + break; + } + } + } + + ReleaseSysCache(stxtup); + + return visible; +} + +/* * get_ts_parser_oid - find a TS parser by possibly qualified name * * If not found, returns InvalidOid if missing_ok, else throws error @@ -4236,6 +4303,17 @@ pg_conversion_is_visible(PG_FUNCTION_ARGS) } Datum +pg_statistic_ext_is_visible(PG_FUNCTION_ARGS) +{ + Oid oid = PG_GETARG_OID(0); + + if (!SearchSysCacheExists1(STATEXTOID, ObjectIdGetDatum(oid))) + PG_RETURN_NULL(); + + PG_RETURN_BOOL(StatisticsObjIsVisible(oid)); +} + +Datum pg_ts_parser_is_visible(PG_FUNCTION_ARGS) { Oid oid = PG_GETARG_OID(0); diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c index 236d876b1c..720e540276 100644 --- a/src/backend/utils/cache/lsyscache.c +++ b/src/backend/utils/cache/lsyscache.c @
Re: [HACKERS] Hash Functions
On Fri, May 12, 2017 at 2:45 PM, Tom Lane wrote: > Yeah, that isn't really appetizing at all. If we were doing physical > partitioning below the user-visible level, we could make it fly. > But the existing design makes the partition boundaries user-visible > which means we have to insist that the partitioning rule is immutable > (in the strongest sense of being invariant across all installations > you could possibly wish to transport data between). I think you're right. > Now, we already have some issues of that sort with range partitioning; > if say you try to transport data to another system with a different > sort ordering, you have probably got problems. But that issue already > existed with the old manual partitioning approach, ie if you have a > CHECK constraint like "(col >= 'foo' && col < 'gob')" then a transfer > to such a system could fail already. So I'm okay with that. But it > seems like hash partitioning is going to introduce new, exciting, and > hard-to-document-or-avoid portability gotchas. Do we really need > to go there? That is a good question. I think it basically amounts to this question: is hash partitioning useful, and if so, for what? Range and list partitioning inherently provide management benefits. For example, if you range-partition your data by year, then when you want to get rid of the oldest year worth of data, you can drop the entire partition at once, which is likely to be much faster than a DELETE statement. But hash partitioning provide no such benefits because, by design, the distribution of the data among the partitions is essentially random. Dropping one partition will not usually be a useful thing to do because the rows in that partition are logically unconnected. So, if you have a natural way of partitioning a table by range or list, that's probably going to be superior to hash partitioning, but sometimes there isn't an obviously useful way of breaking up the data, but you still want to partition it in order to reduce the size of the individual tables. That, in turn, allows maintenance operations to be performed each partition separately. For example, suppose your table is big enough that running CLUSTER or VACUUM FULL on it takes eight hours, but you can't really afford an eight-hour maintenance window when the table gets bloated. However, you can afford (on Sunday nights when activity is lowest) a window of, say, one hour. Well, if you hash-partition the table, you can CLUSTER or VACUUM FULL one partition a week for N weeks until you get to them all. Similarly, if you need to create an index, you can build it on one partition at a time. You can even add the index to one partition to see how well it works -- for example, does it turn too many HOT updates into non-HOT updates, causing bloat? -- and try it out before you go do it across the board. And an unpartitioned table can only accommodate one VACUUM process at a time, but a partitioned table can have one per partition. Partitioning a table also means that you don't need a single disk volume large enough to hold the whole thing; instead, you can put each partition in a separate tablespace or (in some exciting future world where PostgreSQL looks more like a distributed system) perhaps even on another server. For a table that is a few tens of gigabytes, none of this amounts to a hill of beans, but for a table that is a few tens of terabytes, the time it takes to perform administrative operations can become a really big problem. A good example is, say, a table full of orders. Imagine a high-velocity business like Amazon or UPS that has a constant stream of new orders, or a mobile app that has a constant stream of new user profiles. If that data grows fast enough that the table needs to be partitioned, how should it be done? It's often desirable to create partitions of about equal size and about equal hotness, and range-partitioning on the creation date or order ID number means continually creating new partitions, and having all of the hot data in the same partition. In my experience, it is *definitely* the case that users with very large tables are experiencing significant pain right now. The freeze map changes were a good step towards ameliorating some of that pain, but I think hash partitioning is another step in that direction, and I think there will be other applications as well. Even for users who don't have data that large, there are also interesting query-performance implications of hash partitioning. Joins between very large tables tend to get implemented as merge joins, which often means sorting all the data, which is O(n lg n) and not easy to parallelize. But if those very large tables happened to be compatibly partitioned on the join key, you could do a partitionwise join per the patch Ashutosh proposed, which would be cheaper and easier to do in parallel. Maybe a shorter argument for hash partitioning is that not one but two different people proposed patches for it within
Re: [HACKERS] Hash Functions
On Fri, May 12, 2017 at 02:23:14PM -0400, Robert Haas wrote: > > What about integers? I think we're already assuming two's-complement > arithmetic, which I think means that the only problem with making the > hash values portable for integers is big-endian vs. little-endian. > That's sounds solveable-ish. > xxhash produces identical hashes independent for big-endian and little- endian. https://github.com/Cyan4973/xxHash Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash Functions
Robert Haas writes: > On Fri, May 12, 2017 at 1:34 PM, Tom Lane wrote: >> I'd vote that it's not, which means that this whole approach to hash >> partitioning is unworkable. I agree with Andres that demanding hash >> functions produce architecture-independent values will not fly. > If we can't produce architecture-independent hash values, then what's > the other option? Forget hash partitioning. There's no law saying that that's a good idea and we have to have it. With a different set of constraints, maybe we could do it, but I think the existing design decisions have basically locked it out --- and I doubt that hash partitioning is so valuable that we should jettison other desirable properties to get it. > One alternative would be to change the way that we dump and restore > the data. Instead of dumping the data with the individual partitions, > dump it all out for the parent and let tuple routing sort it out at > restore time. Of course, this isn't very satisfying because now > dump-and-restore hasn't really preserved the state of the database; > indeed, you could make it into a hard failure by creating a foreign > key referencing a partition hash partition. Yeah, that isn't really appetizing at all. If we were doing physical partitioning below the user-visible level, we could make it fly. But the existing design makes the partition boundaries user-visible which means we have to insist that the partitioning rule is immutable (in the strongest sense of being invariant across all installations you could possibly wish to transport data between). Now, we already have some issues of that sort with range partitioning; if say you try to transport data to another system with a different sort ordering, you have probably got problems. But that issue already existed with the old manual partitioning approach, ie if you have a CHECK constraint like "(col >= 'foo' && col < 'gob')" then a transfer to such a system could fail already. So I'm okay with that. But it seems like hash partitioning is going to introduce new, exciting, and hard-to-document-or-avoid portability gotchas. Do we really need to go there? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash Functions
On Fri, May 12, 2017 at 1:34 PM, Tom Lane wrote: > I'd vote that it's not, which means that this whole approach to hash > partitioning is unworkable. I agree with Andres that demanding hash > functions produce architecture-independent values will not fly. If we can't produce architecture-independent hash values, then what's the other option? One alternative would be to change the way that we dump and restore the data. Instead of dumping the data with the individual partitions, dump it all out for the parent and let tuple routing sort it out at restore time. Of course, this isn't very satisfying because now dump-and-restore hasn't really preserved the state of the database; indeed, you could make it into a hard failure by creating a foreign key referencing a partition hash partition. After dump-and-restore, the row ends up in some other partition and the foreign key can't be recreated because the relationship no longer holds. This isn't limited to foreign keys, either; similar problems could be created with CHECK constraints or other per-table properties that can vary between one child and another. I basically think it's pretty futile to suppose that we can get away with having a dump and restore move rows around between partitions without having that blow up in some cases. > Maintaining such a property for float8 (and the types that depend on it) > might be possible if you believe that nobody ever uses anything but IEEE > floats, but we've never allowed that as a hard assumption before. I don't know how standard that is. Is there any hardware that anyone's likely to be using that doesn't? TBH, I don't really care if support for obscure, nearly-dead platforms like VAX or whatever don't quite work with hash-partitioned tables. In practice, PostgreSQL only sorta works on that kind of platform anyway; there are far bigger problems than this. On the other hand, if there are servers being shipped in 2017 that don't use IEEE floats, that's another problem. What about integers? I think we're already assuming two's-complement arithmetic, which I think means that the only problem with making the hash values portable for integers is big-endian vs. little-endian. That's sounds solveable-ish. > Even architecture dependence isn't the whole scope of the problem. > Consider for example dumping a LATIN1-encoded database and trying > to reload it into a UTF8-encoded database. People will certainly > expect that to be possible, and do you want to guarantee that the > hash of a text value is encoding-independent? No, I think that's expecting too much. I'd be just fine telling people that if you hash-partition on a text column, it may not load into a database with another encoding. If you care about that, don't use hash-partitioning, or don't change the encoding, or dump out the partitions one by one and reload all the roads into the parent table for re-routing, solving whatever problems come up along the way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] renaming "transaction log"
On 5/12/17 12:12, Dagfinn Ilmari Mannsåker wrote: > Peter Eisentraut writes: > >> Committed, adjusting for some of the suggestions. > > Looks like one occurrence was still left in: > > $ ag --no-group --ignore po --ignore release-\* --ignore wal.sgml > '\btransaction\s+log\b' > doc/src/sgml/func.sgml:18631:end of the transaction log at any instant, > while the write location is the end of > > The four other occurrences in the same paragraph were fixed, so I assume > this was just an oversight. Fixed, thanks. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] renaming "transaction log"
On 5/11/17 12:00, Tom Lane wrote: > Peter Eisentraut writes: >> Most documentation and error messages still uses the term "transaction >> log" to refer to the write-ahead log. Here is a patch to rename that, >> which I think should be done, to match the xlog -> wal renaming in APIs. > > This will need to be rebased over d10c626de, which made a few of the > same/similar changes but did not try to hit stuff that wasn't adjacent > to what it was changing anyway. I'm +1 for the idea though. I think > we should also try to standardize on the terminology "WAL location" > for LSNs, not variants such as "WAL position" or "log position". done -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cached plans and statement generalization
On Fri, May 12, 2017 at 08:35:26PM +0300, Konstantin Knizhnik wrote: > > > On 12.05.2017 18:23, Bruce Momjian wrote: > >On Fri, May 12, 2017 at 10:50:41AM +0300, Konstantin Knizhnik wrote: > >>Definitely changing session context (search_path, date/time format, ...) may > >>cause incorrect behavior of cached statements. > >I wonder if we should clear the cache whenever any SET command is > >issued. > > Well, with autoprepare cache disabled on each set variable, alter system and > any slow utility statement > only one regression test is not passed. And only because of different error > message context: Great. We have to think of how we would keep this reliable when future changes happen. Simple is often better because it limits the odds of breakage. > >>Actually you may get the same problem with explicitly prepared statements > >>(certainly, in the last case, you better understand what going on and it is > >>your choice whether to use or not to use prepared statement). > >> > >>The fact of failure of 7 regression tests means that autoprepare can really > >>change behavior of existed program. This is why my suggestion is to switch > >>off this feature by default. > >I would like to see us target something that can be enabled by default. > >Even if it only improves performance by 5%, it would be better overall > >than a feature that improves performance by 90% but is only used by 1% > >of our users. > > I have to agree with you here. Good. I know there are database systems that will try to get the most performance possible no matter how complex it is for users to configure or to maintain, but that isn't us. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cached plans and statement generalization
On May 12, 2017 12:50:41 AM PDT, Konstantin Knizhnik wrote: >Definitely changing session context (search_path, date/time format, >...) >may cause incorrect behavior of cached statements. >Actually you may get the same problem with explicitly prepared >statements (certainly, in the last case, you better understand what >going on and it is your choice whether to use or not to use prepared >statement). > >The fact of failure of 7 regression tests means that autoprepare can >really change behavior of existed program. This is why my suggestion is > >to switch off this feature by default. I can't see us agreeing with this feature unless it actually reliably works, even if disabled by default. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash Functions
On 05/12/2017 10:17 AM, Robert Haas wrote: > On Fri, May 12, 2017 at 1:12 PM, Andres Freund wrote: >> Given that a lot of data types have a architecture dependent >> representation, it seems somewhat unrealistic and expensive to have >> a hard rule to keep them architecture agnostic. And if that's not >> guaranteed, then I'm doubtful it makes sense as a soft rule >> either. > > That's a good point, but the flip side is that, if we don't have > such a rule, a pg_dump of a hash-partitioned table on one > architecture might fail to restore on another architecture. Today, I > believe that, while the actual database cluster is > architecture-dependent, a pg_dump is architecture-independent. Is it > OK to lose that property? Not from where I sit. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Cached plans and statement generalization
On 12.05.2017 18:23, Bruce Momjian wrote: On Fri, May 12, 2017 at 10:50:41AM +0300, Konstantin Knizhnik wrote: Definitely changing session context (search_path, date/time format, ...) may cause incorrect behavior of cached statements. I wonder if we should clear the cache whenever any SET command is issued. Well, with autoprepare cache disabled on each set variable, alter system and any slow utility statement only one regression test is not passed. And only because of different error message context: *** /home/knizhnik/postgresql.master/src/test/regress/expected/date.out 2017-04-11 18:07:56.497461208 +0300 --- /home/knizhnik/postgresql.master/src/test/regress/results/date.out 2017-05-12 20:21:19.767566302 +0300 *** *** 1443,1452 -- SELECT EXTRACT(MICROSEC FROM DATE 'infinity'); -- ERROR: timestamp units "microsec" not recognized ERROR: timestamp units "microsec" not recognized - CONTEXT: SQL function "date_part" statement 1 SELECT EXTRACT(UNDEFINED FROM DATE 'infinity'); -- ERROR: timestamp units "undefined" not supported ERROR: timestamp units "undefined" not supported - CONTEXT: SQL function "date_part" statement 1 -- test constructors select make_date(2013, 7, 15); make_date --- 1443,1450 == Actually you may get the same problem with explicitly prepared statements (certainly, in the last case, you better understand what going on and it is your choice whether to use or not to use prepared statement). The fact of failure of 7 regression tests means that autoprepare can really change behavior of existed program. This is why my suggestion is to switch off this feature by default. I would like to see us target something that can be enabled by default. Even if it only improves performance by 5%, it would be better overall than a feature that improves performance by 90% but is only used by 1% of our users. I have to agree with you here. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash Functions
Robert Haas writes: > On Fri, May 12, 2017 at 1:12 PM, Andres Freund wrote: >> Given that a lot of data types have a architecture dependent representation, >> it seems somewhat unrealistic and expensive to have a hard rule to keep them >> architecture agnostic. And if that's not guaranteed, then I'm doubtful it >> makes sense as a soft rule either. > That's a good point, but the flip side is that, if we don't have such > a rule, a pg_dump of a hash-partitioned table on one architecture > might fail to restore on another architecture. Today, I believe that, > while the actual database cluster is architecture-dependent, a pg_dump > is architecture-independent. Is it OK to lose that property? I'd vote that it's not, which means that this whole approach to hash partitioning is unworkable. I agree with Andres that demanding hash functions produce architecture-independent values will not fly. Maintaining such a property for float8 (and the types that depend on it) might be possible if you believe that nobody ever uses anything but IEEE floats, but we've never allowed that as a hard assumption before. Even architecture dependence isn't the whole scope of the problem. Consider for example dumping a LATIN1-encoded database and trying to reload it into a UTF8-encoded database. People will certainly expect that to be possible, and do you want to guarantee that the hash of a text value is encoding-independent? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Getting error at the time of dropping subscription and few more issues
On Fri, May 12, 2017 at 9:35 PM, tushar wrote: > Hi, > > There are few more issues , found in logical replication > > (1)ERROR: tuple concurrently updated > > Publication Server - (X machine) > \\create table \ create publication \ insert rows > create table t(n int); > create publication pub for table t; > insert into t values (generate_series(1,100)); > > Subscription Server-(Y machine) > \\create table t / create subscription > create table t(n int); > create subscription sub connection 'dbname=postgres port=5000 user=centos > password=a' publication pub; > > \\drop subscription and re-create (repeat this 2-3 times) > postgres=# drop subscription sub; > NOTICE: dropped replication slot "sub" on publisher > DROP SUBSCRIPTION > postgres=# create subscription sub connection 'dbname=postgres port=5000 > user=centos password=a' publication pub; > NOTICE: synchronized table states > NOTICE: created replication slot "sub" on publisher > CREATE SUBSCRIPTION > postgres=# select count(*) from t; > count > - > 100 > (1 row) > > postgres=# drop subscription sub; > ERROR: tuple concurrently updated I wonder the subscriber had already behaved oddly during dropping and creating the subscription repeatedly. An issue related to doing DROP SUBSCRIPTION repeatedly is reported on another thread and is under the discussion[1]. This issue might be relevant with that. > > (2) Not able to drop the subscription even 'nocreate slot' is specified > > postgres=# create subscription s2s1 connection 'dbname=postgres port=5000 > user=t password=a' publication t with (nocreate > slot,enabled,copydata,SYNCHRONOUS_COMMIT='on'); > NOTICE: synchronized table states > CREATE SUBSCRIPTION > > --not able to drop subscription , i have checked on Publication - no such > slot created but still it is looking for slot. > postgres=# drop subscription s2s1; > ERROR: could not drop the replication slot "s2s1" on publisher > DETAIL: The error was: ERROR: replication slot "s2s1" does not exist As documentation mentions[2], we can drop subscription after disabled and set slot name to NONE by ALTER SUBSCRIPTION. > (3)Alter publication SET command doesn't give you NOTICE message about > tables which got removed. > > postgres=# create publication pub for table t,t1,t2 ; > CREATE PUBLICATION > > postgres=# select * from pg_publication_tables ; > pubname | schemaname | tablename > -++--- > pub | public | t > pub | public | t1 > pub | public | t2 > (3 rows) > > postgres=# alter publication pub set table t; > ALTER PUBLICATION > > postgres=# select * from pg_publication_tables ; > pubname | schemaname | tablename > -++--- > pub | public | t > (1 row) > > in subscription - (we are getting NOTICE message, about tables which got > removed) > > postgres=# alter subscription sub set publication pub refresh; > NOTICE: removed subscription for table public.t1 > NOTICE: removed subscription for table public.t2 > ALTER SUBSCRIPTION > > I think - in publication too ,we should provide NOTICE messages. > I guess one of the reason why we emit such a NOTICE message is that subscriber cannot control which table the upstream server replicate. So if a table got disassociated on the publisher the subscriber should report that to user. On the other hand, since the publication can control it and the changes are obvious, I'm not sure we really need to do that. BTW I think it's better for the above NOTICE message to have subscription name. [1] https://www.postgresql.org/message-id/CAD21AoBYpyqTSw%2B%3DES%2BxXtRGMPKh%3DpKiqjNxZKnNUae0pSt9bg%40mail.gmail.com [2] https://www.postgresql.org/docs/devel/static/logical-replication-subscription.html#logical-replication-subscription-slot Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash Functions
On Fri, May 12, 2017 at 1:12 PM, Andres Freund wrote: > Given that a lot of data types have a architecture dependent representation, > it seems somewhat unrealistic and expensive to have a hard rule to keep them > architecture agnostic. And if that's not guaranteed, then I'm doubtful it > makes sense as a soft rule either. That's a good point, but the flip side is that, if we don't have such a rule, a pg_dump of a hash-partitioned table on one architecture might fail to restore on another architecture. Today, I believe that, while the actual database cluster is architecture-dependent, a pg_dump is architecture-independent. Is it OK to lose that property? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash Functions
On May 12, 2017 10:05:56 AM PDT, Robert Haas wrote: >On Fri, May 12, 2017 at 12:08 AM, Jeff Davis wrote: >> 1. The hash functions as they exist today aren't portable -- they can >> return different results on different machines. That means using >these >> functions for hash partitioning would yield different contents for >the >> same partition on different architectures (and that's bad, >considering >> they are logical partitions and not some internal detail). > >Hmm, yeah, that is bad. Given that a lot of data types have a architecture dependent representation, it seems somewhat unrealistic and expensive to have a hard rule to keep them architecture agnostic. And if that's not guaranteed, then I'm doubtful it makes sense as a soft rule either. Andres Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] eval_const_expresisions and ScalarArrayOpExpr
On Fri, May 12, 2017 at 12:55 PM, Tom Lane wrote: > Robert Haas writes: >> On Fri, May 12, 2017 at 12:04 PM, Tom Lane wrote: >>> Actually, I now remember that I wrote that while at Salesforce (because >>> they'd run into the problem that constant ArrayRef expressions weren't >>> simplified). So that means it's been languishing in a git branch for >>> *two* years. I'm kind of inclined to push it before it gets forgotten >>> again ;-) > >> You will probably not be surprised to hear that I think this is a >> feature and thus subject to the feature freeze currently in effect. > > [ shrug ] Sure. I'll do what I should have done then, which is stick > it into the next CF list. Thanks. > If you intend to also object to my proposed get_attstatsslot refactoring, > please do that promptly. That one I think is bug-proofing. I wasn't planning to express an opinion on that one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] hash partitioning
On Fri, May 12, 2017 at 6:08 PM, amul sul wrote: > Hi, > > Please find the following updated patches attached: > > 0001-Cleanup.patch : Does some cleanup and code refactoring required > for hash partition patch. Otherwise, there will be unnecessary diff in > 0002 patch Thanks for splitting the patch. +if (isnull[0]) +cur_index = partdesc->boundinfo->null_index; This code assumes that null_index will be set to -1 when has_null is false. Per RelationBuildPartitionDesc() this is true. But probably we should write this code as if (isnull[0]) { if (partdesc->boundinfo->has_null) cur_index = partdesc->boundinfo->null_index; } That way we are certain that when has_null is false, cur_index = -1 similar to the original code. Additional arguement to ComputePartitionAttrs() isn't used anywhere in this patch, so may be this better be part of 0002. If we do this the only change that will remain in patch is the refactoring of RelationBuildPartitionDesc(), so we may consider merging it into 0002, unless we find that some more refactoring is needed. But for now, having it as a separate patch helps. Here's some more comments on 0002 + * In the case of hash partitioning, datums is a 2-D array, stores modulus and + * remainder values at datums[x][0] and datums[x][1] respectively for each + * partition in the ascending order. This comment about datums should appear in a paragraph of itself and may be rephrased as in the attached patch. May be we could also add a line about ndatums for hash partitioned tables as in the attached patch. + * (see the above enum); NULL for has and list typo s/has/hash +if (key->strategy == PARTITION_STRATEGY_HASH) +{ +ndatums = nparts; +hbounds = (PartitionHashBound **) palloc(nparts * + sizeof(PartitionHashBound *)); +i = 0; +foreach (cell, boundspecs) +{ +PartitionBoundSpec *spec = lfirst(cell); + [ clipped ] +hbounds[i]->index = i; +i++; +} For list and range partitioned table we order the bounds so that two partitioned tables have them in the same order irrespective of order in which they are specified by the user or hence stored in the catalogs. The partitions then get indexes according the order in which their bounds appear in ordered arrays of bounds. Thus any two partitioned tables with same partition specification always have same PartitionBoundInfoData. This helps in partition-wise join to match partition bounds of two given tables. Above code assigns the indexes to the partitions as they appear in the catalogs. This means that two partitioned tables with same partition specification but different order for partition bound specification will have different PartitionBoundInfoData represenation. If we do that, probably partition_bounds_equal() would reduce to just matching indexes and the last element of datums array i.e. the greatest modulus datum. If ordered datums array of two partitioned table do not match exactly, the mismatch can be because missing datums or different datums. If it's a missing datum it will change the greatest modulus or have corresponding entry in indexes array as -1. If the entry differs it will cause mismatching indexes in the index arrays. + * is not a factor of 15. + * + * + * Get greatest bound in array boundinfo->datums which is An extra line here. +if (offset < 0) +{ +nmod = DatumGetInt32(datums[0][0]); +valid_bound = (nmod % spec->modulus) == 0; +} +else +{ +pmod = DatumGetInt32(datums[offset][0]); +valid_bound = (spec->modulus % pmod) == 0; + +if (valid_bound && (offset + 1) < ndatums) +{ +nmod = DatumGetInt32(datums[offset + 1][0]); +valid_bound = (nmod % spec->modulus) == 0; +} +} May be name the variables as prev_mod(ulus) and next_mod(ulus) for better readability. + * for p_p1: satisfies_hash_partition(2, 1, hash_fn(a), hash_fn(b)) + * for p_p2: satisfies_hash_partition(4, 2, hash_fn(a), hash_fn(b)) + * for p_p3: satisfies_hash_partition(8, 0, hash_fn(a), hash_fn(b)) + * for p_p4: satisfies_hash_partition(8, 4, hash_fn(a), hash_fn(b)) The description here may be read as if we are calling the same hash function for both a and b, but that's not true. So, you may want to clarify that in hash_fn(a) hash_fn means hash function specified for key a. +if (key->partattrs[i] != 0) +{ +keyCol = (Node *) makeVar(1, + key->partattrs[i], +
Re: [HACKERS] [POC] hash partitioning
On Thu, May 11, 2017 at 10:38 PM, Amit Langote wrote: > So, adding keycol IS NOT NULL (like we currently do for expressions) in > the implicit partition constraint would be more future-proof than > generating an actual catalogued NOT NULL constraint on the keycol? I now > tend to think it would be better. Directly inserting into a range > partition with a NULL value for a column currently generates a "null value > in column \"%s\" violates not-null constraint" instead of perhaps more > relevant "new row for relation \"%s\" violates partition constraint". > That said, we *do* document the fact that a NOT NULL constraint is added > on range key columns, but we might as well document instead that we don't > currently support routing tuples with NULL values in the partition key > through a range-partitioned table and so NULL values cause error. > > Can we still decide to do that instead? I suggest you start a new thread on that topic. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash Functions
On Fri, May 12, 2017 at 12:08 AM, Jeff Davis wrote: > 1. The hash functions as they exist today aren't portable -- they can > return different results on different machines. That means using these > functions for hash partitioning would yield different contents for the > same partition on different architectures (and that's bad, considering > they are logical partitions and not some internal detail). Hmm, yeah, that is bad. > We could revert 26043592 (copied Tom because that was his patch) to > make hash_any() go back to being portable -- do we know what that > speedup actually was? Maybe the benefit is smaller on newer > processors? Another option is to try to do some combination of > byteswapping and word-at-a-time, which might be better than > byte-at-a-time if the byteswapping is done with a native instruction. With regard to portability, I find that in 2009, according to Tom, we had "already agreed" that it was dispensible: http://postgr.es/m/23832.1234214...@sss.pgh.pa.us I was not able to find where that was agreed. On performance, I found this: https://www.postgresql.org/message-id/20081104202655.gp18...@it.is.rice.edu It says at the end: "The average time to reindex the table using our current hash_any() without the separate mix()/final() was 1696ms and 1482ms with the separate mix()/final() stages giving almost 13% better performance for this stupid metric." > 5. For salts[2], I don't think it's too hard to support them in an > optional way. We just allow the function to be a two-argument function > with a default. Places that care about specifying the salt may do so > if the function has pronargs==2, otherwise it just gets the default > value. If we have salts, I don't think having 64-bit hashes is very > important. If we run out of bits, we can just salt the hash function > differently and get more hash bits. This is not urgent and I believe > we should just implement salts when and if some algorithm needs them. The potential problem with that is that the extra argument might slow down the hash functions enough to matter. Argument unpacking is not free, and the hashing logic itself will get more complex. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] hash partitioning
On Wed, May 10, 2017 at 11:39 PM, Robert Haas wrote: > On Wed, May 3, 2017 at 9:09 AM, amul sul wrote: >> Fixed in the attached version. > > +[ PARTITION BY { HASH | RANGE | LIST } ( { class="parameter">column_name | ( class="parameter">expression ) } [ COLLATE > In the department of severe nitpicking, I would have expected this to > either use alphabetical order (HASH | LIST | RANGE) or to add the new > method at the end on the theory that we probably did the important > ones first (RANGE | LIST | HASH). Importance is subjective, so may be we should arrange them in alphabetical order, to keep the list in some order and be consistent everywhere in the code and documentation. > More broadly, I wonder why we're > cramming this into the datums arrays instead of just adding another > field to PartitionBoundInfoData that is only used by hash > partitioning. It would be good if we store datums corresponding to partition bounds in the same place. So that we don't have to handle hash partition specially in all the places where we handle partition bound datums. We already do that for list and range partitions. May be we want to continue doing so for hash as well. In my comments to Amul's latest patch, I have described a possibility that partition_bounds_equal() need not compare all entries in the datums array. It can just compare greated modulus and the indexes array from given two partition bounds to check whether they are equal. If that works well, we will probably address your complaint about DatumIsEqual() in a different manner. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] eval_const_expresisions and ScalarArrayOpExpr
Robert Haas writes: > On Fri, May 12, 2017 at 12:04 PM, Tom Lane wrote: >> Actually, I now remember that I wrote that while at Salesforce (because >> they'd run into the problem that constant ArrayRef expressions weren't >> simplified). So that means it's been languishing in a git branch for >> *two* years. I'm kind of inclined to push it before it gets forgotten >> again ;-) > You will probably not be surprised to hear that I think this is a > feature and thus subject to the feature freeze currently in effect. [ shrug ] Sure. I'll do what I should have done then, which is stick it into the next CF list. If you intend to also object to my proposed get_attstatsslot refactoring, please do that promptly. That one I think is bug-proofing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] Crash observed during the start of the Postgres process
On Fri, May 12, 2017 at 12:38 PM, Tom Lane wrote: > I can't draw any other conclusion but that you've hacked something > to make FATAL act like LOG. Which is a fatal mistake. I see what you did there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] eval_const_expresisions and ScalarArrayOpExpr
On Fri, May 12, 2017 at 12:04 PM, Tom Lane wrote: > Heikki Linnakangas writes: >> On 05/11/2017 06:21 PM, Tom Lane wrote: >>> Yeah, I think it's a lack-of-round-tuits situation. Your patch reminds >>> me of a more ambitious attempt I made awhile back to reduce the amount >>> of duplicative boilerplate in eval_const_expressions. I think I wrote >>> it during last year's beta period, stashed it because I didn't feel like >>> arguing for applying it right then, and then forgot about it. > >> Hmph, now we're almost in beta period again. :-(. > > Actually, I now remember that I wrote that while at Salesforce (because > they'd run into the problem that constant ArrayRef expressions weren't > simplified). So that means it's been languishing in a git branch for > *two* years. I'm kind of inclined to push it before it gets forgotten > again ;-) You will probably not be surprised to hear that I think this is a feature and thus subject to the feature freeze currently in effect. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multi-column range partition constraint
On Fri, May 12, 2017 at 3:26 AM, Amit Langote wrote: > Fixed. This seems to be the same patch version as last time, so none of the things you mention as fixed are, in fact, fixed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] snapbuild woes
On 12/05/17 10:57, Petr Jelinek wrote: > On 12/05/17 03:31, Andres Freund wrote: >> On 2017-05-11 14:54:26 -0700, Andres Freund wrote: >>> On 2017-05-11 14:51:55 -0700, wrote: On 2017-05-08 00:10:12 -0700, Andres Freund wrote: > I plan to commit the next pending patch after the back branch releases > are cut - it's an invasive fix and the issue doesn't cause corruption > "just" slow slot creation. So it seems better to wait for a few days, > rather than hurry it into the release. Now that that's done, here's an updated version of that patch. Note the new logic to trigger xl_running_xact's to be logged at the right spot. Works well in my testing. I plan to commit this fairly soon, unless somebody wants a bit more time to look into it. Unless somebody protests, I'd like to slightly revise how the on-disk snapshots are stored on master - given the issues this bug/commit showed with the current method - but I can see one could argue that that should rather be done next release. >>> >>> As usual I forgot my attachement. >> >> This patch also seems to offer a way to do your 0005 in, possibly, more >> efficient manner. We don't ever need to assume a transaction is >> timetravelling anymore. Could you check whether the attached, hasty, >> patch resolves the performance problems you measured? Also, do you have >> a "reference" workload for that? >> > > Hmm, well it helps but actually now that we don't track individual > running transactions anymore it got much less effective (my version of > 0005 does as well). > > The example workload I test with is: > session 1: open transaction, do a write, keep it open > session 2: pgbench -M simple -N -c 10 -P 1 -T 5 > session 3: run CREATE_REPLICATION_SLOT LOGICAL in walsender > session 2: pgbench -M simple -N -c 10 -P 1 -T 20 > session 1: commit > > And wait for session 3 to finish slot creation, takes about 20 mins on > my laptop without patches, minute and half with your patches for 0004 > and 0005 (or with your 0004 and my 0005) and about 2s with my original > 0004 and 0005. > > What makes it slow is the constant qsorting IIRC. > Btw I still think that we should pursue the approach you proposed. I think we can deal with the qsorting in some other ways (ordered insert perhaps?) later. What you propose fixes the correctness, makes performance less awful in the above workload and also makes the snapbuild bit easier to read. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] Crash observed during the start of the Postgres process
"K S, Sandhya (Nokia - IN/Bangalore)" writes: > I have filtered the logs based on PID (19825) to see if this helps to > debug the issue further. Is this really a stock Postgres build? The proximate cause of the PANIC is that the startup process is seeing other processes active even though it hasn't reachedConsistency. This is bad on any number of levels, quite aside from that particular PANIC, because those other processes are presumably seeing non-consistent database state. Looking elsewhere in the log, we see that indeed there seem to be several backend processes happily executing commands. For instance, here's the trace of one of them starting up: [19810-58f473ff.4d62-187] 2017-04-17 07:51:28.783 GMT < > DEBUG: 0: forked new backend, pid=19850 socket=10 [19810-58f473ff.4d62-188] 2017-04-17 07:51:28.783 GMT < > LOCATION: BackendStartup, postmaster.c:3884 [19850-58f47400.4d8a-1] 2017-04-17 07:51:28.783 GMT < > LOG: 57P03: the database system is starting up [19850-58f47400.4d8a-2] 2017-04-17 07:51:28.783 GMT < > LOCATION: ProcessStartupPacket, postmaster.c:2143 [19850-58f47400.4d8a-3] 2017-04-17 07:51:28.784 GMT < authentication> DEBUG: 0: postgres child[19850]: starting with ( Now, that LOG message proves that this backend has observed that the database is not ready to allow connections. So why did it only emit the message as LOG and keep going? The code for this in 9.3 looks like /* * If we're going to reject the connection due to database state, say so * now instead of wasting cycles on an authentication exchange. (This also * allows a pg_ping utility to be written.) */ switch (port->canAcceptConnections) { case CAC_STARTUP: ereport(FATAL, (errcode(ERRCODE_CANNOT_CONNECT_NOW), errmsg("the database system is starting up"))); break; ... I can't draw any other conclusion but that you've hacked something to make FATAL act like LOG. Which is a fatal mistake. Errors that are marked FATAL are generally ones where allowing the process to keep going is not an acceptable outcome. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
> Ok, i agree. Patch is attached. I added a patch to the CF -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] renaming "transaction log"
Peter Eisentraut writes: > Committed, adjusting for some of the suggestions. Looks like one occurrence was still left in: $ ag --no-group --ignore po --ignore release-\* --ignore wal.sgml '\btransaction\s+log\b' doc/src/sgml/func.sgml:18631:end of the transaction log at any instant, while the write location is the end of The four other occurrences in the same paragraph were fixed, so I assume this was just an oversight. - ilmari -- "I use RMS as a guide in the same way that a boat captain would use a lighthouse. It's good to know where it is, but you generally don't want to find yourself in the same spot." - Tollef Fog Heen -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] eval_const_expresisions and ScalarArrayOpExpr
Heikki Linnakangas writes: > On 05/11/2017 06:21 PM, Tom Lane wrote: >> Yeah, I think it's a lack-of-round-tuits situation. Your patch reminds >> me of a more ambitious attempt I made awhile back to reduce the amount >> of duplicative boilerplate in eval_const_expressions. I think I wrote >> it during last year's beta period, stashed it because I didn't feel like >> arguing for applying it right then, and then forgot about it. > Hmph, now we're almost in beta period again. :-(. Actually, I now remember that I wrote that while at Salesforce (because they'd run into the problem that constant ArrayRef expressions weren't simplified). So that means it's been languishing in a git branch for *two* years. I'm kind of inclined to push it before it gets forgotten again ;-) Looking at the patch, it still seems solid, but I remember that one thing I was concerned about was whether the more generalized code was any slower. Not sure about a good test case to measure that though --- const-simplification isn't a large chunk of most workloads. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] renaming "transaction log"
On 5/3/17 09:10, David Steele wrote: > The documentation changes look good to me, but the error message changes > are a random mix of "WAL" and "write-ahead log", even when referring to > the same thing. The reason for this is that some of the error messages refer to a --waldir option or something like it, so it seems appropriate to also refer to "WAL" in the error messages. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] renaming "transaction log"
Committed, adjusting for some of the suggestions. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH clause in CREATE STATISTICS
Tom Lane wrote: > Alvaro Herrera writes: > > Tom Lane wrote: > >> I'm prepared to create a fix for that, but it'd be easier to commit the > >> current patch first, to avoid merge conflicts. > > > It seems we're mostly in agreement regarding the parts I was touching. > > Do you want to push your version of the patch? > > It's still almost all your work, so I figured you should push it. OK, I'll do that shortly then. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cached plans and statement generalization
On Fri, May 12, 2017 at 10:50:41AM +0300, Konstantin Knizhnik wrote: > Definitely changing session context (search_path, date/time format, ...) may > cause incorrect behavior of cached statements. I wonder if we should clear the cache whenever any SET command is issued. > Actually you may get the same problem with explicitly prepared statements > (certainly, in the last case, you better understand what going on and it is > your choice whether to use or not to use prepared statement). > > The fact of failure of 7 regression tests means that autoprepare can really > change behavior of existed program. This is why my suggestion is to switch > off this feature by default. I would like to see us target something that can be enabled by default. Even if it only improves performance by 5%, it would be better overall than a feature that improves performance by 90% but is only used by 1% of our users. > But in 99.9% real cases (my estimation plucked out of thin air:) there will > be no such problems with autoprepare. And it can significantly improve > performance of OLTP applications > which are not able to use prepared statements (because of working through > pgbouncer or any other reasons). Right, but we can't ship something that is 99.9% accurate when the inaccuracy is indeterminate. The bottom line is that Postgres has a very high bar, and I realize you are just prototyping at this point, but the final product is going to have to address all the intricacies of the issue for it to be added. > Can autoprepare slow down the system? > Yes, it can. It can happen if application perform larger number of unique > queries and autoprepare cache size is not limited. > In this case large (and infinitely growing) number of stored plans can > consume a lot of memory and, what is even worse, slowdown cache lookup. > This is why I by default limit number of cached statements > (autoprepare_limit parameter) by 100. Yes, good idea. > I am almost sure that there will be some other issues with autoprepare which > I have not encountered yet (because I mostly tested it on pgbench and > Postgres regression tests). > But I am also sure that benefit of doubling system performance is good > motivation to continue work in this direction. Right, you are still testing to see where the edges are. > My main concern is whether to continue to improve current approach with > local (per-backend) cache of prepared statements. > Or create shared cache (as in Oracle). It is much more difficult to > implement shared cache (the same problem with session context, different > catalog snapshots, cache invalidation,...) > but it also provides more opportunities for queries optimization and tuning. I would continue in the per-backend cache direction at this point because we don't even have that solved yet. The global cache is going to be even more complex. Ultimately I think we are going to want global and local caches because the plans of the local cache are much more likely to be accurate. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH clause in CREATE STATISTICS
Alvaro Herrera writes: > Tom Lane wrote: >> I'm prepared to create a fix for that, but it'd be easier to commit the >> current patch first, to avoid merge conflicts. > It seems we're mostly in agreement regarding the parts I was touching. > Do you want to push your version of the patch? It's still almost all your work, so I figured you should push it. >> Hm. OK, but then that example is pretty misleading, because it leads >> the reader to suppose that the planner can tell the difference between >> the selectivities of the two queries. Maybe what's lacking is an >> explanation of how you'd use this statistics type. > I think we should remove the complex example from that page, and instead > refer the reader to chapters 14 and 69. Seems reasonable. If we did add enough material there to be a standalone description, it would be duplicative probably. However, that does put the onus on the other chapters to offer a complete definition, rather than leaving details to the man page as we do in many other cases. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH clause in CREATE STATISTICS
Tom Lane wrote: > Tomas Vondra writes: > >> Although I've not done anything about it here, I'm not happy about the > >> handling of dependencies for stats objects. > > > Yeah, it's a bit frankensteinian ... > > I'm prepared to create a fix for that, but it'd be easier to commit the > current patch first, to avoid merge conflicts. It seems we're mostly in agreement regarding the parts I was touching. Do you want to push your version of the patch? > >> Lastly, I tried the example given in the CREATE STATISTICS reference page, > >> and it doesn't seem to work. > > > I assume you're talking about the functional dependencies and in that > > case that's expected behavior, because f. dependencies do assume the > > conditions are "consistent" with the functional dependencies. > > Hm. OK, but then that example is pretty misleading, because it leads > the reader to suppose that the planner can tell the difference between > the selectivities of the two queries. Maybe what's lacking is an > explanation of how you'd use this statistics type. I think we should remove the complex example from that page, and instead refer the reader to chapters 14 and 69. The CREATE STATISTICS page can just give some overview of what the syntax looks like, and all explanations of complex topics such as "what does it mean for a query to be consistent with the functional dependencies" can be given at length elsewhere. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] Concurrent ALTER SEQUENCE RESTART Regression
On 5/11/17 23:59, Tom Lane wrote: > Right, but the existing code is *designed* to hold the lock till end of > top-level transaction, regardless of what happens in any subtransaction. > My understanding of your complaint is that you do not think that's OK > for any lock stronger than AccessShareLock. What he is saying (I believe) is: Because of that custom logic to hold the lock until the end of the transaction across subtransactions, you will keep holding the lock even if a subtransaction aborts, which is nonstandard behavior. Previously, nobody cared, because nobody else was locking sequences. But now we're proposing to make ALTER SEQUENCE lock the sequence, so this behavior would be visible: S1: BEGIN; S1: SAVEPOINT s1; S1: SELECT nextval('seq1'); S1: ROLLBACK TO SAVEPOINT s1; S2: ALTER SEQUENCE seq1 MAXVALUE 100; -- would now wait for S1 commit My amendment to that is that "previously nobody cared" is not quite correct, because the same already happens currently with DROP SEQUENCE. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is there any way to access heap_open() from _PG_init ??
Sairam Gaddam writes: > During startup (_PG_init), I need to access some meta info of > table/relation (like PK Column Position, FK Column Positions, Index Column > Positions etc...) and load it into memory. Why not fetch that info at first use, instead? If you insist on doing it at _PG_init, you'll never be able to make the extension work as a shared_preload_libraries item, where _PG_init would be run in the postmaster. (I think local_preload_libraries would be problematic too; not sure that you're inside a transaction there.) You won't be able to retry after an error, or more generally to cope with post-load-time changes in the data you want to cache. It's probably possible to make it work as long as the library gets loaded during a transaction, ie in response to some SQL command. Without seeing your code we can't guess why its crashing though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] Concurrent ALTER SEQUENCE RESTART Regression
On 5/11/17 17:28, Andres Freund wrote: > Isn't that pretty much the point? The whole open_share_lock() > optimization looks like it really only can make a difference with > subtransactions? Yeah that confused me too. That keep-the-lock-for-the-whole-transaction logic was introduced in a2597ef17958e75e7ba26507dc407249cc9e7134 around 7.3, way before subtransactions (8.0). The point there was apparently just to avoid hitting the lock manager on subsequent calls. That code has since been moved around end refactored many times. When subtransactions were introduced, the current logic of keeping the lock across subtransactions was added in order to keep the original intent. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] If subscription to foreign table valid ?
> > > > I sent my version of patch in parallel. I think we don't need to do the > relation open like you did, all the info is in syscache. > That's right. > Regards, Neha
Re: [HACKERS] Addition of pg_dump --no-publications
David Fetter writes: > While it's consistent with surrounding code, I find the use of ints to > express what is in essence a boolean condition puzzling. Any > insights? IIRC, it's forced by the getopt_long API, particularly the way that the long-options struct has to be declared. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time based lag tracking for logical replication
On Sat, May 13, 2017 at 12:04 AM, Petr Jelinek wrote: > > After this commit 024711bb544645c8b1061e9f02b261e2e336981d I get > > following error while executing CREATE SUBSCRIPTION: > > > > CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres host=localhost > > user=neha port=5432' PUBLICATION mypub; > > NOTICE: synchronized table states > > ERROR: could not create replication slot "sub1": ERROR: could not > > load library "/home/neha/postgres/PGCurrentInstall/lib/pgoutput.so": > > /home/neha/postgres/PGCurrentInstall/lib/pgoutput.so: undefined > > symbol: OutputPluginUpdateProgress > > > > Hmm, that sounds like partial rebuild/install (old postgres binary with > new pgoutput one). > That's right. Thanks. Neha
Re: [HACKERS] [BUGS] Concurrent ALTER SEQUENCE RESTART Regression
On 5/11/17 16:34, Andres Freund wrote: >>> This'd probably need to be removed, as we'd otherwise would get very >>> weird semantics around aborted subxacts. >> Can you explain in more detail what you mean by this? > Well, right now we don't do proper lock-tracking for sequences, always > assigning them to the toplevel transaction. But that doesn't seem > proper when nextval() would conflict with ALTER SEQUENCE et al, because > then locks would continue to be held by aborted savepoints. I see what you mean here. We already have this issue with DROP SEQUENCE. While it would be nice to normalize this, I think it's quite esoteric. I doubt users have any specific expectations how sequences behave in aborted subtransactions. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] If subscription to foreign table valid ?
On 12/05/17 15:55, Neha Khatri wrote: > On 5/11/17, Petr Jelinek wrote: >> Hi, >> >> On 11/05/17 14:25, tushar wrote: >>> Hi, >>> >>> I observed that -we cannot publish "foreign table" in Publication >>> >>> but same thing is not true for Subscription >>> >>> postgres=# create foreign table t (n int) server db1_server options >>> (table_name 't'); >>> CREATE FOREIGN TABLE >>> postgres=# alter subscription sub refresh publication ; >>> NOTICE: added subscription for table public.t >>> ALTER SUBSCRIPTION >>> >>> Is this an expected behavior ? if we cannot publish then how can we >>> add subscription for it. > > The above foreign table subscription succeeds only if the publisher > has published a same named normal table i.e. > create table t (n int); > CREATE PUBLICATION mypub FOR TABLE t; > > I think in current implemetation of LogicalRep. it is users > responsibility to match the table definition at publisher and > subscriber. Subscriber can not determine by itself what publisher has > defined. This usecase does not align with this assumption. > > >> However, the foreign tables indeed can't be subscribed. > > I suspect that a user would want to subcribe a foreign table in real world. > >> I think it does make sense to add check for this into CREATE/ALTER >> SUBSCRIBER though so that user is informed immediately about the mistake >> rather than by errors in the logs later. > > Yes, system should prohibit such operation though. > I tried to write to a patch to achieve this. It disalows subscription > of relation other than RELKIND_RELATION. > Attached is the patch. Comments? > I sent my version of patch in parallel. I think we don't need to do the relation open like you did, all the info is in syscache. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication syntax (was DROP SUBSCRIPTION, query cancellations and slot handling)
On 12/05/17 15:02, Peter Eisentraut wrote: > On 5/9/17 11:43, Petr Jelinek wrote: >> Here is rebased version of the other patch (the interface rework). I >> also fixed the tab completion there. > > Committed. > > One small thing I changed, to make it look more like CREATE/ALTER TABLE, > is that the CREATE commands use WITH (...) but the ALTER commands use > SET (...). > Makes sense, thanks! -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Addition of pg_dump --no-publications
On Fri, May 12, 2017 at 10:59:27AM +0900, Michael Paquier wrote: > On Thu, May 11, 2017 at 3:19 PM, Michael Paquier > wrote: > > I imagine that pg_dump -s would be the basic operation that users > > would do first before creating a subcription on a secondary node, but > > what I find surprising is that publications are dumped by default. I > > don't find confusing that those are actually included by default to be > > consistent with the way subcriptions are handled, what I find > > confusing is that there are no options to not dump them, and no > > options to bypass their restore. > > > > So, any opinions about having pg_dump/pg_restore --no-publications? > > And that's really a boring patch, giving the attached. While it's consistent with surrounding code, I find the use of ints to express what is in essence a boolean condition puzzling. Any insights? Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PROVE_FLAGS
Michael Paquier writes: > On Fri, May 12, 2017 at 9:05 PM, Andrew Dunstan > wrote: >> Does anyone object to me backpatching this? It seems to me kinda crazy >> to have --verbose hardcoded on the back branches and not on the dev branch. > +1. A maximum of consistency with the test code when possible is nice to have. No objection here either. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH clause in CREATE STATISTICS
Tomas Vondra writes: > On 5/12/17 4:46 AM, Tom Lane wrote: >> If people agree that that reads better, we should make an effort to >> propagate that terminology elsewhere in the docs, and into error messages, >> objectaddress.c output, etc. > I'm fine with the 'statistics object' wording. I've been struggling with > this bit while working on the patch, and I agree it sounds better and > getting it consistent is definitely worthwhile. OK. We can work on that as a followon patch. >> Although I've not done anything about it here, I'm not happy about the >> handling of dependencies for stats objects. > Yeah, it's a bit frankensteinian ... I'm prepared to create a fix for that, but it'd be easier to commit the current patch first, to avoid merge conflicts. >> Lastly, I tried the example given in the CREATE STATISTICS reference page, >> and it doesn't seem to work. > I assume you're talking about the functional dependencies and in that > case that's expected behavior, because f. dependencies do assume the > conditions are "consistent" with the functional dependencies. Hm. OK, but then that example is pretty misleading, because it leads the reader to suppose that the planner can tell the difference between the selectivities of the two queries. Maybe what's lacking is an explanation of how you'd use this statistics type. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding support for Default partition in partitioning
Hello, On Fri, May 12, 2017 at 5:30 PM, Rahila Syed wrote: > Hello, > > >(1) With the new patch, we allow new partitions when there is overlapping > data with default partition. The entries in default are ignored when > running queries satisfying the new partition. > This was introduced in latest version. We are not allowing adding a > partition when entries with same key value exist in default partition. So > this scenario should not > come in picture. Please find attached an updated patch which corrects this. > Thank you for the updated patch. However, now I cannot create a partition after default. CREATE TABLE list1 ( a int, b int ) PARTITION BY LIST (a); CREATE TABLE list1_1 (LIKE list1); ALTER TABLE list1 ATTACH PARTITION list1_1 FOR VALUES IN (1); CREATE TABLE list1_def PARTITION OF list1 DEFAULT; CREATE TABLE list1_5 PARTITION OF list1 FOR VALUES IN (3); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. !> > > > >(2) I get the following warning: > > >partition.c: In function ‘check_new_partition_bound’: > >partition.c:882:15: warning: ‘boundinfo’ may be used uninitialized in > this function [-Wmaybe-uninitialized] > > && boundinfo->has_default) >^ > >preproc.y:3250.2-8: warning: type clash on default action: != <> > I failed to notice this warning. I will look into it. > > >This is incredibly ugly. I don't know exactly what should be done > >about it, but I think PARTITION_DEFAULT is a bad idea and has got to > >go. Maybe add a separate isDefault flag to PartitionBoundSpec > Will look at other ways to do it. > > >Doesn't it strike you as a bit strange that get_qual_for_default() > >doesn't return a qual? Functions should generally have names that > >describe what they do. > Will fix this. > > >There's an existing function that you can use to concatenate two lists > >instead of open-coding it. > Will check this. > > >you should really add the documentation and > >regression tests which you mentioned as a TODO. And run the code > >through pgindent > I will also update the next version with documentation and regression tests > and run pgindent > > Thank you, > Rahila Syed > > On Fri, May 12, 2017 at 4:33 PM, Beena Emerson > wrote: > >> >> >> On Thu, May 11, 2017 at 7:37 PM, Rahila Syed >> wrote: >> >>> Hello, >>> >>> Please find attached an updated patch with review comments and bugs >>> reported till date implemented. >>> >> >> Hello Rahila, >> >> Tested on "efa2c18 Doc fix: scale(numeric) returns integer, not numeric." >> >> (1) With the new patch, we allow new partitions when there is overlapping >> data with default partition. The entries in default are ignored when >> running queries satisfying the new partition. >> >> DROP TABLE list1; >> CREATE TABLE list1 ( >> a int, >> b int >> ) PARTITION BY LIST (a); >> CREATE TABLE list1_1 (LIKE list1); >> ALTER TABLE list1 ATTACH PARTITION list1_1 FOR VALUES IN (1); >> CREATE TABLE list1_def PARTITION OF list1 DEFAULT; >> INSERT INTO list1 SELECT generate_series(1,2),1; >> -- Partition overlapping with DEF >> CREATE TABLE list1_2 PARTITION OF list1 FOR VALUES IN (2); >> INSERT INTO list1 SELECT generate_series(2,3),2; >> >> postgres=# SELECT * FROM list1 ORDER BY a,b; >> a | b >> ---+--- >> 1 | 1 >> 2 | 1 >> 2 | 2 >> 3 | 2 >> (4 rows) >> >> postgres=# SELECT * FROM list1 WHERE a=2; >> a | b >> ---+--- >> 2 | 2 >> (1 row) >> >> This ignores the a=2 entries in the DEFAULT. >> >> postgres=# SELECT * FROM list1_def; >> a | b >> ---+--- >> 2 | 1 >> 3 | 2 >> (2 rows) >> >> >> (2) I get the following warning: >> >> partition.c: In function ‘check_new_partition_bound’: >> partition.c:882:15: warning: ‘boundinfo’ may be used uninitialized in >> this function [-Wmaybe-uninitialized] >>&& boundinfo->has_default) >>^ >> preproc.y:3250.2-8: warning: type clash on default action: != <> >> >> >>> >1. >>> >In following block, we can just do with def_index, and we do not need >>> found_def >>> >flag. We can check if def_index is -1 or not to decide if default >>> partition is >>> >present. >>> found_def is used to set boundinfo->has_default which is used at couple >>> of other places to check if default partition exists. The implementation >>> is similar >>> to has_null. >>> >>> >3. >>> >In following function isDefaultPartitionBound, first statement "return >>> false" >>> >is not needed. >>> It is needed to return false if the node is not DefElem. >>> >>> Todo: >>> Add regression tests >>> Documentation >>> >>> Thank you, >>> Rahila Syed >>> >>> >>> > -- Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Time based lag tracking for logical replication
On 12/05/17 15:09, Neha Khatri wrote: > On Fri, May 12, 2017 at 8:19 PM, Simon Riggs wrote: >> >> On 11 May 2017 at 18:29, Simon Riggs wrote: >>> On 11 May 2017 at 18:13, Andres Freund wrote: >>> > New patch, v3. > > Applying in 90 minutes, barring objections. Could you please wait till tomorrow? I've bigger pending fixes for related code pending/being tested that I plan to push today. I'd also like to take a look before... >>> >>> Sure. >> >> The changes I've added are very minor, so I'm not expecting debate. >> The main part of the patch is the same as Petr posted 19days ago. >> >> I'm travelling now, so after waiting till tomorrow as you requested I >> have committed the patch. >> > > Prior to this commit CREATE SUBSCRIPTION used to work smoothly. > > After this commit 024711bb544645c8b1061e9f02b261e2e336981d I get > following error while executing CREATE SUBSCRIPTION: > > CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres host=localhost > user=neha port=5432' PUBLICATION mypub; > NOTICE: synchronized table states > ERROR: could not create replication slot "sub1": ERROR: could not > load library "/home/neha/postgres/PGCurrentInstall/lib/pgoutput.so": > /home/neha/postgres/PGCurrentInstall/lib/pgoutput.so: undefined > symbol: OutputPluginUpdateProgress > Hmm, that sounds like partial rebuild/install (old postgres binary with new pgoutput one). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] If subscription to foreign table valid ?
On 11/05/17 15:43, Petr Jelinek wrote: > Hi, > > On 11/05/17 14:25, tushar wrote: >> Hi, >> >> I observed that -we cannot publish "foreign table" in Publication >> >> postgres=# create foreign table t (n int) server db1_server options >> (table_name 't1'); >> CREATE FOREIGN TABLE >> >> postgres=# create publication pub for table t; >> ERROR: "t" is not a table >> DETAIL: Only tables can be added to publications. >> postgres=# >> >> but same thing is not true for Subscription >> >> postgres=# create foreign table t (n int) server db1_server options >> (table_name 't'); >> CREATE FOREIGN TABLE >> postgres=# alter subscription sub refresh publication ; >> NOTICE: added subscription for table public.t >> ALTER SUBSCRIPTION >> >> Is this an expected behavior ? if we cannot publish then how can we >> add subscription for it. >> > > Thanks for report. What you can publish and what you can subscribe is > not necessarily same (we can write to relations which we can't capture > from wal, for example unlogged table can't be published but can be > subscribed). > > However, the foreign tables indeed can't be subscribed. I originally > planned to have foreign tables allowed on subscriber but it turned out > to be more complex to implement than I had anticipated do I ripped the > code for that from the original patch. > > We do check for this, but only during replication which we have to do > because the fact that relation 't' was foreign table during ALTER > SUBSCRIPTION does not mean that it won't be something else half hour later. > > I think it does make sense to add check for this into CREATE/ALTER > SUBSCRIBER though so that user is informed immediately about the mistake > rather than by errors in the logs later. > > I'll look into writing patch for this. I don't think it's beta blocker > though. > So I moved the relkind check to single function and call it from all the necessary places. See the attached I now wonder if we should do some other checks as well (columns etc). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services Check-relkind-of-tables-in-CREATE-ALTER-SUBSCRIPTION.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] If subscription to foreign table valid ?
[Correction below] On 5/12/17, Neha Khatri wrote: > On 5/11/17, Petr Jelinek wrote > >> However, the foreign tables indeed can't be subscribed. Yes, I suspect that a user would _not_ want to subcribe a foreign table in real world. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] If subscription to foreign table valid ?
On 5/11/17, Petr Jelinek wrote: > Hi, > > On 11/05/17 14:25, tushar wrote: >> Hi, >> >> I observed that -we cannot publish "foreign table" in Publication >> >> but same thing is not true for Subscription >> >> postgres=# create foreign table t (n int) server db1_server options >> (table_name 't'); >> CREATE FOREIGN TABLE >> postgres=# alter subscription sub refresh publication ; >> NOTICE: added subscription for table public.t >> ALTER SUBSCRIPTION >> >> Is this an expected behavior ? if we cannot publish then how can we >> add subscription for it. The above foreign table subscription succeeds only if the publisher has published a same named normal table i.e. create table t (n int); CREATE PUBLICATION mypub FOR TABLE t; I think in current implemetation of LogicalRep. it is users responsibility to match the table definition at publisher and subscriber. Subscriber can not determine by itself what publisher has defined. This usecase does not align with this assumption. > However, the foreign tables indeed can't be subscribed. I suspect that a user would want to subcribe a foreign table in real world. > I think it does make sense to add check for this into CREATE/ALTER > SUBSCRIBER though so that user is informed immediately about the mistake > rather than by errors in the logs later. Yes, system should prohibit such operation though. I tried to write to a patch to achieve this. It disalows subscription of relation other than RELKIND_RELATION. Attached is the patch. Comments? Regards, Neha diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c index b76cdc5..f6013da 100644 --- a/src/backend/commands/subscriptioncmds.c +++ b/src/backend/commands/subscriptioncmds.c @@ -440,9 +440,20 @@ CreateSubscription(CreateSubscriptionStmt *stmt, bool isTopLevel) { RangeVar *rv = (RangeVar *) lfirst(lc); Oid relid; +Relation relation; relid = RangeVarGetRelid(rv, AccessShareLock, false); +relation = RelationIdGetRelation(relid); + +if (RelationGetForm(relation)->relkind != RELKIND_RELATION) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("\"%s\" is not a normal table", + RelationGetRelationName(relation)), + errdetail("Only normal tables can be subscribed."))); +RelationClose(relation); + SetSubscriptionRelState(subid, relid, table_state, InvalidXLogRecPtr); } @@ -553,8 +564,21 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data) { RangeVar *rv = (RangeVar *) lfirst(lc); Oid relid; + Relation relation; relid = RangeVarGetRelid(rv, AccessShareLock, false); + + relation = RelationIdGetRelation(relid); + + if (RelationGetForm(relation)->relkind != RELKIND_RELATION) + ereport(NOTICE, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("\"%s\" is not a normal table", + RelationGetRelationName(relation)), + errdetail("Only normal tables can be subscribed."))); + + RelationClose(relation); + pubrel_local_oids[off++] = relid; if (!bsearch(&relid, subrel_local_oids, -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur
Michael Paquier writes: > On Fri, May 12, 2017 at 1:28 PM, Tsunakawa, Takayuki > wrote: >> Likewise, when the first host has already reached max_connections, libpq >> doesn't attempt the connection aginst later hosts. > It seems to me that the feature is behaving as wanted. Or in short > attempt to connect to the next host only if a connection cannot be > established. If there is a failure once the exchange with the server > has begun, just consider it as a hard failure. This is an important > property for authentication and SSL connection failures actually. I would not really expect that reconnection would retry after arbitrary failure cases. Should it retry for "wrong database name", for instance? It's not hard to imagine that leading to very confusing behavior. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Addition of pg_dump --no-publications
On 5/11/17 21:59, Michael Paquier wrote: > On Thu, May 11, 2017 at 3:19 PM, Michael Paquier > wrote: >> I imagine that pg_dump -s would be the basic operation that users >> would do first before creating a subcription on a secondary node, but >> what I find surprising is that publications are dumped by default. I >> don't find confusing that those are actually included by default to be >> consistent with the way subcriptions are handled, what I find >> confusing is that there are no options to not dump them, and no >> options to bypass their restore. >> >> So, any opinions about having pg_dump/pg_restore --no-publications? > > And that's really a boring patch, giving the attached. committed -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time based lag tracking for logical replication
On Fri, May 12, 2017 at 8:19 PM, Simon Riggs wrote: > > On 11 May 2017 at 18:29, Simon Riggs wrote: > > On 11 May 2017 at 18:13, Andres Freund wrote: > > > >>>New patch, v3. > >>> > >>>Applying in 90 minutes, barring objections. > >> > >> Could you please wait till tomorrow? I've bigger pending fixes for > >> related code pending/being tested that I plan to push today. I'd also > >> like to take a look before... > > > > Sure. > > The changes I've added are very minor, so I'm not expecting debate. > The main part of the patch is the same as Petr posted 19days ago. > > I'm travelling now, so after waiting till tomorrow as you requested I > have committed the patch. > Prior to this commit CREATE SUBSCRIPTION used to work smoothly. After this commit 024711bb544645c8b1061e9f02b261e2e336981d I get following error while executing CREATE SUBSCRIPTION: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres host=localhost user=neha port=5432' PUBLICATION mypub; NOTICE: synchronized table states ERROR: could not create replication slot "sub1": ERROR: could not load library "/home/neha/postgres/PGCurrentInstall/lib/pgoutput.so": /home/neha/postgres/PGCurrentInstall/lib/pgoutput.so: undefined symbol: OutputPluginUpdateProgress Regards Neha -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication syntax (was DROP SUBSCRIPTION, query cancellations and slot handling)
On 5/9/17 11:43, Petr Jelinek wrote: > Here is rebased version of the other patch (the interface rework). I > also fixed the tab completion there. Committed. One small thing I changed, to make it look more like CREATE/ALTER TABLE, is that the CREATE commands use WITH (...) but the ALTER commands use SET (...). -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PROVE_FLAGS
On Fri, May 12, 2017 at 9:05 PM, Andrew Dunstan wrote: > Does anyone object to me backpatching this? It seems to me kinda crazy > to have --verbose hardcoded on the back branches and not on the dev branch. +1. A maximum of consistency with the test code when possible is nice to have. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] hash partitioning
On Thu, May 11, 2017 at 9:32 PM, Dilip Kumar wrote: > On Wed, May 3, 2017 at 6:39 PM, amul sul wrote: >> On Thu, Apr 27, 2017 at 1:42 AM, Robert Haas wrote: >> >>>I spent some time today looking at these patches. It seems like there >>>is some more work still needed here to produce something committable >>>regardless of which way we go, but I am inclined to think that Amul's >>>patch is a better basis for work going forward than Nagata-san's >>>patch. Here are some general comments on the two patches: >> >> Thanks for your time. >> >> [...] >> >>> - Neither patch contains any documentation updates, which is bad. >> >> Fixed in the attached version. > > I have done an intial review of the patch and I have some comments. I > will continue the review > and testing and report the results soon > > - > Patch need to be rebased > > > > if (key->strategy == PARTITION_STRATEGY_RANGE) > { > /* Disallow nulls in the range partition key of the tuple */ > for (i = 0; i < key->partnatts; i++) > if (isnull[i]) > ereport(ERROR, > (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), > errmsg("range partition key of row contains null"))); > } > > We need to add PARTITION_STRATEGY_HASH as well, we don't support NULL > for hash also, right? > We do. > > RangeDatumContent **content;/* what's contained in each range bound datum? > * (see the above enum); NULL for list > * partitioned tables */ > > This will be NULL for hash as well we need to change the comments. > - Fixed in previously posted patch(v3). > > bool has_null; /* Is there a null-accepting partition? false > * for range partitioned tables */ > int null_index; /* Index of the null-accepting partition; -1 > > Comments needs to be changed for these two members as well > Fixed in previously posted patch(v3). > > +/* One bound of a hash partition */ > +typedef struct PartitionHashBound > +{ > + int modulus; > + int remainder; > + int index; > +} PartitionHashBound; > > It will good to add some comments to explain the structure members > I think we don't really need that, variable names are ample to explain its purpose. Regards, Amul -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Getting error at the time of dropping subscription and few more issues
Hi, There are few more issues , found in logical replication (1)ERROR: tuple concurrently updated Publication Server - (X machine) \\create table \ create publication \ insert rows create table t(n int); create publication pub for table t; insert into t values (generate_series(1,100)); Subscription Server-(Y machine) \\create table t / create subscription create table t(n int); create subscription sub connection 'dbname=postgres port=5000 user=centos password=a' publication pub; \\drop subscription and re-create (repeat this 2-3 times) postgres=# drop subscription sub; NOTICE: dropped replication slot "sub" on publisher DROP SUBSCRIPTION postgres=# create subscription sub connection 'dbname=postgres port=5000 user=centos password=a' publication pub; NOTICE: synchronized table states NOTICE: created replication slot "sub" on publisher CREATE SUBSCRIPTION postgres=# select count(*) from t; count - 100 (1 row) postgres=# drop subscription sub; ERROR: tuple concurrently updated (2) Not able to drop the subscription even 'nocreate slot' is specified postgres=# create subscription s2s1 connection 'dbname=postgres port=5000 user=t password=a' publication t with (nocreate slot,enabled,copydata,SYNCHRONOUS_COMMIT='on'); NOTICE: synchronized table states CREATE SUBSCRIPTION --not able to drop subscription , i have checked on Publication - no such slot created but still it is looking for slot. postgres=# drop subscription s2s1; ERROR: could not drop the replication slot "s2s1" on publisher DETAIL: The error was: ERROR: replication slot "s2s1" does not exist (3)Alter publication SET command doesn't give you NOTICE message about tables which got removed. postgres=# create publication pub for table t,t1,t2 ; CREATE PUBLICATION postgres=# select * from pg_publication_tables ; pubname | schemaname | tablename -++--- pub | public | t pub | public | t1 pub | public | t2 (3 rows) postgres=# alter publication pub set table t; ALTER PUBLICATION postgres=# select * from pg_publication_tables ; pubname | schemaname | tablename -++--- pub | public | t (1 row) in subscription - (we are getting NOTICE message, about tables which got removed) postgres=# alter subscription sub set publication pub refresh; NOTICE: removed subscription for table public.t1 NOTICE: removed subscription for table public.t2 ALTER SUBSCRIPTION I think - in publication too ,we should provide NOTICE messages. -- regards,tushar EnterpriseDB https://www.enterprisedb.com/ The Enterprise PostgreSQL Company https://sites.google.com/a/enterprisedb.com/old-new-touplestores/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PROVE_FLAGS
On 05/04/2017 12:50 AM, Stephen Frost wrote: > * Tom Lane (t...@sss.pgh.pa.us) wrote: >> Andrew Dunstan writes: >>> Can someone please explain to me why we have this in Makefile.global.in? >>> (from commit e9c81b60 ) >>> PROVE_FLAGS = >> Before that commit it was like >> >> PROVE_FLAGS = --verbose > right. > >> which had some value. I agree that now we'd be better off to not >> set it at all, especially since the convention now appears to be that >> automatically-supplied prove options should be set in PG_PROVE_FLAGS. > Good point. > >> I'd suggest that the comment just above be replaced by something like >> >> # User-supplied prove flags can be provided in PROVE_FLAGS. > Works for me. > Does anyone object to me backpatching this? It seems to me kinda crazy to have --verbose hardcoded on the back branches and not on the dev branch. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding support for Default partition in partitioning
Hello, >(1) With the new patch, we allow new partitions when there is overlapping data with default partition. The entries in default are ignored when running queries satisfying the new partition. This was introduced in latest version. We are not allowing adding a partition when entries with same key value exist in default partition. So this scenario should not come in picture. Please find attached an updated patch which corrects this. >(2) I get the following warning: >partition.c: In function ‘check_new_partition_bound’: >partition.c:882:15: warning: ‘boundinfo’ may be used uninitialized in this function [-Wmaybe-uninitialized] > && boundinfo->has_default) ^ >preproc.y:3250.2-8: warning: type clash on default action: != <> I failed to notice this warning. I will look into it. >This is incredibly ugly. I don't know exactly what should be done >about it, but I think PARTITION_DEFAULT is a bad idea and has got to >go. Maybe add a separate isDefault flag to PartitionBoundSpec Will look at other ways to do it. >Doesn't it strike you as a bit strange that get_qual_for_default() >doesn't return a qual? Functions should generally have names that >describe what they do. Will fix this. >There's an existing function that you can use to concatenate two lists >instead of open-coding it. Will check this. >you should really add the documentation and >regression tests which you mentioned as a TODO. And run the code >through pgindent I will also update the next version with documentation and regression tests and run pgindent Thank you, Rahila Syed On Fri, May 12, 2017 at 4:33 PM, Beena Emerson wrote: > > > On Thu, May 11, 2017 at 7:37 PM, Rahila Syed > wrote: > >> Hello, >> >> Please find attached an updated patch with review comments and bugs >> reported till date implemented. >> > > Hello Rahila, > > Tested on "efa2c18 Doc fix: scale(numeric) returns integer, not numeric." > > (1) With the new patch, we allow new partitions when there is overlapping > data with default partition. The entries in default are ignored when > running queries satisfying the new partition. > > DROP TABLE list1; > CREATE TABLE list1 ( > a int, > b int > ) PARTITION BY LIST (a); > CREATE TABLE list1_1 (LIKE list1); > ALTER TABLE list1 ATTACH PARTITION list1_1 FOR VALUES IN (1); > CREATE TABLE list1_def PARTITION OF list1 DEFAULT; > INSERT INTO list1 SELECT generate_series(1,2),1; > -- Partition overlapping with DEF > CREATE TABLE list1_2 PARTITION OF list1 FOR VALUES IN (2); > INSERT INTO list1 SELECT generate_series(2,3),2; > > postgres=# SELECT * FROM list1 ORDER BY a,b; > a | b > ---+--- > 1 | 1 > 2 | 1 > 2 | 2 > 3 | 2 > (4 rows) > > postgres=# SELECT * FROM list1 WHERE a=2; > a | b > ---+--- > 2 | 2 > (1 row) > > This ignores the a=2 entries in the DEFAULT. > > postgres=# SELECT * FROM list1_def; > a | b > ---+--- > 2 | 1 > 3 | 2 > (2 rows) > > > (2) I get the following warning: > > partition.c: In function ‘check_new_partition_bound’: > partition.c:882:15: warning: ‘boundinfo’ may be used uninitialized in this > function [-Wmaybe-uninitialized] >&& boundinfo->has_default) >^ > preproc.y:3250.2-8: warning: type clash on default action: != <> > > >> >1. >> >In following block, we can just do with def_index, and we do not need >> found_def >> >flag. We can check if def_index is -1 or not to decide if default >> partition is >> >present. >> found_def is used to set boundinfo->has_default which is used at couple >> of other places to check if default partition exists. The implementation >> is similar >> to has_null. >> >> >3. >> >In following function isDefaultPartitionBound, first statement "return >> false" >> >is not needed. >> It is needed to return false if the node is not DefElem. >> >> Todo: >> Add regression tests >> Documentation >> >> Thank you, >> Rahila Syed >> >> >> default_partition_v11.patch Description: application/download -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH clause in CREATE STATISTICS
On 5/12/17 4:46 AM, Tom Lane wrote: Alvaro Herrera writes: Hmm, yeah, makes sense. Here's a patch for this approach. ... Also, while reading the docs changes, I got rather ill from the inconsistent and not very grammatical handling of "statistics" as a noun, particularly the inability to decide from one sentence to the next if that is singular or plural. In the attached, I fixed this in the ref/*_statistics.sgml files by always saying "statistics object" instead. If people agree that that reads better, we should make an effort to propagate that terminology elsewhere in the docs, and into error messages, objectaddress.c output, etc. I'm fine with the 'statistics object' wording. I've been struggling with this bit while working on the patch, and I agree it sounds better and getting it consistent is definitely worthwhile. > Although I've not done anything about it here, I'm not happy about the handling of dependencies for stats objects. I do not think that cloning RemoveStatistics as RemoveStatisticsExt is sane at all. The former is meant to deal with cleaning up pg_statistic rows that we know will be there, so there's no real need to expend pg_depend overhead to track them. For objects that are only loosely connected, the right thing is to use the dependency system; in particular, this design has no real chance of working well with cross-table stats. Also, it's really silly to have *both* this hard-wired mechanism and a pg_depend entry; the latter is surely redundant if you have the former. IMO we should revert RemoveStatisticsExt and instead handle things by making stats objects auto-dependent on the individual column(s) they reference (not the whole table). I'm also of the opinion that having an AUTO dependency, rather than a NORMAL dependency, on the stats object's schema is the wrong semantics. There isn't any other case where you can drop a non-empty schema without saying CASCADE, and I'm mystified why this case should act that way. Yeah, it's a bit frankensteinian ... > Lastly, I tried the example given in the CREATE STATISTICS reference page, and it doesn't seem to work. Without the stats object, the two queries are both estimated at one matching row, whereas they really produce 100 and 0 rows respectively. With the stats object, they seem to both get estimated at ~100 rows, which is a considerable improvement for one case but a considerable disimprovement for the other. If this is not broken, I'd like to know why not. What's the point of an expensive extended- stats mechanism if it can't get this right? I assume you're talking about the functional dependencies and in that case that's expected behavior, because f. dependencies do assume the conditions are "consistent" with the functional dependencies. This is an inherent limitation of functional dependencies, and perhaps a price for the simplicity of this statistics type. If your application executes queries that are likely not consistent with this, don't use functional dependencies. The simplicity is why dependencies were implemented first, mostly to introduce all the infrastructure. The other statistics types (MCV, histograms) don't have this limitation, but those did not make it into pg10. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PoC: full merge join on comparison clause
Hi hackers, As you know, at this time Postgres cannot perform a full join on a comparison clause. For example, if we have two tables with numeric columns and run a query like 'select * from t1 full join t2 on t1.a > t2.a', we get an error: "FULL JOIN is only supported with merge-joinable or hash-joinable join conditions". Such queries are legitimate SQL and sometimes arise when porting applications from different DBMS, so it would be good to support them in Postgres. They can be rewritten as union of right and left joins, but it requires manual work and runs somewhat slower (see the benchmark at the end of the letter). This proof-of-concept patch explores the possibility of performing such queries as merge joins. Consider the following example where outer and inner relations are in ascending order, and we are asked to return outer tuples that are greater than inner. outer > inner outer tuple - 6 4 - marked tuple 7 5 8 6 - inner tuple 8 7 The main difference from normal merge join is that we do not need to advance the marked tuple. This behavior can be implemented with some simple changes to the function that compares inner and outer tuples. However, for the join clause 'outer < inner' we would have to advance the marked tuple, which would require adding a new state to the merge join executor node. We do not do this. Instead, at the path creation stage, we make sure that the particular combination of sorting order and join clause allows us to perform merge join the simple way. The optimizer requires some other changes to support these joins. Currently, it uses the same clauses to generate equivalence classes and to perform merge joins. This patch has to separate these two uses. Clauses that correspond to a btree equality operator are used to construct equivalence classes; the operator families for these clauses are recorded in the 'equivopfamilies' field of RestrictInfo struct. Clauses that correspond to btree equality or comparison are used to perform merge joins, and have their operator families recorded in the 'mergeopfamilies'. The optimizer also has to check whether the particular join clause list can be used for merge join, and ensure that it is compatible with inner/outer path ordering. These checks are performed by 'can_sort_for_mergejoin()' and 'outer_sort_suitable_for_mergejoin()'. There is an important unsolved problem in this patch. When generating index paths for base relations, the optimizer tries to use only one scan direction to limit the number of paths. This direction might not be suitable for a given join clause, and the input path will have to be sorted. We could generate paths for both directions, but this was specifically removed for optimization (SHA 834ddc62 by Tom Lane, 10/27/2007 09:45 AM). For inner joins one would expect the merge join to be slower than the nested loop, because it has more complex code paths, and indeed this can be seen on simple benchmarks (see the end of the letter). Costs should be revised further to reflect this difference. I would be glad to hear your opinion on this approach. Some benchmarks: = Full join vs union of left and right joins test1=# explain analyze select * from t4 right join t1 on t4.a < t1.a union all select * from t4 left join t1 on t4.a < t1.a where t1.a is null; QUERY PLAN - Append (cost=809.69..70703.19 rows=334 width=8) (actual time=8.336..1195.534 rows=5007546 loops=1) -> Merge Left Join (cost=809.69..34230.49 rows=333 width=8) (actual time=8.335..920.442 rows=5007537 loops=1) Merge Cond: (t1.a > t4.a) -> Index Only Scan using idx_t1_a on t1 (cost=0.28..35.27 rows=1000 width=4) (actual time=0.027..0.395 rows=1001 loops=1) Heap Fetches: 97 -> Sort (cost=809.39..834.39 rows=1 width=4) (actual time=8.300..356.821 rows=5007538 loops=1) Sort Key: t4.a Sort Method: quicksort Memory: 931kB -> Seq Scan on t4 (cost=0.00..145.00 rows=1 width=4) (actual time=0.019..2.533 rows=1 loops=1) -> Nested Loop Anti Join (cost=0.28..3072.71 rows=6667 width=8) (actual time=4.685..35.421 rows=9 loops=1) -> Seq Scan on t4 t4_1 (cost=0.00..145.00 rows=1 width=4) (actual time=0.010..0.656 rows=1 loops=1) -> Index Only Scan using idx_t1_a on t1 t1_1 (cost=0.28..6.10 rows=333 width=4) (actual time=0.003..0.003 rows=1 loops=1) Index Cond: (a > t4_1.a) Heap Fetches: 971 Planning time: 1.414 ms Execution time: 1324.985 ms (16 rows) test1=# explain analyze select * from t
Re: [HACKERS] Adding support for Default partition in partitioning
On Thu, May 11, 2017 at 7:37 PM, Rahila Syed wrote: > Hello, > > Please find attached an updated patch with review comments and bugs > reported till date implemented. > Hello Rahila, Tested on "efa2c18 Doc fix: scale(numeric) returns integer, not numeric." (1) With the new patch, we allow new partitions when there is overlapping data with default partition. The entries in default are ignored when running queries satisfying the new partition. DROP TABLE list1; CREATE TABLE list1 ( a int, b int ) PARTITION BY LIST (a); CREATE TABLE list1_1 (LIKE list1); ALTER TABLE list1 ATTACH PARTITION list1_1 FOR VALUES IN (1); CREATE TABLE list1_def PARTITION OF list1 DEFAULT; INSERT INTO list1 SELECT generate_series(1,2),1; -- Partition overlapping with DEF CREATE TABLE list1_2 PARTITION OF list1 FOR VALUES IN (2); INSERT INTO list1 SELECT generate_series(2,3),2; postgres=# SELECT * FROM list1 ORDER BY a,b; a | b ---+--- 1 | 1 2 | 1 2 | 2 3 | 2 (4 rows) postgres=# SELECT * FROM list1 WHERE a=2; a | b ---+--- 2 | 2 (1 row) This ignores the a=2 entries in the DEFAULT. postgres=# SELECT * FROM list1_def; a | b ---+--- 2 | 1 3 | 2 (2 rows) (2) I get the following warning: partition.c: In function ‘check_new_partition_bound’: partition.c:882:15: warning: ‘boundinfo’ may be used uninitialized in this function [-Wmaybe-uninitialized] && boundinfo->has_default) ^ preproc.y:3250.2-8: warning: type clash on default action: != <> > >1. > >In following block, we can just do with def_index, and we do not need > found_def > >flag. We can check if def_index is -1 or not to decide if default > partition is > >present. > found_def is used to set boundinfo->has_default which is used at couple > of other places to check if default partition exists. The implementation > is similar > to has_null. > > >3. > >In following function isDefaultPartitionBound, first statement "return > false" > >is not needed. > It is needed to return false if the node is not DefElem. > > Todo: > Add regression tests > Documentation > > Thank you, > Rahila Syed > > >
Re: [HACKERS] UPDATE of partition key
On 12 May 2017 at 14:56, Amit Kapila wrote: > I think it might be better to summarize all the options discussed > including what the patch has and see what most people consider as > sensible. Yes, makes sense. Here are the options that were discussed so far for ROW triggers : Option 1 : (the patch follows this option) -- BR Update trigger for source partition. BR,AR Delete trigger for source partition. BR,AR Insert trigger for destination partition. No AR Update trigger. Rationale : BR Update trigger should be fired because that trigger can even modify the rows, and that can even result in partition key update even though the UPDATE statement is not updating the partition key. Also, fire the delete/insert triggers on respective partitions since the rows are about to be deleted/inserted. AR update trigger should not be fired because that required an actual update to have happened. Option 2 -- BR Update trigger for source partition. AR Update trigger on destination partition. No insert/delete triggers. Rationale : Since it's an UPDATE statement, only update triggers should be fired. The update ends up moving the row into another partition, so AR Update trigger should be fired on this partition rather than the original partition. Option 3 BR, AR delete triggers on source partition BR, AR insert triggers on destination partition. Rationale : Since the update is converted to delete+insert, just skip the update triggers completely. Option 4 BR-AR update triggers for source partition BR-AR insert triggers for destination partition Rationale : Since it is an update statement, both BR and AR UPDATE trigger should be fired on original partition. Since update is converted to delete+insert, the corresponding triggers should be fired, but since we already are firing UPDATE trigger on original partition, skip delete triggers, otherwise both UPDATE and DELETE triggers would get fired on the same partition. For statement triggers, I think it should be based on the documentation recently checked in for partitions in general. +A statement that targets a parent table in a inheritance or partitioning +hierarchy does not cause the statement-level triggers of affected child +tables to be fired; only the parent table's statement-level triggers are +fired. However, row-level triggers of any affected child tables will be +fired. Based on that, for row movement as well, the trigger should be fired only for the table referred in the UPDATE statement, and not for any child tables, or for any partitions to which the rows were moved. The doc in this row-movement patch also matches with this behaviour. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] hash partitioning
On Wed, May 10, 2017 at 6:04 PM, Ashutosh Bapat wrote: > On Wed, May 3, 2017 at 6:39 PM, amul sul wrote: >> On Thu, Apr 27, 2017 at 1:42 AM, Robert Haas wrote: >> >>> >>> This is not yet a detailed review - I may be missing things, and >>> review and commentary from others is welcome. If there is no major >>> disagreement with the idea of moving forward using Amul's patch as a >>> base, then I will do a more detailed review of that patch (or, >>> hopefully, an updated version that addresses the above comments). >> > > I agree that Amul's approach makes dump/restore feasible whereas > Nagata-san's approach makes that difficult. That is a major plus point > about Amul's patch. Also, it makes it possible to implement > Nagata-san's syntax, which is more user-friendly in future. > > Here are some review comments after my initial reading of Amul's patch: > > Hash partitioning will partition the data based on the hash value of the > partition key. Does that require collation? Should we throw an error/warning > if > collation is specified in PARTITION BY clause? > > +int *indexes;/* Partition indexes; in case of hash > + * partitioned table array length will be > + * value of largest modulus, and for others > + * one entry per member of the datums array > + * (plus one if range partitioned table) */ > This may be rewritten as "Partition indexes: For hash partitioned table the > number of indexes will be same as the largest modulus. For list partitioned > table the number of indexes will be same as the number of datums. For range > partitioned table the number of indexes will be number of datums plus one.". > You may be able to reword it to a shorter version, but essentially we will > have > separate description for each strategy. > Okay, will fix this. > I guess, we need to change the comments for the other members too. For example > "datums" does not contain tuples with key->partnatts attributes for hash > partitions. It contains a tuple with two attributes, modulus and remainder. We > may not want to track null_index separately since rows with NULL partition key > will fit in the partition corresponding to the hash value of NULL. OR may be > we > want to set null_index to partition which contains NULL values, if there is a > partition created for corresponding remainder, modulus pair and set has_null > accordingly. Accordingly we will need to update the comments. > > cal_hash_value() may be renamed as calc_has_value() or compute_hash_value()? > Okay, will rename to compute_hash_value(). > Should we change the if .. else if .. construct in > RelationBuildPartitionDesc() > to a switch case? There's very less chance that we will support a fourth > partitioning strategy, so if .. else if .. may be fine. > > +intmod = hbounds[i]->modulus, > +place = hbounds[i]->remainder; > Although there are places in the code where we separate variable declaration > with same type by comma, most of the code declares each variable with the data > type on separate line. Should variable "place" be renamed as "remainder" since > that's what it is ultimately? > Okay, will rename "place" to "remainder". > RelationBuildPartitionDesc() fills up mapping array but never uses it. In this Agreed, mapping array is not that much useful but not useless, it required at the end of RelationBuildPartitionDesc() while assigning OIDs to result->oids, see for-loop just before releasing mapping memory. > code the index into mapping array itself is the mapping so it doesn't need to > be maintained separately like list partiioning case. Similary next_index usage > looks unnecessary, although that probably improves readability, so may be > fine. > Anyway, will remove uses of "next_index". > + * for p_p1: satisfies_hash_partition(2, 1, pkey, value) > + * for p_p2: satisfies_hash_partition(4, 2, pkey, value) > + * for p_p3: satisfies_hash_partition(8, 0, pkey, value) > + * for p_p4: satisfies_hash_partition(8, 4, pkey, value) > What the function builds is satisfies_hash_partition(2, 1, pkey). I don't see > code to add value as an argument to the function. Is that correct? > Sorry for confusion, "pkey" & "value" are the column of table in the give example. Renamed those column name to "a" & "b". > +intmodulus = DatumGetInt32(datum); > May be you want to rename this variable to greatest_modulus like in the other > places. > Okay, will fix this. > +Assert(spec->modulus > 0 && spec->remainder >= 0); > I liked this assertion. Do you want to add spec->modulus > spec->reminder also > here? > Okay, will add this too. > +char *strategy;/* partitioning strategy > + ('hash', 'list' or 'range') */ > > We need the second line to
Re: [HACKERS] Time based lag tracking for logical replication
On 11 May 2017 at 18:29, Simon Riggs wrote: > On 11 May 2017 at 18:13, Andres Freund wrote: > >>>New patch, v3. >>> >>>Applying in 90 minutes, barring objections. >> >> Could you please wait till tomorrow? I've bigger pending fixes for related >> code pending/being tested that I plan to push today. I'd also like to take >> a look before... > > Sure. The changes I've added are very minor, so I'm not expecting debate. The main part of the patch is the same as Petr posted 19days ago. I'm travelling now, so after waiting till tomorrow as you requested I have committed the patch. Cheers -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] If subscription to foreign table valid ?
On 05/11/2017 07:13 PM, Petr Jelinek wrote: I think it does make sense to add check for this into CREATE/ALTER SUBSCRIBER though so that user is informed immediately about the mistake rather than by errors in the logs later. +1 , there are few similar cases - where user does not get error at prompt , for instance --when publication doesn't not exist postgres=# create subscription sub connection 'dbname=postgres port=5000 user=centos password=a' publication nowhere; NOTICE: synchronized table states NOTICE: created replication slot "sub" on publisher CREATE SUBSCRIPTION --No check validation for Publication name in ALTER postgres=# alter subscription sub set publication _ refresh; ALTER SUBSCRIPTION --slot name given in ALTER postgres=# alter subscription sub with ( slot name='nowhere'); ALTER SUBSCRIPTION --and due to that , we are not able to drop it later. postgres=# drop subscription sub; ERROR: could not drop the replication slot "nowhere" on publisher DETAIL: The error was: ERROR: replication slot "nowhere" does not exist postgres=# -- regards,tushar EnterpriseDB https://www.enterprisedb.com/ The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UPDATE of partition key
On Fri, May 12, 2017 at 10:49 AM, Amit Khandekar wrote: > On 12 May 2017 at 08:30, Amit Kapila wrote: >> On Thu, May 11, 2017 at 5:41 PM, Amit Khandekar >> wrote: > >> If we try to compare it with the non-partitioned update, >> there also it is internally a delete and insert operation, but we >> don't fire triggers for those. > > For a non-partitioned table, the delete+insert is internal, whereas > for partitioned table, it is completely visible to the user. > If the user has executed an update on root table, then it is transparent. I think we can consider it user visible only in case if there is some user visible syntax like "Update ... Move Row If Constraint Not Satisfied" >> (b) It seems inconsistent to consider behavior for row and statement triggers differently >>> >>> I am not sure whether we should compare row and statement triggers. >>> Statement triggers are anyway fired only per-statement, depending upon >>> whether it is update or insert or delete. It has nothing to do with >>> how the rows are modified. >>> >> >> Okay. The broader point I was trying to convey was that the way this >> patch defines the behavior of triggers doesn't sound good to me. It >> appears to me that in this thread multiple people have raised points >> around trigger behavior and you should try to consider those. > > I understand that there is no single solution which will provide > completely intuitive trigger behaviour. Skipping BR delete trigger > should be fine. But then for consistency, we should skip BR insert > trigger as well, the theory being that the delete+insert are not fired > by the user so we should not fire them. But I feel both should be > fired to avoid any consequences unexpected to the user who has > installed those triggers. > > The only specific concern of yours is that of firing *both* update as > well as insert triggers on the same table, right ? My explanation for > this was : we have done this before for UPSERT, and we had documented > the same. We can do it here also. > >> Apart from the options, Robert has suggested, another option could be that >> we allow firing BR-AR update triggers for original partition and BR-AR >> insert triggers for the new partition. In this case, one can argue >> that we have not actually updated the row in the original partition, >> so there is no need to fire AR update triggers, > > Yes that's what I think. If there is no update happened, then AR > update trigger should not be executed. AR triggers are only for > scenarios where it is guaranteed that the DML operation has happened > when the trigger is being executed. > >> but I feel that is what we do for non-partitioned table update and it should >> be okay here >> as well. > > I don't think so. For e.g. if a BR trigger returns NULL, the update > does not happen, and then the AR trigger does not fire as well. Do you > see any other scenarios for non-partitioned tables, where AR triggers > do fire when the update does not happen ? > No, but here also it can be considered as an update for original partition. > > Overall, I am also open to skipping both insert+delete BR trigger, > I think it might be better to summarize all the options discussed including what the patch has and see what most people consider as sensible. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] snapbuild woes
On 12/05/17 03:31, Andres Freund wrote: > On 2017-05-11 14:54:26 -0700, Andres Freund wrote: >> On 2017-05-11 14:51:55 -0700, wrote: >>> On 2017-05-08 00:10:12 -0700, Andres Freund wrote: I plan to commit the next pending patch after the back branch releases are cut - it's an invasive fix and the issue doesn't cause corruption "just" slow slot creation. So it seems better to wait for a few days, rather than hurry it into the release. >>> >>> Now that that's done, here's an updated version of that patch. Note the >>> new logic to trigger xl_running_xact's to be logged at the right spot. >>> Works well in my testing. >>> >>> I plan to commit this fairly soon, unless somebody wants a bit more time >>> to look into it. >>> >>> Unless somebody protests, I'd like to slightly revise how the on-disk >>> snapshots are stored on master - given the issues this bug/commit showed >>> with the current method - but I can see one could argue that that should >>> rather be done next release. >> >> As usual I forgot my attachement. > > This patch also seems to offer a way to do your 0005 in, possibly, more > efficient manner. We don't ever need to assume a transaction is > timetravelling anymore. Could you check whether the attached, hasty, > patch resolves the performance problems you measured? Also, do you have > a "reference" workload for that? > Hmm, well it helps but actually now that we don't track individual running transactions anymore it got much less effective (my version of 0005 does as well). The example workload I test with is: session 1: open transaction, do a write, keep it open session 2: pgbench -M simple -N -c 10 -P 1 -T 5 session 3: run CREATE_REPLICATION_SLOT LOGICAL in walsender session 2: pgbench -M simple -N -c 10 -P 1 -T 20 session 1: commit And wait for session 3 to finish slot creation, takes about 20 mins on my laptop without patches, minute and half with your patches for 0004 and 0005 (or with your 0004 and my 0005) and about 2s with my original 0004 and 0005. What makes it slow is the constant qsorting IIRC. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [doc fix] PG10: wroing description on connect_timeout when multiple hosts are specified
From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tsunakawa, > Takayuki > I found a wrong sentence here in the doc. I'm sorry, this is what I asked > you to add... > > https://www.postgresql.org/docs/devel/static/libpq-connect.html#libpq- > paramkeywords > > connect_timeout > Maximum wait for connection, in seconds (write as a decimal integer string). > Zero or not specified means wait indefinitely. It is not recommended to > use a timeout of less than 2 seconds. This timeout applies separately to > each connection attempt. For example, if you specify two hosts and both > of them are unreachable, and connect_timeout is 5, the total time spent > waiting for a connection might be up to 10 seconds. > > > The program behavior is that libpq times out after connect_timeout seconds > regardless of how many hosts are specified. I confirmed it like this: > > $ export PGOPTIONS="-c post_auth_delay=30" > $ psql -d "dbname=postgres connect_timeout=5" -h localhost,localhost -p > 5432,5433 > (psql erros out after 5 seconds) > > Could you fix the doc with something like this? > > "This timeout applies across all the connection attempts. For example, if > you specify two hosts and both of them are unreachable, and connect_timeout > is 5, the total time spent waiting for a connection is up to 5 seconds." > > Should we also change the minimum "2 seconds" part to be longer, according > to the number of hosts? Instead, I think we should fix the program to match the documented behavior. Otherwise, if the first database machine is down, libpq might wait for about 2 hours (depending on the OS's TCP keepalive setting), during which it tims out after connect_timeout and does not attempt to connect to other hosts. I'll add this item in the PostgreSQL 10 Open Items. Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur
From: Michael Paquier [mailto:michael.paqu...@gmail.com] > It seems to me that the feature is behaving as wanted. Or in short attempt > to connect to the next host only if a connection cannot be established. > If there is a failure once the exchange with the server has begun, just > consider it as a hard failure. This is an important property for > authentication and SSL connection failures actually. But PgJDBC behaves as expected -- attempt another connection to other hosts (and succeed). I believe that's what users would naturally expect. The current libpq implementation handles only the socket-level connect failure. Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CTE inlining
> Just to se what other RDBMS are doing with CTEs; Look at slide > 31 here: > https://www.percona.com/live/17/sites/default/files/slides/Recursive%20Query%20Throwdown.pdf That is taken from Markus Winand's post: https://twitter.com/MarkusWinand/status/852862475699707904 "Seems like MySQL is getting the best WITH support of all tested DBs (RECURSIVE not tested!)" -- View this message in context: http://www.postgresql-archive.org/CTE-inlining-tp5958992p5961164.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Get stuck when dropping a subscription during synchronizing table
On Fri, May 12, 2017 at 11:24 AM, Masahiko Sawada wrote: > On Thu, May 11, 2017 at 6:16 PM, Petr Jelinek > wrote: >> On 11/05/17 10:10, Masahiko Sawada wrote: >>> On Thu, May 11, 2017 at 4:06 PM, Michael Paquier >>> wrote: On Wed, May 10, 2017 at 11:57 AM, Masahiko Sawada wrote: > Barring any objections, I'll add these two issues to open item. It seems to me that those open items have not been added yet to the list. If I am following correctly, they could be defined as follows: - Dropping subscription may stuck if done during tablesync. -- Analyze deadlock issues with DROP SUBSCRIPTION and apply worker process. >> >> I think the solution to this is to reintroduce the LWLock that was >> removed and replaced with the exclusive lock on catalog [1]. I am afraid >> that correct way of handling this is to do both LWLock and catalog lock >> (first LWLock under which we kill the workers and then catalog lock so >> that something that prevents launcher from restarting them is held till >> the end of transaction). > > I agree to reintroduce LWLock and to stop logical rep worker first and > then modify catalog. That way we can reduce catalog lock level (maybe > to RowExclusiveLock) so that apply worker can see it. Also I think > that we need to do more things like in order to prevent that we keep > to hold LWLock until end of transaction, because holding LWLock until > end of transaction is not good idea and could be cause of deadlock. So > for example we can commit the transaction in DropSubscription after > cleaned pg_subscription record and all its dependencies and then start > new transaction for the remaining work. Of course we also need to > disallow DROP SUBSCRIPTION being executed in a user transaction > though. Attached two draft patches to solve these issues. Attached 0001 patch reintroduces LogicalRepLauncherLock and makes DROP SUBSCRIPTION keep holding it until commit. To prevent from deadlock possibility, I disallowed DROP SUBSCRIPTION being called in a transaction block. But there might be more sensible solution for this. please give me feedback. > >> -- Avoid orphaned tablesync worker if apply worker exits before changing its status. >>> >> >> The behavior question I have about this is if sync workers should die >> when apply worker dies (ie they are tied to apply worker) or if they >> should be tied to the subscription. >> >> I guess taking down all the sync workers when apply worker has exited is >> easier to solve. Of course it means that if apply worker restarts in >> middle of table synchronization, the table synchronization will have to >> start from scratch. That being said, in normal operation apply worker >> should only exit/restart if subscription has changed or has been >> dropped/disabled and I think sync workers want to exit/restart in that >> situation as well. > > I agree that sync workers are tied to the apply worker. > >> >> So for example having shmem detach hook for an apply worker (or reusing >> the existing one) that searches for all the other workers for same >> subscription and shuts them down as well sounds like solution to this. > > Seems reasonable solution. > Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center 0001-Fix-a-deadlock-bug-between-DROP-SUBSCRIPTION-and-app.patch Description: Binary data 0002-Wait-for-table-sync-worker-to-finish-when-apply-work.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] hash partitioning
On 2017/05/12 14:24, Ashutosh Bapat wrote: > On Fri, May 12, 2017 at 8:08 AM, Amit Langote > wrote: >> On 2017/05/12 11:20, Robert Haas wrote: >>> Yeah, but I have a feeling that marking the columns NOT NULL is going >>> to make it really hard to support that in the future when we get the >>> syntax hammered out. If it had only affected the partition >>> constraints that'd be different. >> >> So, adding keycol IS NOT NULL (like we currently do for expressions) in >> the implicit partition constraint would be more future-proof than >> generating an actual catalogued NOT NULL constraint on the keycol? I now >> tend to think it would be better. Directly inserting into a range >> partition with a NULL value for a column currently generates a "null value >> in column \"%s\" violates not-null constraint" instead of perhaps more >> relevant "new row for relation \"%s\" violates partition constraint". >> That said, we *do* document the fact that a NOT NULL constraint is added >> on range key columns, but we might as well document instead that we don't >> currently support routing tuples with NULL values in the partition key >> through a range-partitioned table and so NULL values cause error. > > in get_partition_for_tuple() we have > if (key->strategy == PARTITION_STRATEGY_RANGE) > { > /* Disallow nulls in the range partition key of the tuple */ > for (i = 0; i < key->partnatts; i++) > if (isnull[i]) > ereport(ERROR, > (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), > errmsg("range partition key of row contains null"))); > } > > Instead of throwing an error here, we should probably return -1 and > let the error be ""no partition of relation \"%s\" found for row", > which is the real error, not having a partition which can accept NULL. > If in future we decide to support NULL values in partition keys, we > need to just remove above code from get_partition_for_tuple() and > everything will work as is. I am assuming that we don't add any > implicit/explicit NOT NULL constraint right now. We *do* actually, for real columns: create table p (a int) partition by range (a); \d p Table "public.p" Column | Type | Collation | Nullable | Default +-+---+--+- a | integer | | not null | Partition key: RANGE (a) For expression keys, we emit IS NOT NULL as part of the implicit partition constraint. The above check for NULL is really for the expressions, because if any simple columns of the key contain NULL, they will fail the NOT NULL constraint itself (with that error message). As I said in my previous message, I'm thinking that emitting IS NOT NULL as part of the implicit partition constraint might be better instead of adding it as a NOT NULL constraint, that is, for the simple column keys; we already do that for the expression keys for which we cannot add the NOT NULL constraint anyway. The way things are currently, error messages generated when a row with NULL in the range partition key is *directly* into the partition looks a bit inconsistent, depending on whether the target key is a simple column or expression: create table p (a int, b int) partition by range (a, abs(b)); create table p1 partition of p for values from (1, 1) to (1, 10); insert into p1 values (NULL, NULL); ERROR: null value in column "a" violates not-null constraint DETAIL: Failing row contains (null, null). insert into p1 values (1, NULL); ERROR: new row for relation "p1" violates partition constraint DETAIL: Failing row contains (1, null). It would be nice if both said "violates partition constraint". BTW, note that this is independent of your suggestion to emit "partition not found" message instead of the "no NULLs allowed in the range partition key" message, which seems fine to me to implement. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers