Re: [HACKERS] On partitioning
Hi, From: Alvaro Herrera [mailto:alvhe...@2ndquadrant.com] Amit Langote wrote: From: Robert Haas [mailto:robertmh...@gmail.com] What is an overflow partition and why do we want that? That would be a default partition. That is, where the tuples that don't belong elsewhere (other defined partitions) go. VALUES clause of the definition for such a partition would look like: (a range partition) ... VALUES LESS THAN MAXVALUE (a list partition) ... VALUES DEFAULT There has been discussion about whether there shouldn't be such a place for tuples to go. That is, it should generate an error if a tuple can't go anywhere (or support auto-creating a new one like in interval partitioning?) In my design I initially had overflow partitions too, because I inherited the idea from Itagaki Takahiro's patch. Eventually I realized that it's a useless concept, because you can always have leftmost and rightmost partitions, which are just regular partitions (except they don't have a low key, resp. high key). If you don't define unbounded partitions at either side, it's fine, you just raise an error whenever the user tries to insert a value for which there is no partition. I think your mention of low key and high key of a partition has forced me into rethinking how I was going about this. For example, in Itagaki-san's patch, only upper bound for a range partition would go into the catalog while the CHECK expression for that partition would use upper bound for previous partition as lower bound for the partition (an expression of form lower = key AND key upper). I'd think that's presumptuous to a certain degree in that the arrangement does not allow holes in the range. That also means range partitions on either end are unbounded on one side. In fact, what I called overflow partition would get (last_partitions_upper = key) as its CHECK expression and vice versa. You suggest such unbounded partitions be disallowed? Which would mean we do not allow either of the partition bounds to be null in case of a range partition or list of values to be non-empty in case of a LIST partition. Not real clear to me how this applies to list partitioning, but I have the hunch that it'd be better to deal with that without overflow partitions as well. Likewise, CHECK expression for a LIST overflow partition would look something like NOT (key = ANY ( ARRAY[values-of-all-other-partitions])). By the way, I am not saying the primary metadata of partitions is CHECK expression anymore. I hope we can do away without them for partitioning sooner than later. I am looking to have bounds/values stored in the partition definition catalog not as an expression but as something readily amenable to use at places where it's useful. Suggestions are welcome! BTW I think auto-creating partitions is a bad idea in general, because you get into lock escalation mess and furthermore you have to waste time checking for existance beforehand, which lowers performance. Just have a very easy command that users can run ahead of time (something like CREATE PARTITION FOR VALUE now() + '30 days', whatever), and preferrably one that doesn't fail if the partition already exist; that way, users can have (for instance) a daily create-30-partitions-ahead procedure which most days would only create one partition (the one for 30 days in the future) but whenever the odd case happens that the server is turned off just at that time someday, it creates two -- one belt, 29 suspenders. Yeah, I mentioned auto-partitioning just to know if that's how people usually prefer to have overflow cases dealt with. I'd much rather focus on straightforward cases at this point. Having said that, I agree that users of partitioning should have a mechanism you mention though not sure about the details. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq pipelining
On 12/04/2014 03:11 AM, Matt Newell wrote: The recent discussion about pipelining in the jodbc driver prompted me to look at what it would take for libpq. Great! I have a proof of concept patch working. The results are even more promising than I expected. While it's true that many applications and frameworks won't easily benefit, it amazes me that this hasn't been explored before. I developed a simple test application that creates a table with a single auto increment primary key column, then runs a 4 simple queries x times each: ... I plan to write documentation, add regression testing, and do general cleanup before asking for feedback on the patch itself. Any suggestions about performance testing or api design would be nice. I haven't played with changing the sync logic yet, but I'm guessing that an api to allow manual sync instead of a sync per PQsendQuery will be needed. That could make things tricky though with multi-statement queries, because currently the only way to detect when results change from one query to the next are a ReadyForQuery message. A good API is crucial for this. It should make it easy to write an application that does pipelining, and to handle all the error conditions in a predictable way. I'd suggest that you write the documentation first, before writing any code, so that we can discuss the API. It doesn't have to be in SGML format yet, a plain-text description of the API will do. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Keep track of transaction commit timestamps
On 12/03/2014 04:54 PM, Alvaro Herrera wrote: ir commit timestamp directly as they commit, or an external transaction c Sorry, I'm late to the party, but here's some random comments on this after a quick review: * The whole concept of a node ID seems to be undocumented, and unused. No-one calls CommitTsSetDefaultNodeId(). The whole SET_TS record type is dead code too, when a non-default node_id is not set. Please remove, or explain how all that's supposed to work. * COMMIT_TS_SETTS. SETTS sounds like a typo (like Robert complained about committs earlier). Rename to COMMIT_TS_SET_TIMESTAMP, or just COMMIT_TS_SET. * committsdesc.c - commit_ts_desc.c, to be consistent with commit_ts.c * In commit_ts_desc: + nsubxids = ((XLogRecGetDataLen(record) - SizeOfCommitTsSet) / + sizeof(TransactionId)); + if (nsubxids 0) + { + int i; + TransactionId *subxids; + + subxids = palloc(sizeof(TransactionId) * nsubxids); + memcpy(subxids, + XLogRecGetData(record) + SizeOfCommitTsSet, + sizeof(TransactionId) * nsubxids); + for (i = 0; i nsubxids; i++) + appendStringInfo(buf, , %u, subxids[i]); + pfree(subxids); + } There's no need to memcpy() here. The subxid array in the WAL record is aligned. * This seems to be a completely unrelated change. --- a/src/backend/libpq/hba.c +++ b/src/backend/libpq/hba.c @@ -1438,7 +1438,7 @@ parse_hba_auth_opt(char *name, char *val, HbaLine *hbaline, int line_num) ereport(LOG, (errcode(ERRCODE_CONFIG_FILE_ERROR), errmsg(client certificates can only be checked if a root certificate store is available), -errhint(Make sure the configuration parameter \ssl_ca_file\ is set.), +errhint(Make sure the configuration parameter \%s\ is set., ssl_ca_file), errcontext(line %d of configuration file \%s\, line_num, HbaFileName))); return false; * What is the definition of latest commit, in pg_last_committed_xact()? It doesn't necessarily line up with the order of commit WAL records, nor with the order the proc array is updated (i.e. the order that the effects become visible to other backends). It seems confusing to add yet another notion of commit order. Perhaps that's the best we can do, but it needs to be documented. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Wed, 2014-11-26 at 16:59 -0800, Peter Geoghegan wrote: On Mon, Nov 24, 2014 at 1:03 PM, Peter Geoghegan p...@heroku.com wrote: Looks like the consensus is that we should have RETURNING project updated tuples too, then. Attached revision, v1.5, establishes this behavior (as always, there is a variant for each approach to value locking). There is a new commit with a commit message describing the new RETURNING/command tag behavior in detail, so no need to repeat it here. The documentation has been updated in these areas, too. It seems there isn't any way to distinguish between insert and update of given row. Maybe a pseudo-column can be added so that it can be used in the returning statement: insert into foobar(id, other_col) values(2, '2') on conflict (id) update set other_col=excluded.other_col returning id, pseudo.was_updated; This would ensure that users could check for each primary key value if the row was updated or inserted. Of course, the pseudo.was_updated name should be replaced by something better. It would be nice to be able to skip updates of rows that were not changed: insert into foobar values(2, '2') on conflict (id) update set other_col=excluded.other_col where target is distinct from excluded; - Anssi Kääriäinen -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On 12/04/2014 07:07 PM, Anssi Kääriäinen wrote: On Wed, 2014-11-26 at 16:59 -0800, Peter Geoghegan wrote: On Mon, Nov 24, 2014 at 1:03 PM, Peter Geoghegan p...@heroku.com wrote: Looks like the consensus is that we should have RETURNING project updated tuples too, then. Attached revision, v1.5, establishes this behavior (as always, there is a variant for each approach to value locking). There is a new commit with a commit message describing the new RETURNING/command tag behavior in detail, so no need to repeat it here. The documentation has been updated in these areas, too. It seems there isn't any way to distinguish between insert and update of given row. Maybe a pseudo-column can be added so that it can be used in the returning statement Yes, I think that's pretty important. With a negative attno so it's treated as a hidden col that must be explicitly named to be shown and won't be confused with user columns. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Keep track of transaction commit timestamps
On 04/12/14 10:42, Heikki Linnakangas wrote: On 12/03/2014 04:54 PM, Alvaro Herrera wrote: ir commit timestamp directly as they commit, or an external transaction c Sorry, I'm late to the party, but here's some random comments on this after a quick review: * The whole concept of a node ID seems to be undocumented, and unused. No-one calls CommitTsSetDefaultNodeId(). The whole SET_TS record type is dead code too, when a non-default node_id is not set. Please remove, or explain how all that's supposed to work. That's API meant to be used by extensions, maybe it would be preferable if there was SQL API exposed also? (It might also make more sense once replication identifiers are submitted.) * What is the definition of latest commit, in pg_last_committed_xact()? It doesn't necessarily line up with the order of commit WAL records, nor with the order the proc array is updated (i.e. the order that the effects become visible to other backends). It seems confusing to add yet another notion of commit order. Perhaps that's the best we can do, but it needs to be documented. It's updated on CommitTransaction (well RecordTransactionCommit but that's only called from CommitTransaction) time so the definition of latest commit is last CommitTransaction call. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Keep track of transaction commit timestamps
On 12/04/2014 01:16 PM, Petr Jelinek wrote: On 04/12/14 10:42, Heikki Linnakangas wrote: On 12/03/2014 04:54 PM, Alvaro Herrera wrote: ir commit timestamp directly as they commit, or an external transaction c Sorry, I'm late to the party, but here's some random comments on this after a quick review: * The whole concept of a node ID seems to be undocumented, and unused. No-one calls CommitTsSetDefaultNodeId(). The whole SET_TS record type is dead code too, when a non-default node_id is not set. Please remove, or explain how all that's supposed to work. That's API meant to be used by extensions, maybe it would be preferable if there was SQL API exposed also? (It might also make more sense once replication identifiers are submitted.) Maybe. Hard to tell without any documentation or comments on how it's supposed to work. In unacceptable in its current state. I would prefer to remove it now, and add it back later together with the code that actually uses it. I don't like having unused internal functions and APIs sitting the source tree in the hope that someone will find them useful in the future. It's more likely that it will bitrot, or not actually work as intended, when someone later tries to use it. What would the SQL API look like, and what would it be used for? * What is the definition of latest commit, in pg_last_committed_xact()? It doesn't necessarily line up with the order of commit WAL records, nor with the order the proc array is updated (i.e. the order that the effects become visible to other backends). It seems confusing to add yet another notion of commit order. Perhaps that's the best we can do, but it needs to be documented. It's updated on CommitTransaction (well RecordTransactionCommit but that's only called from CommitTransaction) time so the definition of latest commit is last CommitTransaction call. Right. It was a rhetorical question, actually. What I meant is that that needs to be documented, at least. Or changed so that it matches one of the other definitions of commit order, and then documented. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Thu, Dec 4, 2014 at 7:36 PM, Rahila Syed rahilasyed...@gmail.com wrote: IIUC, forcibly written fpws are not exposed to user , so is it worthwhile to add a GUC similar to full_page_writes in order to control a feature which is unexposed to user in first place? If full page writes is set 'off' by user, user probably cannot afford the overhead involved in writing large pages to disk . So , if a full page write is forcibly written in such a situation it is better to compress it before writing to alleviate the drawbacks of writing full_page_writes in servers with heavy write load. The only scenario in which a user would not want to compress forcibly written pages is when CPU utilization is high. But according to measurements done earlier the CPU utilization of compress='on' and 'off' are not significantly different. Yes they are not visible to the user still they exist. I'd prefer that we have a safety net though to prevent any problems that may occur if compression algorithm has a bug as if we enforce compression for forcibly-written blocks all the backups of our users would be impacted. I pondered something that Andres mentioned upthread: we may not do the compression in WAL record only for blocks, but also at record level. Hence joining the two ideas together I think that we should definitely have a different GUC to control the feature, consistently for all the images. Let's call it wal_compression, with the following possible values: - on, meaning that a maximum of compression is done, for this feature basically full_page_writes = on. - full_page_writes, meaning that full page writes are compressed - off, default value, to disable completely the feature. This would let room for another mode: 'record', to completely compress a record. For now though, I think that a simple on/off switch would be fine for this patch. Let's keep things simple. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Keep track of transaction commit timestamps
On 04/12/14 12:26, Heikki Linnakangas wrote: On 12/04/2014 01:16 PM, Petr Jelinek wrote: On 04/12/14 10:42, Heikki Linnakangas wrote: On 12/03/2014 04:54 PM, Alvaro Herrera wrote: ir commit timestamp directly as they commit, or an external transaction c Sorry, I'm late to the party, but here's some random comments on this after a quick review: * The whole concept of a node ID seems to be undocumented, and unused. No-one calls CommitTsSetDefaultNodeId(). The whole SET_TS record type is dead code too, when a non-default node_id is not set. Please remove, or explain how all that's supposed to work. That's API meant to be used by extensions, maybe it would be preferable if there was SQL API exposed also? (It might also make more sense once replication identifiers are submitted.) Maybe. Hard to tell without any documentation or comments on how it's supposed to work. In unacceptable in its current state. I would prefer to remove it now, and add it back later together with the code that actually uses it. I don't like having unused internal functions and APIs sitting the source tree in the hope that someone will find them useful in the future. It's more likely that it will bitrot, or not actually work as intended, when someone later tries to use it. The thing is I already have extension for 9.4 where I would use this API for conflict detection if it was available so it's not about hoping for somebody to find it useful. There was discussion about this during the review process because the objection was raised already then. What would the SQL API look like, and what would it be used for? It would probably mirror the C one, from my POV it's not needed but maybe some replication solution would prefer to use this without having to write C components... -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Keep track of transaction commit timestamps
On 12/04/2014 01:47 PM, Petr Jelinek wrote: On 04/12/14 12:26, Heikki Linnakangas wrote: On 12/04/2014 01:16 PM, Petr Jelinek wrote: On 04/12/14 10:42, Heikki Linnakangas wrote: On 12/03/2014 04:54 PM, Alvaro Herrera wrote: ir commit timestamp directly as they commit, or an external transaction c Sorry, I'm late to the party, but here's some random comments on this after a quick review: * The whole concept of a node ID seems to be undocumented, and unused. No-one calls CommitTsSetDefaultNodeId(). The whole SET_TS record type is dead code too, when a non-default node_id is not set. Please remove, or explain how all that's supposed to work. That's API meant to be used by extensions, maybe it would be preferable if there was SQL API exposed also? (It might also make more sense once replication identifiers are submitted.) Maybe. Hard to tell without any documentation or comments on how it's supposed to work. In unacceptable in its current state. I would prefer to remove it now, and add it back later together with the code that actually uses it. I don't like having unused internal functions and APIs sitting the source tree in the hope that someone will find them useful in the future. It's more likely that it will bitrot, or not actually work as intended, when someone later tries to use it. The thing is I already have extension for 9.4 where I would use this API for conflict detection if it was available so it's not about hoping for somebody to find it useful. There was discussion about this during the review process because the objection was raised already then. Yeah, it was raised. I don't think it was really addressed. There was lengthy discussion on whether to include LSN, node id, and/or some other information, but there was no discussion on: * What is a node ID? * How is it used? * Who assigns it? etc. None of those things are explained in the documentation nor code comments. The node ID sounds suspiciously like the replication identifiers that have been proposed a couple of times. I don't remember if I like replication identifiers or not, but I'd rather discuss that issue explicitly and separately. I don't want the concept of a replication/node ID to fly under the radar like this. What would the SQL API look like, and what would it be used for? It would probably mirror the C one, from my POV it's not needed but maybe some replication solution would prefer to use this without having to write C components... I can't comment on that, because without any documentation, I don't even know what the C API is. BTW, why is it OK that the node-ID of a commit is logged as a separate WAL record, after the commit record? That means that it's possible that a transaction commits, but you crash just before writing the SETTS record, so that after replay the transaction appears committed but with the default node ID. (Maybe that's OK given the intended use case, but it's hard to tell without any documentation. See a pattern here? ;-) ) - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSL regression test suite
On 10/06/2014 04:21 PM, Heikki Linnakangas wrote: Here's a new version of the SSL regression suite I wrote earlier. It now specifies both host and hostaddr in the connection string as Andres suggested, so it no longer requires changes to network configuration. I added a bunch of tests for the SAN feature that Alexey Klyukin wrote and was committed earlier. Plus a lot of miscellaneous cleanup. And here's another version. It now includes tests for CRLs, and uses a root CA that's used to sign the server and client CA's certificates, to test that using intermediary CAs work. This probably needs some further cleanup before it's ready for committing. One issues is that it creates a temporary cluster that listens for TCP connections on localhost, which isn't safe on a multi-user system. This issue remains. There isn't much we can do about it; SSL doesn't work over Unix domain sockets. We could make it work, but that's a whole different feature. How do people feel about including this test suite in the source tree? It's probably not suitable for running as part of make check-world, but it's extremely handy if you're working on a patch related to SSL. I'd like to commit this, even if it has some rough edges. That way we can improve it later, rather than have it fall into oblivion. Any objections? - Heikki diff --git a/src/test/Makefile b/src/test/Makefile index 9238860..1d6f789 100644 --- a/src/test/Makefile +++ b/src/test/Makefile @@ -12,7 +12,7 @@ subdir = src/test top_builddir = ../.. include $(top_builddir)/src/Makefile.global -SUBDIRS = regress isolation modules +SUBDIRS = regress isolation modules ssl # We want to recurse to all subdirs for all standard targets, except that # installcheck and install should not recurse into the subdirectory modules. diff --git a/src/test/ssl/Makefile b/src/test/ssl/Makefile new file mode 100644 index 000..194267b --- /dev/null +++ b/src/test/ssl/Makefile @@ -0,0 +1,126 @@ +#- +# +# Makefile for src/test/ssl +# +# Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group +# Portions Copyright (c) 1994, Regents of the University of California +# +# src/test/ssl/Makefile +# +#- + +subdir = src/test/ssl +top_builddir = ../../.. +include $(top_builddir)/src/Makefile.global + +CERTIFICATES := server_ca server-cn-and-alt-names \ + server-cn-only server-single-alt-name server-multiple-alt-names \ + server-no-names server-revoked server-ss \ + client_ca client client-revoked \ + root_ca + +SSLFILES := $(CERTIFICATES:%=ssl/%.key) $(CERTIFICATES:%=ssl/%.crt) \ + ssl/client.crl ssl/server.crl ssl/root.crl \ + ssl/both-cas-1.crt ssl/both-cas-2.crt \ + ssl/root+server_ca.crt ssl/root+server.crl \ + ssl/root+client_ca.crt ssl/root+client.crl + +sslfiles: $(SSLFILES) + +ssl/new_certs_dir: + mkdir ssl/new_certs_dir + +# Rule for creating private/public key pairs +ssl/%.key: + openssl genrsa -out $@ 1024 + chmod 0600 $@ + +# Rules for creating root CA certificates +ssl/root_ca.crt: ssl/root_ca.key cas.config + touch ssl/root_ca-certindex + openssl req -new -out ssl/root_ca.crt -x509 -config cas.config -config root_ca.config -key ssl/root_ca.key + echo 01 ssl/root_ca.srl + +# for client and server CAs +ssl/%_ca.crt: ssl/%_ca.key %_ca.config ssl/root_ca.crt ssl/new_certs_dir + touch ssl/$*_ca-certindex + openssl req -new -out ssl/temp_ca.crt -config cas.config -config $*_ca.config -key ssl/$*_ca.key +# Sign the certificate with the root CA + openssl ca -name root_ca -batch -config cas.config -in ssl/temp_ca.crt -out ssl/temp_ca_signed.crt + openssl x509 -in ssl/temp_ca_signed.crt -out ssl/$*_ca.crt # to keep just the PEM cert + rm ssl/temp_ca.crt ssl/temp_ca_signed.crt + echo 01 ssl/$*_ca.srl + +# Server certificates, signed by server CA: +ssl/server-%.crt: ssl/server-%.key ssl/server_ca.crt server-%.config + openssl req -new -key ssl/server-$*.key -out ssl/server-$*.csr -config server-$*.config + openssl ca -name server_ca -batch -config cas.config -in ssl/server-$*.csr -out ssl/temp.crt -extensions v3_req -extfile server-$*.config + openssl x509 -in ssl/temp.crt -out ssl/server-$*.crt # to keep just the PEM cert + rm ssl/server-$*.csr + +# Self-signed version of server-cn-only.crt +ssl/server-ss.crt: ssl/server-cn-only.key ssl/server-cn-only.crt server-cn-only.config + openssl req -new -key ssl/server-cn-only.key -out ssl/server-ss.csr -config server-cn-only.config + openssl x509 -req -days 1 -in ssl/server-ss.csr -signkey ssl/server-cn-only.key -out ssl/server-ss.crt -extensions v3_req -extfile server-cn-only.config + rm ssl/server-ss.csr + +# Client certificate, signed by the client CA: +ssl/client.crt: ssl/client.key ssl/client_ca.crt + openssl req -new -key ssl/client.key -out ssl/client.csr -config client.config + openssl ca -name client_ca -batch -out ssl/temp.crt -config cas.config -infiles
Re: [HACKERS] Add shutdown_at_recovery_target option to recovery.conf
On 02/12/14 18:59, Robert Haas wrote: On Fri, Nov 28, 2014 at 11:59 AM, Petr Jelinek p...@2ndquadrant.com wrote: I'm a bit late to the party, but wouldn't recovery_target_action = ... have been a better name for this? It'd be in line with the other recovery_target_* parameters, and also a bit shorter than the imho somewhat ugly action_at_recovery_target. FWIW, I too think that recovery_target_action is a better name. I agree. +1. Here is patch which renames action_at_recovery_target to recovery_target_action everywhere. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml index b4959ac..06d66d2 100644 --- a/doc/src/sgml/recovery-config.sgml +++ b/doc/src/sgml/recovery-config.sgml @@ -289,9 +289,9 @@ restore_command = 'copy C:\\server\\archivedir\\%f %p' # Windows /term listitem para -Alias for action_at_recovery_target, literaltrue/ is same as -action_at_recovery_target = literalpause/ and literalfalse/ -is same as action_at_recovery_target = literalpromote/. +Alias for recovery_target_action, literaltrue/ is same as +recovery_target_action = literalpause/ and literalfalse/ +is same as recovery_target_action = literalpromote/. /para para This setting has no effect if xref linkend=guc-hot-standby is not @@ -300,11 +300,11 @@ restore_command = 'copy C:\\server\\archivedir\\%f %p' # Windows /listitem /varlistentry - varlistentry id=action-at-recovery-target - xreflabel=action_at_recovery_target - termvarnameaction_at_recovery_target/varname (typeenum/type) + varlistentry id=recovery-target-action + xreflabel=recovery_target_action + termvarnamerecovery_target_action/varname (typeenum/type) indexterm -primaryvarnameaction_at_recovery_target/ recovery parameter/primary +primaryvarnamerecovery_target_action/ recovery parameter/primary /indexterm /term listitem @@ -336,7 +336,7 @@ restore_command = 'copy C:\\server\\archivedir\\%f %p' # Windows /para para Note that because filenamerecovery.conf/ will not be renamed when -varnameaction_at_recovery_target/ is set to literalshutdown/, +varnamerecovery_target_action/ is set to literalshutdown/, any subsequent start will end with immediate shutdown unless the configuration is changed or the filenamerecovery.conf/ is removed manually. diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index da28de9..e8e5325 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -229,7 +229,7 @@ static char *recoveryEndCommand = NULL; static char *archiveCleanupCommand = NULL; static RecoveryTargetType recoveryTarget = RECOVERY_TARGET_UNSET; static bool recoveryTargetInclusive = true; -static RecoveryTargetAction actionAtRecoveryTarget = RECOVERY_TARGET_ACTION_PAUSE; +static RecoveryTargetAction recoveryTargetAction = RECOVERY_TARGET_ACTION_PAUSE; static TransactionId recoveryTargetXid; static TimestampTz recoveryTargetTime; static char *recoveryTargetName; @@ -4654,7 +4654,7 @@ readRecoveryCommandFile(void) *head = NULL, *tail = NULL; bool recoveryPauseAtTargetSet = false; - bool actionAtRecoveryTargetSet = false; + bool recoveryTargetActionSet = false; fd = AllocateFile(RECOVERY_COMMAND_FILE, r); @@ -4712,32 +4712,32 @@ readRecoveryCommandFile(void) (errmsg_internal(pause_at_recovery_target = '%s', item-value))); - actionAtRecoveryTarget = recoveryPauseAtTarget ? - RECOVERY_TARGET_ACTION_PAUSE : - RECOVERY_TARGET_ACTION_PROMOTE; + recoveryTargetAction = recoveryPauseAtTarget ? + RECOVERY_TARGET_ACTION_PAUSE : + RECOVERY_TARGET_ACTION_PROMOTE; recoveryPauseAtTargetSet = true; } - else if (strcmp(item-name, action_at_recovery_target) == 0) + else if (strcmp(item-name, recovery_target_action) == 0) { if (strcmp(item-value, pause) == 0) -actionAtRecoveryTarget = RECOVERY_TARGET_ACTION_PAUSE; +recoveryTargetAction = RECOVERY_TARGET_ACTION_PAUSE; else if (strcmp(item-value, promote) == 0) -actionAtRecoveryTarget = RECOVERY_TARGET_ACTION_PROMOTE; +recoveryTargetAction = RECOVERY_TARGET_ACTION_PROMOTE; else if (strcmp(item-value, shutdown) == 0) -actionAtRecoveryTarget = RECOVERY_TARGET_ACTION_SHUTDOWN; +recoveryTargetAction = RECOVERY_TARGET_ACTION_SHUTDOWN; else ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg(invalid value for recovery parameter \%s\, -action_at_recovery_target), +recovery_target_action), errhint(The allowed values are \pause\, \promote\ and \shutdown\.)));
Re: [HACKERS] Add shutdown_at_recovery_target option to recovery.conf
On 4 December 2014 at 22:13, Petr Jelinek p...@2ndquadrant.com wrote: On 02/12/14 18:59, Robert Haas wrote: On Fri, Nov 28, 2014 at 11:59 AM, Petr Jelinek p...@2ndquadrant.com wrote: I'm a bit late to the party, but wouldn't recovery_target_action = ... have been a better name for this? It'd be in line with the other recovery_target_* parameters, and also a bit shorter than the imho somewhat ugly action_at_recovery_target. FWIW, I too think that recovery_target_action is a better name. I agree. +1. Here is patch which renames action_at_recovery_target to recovery_target_action everywhere. Thanks; I'll fix it up on Monday. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add shutdown_at_recovery_target option to recovery.conf
On Thu, Dec 4, 2014 at 10:13 PM, Petr Jelinek p...@2ndquadrant.com wrote: On 02/12/14 18:59, Robert Haas wrote: On Fri, Nov 28, 2014 at 11:59 AM, Petr Jelinek p...@2ndquadrant.com wrote: I'm a bit late to the party, but wouldn't recovery_target_action = ... have been a better name for this? It'd be in line with the other recovery_target_* parameters, and also a bit shorter than the imho somewhat ugly action_at_recovery_target. FWIW, I too think that recovery_target_action is a better name. I agree. +1. Here is patch which renames action_at_recovery_target to recovery_target_action everywhere. Thanks, Looks good to me. A couple of things that would be good to document as well in recovery-config.sgml: - the fact that pause_at_recovery_target is deprecated. - the fact that both parameters cannot be used at the same time. Let's not surprise the users with behaviors they may expect or guess and document this stuff precisely.. This would make docs consistent with this block of code in xlog.c: if (recoveryPauseAtTargetSet actionAtRecoveryTargetSet) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg(cannot set both \%s\ and \%s\ recovery parameters, pause_at_recovery_target, action_at_recovery_target), errhint(The \pause_at_recovery_target\ is deprecated.))); Regards, -- Michael
Re: [HACKERS] Add shutdown_at_recovery_target option to recovery.conf
On Thu, Dec 4, 2014 at 10:45 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Dec 4, 2014 at 10:13 PM, Petr Jelinek p...@2ndquadrant.com wrote: On 02/12/14 18:59, Robert Haas wrote: On Fri, Nov 28, 2014 at 11:59 AM, Petr Jelinek p...@2ndquadrant.com wrote: I'm a bit late to the party, but wouldn't recovery_target_action = ... have been a better name for this? It'd be in line with the other recovery_target_* parameters, and also a bit shorter than the imho somewhat ugly action_at_recovery_target. FWIW, I too think that recovery_target_action is a better name. I agree. +1. Here is patch which renames action_at_recovery_target to recovery_target_action everywhere. Thanks, Looks good to me. A couple of things that would be good to document as well in recovery-config.sgml: - the fact that pause_at_recovery_target is deprecated. - the fact that both parameters cannot be used at the same time. Let's not surprise the users with behaviors they may expect or guess and document this stuff precisely.. Btw, you are missing as well the addition of this parameter in recovery.conf.sample (mentioned by Fujii-san upthread). It would be nice to have a description paragraph as well similarly to what is written for pause_at_recovery_target. -- Michael
Re: [HACKERS] [COMMITTERS] pgsql: Keep track of transaction commit timestamps
Heikki Linnakangas hlinnakan...@vmware.com writes: On 12/03/2014 04:54 PM, Alvaro Herrera wrote: ir commit timestamp directly as they commit, or an external transaction c Sorry, I'm late to the party, but here's some random comments on this after a quick review: Also this commit breaks initdb of `make check' for me: creating template1 database in /home/ash/build/postgresql/HEAD/src/test/regress/./tmp_check/data/base/1 ... TRAP: FailedAssertion(!(((xmax) = ((TransactionId) 3))), File: /home/ash/src/postgresql/src/backend/storage/ipc/procarray.c, Line: 1414) Aborted (core dumped) child process exited with exit code 134 -- Alex -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New wal format distorts pg_xlogdump --stats
On 11/25/2014 05:36 AM, Andres Freund wrote: Hi, The new WAL format calculates FPI vs plain record data like: rec_len = XLogRecGetDataLen(record) + SizeOfXLogRecord; fpi_len = record-decoded_record-xl_tot_len - rec_len; Due to the amount of data now handled outside the main data portion , that doesn't seem correct to me. As an example, a couple thousand inserts with full_page_writes=off now yields: Type N (%) Record size (%) FPI size (%)Combined size (%) - --- --- --- ---- --- Heap/INSERT30167 ( 99.53) 814509 ( 99.50) 965856 ( 99.54) 1780365 ( 99.52) I think fpi_len now needs to only count the sum of the of the actual length of block images, and all the rest needs to be rec_len. Yeah, that's broken. I propose the attached. Or does anyone want to argue for adding an XLogRecGetFPILen() accessor macro for the hole_length in xlogreader.h. It's not something a redo routine would need, nor most XLOG-reading applications, so I thought it's better to just have pg_xlogdump peek into the DecodedBkpBlock struct directly. - Heikki diff --git a/contrib/pg_xlogdump/pg_xlogdump.c b/contrib/pg_xlogdump/pg_xlogdump.c index 26556dc..9f05e25 100644 --- a/contrib/pg_xlogdump/pg_xlogdump.c +++ b/contrib/pg_xlogdump/pg_xlogdump.c @@ -351,14 +351,29 @@ XLogDumpCountRecord(XLogDumpConfig *config, XLogDumpStats *stats, uint8 recid; uint32 rec_len; uint32 fpi_len; + int block_id; stats-count++; - /* Update per-rmgr statistics */ - rmid = XLogRecGetRmid(record); rec_len = XLogRecGetDataLen(record) + SizeOfXLogRecord; - fpi_len = record-decoded_record-xl_tot_len - rec_len; + + /* + * Calculate the amount of FPI data in the record. Each backup block + * takes up BLCKSZ bytes, minus the hole length. + * + * XXX: We peek into xlogreader's private decoded backup blocks for the + * hole_length. It doesn't seem worth it to add an accessor macro for + * this. + */ + fpi_len = 0; + for (block_id = 0; block_id = record-max_block_id; block_id++) + { + if (XLogRecHasBlockImage(record, block_id)) + fpi_len += BLCKSZ - record-blocks[block_id].hole_length; + } + + /* Update per-rmgr statistics */ stats-rmgr_stats[rmid].count++; stats-rmgr_stats[rmid].rec_len += rec_len; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq pipelining
On 12/04/2014 05:08 PM, Heikki Linnakangas wrote: A good API is crucial for this. It should make it easy to write an application that does pipelining, and to handle all the error conditions in a predictable way. I'd suggest that you write the documentation first, before writing any code, so that we can discuss the API. It doesn't have to be in SGML format yet, a plain-text description of the API will do. I strongly agree. Applications need to be able to reliably predict what will happen if there's an error in the middle of a pipeline. Consideration of implicit transactions (autocommit), the whole pipeline being one transaction, or multiple transactions is needed. Apps need to be able to wait for the result of a query partway through a pipeline, e.g. scheduling four queries, then waiting for the result of the 2nd. There are probably plenty of other wrinkly bits to think about. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Keep track of transaction commit timestamps
Alex Shulgin wrote: Also this commit breaks initdb of `make check' for me: creating template1 database in /home/ash/build/postgresql/HEAD/src/test/regress/./tmp_check/data/base/1 ... TRAP: FailedAssertion(!(((xmax) = ((TransactionId) 3))), File: /home/ash/src/postgresql/src/backend/storage/ipc/procarray.c, Line: 1414) Aborted (core dumped) child process exited with exit code 134 Uh, that's odd. Can you please get a stack trace? Do you have unusual settings or a patched build? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make check-world regress failed
On 11/23/2014 08:37 PM, Vladimir Koković wrote: PostgreSQL check-world regress failed with current GIT HEAD on my Kubuntu 14.10. uname -a Linux vlD-kuci 3.16.0-24-generic #32-Ubuntu SMP Tue Oct 28 13:13:18 UTC 2014 i686 athlon i686 GNU/Linux gdb -d /home/src/postgresql-devel/postgresql-git/postgresql/src -c core ... Loaded symbols for /home/src/postgresql-devel/dev-build/src/test/regress/regress.so (gdb) bt #0 0xb76ecc7c in __kernel_vsyscall () #1 0xb7075577 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #2 0xb7076cf3 in __GI_abort () at abort.c:89 #3 0x084c326a in ?? () #4 0x0a56c3b8 in ?? () #5 0xb76d232f in pg_atomic_init_u64 (ptr=0xbfa16fd4, val=0) at /home/src/postgresql-devel/postgresql-git/postgresql/src/include/port/atomics.h:445 #6 0xb76d50e4 in test_atomic_uint64 () at /home/src/postgresql-devel/postgresql-git/postgresql/src/test/regress/regress.c:1022 #7 0xb76d5756 in test_atomic_ops (fcinfo=0xa57c76c) at /home/src/postgresql-devel/postgresql-git/postgresql/src/test/regress/regress.c:1114 #8 0x0825bfee in ?? () ... Andres, have you had a chance to look at this? On 32-bit x86, arch-x86.h leaves PG_HAVE_ATOMIC_U64_SUPPORT undefined. But generic-gcc.h, which is included later, then defines it. pg_atomic_init_u64 does AssertPointerAlignment(ptr, 8) on the variable, but there is no guarantee that it is 8-bytes aligned on x86. My patch solved check-world regress fail. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Keep track of transaction commit timestamps
Alvaro Herrera alvhe...@2ndquadrant.com writes: Alex Shulgin wrote: Also this commit breaks initdb of `make check' for me: creating template1 database in /home/ash/build/postgresql/HEAD/src/test/regress/./tmp_check/data/base/1 ... TRAP: FailedAssertion(!(((xmax) = ((TransactionId) 3))), File: /home/ash/src/postgresql/src/backend/storage/ipc/procarray.c, Line: 1414) Aborted (core dumped) child process exited with exit code 134 Uh, that's odd. Can you please get a stack trace? Do you have unusual settings or a patched build? Not really, and I would mention that. Will get a trace. -- Alex -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Keep track of transaction commit timestamps
Alvaro Herrera alvhe...@2ndquadrant.com writes: Uh, that's odd. Can you please get a stack trace? Do you have unusual settings or a patched build? Is there a way to pause the bootstrap process so I can attach gdb to it? -- Alex -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Keep track of transaction commit timestamps
On 12/04/2014 10:50 PM, Alex Shulgin wrote: Is there a way to pause the bootstrap process so I can attach gdb to it? With a newer gdb, you can instead tell gdb to follow all forks. I wrote some notes on it recently. http://blog.2ndquadrant.com/processes-breakpoints-watchpoints-postgresql/ I've found it really handy when debugging crashes in regression tests. However, it's often simpler to just: ulimit -c unlimited make check (or whatever your crashing test is) then look at the core files that are produced. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Keep track of transaction commit timestamps
Craig Ringer cr...@2ndquadrant.com writes: On 12/04/2014 10:50 PM, Alex Shulgin wrote: Is there a way to pause the bootstrap process so I can attach gdb to it? With a newer gdb, you can instead tell gdb to follow all forks. I wrote some notes on it recently. http://blog.2ndquadrant.com/processes-breakpoints-watchpoints-postgresql/ I've found it really handy when debugging crashes in regression tests. However, it's often simpler to just: ulimit -c unlimited make check (or whatever your crashing test is) then look at the core files that are produced. Good one, it didn't occur to me that assert makes core files. Figured it out with a pg_usleep in bootstrap.c anyway. Does this look sane? DEBUG: inserting column 6 value 0 DEBUG: inserted - 0 DEBUG: inserting column 7 value varchar_transform TRAP: FailedAssertion(!(((xmax) = ((TransactionId) 3))), File: /home/ash/src/postgresql/src/backend/storage/ipc/procarray.c, Line: 1414) Program received signal SIGABRT, Aborted. 0x7f2757128d27 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) bt #0 0x7f2757128d27 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x7f275712a418 in __GI_abort () at abort.c:89 #2 0x009088b2 in ExceptionalCondition ( conditionName=0xac0710 !(((xmax) = ((TransactionId) 3))), errorType=0xac01d8 FailedAssertion, fileName=0xac0178 /home/ash/src/postgresql/src/backend/storage/ipc/procarray.c, lineNumber=1414) at /home/ash/src/postgresql/src/backend/utils/error/assert.c:54 #3 0x0079e125 in GetSnapshotData ( snapshot=0xdb2d60 CatalogSnapshotData) at /home/ash/src/postgresql/src/backend/storage/ipc/procarray.c:1414 #4 0x0094e02d in GetNonHistoricCatalogSnapshot (relid=1255) at /home/ash/src/postgresql/src/backend/utils/time/snapmgr.c:298 #5 0x0094dfdd in GetCatalogSnapshot (relid=1255) at /home/ash/src/postgresql/src/backend/utils/time/snapmgr.c:272 #6 0x004c8f0d in systable_beginscan (heapRelation=0x1d0e5c0, indexId=2691, indexOK=1 '\001', snapshot=0x0, nkeys=1, key=0x7fff201bbc40) at /home/ash/src/postgresql/src/backend/access/index/genam.c:275 #7 0x00885070 in regprocin (fcinfo=0x7fff201bbce0) at /home/ash/src/postgresql/src/backend/utils/adt/regproc.c:106 #8 0x00914fe7 in InputFunctionCall (flinfo=0x7fff201bc0c0, str=0x1d349b8 varchar_transform, typioparam=24, typmod=-1) ---Type return to continue, or q return to quit--- at /home/ash/src/postgresql/src/backend/utils/fmgr/fmgr.c:1914 #9 0x0091533e in OidInputFunctionCall (functionId=44, str=0x1d349b8 varchar_transform, typioparam=24, typmod=-1) at /home/ash/src/postgresql/src/backend/utils/fmgr/fmgr.c:2045 #10 0x0052af91 in InsertOneValue (value=0x1d349b8 varchar_transform, i=7) at /home/ash/src/postgresql/src/backend/bootstrap/bootstrap.c:840 #11 0x00527409 in boot_yyparse () at /home/ash/src/postgresql/src/backend/bootstrap/bootparse.y:455 #12 0x0052a26b in BootstrapModeMain () at /home/ash/src/postgresql/src/backend/bootstrap/bootstrap.c:494 #13 0x0052a177 in AuxiliaryProcessMain (argc=6, argv=0x1cc8378) at /home/ash/src/postgresql/src/backend/bootstrap/bootstrap.c:414 #14 0x006a327c in main (argc=7, argv=0x1cc8370) at /home/ash/src/postgresql/src/backend/main/main.c:212 (gdb) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Postgres 9.4.0 to be released week after next
After due consideration, the core committee has agreed it's time to push this puppy out the door. We will wrap 9.4.0 on Monday Dec 15 for public announcement Thursday 18th. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSL regression test suite
On Thu, Dec 04, 2014 at 02:42:41PM +0200, Heikki Linnakangas wrote: On 10/06/2014 04:21 PM, Heikki Linnakangas wrote: This probably needs some further cleanup before it's ready for committing. One issues is that it creates a temporary cluster that listens for TCP connections on localhost, which isn't safe on a multi-user system. This issue remains. There isn't much we can do about it; SSL doesn't work over Unix domain sockets. We could make it work, but that's a whole different feature. How do people feel about including this test suite in the source tree? It's probably not suitable for running as part of make check-world, What makes it unsuitable? but it's extremely handy if you're working on a patch related to SSL. I'd like to commit this, even if it has some rough edges. That way we can improve it later, rather than have it fall into oblivion. Any objections? Not from me :) Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSL regression test suite
Heikki Linnakangas hlinnakan...@vmware.com writes: On 10/06/2014 04:21 PM, Heikki Linnakangas wrote: This probably needs some further cleanup before it's ready for committing. One issues is that it creates a temporary cluster that listens for TCP connections on localhost, which isn't safe on a multi-user system. This issue remains. There isn't much we can do about it; SSL doesn't work over Unix domain sockets. We could make it work, but that's a whole different feature. How do people feel about including this test suite in the source tree? It's probably not suitable for running as part of make check-world, but it's extremely handy if you're working on a patch related to SSL. I'd like to commit this, even if it has some rough edges. That way we can improve it later, rather than have it fall into oblivion. Any objections? As long as it's not run by any standard target, and there's some documentation explaining why not, I see no reason it can't be in the tree. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sequence Access Method WIP
On 12/3/14, 8:50 AM, José Luis Tallón wrote: May I possibly suggest a file-per-schema model instead? This approach would certainly solve the excessive i-node consumption problem that --I guess-- Andres is trying to address here. I don't think that really has any advantages. Just spreading the I/O load, nothing more, it seems: Just to elaborate a bit on the reasoning, for completeness' sake: Given that a relation's segment maximum size is 1GB, we'd have (1048576/8)=128k sequences per relation segment. Arguably, not many real use cases will have that many sequences save for *massively* multi-tenant databases. The downside being that all that random I/O --- in general, it can't really be sequential unless there are very very few sequences--- can't be spread to other spindles. Create a sequence_default_tablespace GUC + ALTER SEQUENCE SET TABLESPACE, to use an SSD for this purpose maybe? Why not? RAID arrays typically use stripe sizes in the 128-256k range, which means only 16 or 32 sequences per stripe. It still might make sense to allow controlling what tablespace a sequence is in, but IMHO the default should just be pg_default. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why is PG_AUTOCONF_FILENAME is pg_config_manual.h?
On 11/27/14 9:58 AM, Peter Eisentraut wrote: Surely that's not a value that we expect users to be able to edit. Is pg_config_manual.h just abused as a place that's included everywhere? (I suggest utils/guc.h as a better place.) This has been fixed. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugfix and new feature for PGXS
On 11/19/14 11:11 PM, Peter Eisentraut wrote: I noticed this item was still in the 9.4 code. Looking at the original submission (http://www.postgresql.org/message-id/201306181552.36673.ced...@2ndquadrant.com, patch 0001), I think the original reason for adding this was wrong to begin with. I have applied all three patches to 9.4 and 9.5. So this issue is now resolved. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Keep track of transaction commit timestamps
Alex Shulgin wrote: DEBUG: inserting column 7 value varchar_transform Breakpoint 1, GetSnapshotData (snapshot=0xdb2d60 CatalogSnapshotData) at /home/ash/src/postgresql/src/backend/storage/ipc/procarray.c:1413 1413 xmax = ShmemVariableCache-latestCompletedXid; (gdb) p ShmemVariableCache-latestCompletedXid $1 = 4294967295 I think you might need to make distclean, then recompile. If you're not passing --enable-depend to configure, I suggest you do so; that greatly reduces such problems. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Keep track of transaction commit timestamps
Alvaro Herrera alvhe...@2ndquadrant.com writes: Alex Shulgin wrote: DEBUG: inserting column 7 value varchar_transform Breakpoint 1, GetSnapshotData (snapshot=0xdb2d60 CatalogSnapshotData) at /home/ash/src/postgresql/src/backend/storage/ipc/procarray.c:1413 1413 xmax = ShmemVariableCache-latestCompletedXid; (gdb) p ShmemVariableCache-latestCompletedXid $1 = 4294967295 I think you might need to make distclean, then recompile. If you're not passing --enable-depend to configure, I suggest you do so; that greatly reduces such problems. Well I tried that before posting the complaint, let me try again though. -- Alex -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Keep track of transaction commit timestamps
Alex Shulgin a...@commandprompt.com writes: Figured it out with a pg_usleep in bootstrap.c anyway. Does this look sane? DEBUG: inserting column 6 value 0 DEBUG: inserted - 0 DEBUG: inserting column 7 value varchar_transform TRAP: FailedAssertion(!(((xmax) = ((TransactionId) 3))), File: /home/ash/src/postgresql/src/backend/storage/ipc/procarray.c, Line: 1414) I've tried to debug this and I feel really dumbfound... DEBUG: inserting column 7 value varchar_transform Breakpoint 1, GetSnapshotData (snapshot=0xdb2d60 CatalogSnapshotData) at /home/ash/src/postgresql/src/backend/storage/ipc/procarray.c:1413 1413xmax = ShmemVariableCache-latestCompletedXid; (gdb) p ShmemVariableCache-latestCompletedXid $1 = 4294967295 (gdb) p *ShmemVariableCache $2 = {nextOid = 1, oidCount = 0, nextXid = 3, oldestXid = 3, xidVacLimit = 20003, xidWarnLimit = 2136483650, xidStopLimit = 2146483650, xidWrapLimit = 2147483650, oldestXidDB = 1, oldestCommitTs = 1, newestCommitTs = 0, latestCompletedXid = 4294967295} (gdb) p xmax $3 = 0 (gdb) n 1414Assert(TransactionIdIsNormal(xmax)); (gdb) p xmax $4 = 1 (gdb) p *ShmemVariableCache $5 = {nextOid = 1, oidCount = 0, nextXid = 3, oldestXid = 3, xidVacLimit = 20003, xidWarnLimit = 2136483650, xidStopLimit = 2146483650, xidWrapLimit = 2147483650, oldestXidDB = 1, oldestCommitTs = 1, newestCommitTs = 0, latestCompletedXid = 4294967295} (gdb) p ShmemVariableCache-latestCompletedXid $6 = 4294967295 (gdb) How? Is there an another concurrent process with the old view of VariableCacheData struct where latestCompletedXid still points to oldestCommitTs? This only happens with the CommitTs commit in effect. -- Alex -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Keep track of transaction commit timestamps
Alvaro Herrera alvhe...@2ndquadrant.com writes: Alex Shulgin wrote: DEBUG: inserting column 7 value varchar_transform Breakpoint 1, GetSnapshotData (snapshot=0xdb2d60 CatalogSnapshotData) at /home/ash/src/postgresql/src/backend/storage/ipc/procarray.c:1413 1413 xmax = ShmemVariableCache-latestCompletedXid; (gdb) p ShmemVariableCache-latestCompletedXid $1 = 4294967295 I think you might need to make distclean, then recompile. If you're not passing --enable-depend to configure, I suggest you do so; that greatly reduces such problems. Yeah, that did it. Sorry for the noise, I must have messed it up with the recompilations (note to self to always use --enable-depend). -- Alex -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sequence Access Method WIP
May I possibly suggest a file-per-schema model instead? This approach would certainly solve the excessive i-node consumption problem that --I guess-- Andres is trying to address here. I don't think that really has any advantages. Just spreading the I/O load, nothing more, it seems: Just to elaborate a bit on the reasoning, for completeness' sake: Given that a relation's segment maximum size is 1GB, we'd have (1048576/8)=128k sequences per relation segment. Arguably, not many real use cases will have that many sequences save for *massively* multi-tenant databases. The downside being that all that random I/O --- in general, it can't really be sequential unless there are very very few sequences--- can't be spread to other spindles. Create a sequence_default_tablespace GUC + ALTER SEQUENCE SET TABLESPACE, to use an SSD for this purpose maybe? (I could take a shot at the patch, if deemed worthwhile) I think that's just so far outside the sane usecases that I really don't see us adding complexity to reign it in. If your frequently used sequences get flushed out to disk by anything but the checkpointer the schema is just badly designed... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Thu, Dec 4, 2014 at 3:04 AM, Craig Ringer cr...@2ndquadrant.com wrote: Yes, I think that's pretty important. With a negative attno so it's treated as a hidden col that must be explicitly named to be shown and won't be confused with user columns. I think that the standard for adding a new system attribute ought to be enormous. The only case where a new one was added post-Postgres95 was tableoid. I'm pretty sure that others aren't going to want to do it that way. Besides, I'm not entirely convinced that this is actually an important distinction to expose. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSL regression test suite
Heikki Linnakangas wrote: How do people feel about including this test suite in the source tree? +1 It's probably not suitable for running as part of make check-world, but it's extremely handy if you're working on a patch related to SSL. I'd like to commit this, even if it has some rough edges. That way we can improve it later, rather than have it fall into oblivion. Any objections? To prevent it from breaking, one idea is to have one or more buildfarm animals that run this test as a separate module. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq pipelining
On Thursday, December 04, 2014 10:30:46 PM Craig Ringer wrote: On 12/04/2014 05:08 PM, Heikki Linnakangas wrote: A good API is crucial for this. It should make it easy to write an application that does pipelining, and to handle all the error conditions in a predictable way. I'd suggest that you write the documentation first, before writing any code, so that we can discuss the API. It doesn't have to be in SGML format yet, a plain-text description of the API will do. I strongly agree. First pass at the documentation changes attached, along with a new example that demonstrates pipelining 3 queries, with the middle one resulting in a PGRES_FATAL_ERROR response. With the API i am proposing, only 2 new functions (PQgetFirstQuery, PQgetLastQuery) are required to be able to match each result to the query that caused it. Another function, PQgetNextQuery allows iterating through the pending queries, and PQgetQueryCommand permits getting the original query text. Adding the ability to set a user supplied pointer on the PGquery struct might make it much easier for some frameworks, and other users might want a callback, but I don't think either are required. Applications need to be able to reliably predict what will happen if there's an error in the middle of a pipeline. Yes, the API i am proposing makes it easy to get results for each submitted query independently of the success or failure of previous queries in the pipeline. Consideration of implicit transactions (autocommit), the whole pipeline being one transaction, or multiple transactions is needed. The more I think about this the more confident I am that no extra work is needed. Unless we start doing some preliminary processing of the query inside of libpq, our hands are tied wrt sending a sync at the end of each query. The reason for this is that we rely on the ReadyForQuery message to indicate the end of a query, so without the sync there is no way to tell if the next result is from another statement in the current query, or the first statement in the next query. I also don't see a reason to need multiple queries without a sync statement. If the user wants all queries to succeed or fail together it should be no problem to start the pipeline with begin and complete it commit. But I may be missing some detail... Apps need to be able to wait for the result of a query partway through a pipeline, e.g. scheduling four queries, then waiting for the result of the 2nd. Right. With the api i am proposing the user does have to process each result until it gets to the one it wants, but it's no problem doing that. It would also be trivial to add a function PGresult * PQgetNextQueryResult(PQquery *query); that discards all results from previous queries. Very similar to how a PQexec disregards all results from previous async queries. It would also be possible to queue the results and be able to retrieve them out of order, but I think that add unnecessary complexity and might also make it easy for users to never retrieve and free some results. There are probably plenty of other wrinkly bits to think about. Yup, I'm sure i'm still missing some significant things at this point... Matt Newell /* * src/test/examples/testlibpqpipeline2.c * * * testlibpqpipeline.c * this test program tests query pipelining. It shows how to issue multiple * pipelined queries, and identify from which query a result originated. It * also demonstrates how failure of one query does not impact subsequent queries * when they are not part of the same transaction. * * */ #include stdio.h #include stdlib.h #include sys/time.h #include libpq-fe.h static void checkResult(PGconn *conn, PGresult *result, PGquery *query, int expectedResultStatus) { if (PQresultStatus(result) != expectedResultStatus) { printf( Got unexpected result status '%s', expected '%s'\nQuery:%s\n, PQresStatus(PQresultStatus(result)), PQresStatus(expectedResultStatus), PQgetQueryCommand(query)); PQclear(result); PQclear(PQexec(conn,DROP TABLE test)); PQfinish(conn); exit(1); } PQclear(result); } int main(int argc, char **argv) { PGconn * conn; PGquery * query1; PGquery * query2; PGquery * query3; PGquery * curQuery; PGresult * result; conn = NULL; query1 = query2 = query3 = curQuery = NULL; result = NULL; /* make a connection to the database */ conn = PQsetdb(NULL, NULL, NULL, NULL, NULL); /* check to see that the backend connection was successfully made */ if (PQstatus(conn) != CONNECTION_OK) { fprintf(stderr, Connection to database failed: %s, PQerrorMessage(conn)); exit(1); } checkResult(conn,PQexec(conn,DROP TABLE IF EXISTS test),NULL,PGRES_COMMAND_OK); checkResult(conn,PQexec(conn,CREATE TABLE test ( id SERIAL PRIMARY KEY )),NULL,PGRES_COMMAND_OK); PQsendQuery(conn, INSERT INTO test(id) VALUES (DEFAULT),(DEFAULT) RETURNING id); query1 = PQgetLastQuery(conn); /*
Re: [HACKERS] libpq pipelining
On Thu, Dec 4, 2014 at 4:11 PM, Matt Newell newe...@blur.com wrote: With the API i am proposing, only 2 new functions (PQgetFirstQuery, PQgetLastQuery) are required to be able to match each result to the query that caused it. Another function, PQgetNextQuery allows iterating through the pending queries, and PQgetQueryCommand permits getting the original query text. Adding the ability to set a user supplied pointer on the PGquery struct might make it much easier for some frameworks, and other users might want a callback, but I don't think either are required. With a pointer on PGquery you wouldn't need any of the above. Who whants the query text sets it as a pointer, who wants some other struct sets it as a pointer. You would only need to be careful about the lifetime of the pointed struct, but that onus is on the application I'd say. The API only needs to provide some guarantees about how long or short it holds onto that pointer. I'm thinking this would be somewhat necessary for a python wrapper, like psycopg2 (the wrapper could build a dictionary based on query text, but there's no guarantee that query text will be unique so it'd be very tricky). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugfix and new feature for PGXS
On 12/4/14 11:38 AM, Peter Eisentraut wrote: On 11/19/14 11:11 PM, Peter Eisentraut wrote: I noticed this item was still in the 9.4 code. Looking at the original submission (http://www.postgresql.org/message-id/201306181552.36673.ced...@2ndquadrant.com, patch 0001), I think the original reason for adding this was wrong to begin with. I have applied all three patches to 9.4 and 9.5. So this issue is now resolved. Apparently, some buildfarm module is unhappy with this: http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=sittelladt=2014-12-04%2017%3A16%3A29stg=RedisFDW-build Is there some custom code running there? I don't know how that error would happen otherwise? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq pipelining
On Thursday, December 04, 2014 04:30:27 PM Claudio Freire wrote: On Thu, Dec 4, 2014 at 4:11 PM, Matt Newell newe...@blur.com wrote: With the API i am proposing, only 2 new functions (PQgetFirstQuery, PQgetLastQuery) are required to be able to match each result to the query that caused it. Another function, PQgetNextQuery allows iterating through the pending queries, and PQgetQueryCommand permits getting the original query text. Adding the ability to set a user supplied pointer on the PGquery struct might make it much easier for some frameworks, and other users might want a callback, but I don't think either are required. With a pointer on PGquery you wouldn't need any of the above. Who whants the query text sets it as a pointer, who wants some other struct sets it as a pointer. libpq already stores the (current) query text as it's used in some error cases, so that's not really optional without breaking backwards compatibility. Adding another pointer for the user to optional utilize should be no big deal though if everyone agrees it's a good thing. You would only need to be careful about the lifetime of the pointed struct, but that onus is on the application I'd say. The API only needs to provide some guarantees about how long or short it holds onto that pointer. Agreed. I'm thinking this would be somewhat necessary for a python wrapper, like psycopg2 (the wrapper could build a dictionary based on query text, but there's no guarantee that query text will be unique so it'd be very tricky). While it might make some things simpler, i really don't think it absolutely necessary since the wrapper can maintain a queue that corresponds to libpq's internal queue of PGquery's. ie, each time you call a PQsendQuery* function you push your required state, and each time the return value of PQgetFirstQuery changes you pop from the queue. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] superuser() shortcuts
On 11/21/14 10:28 AM, Stephen Frost wrote: Specifically: --- One curious item to note is that the current if(!superuser()) {} block approach has masked an inconsistency in at least the FDW code which required a change to the regression tests- namely, we normally force FDW owners to have USAGE rights on the FDW, but that was being bypassed when a superuser makes the call. I seriously doubt that was intentional. I won't push back if someone really wants it to stay that way though. --- I think there are some inconsistencies here. To rephrase the problem: 1. To run CREATE SERVER, you need USAGE rights on FDW. 2. To run ALTER SERVER OWNER as unprivileged user, the new owner also needs USAGE rights on FDW. 3. When you run ALTER SERVER OWNER as superuser, the new owner does not need USAGE rights on FDW. - The proposal is to change #3 to require the new owner to have USAGE privileges even when running as superuser. Let's look at an analogous case: 1. To run CREATE DOMAIN, you need USAGE rights on the underlying type. 2. To run ALTER DOMAIN OWNER, the receiving user does not need USAGE rights on the underlying type. These two cases are inconsistent. There might be more analogous cases involving other object classes. Solution, either: A. We allow unprivileged users to give away objects to other unprivileged users who do not have the privileges that they could have created the object themselves. This could be a marginally useful feature. But it might also violate the privilege system, because it effectively allows you to bypass the grant options. or B. For unprivileged users to give away an object to another unprivileged user, the receiving user must have the privileges so that they could have created the object themselves. B.i. Superuser can bypass that restriction. B.ii. Superusers cannot bypass that restriction. Which one shall it be? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] superuser() shortcuts
On Tue, Dec 2, 2014 at 4:11 PM, Stephen Frost sfr...@snowman.net wrote: It's pretty clear that the current message is complaining about a permissions problem, so the fact that it doesn't specifically include the words permission and denied doesn't bother me. Let me ask the question again: what information do you think is conveyed by your proposed rewrite that is not conveyed by the current message? I do like that it's explicitly stating that the problem is with permissions and not because of some lack-of-capability in the software and I like the consistency of messaging (to the point where I was suggesting somewhere along the way that we update the error messaging documentation to be explicit that the errmsg and the errcode should make sense and match up- NOT that we should use some simple mapping from one to the other as we'd then be reducing the information provided, but I've seen a number of cases where people have the SQL error-code also returned in their psql session; I've never been a fan of that as I much prefer words over error-codes, but I wonder if they've done that because, in some cases, it *isn't* clear what the errcode is from the errmsg..). I think you're dodging my question. The answer is that there really *isn't* any additional information conveyed by your proposal rewrite; the issue is simply that you prefer your version to the current version. Which is fair enough, but my preference is different (as is that of Andres), which is also fair. It provides a consistency of messaging that I feel is an improvement. That they aren't related in a number of existing cases strikes me as an opportunity to improve rather than cases where we really know better. Let's consider this case, which perhaps you haven't examined: if (es.buffers !es.analyze) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg(EXPLAIN option BUFFERS requires ANALYZE))); This is basically analogous to the case your complaining about. The error message is short and to the point and does not include the words that appear in the error code. Now, you could propose to change this, but any rewording of it is going to be longer than what we have now for no real improvement in clarity, and it's going to force translators to do the extra work of retranslating it for no particular gain. The burden for changing existing messages is not high, but your desire for a different style is not sufficient to go change half the error messages in the backend, or even 20% of them, and I think that 20% is a conservative estimate of how many we'd have to change to actually achieve what you're talking about here. Perhaps in some of those cases the problem is that there isn't a good errcode, and maybe we should actually add an error code instead of changing the message. The error codes are mostly taken from the SQL standard. It is of course possible that we should add our own in some cases, but it will have the disadvantage of reducing the ability of applications written for other systems to understand our error codes. If we want to say that role-attribute-related error messages should be You need X to do Y, then I'll go make those changes, removing the 'permission denied' where those are used and be happier that we're at least consistent, but it'll be annoying if we ever make those role-attributes be GRANT'able rights (GRANT .. ON CLUSTER?) as those errmsg's will then change from you need X to permission denied to do Y. Having the errdetail change bothers me a lot less. You can't really put you need X to do Y in the error message, I think. That style is appropriate for an error detail, but not an error message. I just don't understand why you want to pointlessly tinker with this. If this is integrally related to the introduction of finer-grained superuser permissions, then perhaps it is worth doing, but if that's the motivation, you haven't made an argument that I can understand as to why it's necessary. What I think is that the current messages - or at least the ones you are complaining about - are fine, and we can change them on a case-by-case basis as the evolution of the system demands, but that we shouldn't go whack them around just because you would have chosen differently from among sane alternatives than what the original author chose to do. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] superuser() shortcuts
* Peter Eisentraut (pete...@gmx.net) wrote: On 11/21/14 10:28 AM, Stephen Frost wrote: Specifically: --- One curious item to note is that the current if(!superuser()) {} block approach has masked an inconsistency in at least the FDW code which required a change to the regression tests- namely, we normally force FDW owners to have USAGE rights on the FDW, but that was being bypassed when a superuser makes the call. I seriously doubt that was intentional. I won't push back if someone really wants it to stay that way though. --- I think there are some inconsistencies here. Agreed.. We certainly have a number of inconsistencies and I'd love to reduce them. :) To rephrase the problem: 1. To run CREATE SERVER, you need USAGE rights on FDW. 2. To run ALTER SERVER OWNER as unprivileged user, the new owner also needs USAGE rights on FDW. 3. When you run ALTER SERVER OWNER as superuser, the new owner does not need USAGE rights on FDW. - The proposal is to change #3 to require the new owner to have USAGE privileges even when running as superuser. On reflection, this seemed odd because of how the code was written but perhaps it was intentional after all. In general, superuser should be able to bypass permissions restrictions and I don't see why this case should be different. Let's look at an analogous case: 1. To run CREATE DOMAIN, you need USAGE rights on the underlying type. 2. To run ALTER DOMAIN OWNER, the receiving user does not need USAGE rights on the underlying type. These two cases are inconsistent. There might be more analogous cases involving other object classes. Solution, either: A. We allow unprivileged users to give away objects to other unprivileged users who do not have the privileges that they could have created the object themselves. This could be a marginally useful feature. But it might also violate the privilege system, because it effectively allows you to bypass the grant options. or B. For unprivileged users to give away an object to another unprivileged user, the receiving user must have the privileges so that they could have created the object themselves. In general, I don't think we want to allow giving away of objects by unprivileged users. We don't allow that to be done for tables and I'm surprised to hear that it's possible to give domains away. B.i. Superuser can bypass that restriction. B.ii. Superusers cannot bypass that restriction. Superuser should be able to bypass the restriction, BUT the object given away by the superuser to an unprivileged user should NOT be able to be further given away by that unprivileged user. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] superuser() shortcuts
On 11/26/14 10:24 AM, Stephen Frost wrote: The implementation detail is that it's not part of the normal GRANT/REVOKE privilege system, which is why it's useful to note it in the detail and why we don't need to add an errdetail along the lines of 'You must have SELECT rights on relation X to SELECT from it'. I don't agree with this argument, but I might agree with the conclusion. ;-) I think in the past, error messages for permission problems were effectively written according to the criterion: If I can explain the reason for the lack of permission in one short line, then I will, otherwise I will just produce a generic 'permission denied' error and have the user read the manual for the details. The proposed change is effectively: I will produce a generic 'permission denied' error, and if the reason for the lack of permission is anything other than GRANT/REVOKE, then I will add it to the detail message. That's not necessarily an invalid change, but it implies that there is something special (or less special) about GRANT/REVOKE, and there is no consensus on that. Seeing that we are planning to add more permissions systems of various kinds, I don't think it would be bad to uniformly add You must have SELECT rights on relation X to SELECT from it detail messages. The proposed changes would then be subset of that. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugfix and new feature for PGXS
On 12/04/2014 02:44 PM, Peter Eisentraut wrote: On 12/4/14 11:38 AM, Peter Eisentraut wrote: On 11/19/14 11:11 PM, Peter Eisentraut wrote: I noticed this item was still in the 9.4 code. Looking at the original submission (http://www.postgresql.org/message-id/201306181552.36673.ced...@2ndquadrant.com, patch 0001), I think the original reason for adding this was wrong to begin with. I have applied all three patches to 9.4 and 9.5. So this issue is now resolved. Apparently, some buildfarm module is unhappy with this: http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=sittelladt=2014-12-04%2017%3A16%3A29stg=RedisFDW-build Is there some custom code running there? I don't know how that error would happen otherwise? You have broken two buildfarm instances that build and test external modules - in one case the Redis FDW module and in the other the File Text Array FDW. I will see what can be retrieved. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] superuser() shortcuts
* Robert Haas (robertmh...@gmail.com) wrote: On Tue, Dec 2, 2014 at 4:11 PM, Stephen Frost sfr...@snowman.net wrote: It's pretty clear that the current message is complaining about a permissions problem, so the fact that it doesn't specifically include the words permission and denied doesn't bother me. Let me ask the question again: what information do you think is conveyed by your proposed rewrite that is not conveyed by the current message? I do like that it's explicitly stating that the problem is with permissions and not because of some lack-of-capability in the software and I like the consistency of messaging (to the point where I was suggesting somewhere along the way that we update the error messaging documentation to be explicit that the errmsg and the errcode should make sense and match up- NOT that we should use some simple mapping from one to the other as we'd then be reducing the information provided, but I've seen a number of cases where people have the SQL error-code also returned in their psql session; I've never been a fan of that as I much prefer words over error-codes, but I wonder if they've done that because, in some cases, it *isn't* clear what the errcode is from the errmsg..). I think you're dodging my question. My answer included not because of some lack-of-capability which was intended to answer your question of what information do you think is conveyed by your proposed rewrite that is not converyed by the current message? I have a hard time wrapping my head around what a *lot* of our users ask when they see a given error message, but if our error message is 'you must have a pear-shaped object to run this command' then I imagine that a new-to-PG user might think well, I can't create pear shaped objects in PG, so I guess this just isn't supported. That's not necessairly any fault of ours, but I do think permission denied reduces the chance that there's any confusion about the situation. The answer is that there really *isn't* any additional information conveyed by your proposal rewrite; To be sure it's clear, I *don't* agree with this. You and I don't see any additional information in it because we're familiar with the system and know all about role attributes, the GRANT system, and everything else. I'm not looking to change the error message because it's going to make it clearer to you or to me or to anyone else on this list though. the issue is simply that you prefer your version to the current version. Which is fair enough, but my preference is different (as is that of Andres), which is also fair. I do prefer my version but it's not an unfounded preference. It provides a consistency of messaging that I feel is an improvement. That they aren't related in a number of existing cases strikes me as an opportunity to improve rather than cases where we really know better. Let's consider this case, which perhaps you haven't examined: if (es.buffers !es.analyze) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg(EXPLAIN option BUFFERS requires ANALYZE))); This is basically analogous to the case your complaining about. The error message is short and to the point and does not include the words that appear in the error code. Now, you could propose to change this, but any rewording of it is going to be longer than what we have now for no real improvement in clarity, and it's going to force translators to do the extra work of retranslating it for no particular gain. I'm on the fence about if this is really the same case. In this case, changing 'option' to 'parameter' would make it pretty darn close to the error code. I could also see making it longer and more explicit- 'invalid parameter (BUFFERS) for EXPLAIN' with an errdetail of what's above. The burden for changing existing messages is not high, but your desire for a different style is not sufficient to go change half the error messages in the backend, or even 20% of them, and I think that 20% is a conservative estimate of how many we'd have to change to actually achieve what you're talking about here. The different style is what's in the error style guidelines. I agree that it could lead to a lot of changes and I don't think we'd need to change them all in one shot or in one release, in part because of the burden it would put on translators. Perhaps in some of those cases the problem is that there isn't a good errcode, and maybe we should actually add an error code instead of changing the message. The error codes are mostly taken from the SQL standard. It is of course possible that we should add our own in some cases, but it will have the disadvantage of reducing the ability of applications written for other systems to understand our error codes. This, in particular, doesn't bother me terribly much- if there's reason enough for
Re: [HACKERS] superuser() shortcuts
* Peter Eisentraut (pete...@gmx.net) wrote: I will produce a generic 'permission denied' error, and if the reason for the lack of permission is anything other than GRANT/REVOKE, then I will add it to the detail message. That's what I had been thinking, on the assumption that individuals with SQL-spec-type systems would be familiar with the GRANT/REVOKE system, but.. Seeing that we are planning to add more permissions systems of various kinds, I don't think it would be bad to uniformly add You must have SELECT rights on relation X to SELECT from it detail messages. The proposed changes would then be subset of that. I'd be fine with that. It would mean an extra line of output in many cases but we could at least be consistent across the backend with regard to how these cases are handled... Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Bugfix and new feature for PGXS
On 12/04/2014 03:47 PM, Andrew Dunstan wrote: On 12/04/2014 02:44 PM, Peter Eisentraut wrote: On 12/4/14 11:38 AM, Peter Eisentraut wrote: On 11/19/14 11:11 PM, Peter Eisentraut wrote: I noticed this item was still in the 9.4 code. Looking at the original submission (http://www.postgresql.org/message-id/201306181552.36673.ced...@2ndquadrant.com, patch 0001), I think the original reason for adding this was wrong to begin with. I have applied all three patches to 9.4 and 9.5. So this issue is now resolved. Apparently, some buildfarm module is unhappy with this: http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=sittelladt=2014-12-04%2017%3A16%3A29stg=RedisFDW-build Is there some custom code running there? I don't know how that error would happen otherwise? You have broken two buildfarm instances that build and test external modules - in one case the Redis FDW module and in the other the File Text Array FDW. I will see what can be retrieved. I think this needs to be reverted. This has broken two modules that are not even using vpath builds. You can see the relevant Makefiles at: https://github.com/adunstan/file_text_array_fdw/blob/master/Makefile and https://github.com/pg-redis-fdw/redis_fdw/blob/master/Makefile. IIRC, the code you found convoluted and removed was required precisely to prevent this sort of error. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Support frontend-backend protocol communication using a shm_mq.
Robert Haas wrote: Support frontend-backend protocol communication using a shm_mq. I just noticed that this patch broke the case where a standalone backend is sent a query that throws an error -- instead, we get a segmentation fault. Example: echo foobar | postgres --single PostgreSQL stand-alone backend 9.5devel backend Segmentation fault I guess we could have a src/test/modules subdir that tests simple stuff on standalone backends ... -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] duplicated partial-column assigns allowed by checkInsertTargets
While perusing checkInsertTargets I noticed that it allows duplicated assignments to the same member of a composite: alvherre=# create type f as (a int, b int); CREATE TYPE alvherre=# create table t (col f); CREATE TABLE alvherre=# insert into t (col.a, col.b, col.a) values (42, 43, 44); INSERT 0 1 alvherre=# select * from t; col - (44,43) (1 fila) If you instead try a duplicate col, it is rightfully rejected: alvherre=# insert into t (col, col) values ((42, 43), (44, 43)); ERROR: column col specified more than once LÍNEA 1: insert into t (col, col) values ((42, 43), (44, 43)); ^ Isn't this a bit odd? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] controlling psql's use of the pager a bit more
Andrew Dunstan wrote: However, there is more work to do. As Tom noted upthread, psql's calculation of the number of lines is pretty bad. For example, if I do: \pset pager_min_lines 100 select * from generate_series(1,50); the pager still gets invoked, which is unfortunate to say the least. So I'm going to take a peek at that. Any luck there? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] superuser() shortcuts
On 2014-12-04 15:59:17 -0500, Stephen Frost wrote: I have a hard time wrapping my head around what a *lot* of our users ask when they see a given error message, but if our error message is 'you must have a pear-shaped object to run this command' then I imagine that a new-to-PG user might think well, I can't create pear shaped objects in PG, so I guess this just isn't supported. That's not necessairly any fault of ours, but I do think permission denied reduces the chance that there's any confusion about the situation. I've a hard time taking this comment seriously. If can't believe that you think that comment bears relation to the error message we're discussing. The answer is that there really *isn't* any additional information conveyed by your proposal rewrite; To be sure it's clear, I *don't* agree with this. You and I don't see any additional information in it because we're familiar with the system and know all about role attributes, the GRANT system, and everything else. I'm not looking to change the error message because it's going to make it clearer to you or to me or to anyone else on this list though. The different style is what's in the error style guidelines. I think you're vastly over-interpreting the guidelines because that happens to suite your position. None of the current error message violates any of: The primary message should be short, factual, and avoid reference to implementation details such as specific function names. Short means should fit on one line under normal conditions. Use a detail message if needed to keep the primary message short, or if you feel a need to mention implementation details such as the particular system call that failed. Both primary and detail messages should be factual. Use a hint message for suggestions about what to do to fix the problem, especially if the suggestion might not always be applicable. And I don't for a second buy your argument that the permissions involved are an implemementation detail. If you say that you like the new message better because it's more consistent or more beautiful I can buy that. But don't try to find justifications in guidelines when they don't contain that. I just don't understand why you want to pointlessly tinker with this. Because I don't feel it's pointless to improve the consistency of the error messaging and I don't like that it's inconsistent today. Then please do so outside of patches/threads that do something pretty much unrelated. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq pipelining
On 12/04/2014 09:11 PM, Matt Newell wrote: With the API i am proposing, only 2 new functions (PQgetFirstQuery, PQgetLastQuery) are required to be able to match each result to the query that caused it. Another function, PQgetNextQuery allows iterating through the pending queries, and PQgetQueryCommand permits getting the original query text. Adding the ability to set a user supplied pointer on the PGquery struct might make it much easier for some frameworks, and other users might want a callback, but I don't think either are required. I don't like exposing the PGquery struct to the application like that. Access to all other libpq objects is done via functions. The application can't (or shouldn't, anyway) directly access the fields of PGresult, for example. It has to call PQnfields(), PQntuples() etc. The user-supplied pointer seems quite pointless. It would make sense if the pointer was passed to PQsendquery(), and you'd get it back in PGquery. You could then use it to tag the query when you send it with whatever makes sense for the application, and use the tag in the result to match it with the original query. But as it stands, I don't see the point. The original query string might be handy for some things, but for others it's useless. It's not enough as a general method to identify the query the result belongs to. A common use case for this is to execute the same query many times with different parameters. So I don't think you've quite nailed the problem of how to match the results to the commands that originated them, yet. One idea is to add a function that can be called after PQgetResult(), to get some identifier of the original command. But there needs to be a mechanism to tag the PQsendQuery() calls. Or you can assign each call a unique ID automatically, and have a way to ask for that ID after calling PQsendQuery(). The explanation of PQgetFirstQuery makes it sound pretty hard to match up the result with the query. You have to pay attention to PQisBusy. It would be good to make it explicit when you start a pipelined operation. Currently, you get an error if you call PQsendQuery() twice in a row, without reading the result inbetween. That's a good thing, to catch application errors, when you're not trying to do pipelining. Otherwise, if you forget to get the result of a query you've sent, and then send another query, you'll merrily read the result of the first query and think that it belongs to the second. Are you trying to support continous pipelining, where you send new queries all the time, and read results as they arrive, without ever draining the pipe? Or are you just trying to do batches, where you send a bunch of queries, and wait for all the results to arrive, before sending more? A batched API would be easier to understand and work with, although a continuous pipeline could be more efficient for an application that can take advantage of it. Consideration of implicit transactions (autocommit), the whole pipeline being one transaction, or multiple transactions is needed. The more I think about this the more confident I am that no extra work is needed. Unless we start doing some preliminary processing of the query inside of libpq, our hands are tied wrt sending a sync at the end of each query. The reason for this is that we rely on the ReadyForQuery message to indicate the end of a query, so without the sync there is no way to tell if the next result is from another statement in the current query, or the first statement in the next query. I also don't see a reason to need multiple queries without a sync statement. If the user wants all queries to succeed or fail together it should be no problem to start the pipeline with begin and complete it commit. But I may be missing some detail... True. It makes me a bit uneasy, though, to not be sure that the whole batch is committed or rolled back as one unit. There are many ways the user can shoot himself in the foot with that. Error handling would be a lot simpler if you would only send one Sync for the whole batch. Tom explained it better on this recent thread: http://www.postgresql.org/message-id/32086.1415063...@sss.pgh.pa.us. Another thought is that for many applications, it would actually be OK to not know which query each result belongs to. For example, if you execute a bunch of inserts, you often just want to get back the total number of inserted, or maybe not even that. Or if you execute a CREATE TEMPORARY TABLE ... ON COMMIT DROP, followed by some insertions to it, some more data manipulations, and finally a SELECT to get the results back. All you want is the last result set. If we could modify the wire protocol, we'd want to have a MiniSync message that is like Sync except that it wouldn't close the current transaction. The server would respond to it with a ReadyForQuery message (which could carry an ID number, to match it up with the MiniSync command). But I really
Re: [HACKERS] controlling psql's use of the pager a bit more
On 12/04/2014 03:53 PM, Alvaro Herrera wrote: Andrew Dunstan wrote: However, there is more work to do. As Tom noted upthread, psql's calculation of the number of lines is pretty bad. For example, if I do: \pset pager_min_lines 100 select * from generate_series(1,50); the pager still gets invoked, which is unfortunate to say the least. So I'm going to take a peek at that. Any luck there? Some, yes, See http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=4077fb4d1d34ad04dfb95ba676c2b43ea1f1da53 cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] superuser() shortcuts
* Andres Freund (and...@2ndquadrant.com) wrote: On 2014-12-04 15:59:17 -0500, Stephen Frost wrote: I have a hard time wrapping my head around what a *lot* of our users ask when they see a given error message, but if our error message is 'you must have a pear-shaped object to run this command' then I imagine that a new-to-PG user might think well, I can't create pear shaped objects in PG, so I guess this just isn't supported. That's not necessairly any fault of ours, but I do think permission denied reduces the chance that there's any confusion about the situation. I've a hard time taking this comment seriously. If can't believe that you think that comment bears relation to the error message we're discussing. It's reasonable to think that new users won't understand PG-specific things such as role attributes. A new user might not recognize or understand what replication role is but they're more than likely to get the gist behind permission denied. The answer is that there really *isn't* any additional information conveyed by your proposal rewrite; To be sure it's clear, I *don't* agree with this. You and I don't see any additional information in it because we're familiar with the system and know all about role attributes, the GRANT system, and everything else. I'm not looking to change the error message because it's going to make it clearer to you or to me or to anyone else on this list though. The different style is what's in the error style guidelines. I think you're vastly over-interpreting the guidelines because that happens to suite your position. So I shouldn't consider what the guidelines have because they agree with my position? Perhaps we should *change* the guidelines if they aren't what we should be following. None of the current error message violates any of: The primary message should be short, factual, and avoid reference to implementation details such as specific function names. Short means should fit on one line under normal conditions. Use a detail message if needed to keep the primary message short, or if you feel a need to mention implementation details such as the particular system call that failed. Both primary and detail messages should be factual. Use a hint message for suggestions about what to do to fix the problem, especially if the suggestion might not always be applicable. I agree that the text doesn't say the error message should be related to what the errcode used is, but it's certainly the implication of the examples provided and, when the text doesn't make any reference, going by what is in the example seems perfectly reasonable to me. And I don't for a second buy your argument that the permissions involved are an implemementation detail. Would you agree with Peter that, rather than focus on if the errdetail() involves an implementation detail or not, we should go ahead and add the You must have SELECT rights ... to the existing cases where we just say permission denied? If you say that you like the new message better because it's more consistent or more beautiful I can buy that. But don't try to find justifications in guidelines when they don't contain that. I've said that I like the new messaging because it's more consistent time and time again. You suggest that I argue on that merit alone rather than pointing out where we've (intentionally or not) used examples which agree with me? That strikes me as rather silly. I just don't understand why you want to pointlessly tinker with this. Because I don't feel it's pointless to improve the consistency of the error messaging and I don't like that it's inconsistent today. Then please do so outside of patches/threads that do something pretty much unrelated. Alright, I bothered to go back and find the relevant message- it's this one: 20141022231834.ga1...@alvin.alvh.no-ip.org where Alvaro pointed out that the messaging is inconsistent and asked that this patch should include the correct wording (whatever we decide it to be) and further commentary in 20141023232302.gh1...@alvin.alvh.no-ip.org matches that.. I didn't bring this up and I'm getting a bit put-out by the constant implications that it's just all me and my crazy ideas for consistency. Perhaps Alvaro doesn't care to argue the position further, which is fine, but at least acknowledge that there are others who agree that we're currently inconsistent (Alvaro and, I believe, Peter). Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] superuser() shortcuts
Stephen Frost wrote: * Andres Freund (and...@2ndquadrant.com) wrote: On 2014-12-04 15:59:17 -0500, Stephen Frost wrote: I just don't understand why you want to pointlessly tinker with this. Because I don't feel it's pointless to improve the consistency of the error messaging and I don't like that it's inconsistent today. Then please do so outside of patches/threads that do something pretty much unrelated. Alright, I bothered to go back and find the relevant message- it's this one: 20141022231834.ga1...@alvin.alvh.no-ip.org where Alvaro pointed out that the messaging is inconsistent and asked that this patch should include the correct wording (whatever we decide it to be) and further commentary in 20141023232302.gh1...@alvin.alvh.no-ip.org matches that.. I didn't bring this up and I'm getting a bit put-out by the constant implications that it's just all me and my crazy ideas for consistency. Perhaps Alvaro doesn't care to argue the position further, which is fine, but at least acknowledge that there are others who agree that we're currently inconsistent (Alvaro and, I believe, Peter). Several dozens messages ago in this thread I would have dropped this item, TBH. *Maybe* I would consider changing the specific messages around the specific code that's being tinkered with, but probably not other ones. I haven't looked at this patch recently, but if it's changing 1% of the error messages in the backend, I doubt it's a good idea. More in general, if I see strong, well-rationalized opposition to some non-essential idea of mine, I tend to drop it because I don't see the value in arguing endlessly. Life's already way too short. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] superuser() shortcuts
On 2014-12-04 17:06:02 -0500, Stephen Frost wrote: * Andres Freund (and...@2ndquadrant.com) wrote: On 2014-12-04 15:59:17 -0500, Stephen Frost wrote: I have a hard time wrapping my head around what a *lot* of our users ask when they see a given error message, but if our error message is 'you must have a pear-shaped object to run this command' then I imagine that a new-to-PG user might think well, I can't create pear shaped objects in PG, so I guess this just isn't supported. That's not necessairly any fault of ours, but I do think permission denied reduces the chance that there's any confusion about the situation. I've a hard time taking this comment seriously. If can't believe that you think that comment bears relation to the error message we're discussing. It's reasonable to think that new users won't understand PG-specific things such as role attributes. A new user might not recognize or understand what replication role is but they're more than likely to get the gist behind permission denied. Anyone that's using any of these facilities better know at least halfway what a permission is. The answer is that there really *isn't* any additional information conveyed by your proposal rewrite; To be sure it's clear, I *don't* agree with this. You and I don't see any additional information in it because we're familiar with the system and know all about role attributes, the GRANT system, and everything else. I'm not looking to change the error message because it's going to make it clearer to you or to me or to anyone else on this list though. The different style is what's in the error style guidelines. I think you're vastly over-interpreting the guidelines because that happens to suite your position. So I shouldn't consider what the guidelines have because they agree with my position? Not if there's not actually anything in there. Perhaps we should *change* the guidelines if they aren't what we should be following. So, now you've build the full circle. None of the current error message violates any of: The primary message should be short, factual, and avoid reference to implementation details such as specific function names. Short means should fit on one line under normal conditions. Use a detail message if needed to keep the primary message short, or if you feel a need to mention implementation details such as the particular system call that failed. Both primary and detail messages should be factual. Use a hint message for suggestions about what to do to fix the problem, especially if the suggestion might not always be applicable. I agree that the text doesn't say the error message should be related to what the errcode used is, but it's certainly the implication of the examples provided and, when the text doesn't make any reference, going by what is in the example seems perfectly reasonable to me. If anything it's the absolute contrary. Messages should always state the reason why an error occurred. For example: BAD:could not open file %s BETTER: could not open file %s (I/O failure) Avoid mentioning called function names, either; instead say what the code was trying to do: BAD:open() failed: %m BETTER: could not open file %s: %m BAD:unknown node type BETTER: unrecognized node type: 42 There's simply not a a single example giving credence to your position in the guidelines I'm looking at. That's *not* to say that your position is wrong, but holding up the guidelines as a reason for it is absurd. And I don't for a second buy your argument that the permissions involved are an implemementation detail. Would you agree with Peter that, rather than focus on if the errdetail() involves an implementation detail or not, we should go ahead and add the You must have SELECT rights ... to the existing cases where we just say permission denied? I think adding helpful information isn't a bad idea. It's not always obvious which permission is required to do something. If you say that you like the new message better because it's more consistent or more beautiful I can buy that. But don't try to find justifications in guidelines when they don't contain that. I've said that I like the new messaging because it's more consistent time and time again. You suggest that I argue on that merit alone rather than pointing out where we've (intentionally or not) used examples which agree with me? That strikes me as rather silly. Since there's nothing in the guidelines... I didn't bring this up and I'm getting a bit put-out by the constant implications that it's just all me and my crazy ideas for consistency. I don't mind much being outvoted or convinced by better reasoning. But you've shown so far no recognition of the fact that the new message is much longer without much of additional detail. And you've largely argumented using arguments that I found
Re: [HACKERS] superuser() shortcuts
* Andres Freund (and...@2ndquadrant.com) wrote: On 2014-12-04 17:06:02 -0500, Stephen Frost wrote: Would you agree with Peter that, rather than focus on if the errdetail() involves an implementation detail or not, we should go ahead and add the You must have SELECT rights ... to the existing cases where we just say permission denied? I think adding helpful information isn't a bad idea. It's not always obvious which permission is required to do something. [...] I didn't bring this up and I'm getting a bit put-out by the constant implications that it's just all me and my crazy ideas for consistency. I don't mind much being outvoted or convinced by better reasoning. But you've shown so far no recognition of the fact that the new message is much longer without much of additional detail. And you've largely argumented using arguments that I found absolutely unconvincing. I agree that the new message is longer. I don't much care, no. I'm confused by the comment above and then this one- surely error messages from lack of SELECT or similar rights are way more commonly emitted than ones about the replication role attribute? You are, seemingly, agreeing with making the error message emitted when SELECT is used longer while saying that we shouldn't make the error message about the replication role attribute longer? Is there something special about the replciation role attribute case that makes it different from the SELECT rights? Thanks, Stephen signature.asc Description: Digital signature
[HACKERS] Minor documentation tweak to pg_stat_all_tables view description
Attached patch makes minor modification to the pg_stat_all_tables documentation. This clarifies that pg_stat_*_tables.n_tup_upd includes HOT updates. -- Peter Geoghegan diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index b29e5e6..3ce7e80 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -1226,7 +1226,7 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser row entrystructfieldn_tup_upd//entry entrytypebigint//entry - entryNumber of rows updated/entry + entryNumber of rows updated (includes HOT updated rows)/entry /row row entrystructfieldn_tup_del//entry -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] superuser() shortcuts
Alvaro, * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: Several dozens messages ago in this thread I would have dropped this item, TBH. I'm getting to that point, though it's quite frustrating. I didn't much care initially but it's gotten to the point where the current situation just strikes me as quite 'wrong'. *Maybe* I would consider changing the specific messages around the specific code that's being tinkered with, but probably not other ones. If I thought that was a way to move forward, then I'd be happy with it. I have no problem with this being a policy of make it match the policy when you change it, but don't just go changing everything right away as it'll cause too much disruption. I'm mostly argueing about what the policy should be here and less about the specific minor changes in this patch. I'm pretty sure that's not acceptable to Robert or Andres though, as I think they're also argueing policy. I can certainly understand a position which is it's not wrong and we shouldn't change it for the sake of changing it. I haven't looked at this patch recently, but if it's changing 1% of the error messages in the backend, I doubt it's a good idea. This just confuses me- you said above that you'd consider changing just the specific messages around the specific code being tinkered with (which would be less than 1%..) and then say it's not a good idea to change just 1%? The actual patch is a mostly unrelated (and, seemingly, not terribly controversial) change, as compared to this discussion about error messages. More in general, if I see strong, well-rationalized opposition to some non-essential idea of mine, I tend to drop it because I don't see the value in arguing endlessly. Life's already way too short. I'm not sure that I see it as well-rationalized, but it certainly seems to be consistent and the error message changes are certainly non-essential. Perhaps it'd be better discussed in Ottawa. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Minor documentation tweak to pg_stat_all_tables view description
On 2014-12-04 14:43:43 -0800, Peter Geoghegan wrote: Attached patch makes minor modification to the pg_stat_all_tables documentation. This clarifies that pg_stat_*_tables.n_tup_upd includes HOT updates. Ah. Good idea. I've been asked that by others and myself quite a number of times and had to check the code every time. diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index b29e5e6..3ce7e80 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -1226,7 +1226,7 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser row entrystructfieldn_tup_upd//entry entrytypebigint//entry - entryNumber of rows updated/entry + entryNumber of rows updated (includes HOT updated rows)/entry /row row entrystructfieldn_tup_del//entry I wondered for a sec whether it'd be better to refer to n_tup_hot_upd here, but decided you were right. Pushed. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minor documentation tweak to pg_stat_all_tables view description
On Thu, Dec 4, 2014 at 2:56 PM, Andres Freund and...@2ndquadrant.com wrote: I wondered for a sec whether it'd be better to refer to n_tup_hot_upd here, but decided you were right. Pushed. Thanks. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New wal format distorts pg_xlogdump --stats
On 2014-12-04 16:26:02 +0200, Heikki Linnakangas wrote: Yeah, that's broken. I propose the attached. Or does anyone want to argue for adding an XLogRecGetFPILen() accessor macro for the hole_length in xlogreader.h. It's not something a redo routine would need, nor most XLOG-reading applications, so I thought it's better to just have pg_xlogdump peek into the DecodedBkpBlock struct directly. I think both would be justifyable, so I don't really care for now. One slight reason for wrapping it in xlogreader.h is that we might add compression of some form or another soon and it'd possibly be better to have it in xlogreader.h so pg_xlogdump doesn't have to be changed. But that's really rather minor. Thanks, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq pipelining
On Thursday, December 04, 2014 11:39:02 PM Heikki Linnakangas wrote: Adding the ability to set a user supplied pointer on the PGquery struct might make it much easier for some frameworks, and other users might want a callback, but I don't think either are required. I don't like exposing the PGquery struct to the application like that. Access to all other libpq objects is done via functions. The application can't (or shouldn't, anyway) directly access the fields of PGresult, for example. It has to call PQnfields(), PQntuples() etc. Right, my patch doesn't expose it. I was thinking of adding two new functions to get/set the user tag/pointer. The user-supplied pointer seems quite pointless. It would make sense if the pointer was passed to PQsendquery(), and you'd get it back in PGquery. You could then use it to tag the query when you send it with whatever makes sense for the application, and use the tag in the result to match it with the original query. That's exactly what I envisioned, but with a separate call to avoid having to modify/duplicate the PQsendQuery functions: PQsendQuery(conn,...) query = PQgetLastQuery(conn); PQquerySetUserPointer(query,userPtr); ... result = PQgetResult(conn); query = PQgetFirstQuery(conn); userPtr = PQqueryGetUserPointer(query); But as it stands, I don't see the point. I don't need it since it should be easy to keep track without it. It was just an idea. The original query string might be handy for some things, but for others it's useless. It's not enough as a general method to identify the query the result belongs to. A common use case for this is to execute the same query many times with different parameters. Right, I'm only saving the query text because that's how things were done already. Since it's already there I didn't see a reason not to expose it. So I don't think you've quite nailed the problem of how to match the results to the commands that originated them, yet. One idea is to add a function that can be called after PQgetResult(), to get some identifier of the original command. But there needs to be a mechanism to tag the PQsendQuery() calls. Or you can assign each call a unique ID automatically, and have a way to ask for that ID after calling PQsendQuery(). PGquery IS the unique ID, and it is available after calling PQsendQuery by calling PQgetLastQuery. The explanation of PQgetFirstQuery makes it sound pretty hard to match up the result with the query. You have to pay attention to PQisBusy. It's not hard at all and is very natural to use since the whole point of an async api is to avoid blocking, so it's natural to only call PQgetResult when it's not going to block. PQgetFirstQuery should also be valid after calling PQgetResult and then you don't have to worry about PQisBusy, so I should probably change the documentation to indicate that is the preferred usage, or maybe make that the only guaranteed usage, and say the results are undefined if you call it before calling PQgetResult. That usage also makes it consistent with PQgetLastQuery being called immediately after PQsendQuery. Another option would be a function to get the PGquery for any PGresult. This would make things a bit more straightforward for the user, but more complicated in the implementation since multiple PGresults will share the same PGquery. However it's nothing that a reference count wouldn't solve. It would be good to make it explicit when you start a pipelined operation. Currently, you get an error if you call PQsendQuery() twice in a row, without reading the result inbetween. That's a good thing, to catch application errors, when you're not trying to do pipelining. Otherwise, if you forget to get the result of a query you've sent, and then send another query, you'll merrily read the result of the first query and think that it belongs to the second. Agreed, and I think this is the only behavior change currently. An easy fix to restore existing behavior by default: PQsetPipelining(PGconn *conn, int arg); should work. Are you trying to support continous pipelining, where you send new queries all the time, and read results as they arrive, without ever draining the pipe? Or are you just trying to do batches, where you send a bunch of queries, and wait for all the results to arrive, before sending more? A batched API would be easier to understand and work with, although a continuous pipeline could be more efficient for an application that can take advantage of it. I don't see any reason to limit it to batches, though it can certainly be used that way. My first test case does continuous pipelining and it provides a huge throughput gain when there's any latency in the connection. I can envision a lot of uses for the continuous approach. Consideration of implicit transactions (autocommit), the whole pipeline being one transaction, or multiple transactions is needed. The more I think about
Re: [HACKERS] duplicated partial-column assigns allowed by checkInsertTargets
On 12/4/14, 2:23 PM, Alvaro Herrera wrote: While perusing checkInsertTargets I noticed that it allows duplicated assignments to the same member of a composite: alvherre=# create type f as (a int, b int); CREATE TYPE alvherre=# create table t (col f); CREATE TABLE alvherre=# insert into t (col.a, col.b, col.a) values (42, 43, 44); INSERT 0 1 alvherre=# select * from t; col - (44,43) (1 fila) If you instead try a duplicate col, it is rightfully rejected: alvherre=# insert into t (col, col) values ((42, 43), (44, 43)); ERROR: column col specified more than once LÍNEA 1: insert into t (col, col) values ((42, 43), (44, 43)); ^ Isn't this a bit odd? Yes, and sounds like a good way to create bugs... my vote would be to fix this. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New wal format distorts pg_xlogdump --stats
On Fri, Dec 5, 2014 at 8:09 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-12-04 16:26:02 +0200, Heikki Linnakangas wrote: Yeah, that's broken. I propose the attached. Or does anyone want to argue for adding an XLogRecGetFPILen() accessor macro for the hole_length in xlogreader.h. It's not something a redo routine would need, nor most XLOG-reading applications, so I thought it's better to just have pg_xlogdump peek into the DecodedBkpBlock struct directly. I think both would be justifyable, so I don't really care for now. One slight reason for wrapping it in xlogreader.h is that we might add compression of some form or another soon and it'd possibly be better to have it in xlogreader.h so pg_xlogdump doesn't have to be changed. But that's really rather minor. If we go this road and want to be complete, you may as well add access macros for the image offset, the block image and for the block data stuff. That would be handy and consistent with the rest, now both approaches are fine as long as DecodedBkpBlock is in xlogreader.h. Regards, -- Michael
[HACKERS] pg_basebackup -x/X doesn't play well with archive_mode wal_keep_segments
Hi, We've recently observed a case where, after a promotion, a postgres server suddenly started to archive a large amount of old WAL. After some digging the problem is this: pg_basebackup -X creates files in pg_xlog/ without creating the corresponding .done file. Note that walreceiver *does* create them. The standby in this case, just like the master, had a significant wal_keep_segments. RemoveOldXlogFiles() then, during recovery restart points, calls XLogArchiveCheckDone() which in turn does: /* Retry creation of the .ready file */ XLogArchiveNotify(xlog); return false; if there's neither a .done nor a .ready file present and archive_mode is enabled. These segments then aren't removed because there's a .ready present and they're never archived as long as the node is a standby because we don't do archiving on standbys. Once the node is promoted archiver will be started and suddenly archive all these files - which might be months old. And additional, at first strange, nice detail is that a lot of the .ready files had nearly the same timestamps. Turns out that's due to wal_keep_segments. Initially RemoveOldXlogFiles() doesn't process the files because they're newer than allowed due to wal_keep_segments. Then every checkpoint a couple segments would be old enough to reach XLogArchiveCheckDone() which then'd create the .ready marker... But not all at once :) So I think we just need to make pg_basebackup create to .ready files. Given that the walreceiver and restore_command already unconditionally do XLogArchiveForceDone() I think we'd follow the established precedent. Arguably it could make sense to archive files again on the standby after a promotion as they aren't guaranteed to have been on the then primary. But we don't have any infrastructure anyway for that and walsender doesn't do so, so it doesn't seem to make any sense to do that for pg_basebackup. Independent from this bug, there's also some debatable behaviour about what happens if a node with a high wal_keep_segments turns on archive_mode. Suddenly all those old files are archived... I think it might be a good idea to simply always create .done files when archive_mode is disabled while a wal segment is finished. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq pipelining
The explanation of PQgetFirstQuery makes it sound pretty hard to match up the result with the query. You have to pay attention to PQisBusy. PQgetFirstQuery should also be valid after calling PQgetResult and then you don't have to worry about PQisBusy, so I should probably change the documentation to indicate that is the preferred usage, or maybe make that the only guaranteed usage, and say the results are undefined if you call it before calling PQgetResult. That usage also makes it consistent with PQgetLastQuery being called immediately after PQsendQuery. I changed my second example to call PQgetFirstQuery after PQgetResult instead of before, and that removes the need to call PQconsumeInput and PQisBusy when you don't mind blocking. It makes the example super simple: PQsendQuery(conn, INSERT INTO test(id) VALUES (DEFAULT),(DEFAULT) RETURNING id); query1 = PQgetLastQuery(conn); /* Duplicate primary key error */ PQsendQuery(conn, UPDATE test SET id=2 WHERE id=1); query2 = PQgetLastQuery(conn); PQsendQuery(conn, SELECT * FROM test); query3 = PQgetLastQuery(conn); while( (result = PQgetResult(conn)) != NULL ) { curQuery = PQgetFirstQuery(conn); if (curQuery == query1) checkResult(conn,result,curQuery,PGRES_TUPLES_OK); if (curQuery == query2) checkResult(conn,result,curQuery,PGRES_FATAL_ERROR); if (curQuery == query3) checkResult(conn,result,curQuery,PGRES_TUPLES_OK); } Note that the curQuery == queryX check will work no matter how many results a query produces. Matt Newell -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New wal format distorts pg_xlogdump --stats
On 2014-12-05 08:58:33 +0900, Michael Paquier wrote: On Fri, Dec 5, 2014 at 8:09 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-12-04 16:26:02 +0200, Heikki Linnakangas wrote: Yeah, that's broken. I propose the attached. Or does anyone want to argue for adding an XLogRecGetFPILen() accessor macro for the hole_length in xlogreader.h. It's not something a redo routine would need, nor most XLOG-reading applications, so I thought it's better to just have pg_xlogdump peek into the DecodedBkpBlock struct directly. I think both would be justifyable, so I don't really care for now. One slight reason for wrapping it in xlogreader.h is that we might add compression of some form or another soon and it'd possibly be better to have it in xlogreader.h so pg_xlogdump doesn't have to be changed. But that's really rather minor. If we go this road and want to be complete, you may as well add access macros for the image offset, the block image and for the block data stuff. That would be handy and consistent with the rest, now both approaches are fine as long as DecodedBkpBlock is in xlogreader.h. I don't see the point. Let's introduce that if (which I doubt a bit) there's a user. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Support frontend-backend protocol communication using a shm_mq.
On Thu, Dec 4, 2014 at 4:15 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Robert Haas wrote: Support frontend-backend protocol communication using a shm_mq. I just noticed that this patch broke the case where a standalone backend is sent a query that throws an error -- instead, we get a segmentation fault. Ugh, sorry. Fixed. Example: echo foobar | postgres --single PostgreSQL stand-alone backend 9.5devel backend Segmentation fault I guess we could have a src/test/modules subdir that tests simple stuff on standalone backends ... Wouldn't hurt. Although it's not something that's likely to get broken very often. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugfix and new feature for PGXS
On 12/4/14 3:47 PM, Andrew Dunstan wrote: You have broken two buildfarm instances that build and test external modules - in one case the Redis FDW module and in the other the File Text Array FDW. I will see what can be retrieved. This has been fixed. One thing I'll look into sometime is splitting up Makefile.global into parts that apply to PGXS and those that don't (and possibly common parts), because there are too many ifdef PGXS's popping up all over the place in an unorganized fashion. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] superuser() shortcuts
On Thu, Dec 4, 2014 at 3:59 PM, Stephen Frost sfr...@snowman.net wrote: I have a hard time wrapping my head around what a *lot* of our users ask when they see a given error message, but if our error message is 'you must have a pear-shaped object to run this command' then I imagine that a new-to-PG user might think well, I can't create pear shaped objects in PG, so I guess this just isn't supported. That's not necessairly any fault of ours, but I do think permission denied reduces the chance that there's any confusion about the situation. This is just ridiculous. You're postulating the existing of a person who thinks that a replication role is something that they buy at Wendi's rather than something granted by the database administrator. Give me a break. I do prefer my version but it's not an unfounded preference. I never said that your preference was unfounded. I said that I disagreed with it. The burden for changing existing messages is not high, but your desire for a different style is not sufficient to go change half the error messages in the backend, or even 20% of them, and I think that 20% is a conservative estimate of how many we'd have to change to actually achieve what you're talking about here. The different style is what's in the error style guidelines. I agree that it could lead to a lot of changes and I don't think we'd need to change them all in one shot or in one release, in part because of the burden it would put on translators. As Andres says, that's just plain not true. There is nothing whatsoever in the error guidelines that supports your position on this issue. Reasoning logically, you're saying that this could could lead to a lot of changes. You're also claiming that it is supported by the message style guidelines. Taken together, these two statements imply that you think that a lot of our current messages are not in conformance with the message style guidelines. But how did all of those non-comformant messages get there, and how is it that no one has ever felt the urge to fix it up until now? After all, it is not as if message style is a topic that never gets discussed here; it does, quite regularly. A more reasonable explanation is that your interpretation of the message style guidelines disagrees with that of other people on this list. This, in particular, doesn't bother me terribly much- if there's reason enough for a new code then it's probably because it's something PG specific enough to justify it. Fair enough. I don't know enough about the merits or demerits of inventing new error codes to have a clear feeling for whether it would be a good idea or not. But that's not the ostensible topic of this thread. If we want to say that role-attribute-related error messages should be You need X to do Y, then I'll go make those changes, removing the 'permission denied' where those are used and be happier that we're at least consistent, but it'll be annoying if we ever make those role-attributes be GRANT'able rights (GRANT .. ON CLUSTER?) as those errmsg's will then change from you need X to permission denied to do Y. Having the errdetail change bothers me a lot less. You can't really put you need X to do Y in the error message, I think. That style is appropriate for an error detail, but not an error message. I'm utterly confused by this comment. The existing error messages that I want to change are of the 'you need X to do Y' style (eg: must be superuser or have the same role to cancel queries running in other server processes). If you're just referring to the 'you' word, then ok, I agree that it goes against the error style guidelines and it would really be need X to do Y, but that wasn't what I was trying to get at with the above comment. Yes, I was talking about the you. I was saying *let's make it consistent* and go change the existing cases which say permission denied to create role and similar to, instead, say must have CREATEROLE to create roles. Today, we're utterly inconsistent. Consistency is a virtue, but not the only one or the highest one. I think that when the rule is something simple, it makes sense to articulate it in the error message, but when the rule gets too complicated to be articulated in brief, then we must give a generic permission denied message and expect the user to go read the manual for more information. Suppose we consider two hypothetical operations, fire-united-states-nukes and punish-recalcitrant-child. The first one is a highly restricted operation, but the criteria are simple enough that they can be stated succinctly: ERROR: must be POTUS to launch US nukes The second operation is far less restricted in terms of who can carry it out, but whether it's allowable in a particular instance is complicated, depending as it does on your relationship to the child in question and the proposed punishment. Parents can probably safely assume they can send their child to their
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Thu, Dec 4, 2014 at 5:36 AM, Rahila Syed rahilasyed...@gmail.com wrote: The only scenario in which a user would not want to compress forcibly written pages is when CPU utilization is high. Or if they think the code to compress full pages is buggy. But according to measurements done earlier the CPU utilization of compress=’on’ and ‘off’ are not significantly different. If that's really true, we could consider having no configuration any time, and just compressing always. But I'm skeptical that it's actually true. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] XLOG_PARAMETER_CHANGE (WAL record) missing two params in its desc routine
Hi all, While reading the code in this area this morning, I noticed that wal_log_hints and track_commit_timestamp are not mentioned in the desc routine of XLOG_CHANGE_PARAMETER. Also, it is not mentioned in postgresql.conf.sample that a value update of wal_log_hints requires a system restart. In order to fix those things, attached are two patches: - patch 1 should be applied back to 9.4 where wal_log_hints has been added - patch 2 is master-only, and needs to be applied on top of patch 1. Regards, -- Michael From 5237f43c4fa8f97a67a98421c1a72a1377b2d5d0 Mon Sep 17 00:00:00 2001 From: Michael Paquier michael@otacoo.com Date: Fri, 5 Dec 2014 11:45:25 +0900 Subject: [PATCH 1/2] Fix wal_log_hints management in xlog resource manager A couple of things are fixed here: - In postgresql.conf, it was not mentioned that wal_log_hints requires a restart when updated - When issuing WAL record XLOG_PARAMETER_CHANGE, description function did not log out value update of this parameter. Backpatch to 9.4. --- src/backend/access/rmgrdesc/xlogdesc.c| 5 - src/backend/utils/misc/postgresql.conf.sample | 1 + 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c index 6b5fea9..645eec1 100644 --- a/src/backend/access/rmgrdesc/xlogdesc.c +++ b/src/backend/access/rmgrdesc/xlogdesc.c @@ -108,11 +108,14 @@ xlog_desc(StringInfo buf, XLogReaderState *record) } } - appendStringInfo(buf, max_connections=%d max_worker_processes=%d max_prepared_xacts=%d max_locks_per_xact=%d wal_level=%s, + appendStringInfo(buf, max_connections=%d max_worker_processes=%d + max_prepared_xacts=%d max_locks_per_xact=%d + wal_level=%s wal_log_hints=%s, xlrec.MaxConnections, xlrec.max_worker_processes, xlrec.max_prepared_xacts, xlrec.max_locks_per_xact, + xlrec.wal_log_hints ? true : false, wal_level_str); } else if (info == XLOG_FPW_CHANGE) diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index c4b546e..b053659 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -187,6 +187,7 @@ # open_sync #full_page_writes = on # recover from partial page writes #wal_log_hints = off # also do full page writes of non-critical updates + # (change requires restart) #wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers # (change requires restart) #wal_writer_delay = 200ms # 1-1 milliseconds -- 2.2.0 From 475d38c557fc2361aef5395652e26dc3c10072fc Mon Sep 17 00:00:00 2001 From: Michael Paquier michael@otacoo.com Date: Fri, 5 Dec 2014 11:49:41 +0900 Subject: [PATCH 2/2] Mention update of track_commit_timestamps in XLOG_PARAMETER_CHANGE This parameter update was not mentioned in the description of this WAL record. --- src/backend/access/rmgrdesc/xlogdesc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c index 645eec1..7a2a842 100644 --- a/src/backend/access/rmgrdesc/xlogdesc.c +++ b/src/backend/access/rmgrdesc/xlogdesc.c @@ -110,12 +110,13 @@ xlog_desc(StringInfo buf, XLogReaderState *record) appendStringInfo(buf, max_connections=%d max_worker_processes=%d max_prepared_xacts=%d max_locks_per_xact=%d - wal_level=%s wal_log_hints=%s, + wal_level=%s wal_log_hints=%s track_commit_timestamp=%s, xlrec.MaxConnections, xlrec.max_worker_processes, xlrec.max_prepared_xacts, xlrec.max_locks_per_xact, xlrec.wal_log_hints ? true : false, + xlrec.track_commit_timestamp ? true : false, wal_level_str); } else if (info == XLOG_FPW_CHANGE) -- 2.2.0 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] check-world failure: dummy_seclabel
All, I've noticed that 'check-world' fails for dummy_seclabel after a 'clean'. I believe that in commit da34731, the EXTRA_CLEAN statement should have been removed from 'src/test/modules/dummy_seclabel/Makefile' as well. Attached is a proposed patch that addresses this issue. Thanks, Adam -- Adam Brightwell - adam.brightw...@crunchydatasolutions.com Database Engineer - www.crunchydatasolutions.com diff --git a/src/test/modules/dummy_seclabel/Makefile b/src/test/modules/dummy_seclabel/Makefile new file mode 100644 index 72049d4..d93c964 *** a/src/test/modules/dummy_seclabel/Makefile --- b/src/test/modules/dummy_seclabel/Makefile *** EXTENSION = dummy_seclabel *** 7,13 DATA = dummy_seclabel--1.0.sql REGRESS = dummy_seclabel - EXTRA_CLEAN = sql/dummy_seclabel.sql expected/dummy_seclabel.out ifdef USE_PGXS PG_CONFIG = pg_config --- 7,12 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Thu, 2014-12-04 at 10:27 -0800, Peter Geoghegan wrote: I think that the standard for adding a new system attribute ought to be enormous. The only case where a new one was added post-Postgres95 was tableoid. I'm pretty sure that others aren't going to want to do it that way. Besides, I'm not entirely convinced that this is actually an important distinction to expose. For Django's use case this is a requirement. We must inform the user if the save() action created a new row or if it modified an existing one. Another way to do this would be to expose the excluded alias in the returning clause. All columns of the excluded alias would be null in the case of insert (especially the primary key column), and thus if a query insert into foobar values(2, '2') on conflict (id) update set other_col=excluded.other_col returning excluded.id returns a non-null value, then it was an update. - Anssi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On partitioning
On Thu, Dec 4, 2014 at 10:46 AM, Amit Langote langote_amit...@lab.ntt.co.jp wrote: Hi, From: Jim Nasby [mailto:jim.na...@bluetreble.com] On 12/2/14, 9:43 PM, Amit Langote wrote: What are you going to do if the partitioning key has two columns of different data types? Sorry, this totally eluded me. Perhaps, the 'values' needs some more thought. They are one of the most crucial elements of the scheme. I wonder if your suggestion of pg_node_tree plays well here. This then could be a list of CONSTs or some such... And I am thinking it's a concern only for range partitions, no? (that is, a multicolumn partition key) I think partkind switches the interpretation of the field as appropriate. Am I missing something? By the way, I had mentioned we could have two values fields each for range and list partition kind. The more SQL way would be records (composite types). That would make catalog inspection a LOT easier and presumably make it easier to change the partitioning key (I'm assuming ALTER TYPE cascades to stored data). Records are stored internally as tuples; not sure if that would be faster than a List of Consts or a pg_node_tree. Nodes would theoretically allow using things other than Consts, but I suspect that would be a bad idea. While I couldn’t find an example in system catalogs where a record/composite type is used, there are instances of pg_node_tree at a number of places like in pg_attrdef and others. Could you please point me to such a usage for reference? I think you can check the same by manually creating table with a user-defined type. Create type typ as (f1 int, f2 text); Create table part_tab(c1 int, c2 typ); With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
If that's really true, we could consider having no configuration any time, and just compressing always. But I'm skeptical that it's actually true. I was referring to this for CPU utilization: http://www.postgresql.org/message-id/1410414381339-5818552.p...@n5.nabble.com http:// The above tests were performed on machine with configuration as follows Server specifications: Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos RAM: 32GB Disk : HDD 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos 1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm Thank you, Rahila Syed -- View this message in context: http://postgresql.nabble.com/Compression-of-full-page-writes-tp5769039p5829339.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On partitioning
On Wed, Dec 3, 2014 at 6:30 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Amit Langote wrote: From: Robert Haas [mailto:robertmh...@gmail.com] What is an overflow partition and why do we want that? That would be a default partition. That is, where the tuples that don't belong elsewhere (other defined partitions) go. VALUES clause of the definition for such a partition would look like: (a range partition) ... VALUES LESS THAN MAXVALUE (a list partition) ... VALUES DEFAULT There has been discussion about whether there shouldn't be such a place for tuples to go. That is, it should generate an error if a tuple can't go anywhere (or support auto-creating a new one like in interval partitioning?) In my design I initially had overflow partitions too, because I inherited the idea from Itagaki Takahiro's patch. Eventually I realized that it's a useless concept, because you can always have leftmost and rightmost partitions, which are just regular partitions (except they don't have a low key, resp. high key). If you don't define unbounded partitions at either side, it's fine, you just raise an error whenever the user tries to insert a value for which there is no partition. Not real clear to me how this applies to list partitioning, but I have the hunch that it'd be better to deal with that without overflow partitions as well. Well, overflow partitions might not sound to be a nice idea and we might not want to do it or atleast not in first version, however I think it could be useful in certain cases like if in a long running transaction user is able to insert many rows into appropriate partitions and one row falls out of the defined partition's range; an error in such a case can annoy user, also I think similar situation could occur for bulk insert (COPY). With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] On partitioning
From: Amit Kapila [mailto:amit.kapil...@gmail.com] On Thu, Dec 4, 2014 at 10:46 AM, Amit Langote langote_amit...@lab.ntt.co.jp wrote: The more SQL way would be records (composite types). That would make catalog inspection a LOT easier and presumably make it easier to change the partitioning key (I'm assuming ALTER TYPE cascades to stored data). Records are stored internally as tuples; not sure if that would be faster than a List of Consts or a pg_node_tree. Nodes would theoretically allow using things other than Consts, but I suspect that would be a bad idea. While I couldn’t find an example in system catalogs where a record/composite type is used, there are instances of pg_node_tree at a number of places like in pg_attrdef and others. Could you please point me to such a usage for reference? I think you can check the same by manually creating table with a user-defined type. Create type typ as (f1 int, f2 text); Create table part_tab(c1 int, c2 typ); Is there such a custom-defined type used in some system catalog? Just not sure how one would put together a custom type to use in a system catalog given the way a system catalog is created. That's my concern but it may not be valid. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On partitioning
On Tue, Dec 2, 2014 at 8:53 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Nov 25, 2014 at 8:20 PM, Amit Langote langote_amit...@lab.ntt.co.jp wrote: Before going too much further with this I'd mock up schemas for your proposed catalogs and a list of DDL operations to be supported, with the corresponding syntax, and float that here for comment. More people should really comment on this. This is a pretty big deal if it goes forward, so it shouldn't be based on what one or two people think. * Catalog schema: CREATE TABLE pg_catalog.pg_partitioned_rel ( partrelidoidNOT NULL, partkindoidNOT NULL, partissub bool NOT NULL, partkey int2vector NOT NULL, -- partitioning attributes partopclass oidvector, PRIMARY KEY (partrelid, partissub), FOREIGN KEY (partrelid) REFERENCES pg_class (oid), FOREIGN KEY (partopclass) REFERENCES pg_opclass (oid) ) WITHOUT OIDS ; So, we're going to support exactly two levels of partitioning? partitions with partissub=false and subpartitions with partissub=true? Why not support only one level of partitioning here but then let the children have their own pg_partitioned_rel entries if they are subpartitioned? That seems like a cleaner design and lets us support an arbitrary number of partitioning levels if we ever need them. CREATE TABLE pg_catalog.pg_partition_def ( partitionid oid NOT NULL, partitionparentrel oidNOT NULL, partitionisoverflow bool NOT NULL, partitionvalues anyarray, PRIMARY KEY (partitionid), FOREIGN KEY (partitionid) REFERENCES pg_class(oid) ) WITHOUT OIDS; ALTER TABLE pg_catalog.pg_class ADD COLUMN relispartitioned; What is an overflow partition and why do we want that? What are you going to do if the partitioning key has two columns of different data types? * DDL syntax (no multi-column partitioning, sub-partitioning support as yet): -- create partitioned table and child partitions at once. CREATE TABLE parent (...) PARTITION BY [ RANGE | LIST ] (key_column) [ opclass ] [ ( PARTITION child { VALUES LESS THAN { ... | MAXVALUE } -- for RANGE | VALUES [ IN ] ( { ... | DEFAULT } ) -- for LIST } [ WITH ( ... ) ] [ TABLESPACE tbs ] [, ...] ) ] ; How are you going to dump and restore this, bearing in mind that you have to preserve a bunch of OIDs across pg_upgrade? What if somebody wants to do pg_dump --table name_of_a_partition? Do we really need to support dml or pg_dump for individual partitions? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] pg_basebackup -x/X doesn't play well with archive_mode wal_keep_segments
On Fri, Dec 5, 2014 at 9:28 AM, Andres Freund and...@2ndquadrant.com wrote: Hi, We've recently observed a case where, after a promotion, a postgres server suddenly started to archive a large amount of old WAL. After some digging the problem is this: pg_basebackup -X creates files in pg_xlog/ without creating the corresponding .done file. Note that walreceiver *does* create them. The standby in this case, just like the master, had a significant wal_keep_segments. RemoveOldXlogFiles() then, during recovery restart points, calls XLogArchiveCheckDone() which in turn does: /* Retry creation of the .ready file */ XLogArchiveNotify(xlog); return false; if there's neither a .done nor a .ready file present and archive_mode is enabled. These segments then aren't removed because there's a .ready present and they're never archived as long as the node is a standby because we don't do archiving on standbys. Once the node is promoted archiver will be started and suddenly archive all these files - which might be months old. And additional, at first strange, nice detail is that a lot of the .ready files had nearly the same timestamps. Turns out that's due to wal_keep_segments. Initially RemoveOldXlogFiles() doesn't process the files because they're newer than allowed due to wal_keep_segments. Then every checkpoint a couple segments would be old enough to reach XLogArchiveCheckDone() which then'd create the .ready marker... But not all at once :) So I think we just need to make pg_basebackup create to .ready files. s/.ready/.done? If yes, +1. Given that the walreceiver and restore_command already unconditionally do XLogArchiveForceDone() I think we'd follow the established precedent. Arguably it could make sense to archive files again on the standby after a promotion as they aren't guaranteed to have been on the then primary. But we don't have any infrastructure anyway for that and walsender doesn't do so, so it doesn't seem to make any sense to do that for pg_basebackup. Independent from this bug, there's also some debatable behaviour about what happens if a node with a high wal_keep_segments turns on archive_mode. Suddenly all those old files are archived... I think it might be a good idea to simply always create .done files when archive_mode is disabled while a wal segment is finished. +1 Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Thu, Dec 4, 2014 at 8:37 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Dec 4, 2014 at 7:36 PM, Rahila Syed rahilasyed...@gmail.com wrote: IIUC, forcibly written fpws are not exposed to user , so is it worthwhile to add a GUC similar to full_page_writes in order to control a feature which is unexposed to user in first place? If full page writes is set 'off' by user, user probably cannot afford the overhead involved in writing large pages to disk . So , if a full page write is forcibly written in such a situation it is better to compress it before writing to alleviate the drawbacks of writing full_page_writes in servers with heavy write load. The only scenario in which a user would not want to compress forcibly written pages is when CPU utilization is high. But according to measurements done earlier the CPU utilization of compress='on' and 'off' are not significantly different. Yes they are not visible to the user still they exist. I'd prefer that we have a safety net though to prevent any problems that may occur if compression algorithm has a bug as if we enforce compression for forcibly-written blocks all the backups of our users would be impacted. I pondered something that Andres mentioned upthread: we may not do the compression in WAL record only for blocks, but also at record level. Hence joining the two ideas together I think that we should definitely have a different GUC to control the feature, consistently for all the images. Let's call it wal_compression, with the following possible values: - on, meaning that a maximum of compression is done, for this feature basically full_page_writes = on. - full_page_writes, meaning that full page writes are compressed - off, default value, to disable completely the feature. This would let room for another mode: 'record', to completely compress a record. For now though, I think that a simple on/off switch would be fine for this patch. Let's keep things simple. +1 Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Fri, Dec 5, 2014 at 10:53 AM, Robert Haas robertmh...@gmail.com wrote: On Thu, Dec 4, 2014 at 5:36 AM, Rahila Syed rahilasyed...@gmail.com wrote: The only scenario in which a user would not want to compress forcibly written pages is when CPU utilization is high. Or if they think the code to compress full pages is buggy. Yeah, especially if in the future we begin to add support for other compression algorithm. But according to measurements done earlier the CPU utilization of compress='on' and 'off' are not significantly different. If that's really true, we could consider having no configuration any time, and just compressing always. But I'm skeptical that it's actually true. So am I. Data is the thing that matters for us. Speaking of which, I have been working more on the set of patches to add support for this feature and attached are updated patches, with the following changes: - Addition of a new GUC parameter wal_compression, being a complete switch to control compression of WAL. Default is off. We could extend this parameter later if we decide to add support for new algorithms or new modes, let's say a record-level compression. Parameter is PGC_POSTMASTER. We could make it PGC_SIGHUP but that would be better as a future optimization, and would need a new WAL record type similar to full_page_writes. (Actually, I see no urgency in making it SIGHUP..) - full_page_writes is moved back to its original state - Correction of a couple of typos and comments. Regards, -- Michael From 1887d6cb3ef92d3a5defb1cef5be1d8bf6dbdb9e Mon Sep 17 00:00:00 2001 From: Michael Paquier michael@otacoo.com Date: Tue, 25 Nov 2014 14:05:59 +0900 Subject: [PATCH 1/2] Move pg_lzcompress.c to src/common Exposing compression and decompression APIs of pglz makes possible its use by extensions and contrib modules. pglz_decompress contained a call to elog to emit an error message in case of corrupted data. This function is changed to return a status code to let its callers return an error instead. Compression function is changed similarly to make the whole set consistent. --- src/backend/access/heap/tuptoaster.c | 11 +- src/backend/utils/adt/Makefile| 4 +- src/backend/utils/adt/pg_lzcompress.c | 779 - src/common/Makefile | 3 +- src/common/pg_lzcompress.c| 784 ++ src/include/utils/pg_lzcompress.h | 19 +- src/tools/msvc/Mkvcbuild.pm | 3 +- 7 files changed, 813 insertions(+), 790 deletions(-) delete mode 100644 src/backend/utils/adt/pg_lzcompress.c create mode 100644 src/common/pg_lzcompress.c diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c index ce44bbd..9af456f 100644 --- a/src/backend/access/heap/tuptoaster.c +++ b/src/backend/access/heap/tuptoaster.c @@ -142,7 +142,8 @@ heap_tuple_untoast_attr(struct varlena * attr) attr = (struct varlena *) palloc(PGLZ_RAW_SIZE(tmp) + VARHDRSZ); SET_VARSIZE(attr, PGLZ_RAW_SIZE(tmp) + VARHDRSZ); - pglz_decompress(tmp, VARDATA(attr)); + if (pglz_decompress(tmp, VARDATA(attr)) != PGLZ_OK) +elog(ERROR, compressed data is corrupted); pfree(tmp); } } @@ -167,7 +168,8 @@ heap_tuple_untoast_attr(struct varlena * attr) attr = (struct varlena *) palloc(PGLZ_RAW_SIZE(tmp) + VARHDRSZ); SET_VARSIZE(attr, PGLZ_RAW_SIZE(tmp) + VARHDRSZ); - pglz_decompress(tmp, VARDATA(attr)); + if (pglz_decompress(tmp, VARDATA(attr)) != PGLZ_OK) + elog(ERROR, compressed data is corrupted); } else if (VARATT_IS_SHORT(attr)) { @@ -239,7 +241,8 @@ heap_tuple_untoast_attr_slice(struct varlena * attr, preslice = (struct varlena *) palloc(size); SET_VARSIZE(preslice, size); - pglz_decompress(tmp, VARDATA(preslice)); + if (pglz_decompress(tmp, VARDATA(preslice)) != PGLZ_OK) + elog(ERROR, compressed data is corrupted); if (tmp != (PGLZ_Header *) attr) pfree(tmp); @@ -1253,7 +1256,7 @@ toast_compress_datum(Datum value) * we insist on a savings of more than 2 bytes to ensure we have a gain. */ if (pglz_compress(VARDATA_ANY(DatumGetPointer(value)), valsize, - (PGLZ_Header *) tmp, PGLZ_strategy_default) + (PGLZ_Header *) tmp, PGLZ_strategy_default) == PGLZ_OK VARSIZE(tmp) valsize - 2) { /* successful compression */ diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile index 3ea9bf4..20e5ff1 100644 --- a/src/backend/utils/adt/Makefile +++ b/src/backend/utils/adt/Makefile @@ -25,8 +25,8 @@ OBJS = acl.o arrayfuncs.o array_selfuncs.o array_typanalyze.o \ jsonfuncs.o like.o lockfuncs.o mac.o misc.o nabstime.o name.o \ network.o network_gist.o network_selfuncs.o \ numeric.o numutils.o oid.o oracle_compat.o \ - orderedsetaggs.o pg_lzcompress.o pg_locale.o pg_lsn.o \ - pgstatfuncs.o pseudotypes.o quote.o rangetypes.o rangetypes_gist.o \ + orderedsetaggs.o pg_locale.o pg_lsn.o pgstatfuncs.o \ + pseudotypes.o quote.o