Re: [HACKERS] [PATCH] Support for foreign keys with arrays
On 6 April 2012 07:21, Peter Eisentraut pete...@gmx.net wrote: On lör, 2012-03-24 at 10:01 +, Gianni Ciolli wrote: ON (DELETE | UPDATE) actions for EACH foreign keys == -- --- --- | ON | ON | Action | DELETE | UPDATE | -- --- --- CASCADE | Row | Forbidden | SET NULL | Row | Row | SET DEFAULT | Row | Row | EACH CASCADE | Element | Element | EACH SET NULL | Element | Element | EACH SET DEFAULT | Forbidden | Forbidden | NO ACTION | - | - | RESTRICT | - | - | -- - - I took another fresh look at this feature after not having looked for a month or two. I think the functionality is probably OK, but I find the interfaces somewhat poorly named. Consider, PostgreSQL adds EACH foreign keys -- huh? I think they key word ELEMENT would be more descriptive and precise, and it also leaves the door open to other kind of non-atomic foreign key relationships outside of arrays. EACH has no relationship with arrays. It might as well refer to each row. On the matter of the above chart, there has been a long back and forth about whether the row or the element case should be the default. Both cases are probably useful, but unfortunately you have now settled on making maximum destruction the default. Additionally, we would now have the case that sometimes, depending on some configuration elsewhere, an ON DELETE CASCADE deletes more than what was actually involved in the foreign key. What I'd suggest is to make both cases explicit. That is, forbid ON DELETE CASCADE altogether and make people write ON DELETE CASCADE ROW or ON DELETE CASCADE ELEMENT. In addition to making things more explicit and safer, it would again leave the door open to other kinds of relationships later. +1 -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] temporal support patch
On Wed, 2012-06-13 at 23:10 +0200, Miroslav Šimulčík wrote: I have working patch for postgresql version 9.0.4, but it needs refactoring before i can submit it, because some parts don't meet formatting requirements yet. And yes, changes are large, so it will be better to discuss design first and then deal with code. Do you insist on compatibility with standard SQL 2011 as Pavel wrote? Try to work on solving the problem and identify the use cases. I don't think the standard will cause a major problem, we should be able to make the relevant parts of your patch match the standard. That's one reason to work on it as an extension first: we can get a better sense of the problem space and various use cases without worrying about violating any standard. Then, as you need specific backend support (e.g. special syntax), we can take the standards more seriously. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backup docs
On Thu, Jun 14, 2012 at 10:37 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Jun 13, 2012 at 3:20 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Please let's apply that documentation patch to 9.2 too. Agreed. Here's a patch that does the first two things. Does not attempt a tl;tr section yet. Also adds a subheader for the notes about compressing archive logs that seems to have been missing for a long time - that's definitely valid for things that aren't standalone backups, and is arguably a lot more *useful* in cases that aren't standalone backups (since standalone backups won't have very much log). No removed text, just moved around and added some. Unless there are objections to this one specifically, I'll go ahead and commit it soon. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ backup_docs.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On Jun15, 2012, at 07:50 , Magnus Hagander wrote: So I've got very little patience with the idea of let's put in some hooks and then great things will happen. It would be far better all around if we supported exactly one, well-chosen, method. But really I still don't see a reason not to let openssl do it for us. Do we just need to document SSL's NULL encryption option? Does the SSL NULL encryption+compression thing work if you're not using openssl? The compression support is defined in RFC 3749, and according to http://en.wikipedia.org/wiki/Comparison_of_TLS_Implementations it's supported in openssl and gnutls. gnutls also seems to support a NULL cipher - gnutls-cli on my Ubuntu 10.04 box prints Ciphers: AES-256-CBC, AES-128-CBC, 3DES-CBC, DES-CBC, ARCFOUR-128, ARCFOUR-40, RC2-40, CAMELLIA-256-CBC, CAMELLIA-128-CBC, NULL. For one thing, some of us still hold a hope to support non-openssl libraries in both libpq and server side, so it's something that would need to be supported by the standard and thus available in most libraries not to invalidate that. Well, it's a standard a least, and both openssl and gnutls seem to support it. Are there any other ssl implementations beside gnutls and openssl that we need to worry about? Second, we also have things like the JDBC driver and the .Net driver that don't use libpq. the JDBC driver uses the native java ssl support, AFAIK. Does that one support the compression, and does it support controlling it? Java uses pluggable providers with standardized interfaces for most things related to encryption. SSL support is provided by JSSE (Java Secure Socket Extension). The JSSE implementation included with the oracle JRE doesn't seem to support compression according to the wikipedia page quoted above. But chances are that there exists an alternative implementation which does. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On Fri, Jun 15, 2012 at 5:52 PM, Florian Pflug f...@phlo.org wrote: On Jun15, 2012, at 07:50 , Magnus Hagander wrote: So I've got very little patience with the idea of let's put in some hooks and then great things will happen. It would be far better all around if we supported exactly one, well-chosen, method. But really I still don't see a reason not to let openssl do it for us. Do we just need to document SSL's NULL encryption option? Does the SSL NULL encryption+compression thing work if you're not using openssl? The compression support is defined in RFC 3749, and according to http://en.wikipedia.org/wiki/Comparison_of_TLS_Implementations it's supported in openssl and gnutls. gnutls also seems to support a NULL cipher - gnutls-cli on my Ubuntu 10.04 box prints Ciphers: AES-256-CBC, AES-128-CBC, 3DES-CBC, DES-CBC, ARCFOUR-128, ARCFOUR-40, RC2-40, CAMELLIA-256-CBC, CAMELLIA-128-CBC, NULL. ah, thanks for looking that up for me! The other big one to consider would be GNUTLS - which also has support for compression, I see. I guess a related question is if they all alow us to turn it *off*, which we now do support on openssl :) gnutls does, I didn't look into nss. For one thing, some of us still hold a hope to support non-openssl libraries in both libpq and server side, so it's something that would need to be supported by the standard and thus available in most libraries not to invalidate that. Well, it's a standard a least, and both openssl and gnutls seem to support it. Are there any other ssl implementations beside gnutls and openssl that we need to worry about? NSS would be the big one, an din theory microsoft schannel if we were to go there (that would give us access to easy use of the windows certificate store so ther emight be a reason - but not a very big one, to support that). Second, we also have things like the JDBC driver and the .Net driver that don't use libpq. the JDBC driver uses the native java ssl support, AFAIK. Does that one support the compression, and does it support controlling it? Java uses pluggable providers with standardized interfaces for most things related to encryption. SSL support is provided by JSSE (Java Secure Socket Extension). The JSSE implementation included with the oracle JRE doesn't seem to support compression according to the wikipedia page quoted above. But chances are that there exists an alternative implementation which does. Yeah, but that alone is IMO a rather big blocker for claiming that this is the only way to do it :( And I think the fact that that wikipedia page doesn't list any other ones, is a sign that there might not be a lot of other choices out there in reality - expecially not opensource... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS]
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On Jun15, 2012, at 12:09 , Magnus Hagander wrote: On Fri, Jun 15, 2012 at 5:52 PM, Florian Pflug f...@phlo.org wrote: On Jun15, 2012, at 07:50 , Magnus Hagander wrote: Second, we also have things like the JDBC driver and the .Net driver that don't use libpq. the JDBC driver uses the native java ssl support, AFAIK. Does that one support the compression, and does it support controlling it? Java uses pluggable providers with standardized interfaces for most things related to encryption. SSL support is provided by JSSE (Java Secure Socket Extension). The JSSE implementation included with the oracle JRE doesn't seem to support compression according to the wikipedia page quoted above. But chances are that there exists an alternative implementation which does. Yeah, but that alone is IMO a rather big blocker for claiming that this is the only way to do it :( And I think the fact that that wikipedia page doesn't list any other ones, is a sign that there might not be a lot of other choices out there in reality - expecially not opensource… Hm, but things get even harder for the JDBC and .NET folks if we go with a third-party compression method. Or would we require that the existence of a free Java (and maybe .NET) implementation of such a method would be an absolute must? The way I see it, if we use SSL-based compression then non-libpq clients there's at least a chance of those clients being able to use it easily (if their SSL implementation supports it). If we go with a third-party compression method, they *all* need to add yet another dependency, or may even need to re-implement the compression method in their implementation language of choice. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Lazy hashaggregate when no aggregation is needed
Hi, I would like to ask a question before looking into the patch. At 21:56 12/03/30 -0400, Jay Levitt wrote: Tom Lane wrote: Ants Aasmaa...@cybertec.at writes: A user complained on pgsql-performance that SELECT col FROM table GROUP BY col LIMIT 2; performs a full table scan. ISTM that it's safe to return tuples from hash-aggregate as they are found when no aggregate functions are in use. Attached is a first shot at that. As I commented in the other thread, the user would be a lot better off if he'd had an index on the column in question. I'm not sure it's worth complicating the hashagg logic when an indexscan + groupagg would address the case better. Would this patch help in the case where table is actually a set-returning function, and thus can't have an index? ISTM that in many cases, the result size of a set-returning function is not so large compared with that of a full plain table scan. So, in such a case a full hash aggregation is not so time consuming. Am I wrong? Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Ability to listen on two unix sockets
On 06/13/2012 03:25 PM, Honza Horak wrote: Going through the thread, I'd like to sum it up choosing approach with less potential issues and would like to find a consensus if possible. It seems unix_socket_directory could be turned into list and probably renamed to unix_socket_directories, since it would be confusing if a list value is in singular. On the other hand, we probably don't want to specify listening ports together with additional unix sockets in one configuration option, so it seems better to add a new configuration option to distinguish the primary listening port from additional ports. Regards, Honza A draft patch is attached. It renames unix_socket_directory to unix_socket_directories and allows to use directory:port to be able to create more sockets in one directory with different port number in the socket name. Regards, Honza diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml index cfdb33a..679c40a 100644 --- a/doc/src/sgml/client-auth.sgml +++ b/doc/src/sgml/client-auth.sgml @@ -838,7 +838,7 @@ omicron bryanh guest1 varnameunix_socket_permissions/varname (and possibly varnameunix_socket_group/varname) configuration parameters as described in xref linkend=runtime-config-connection. Or you -could set the varnameunix_socket_directory/varname +could set the varnameunix_socket_directories/varname configuration parameter to place the socket file in a suitably restricted directory. /para diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 074afee..7634682 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -453,17 +453,23 @@ SET ENABLE_SEQSCAN TO OFF; /listitem /varlistentry - varlistentry id=guc-unix-socket-directory xreflabel=unix_socket_directory - termvarnameunix_socket_directory/varname (typestring/type)/term + varlistentry id=guc-unix-socket-directories xreflabel=unix_socket_directories + termvarnameunix_socket_directories/varname (typestring/type)/term indexterm - primaryvarnameunix_socket_directory/ configuration parameter/primary + primaryvarnameunix_socket_directories/ configuration parameter/primary /indexterm listitem para -Specifies the directory of the Unix-domain socket on which the +Specifies the directories of the Unix-domain sockets on which the server is to listen for connections from client applications. The default is normally filename/tmp/filename, but can be changed at build time. +Directories are separated by ',' and additional replaceableport/ +number can be set, separated from directory by ':'. Port number will +only be used as a part of the socket file name. For example, +literal'/var/run, /tmp:5431'/literal would create socket files +literal/var/run/.s.PGSQL.5432/literal and +literal/tmp/.s.PGSQL.5431/literal. This parameter can only be set at server start. /para @@ -472,7 +478,7 @@ SET ENABLE_SEQSCAN TO OFF; literal.s.PGSQL.replaceable//literal where replaceable/ is the server's port number, an ordinary file named literal.s.PGSQL.replaceable/.lock/literal will be -created in the varnameunix_socket_directory/ directory. Neither +created in the varnameunix_socket_directories/ directories. Neither file should ever be removed manually. /para @@ -6593,7 +6599,7 @@ LOG: CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1) /row row entryoption-k replaceablex/replaceable/option/entry -entryliteralunix_socket_directory = replaceablex/replaceable//entry +entryliteralunix_socket_directories = replaceablex/replaceable//entry /row row entryoption-l/option/entry diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml index 7ba18f0..6c74844 100644 --- a/doc/src/sgml/runtime.sgml +++ b/doc/src/sgml/runtime.sgml @@ -1784,7 +1784,7 @@ pg_dumpall -p 5432 | psql -d postgres -p 5433 para The simplest way to prevent spoofing for literallocal/ connections is to use a Unix domain socket directory (xref - linkend=guc-unix-socket-directory) that has write permission only + linkend=guc-unix-socket-directories) that has write permission only for a trusted local user. This prevents a malicious user from creating their own socket file in that directory. If you are concerned that some applications might still reference filename/tmp/ for the diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c index e3ae92d..72505e3 100644 --- a/src/backend/bootstrap/bootstrap.c +++ b/src/backend/bootstrap/bootstrap.c @@ -349,7 +349,7 @@ AuxiliaryProcessMain(int argc, char *argv[]) /* If standalone, create lockfile for data directory */ if (!IsUnderPostmaster) -
Re: [HACKERS] [PATCH] Lazy hashaggregate when no aggregation is needed
On Fri, Jun 15, 2012 at 6:55 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: A user complained on pgsql-performance that SELECT col FROM table GROUP BY col LIMIT 2; performs a full table scan. ISTM that it's safe to return tuples from hash-aggregate as they are found when no aggregate functions are in use. Attached is a first shot at that. As I commented in the other thread, the user would be a lot better off if he'd had an index on the column in question. I'm not sure it's worth complicating the hashagg logic when an indexscan + groupagg would address the case better. Would this patch help in the case where table is actually a set-returning function, and thus can't have an index? ISTM that in many cases, the result size of a set-returning function is not so large compared with that of a full plain table scan. So, in such a case a full hash aggregation is not so time consuming. Am I wrong? This query is a little unusual in that it involves both an aggregate and a limit. Now, sorted aggregates work pretty well with limit, because you can be sure upon seeing the beginning of the next group that you are done with the previous group. But in a hash aggregate, you normally can't start returning results until you've read the entire input, so it doesn't work so well with limit. However, as Ants points out, we could make it work better for the special case where we're not actually doing any aggregation, because in that case we can emit the row for each group when the group is created, rather than waiting until end-of-input. This is only going to help when there is a LIMIT, though. Moreover, if there happens to be an ORDER BY, then the data will have to be pre-sorted, in which case you may as well use a sorted aggregate. So the use case for this optimization is basically DISTINCT plus LIMIT but not ORDER BY. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: New SQL functons pg_backup_in_progress() and pg_backup_start_tim
On Fri, Jun 15, 2012 at 1:52 AM, Magnus Hagander mag...@hagander.net wrote: On Fri, Jun 15, 2012 at 1:29 AM, Robert Haas rh...@postgresql.org wrote: New SQL functons pg_backup_in_progress() and pg_backup_start_time() Darold Gilles, reviewed by Gabriele Bartolini and others, rebased by Marco Nenciarini. Stylistic cleanup and OID fixes by me. How well is the term on-line exclusive backup really settled with people? I wonder if we need to add a specific note to the docs saying that the function doesn't consider streaming base backups at all, and that one should refer to pg_stat_replication for info about those? Or really, should the function be pg_exclusive_backup_in_progress() perhaps? Well, if we think that the term exclusive backup is not going to be easily comprehensible, then sticking that into the function name isn't going to help us much. I think that's just wordiness for the sake of being wordy. I do agree that we could probably improve the clarity of the documentation along the lines you suggest. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On Fri, Jun 15, 2012 at 5:48 AM, Florian Pflug f...@phlo.org wrote: On Jun15, 2012, at 12:09 , Magnus Hagander wrote: On Fri, Jun 15, 2012 at 5:52 PM, Florian Pflug f...@phlo.org wrote: On Jun15, 2012, at 07:50 , Magnus Hagander wrote: Second, we also have things like the JDBC driver and the .Net driver that don't use libpq. the JDBC driver uses the native java ssl support, AFAIK. Does that one support the compression, and does it support controlling it? Java uses pluggable providers with standardized interfaces for most things related to encryption. SSL support is provided by JSSE (Java Secure Socket Extension). The JSSE implementation included with the oracle JRE doesn't seem to support compression according to the wikipedia page quoted above. But chances are that there exists an alternative implementation which does. Yeah, but that alone is IMO a rather big blocker for claiming that this is the only way to do it :( And I think the fact that that wikipedia page doesn't list any other ones, is a sign that there might not be a lot of other choices out there in reality - expecially not opensource… Hm, but things get even harder for the JDBC and .NET folks if we go with a third-party compression method. Or would we require that the existence of a free Java (and maybe .NET) implementation of such a method would be an absolute must? The way I see it, if we use SSL-based compression then non-libpq clients there's at least a chance of those clients being able to use it easily (if their SSL implementation supports it). If we go with a third-party compression method, they *all* need to add yet another dependency, or may even need to re-implement the compression method in their implementation language of choice. hm, that's a really excellent point. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Lazy hashaggregate when no aggregation is needed
On Fri, Jun 15, 2012 at 3:13 PM, Robert Haas robertmh...@gmail.com wrote: However, as Ants points out, we could make it work better for the special case where we're not actually doing any aggregation, because in that case we can emit the row for each group when the group is created, rather than waiting until end-of-input. This is only going to help when there is a LIMIT, though. Moreover, if there happens to be an ORDER BY, then the data will have to be pre-sorted, in which case you may as well use a sorted aggregate. So the use case for this optimization is basically DISTINCT plus LIMIT but not ORDER BY. Exactly. I think the first question for this patch should be whether this use-case is worth the complexity of the patch. I can't imagine any really compelling use cases that need an arbitrary distinct subset of results. The original complaint on -performance [1], didn't specify a real world use case, but it seemed to be a case of an ORM generating suboptimal queries. On the other hand, the patch itself is in my opinion rather simple, so it might be worth it. It has one outstanding issue, query_planner chooses the cheapest path based on total cost. This can be suboptimal when that path happens to have high startup cost. It seems to me that enabling the query_planner to find the cheapest unsorted path returning a limited amount of tuples would require some major surgery to the planner. To be clear, this is only a case of missed optimization, not a regression. It won't help set returning functions because the tuplestore for those is fully materialized when the first row is fetched. [1] http://archives.postgresql.org/message-id/16737833.463.1332881676120.JavaMail.geo-discussion-forums%40pbcpw7 Ants Aasma -- Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] measuring spinning
On Thu, Jun 14, 2012 at 4:39 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Jan 11, 2012 at 8:48 PM, Robert Haas robertmh...@gmail.com wrote: I've had cause, a few times this development cycle, to want to measure the amount of spinning on each lwlock in the system. To that end, I've found the attached patch useful. Note that if you don't define LWLOCK_STATS, this changes nothing except that the return value from s_lock becomes int rather than void. If you do define LWLOCK_STATS, then LWLockAcquire() counts the number of pg_usleep() calls that are required to acquire each LWLock, in addition to the other statistics. Since this has come up for me a few times now, I'd like to proposing including it in core. Well, this fell through the cracks, because I forgot to add it to the January CommitFest. Here it is again, rebased. +1. It might be too awkward to add, but it would be nice to be able to fetch the number of spins as well as number of delays (aside, it's a bit confusing that in s_lock.c 'delay' is used both for the hardware sleep as well as the yielding sleep). merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On Fri, Jun 15, 2012 at 07:18:34AM -0500, Merlin Moncure wrote: On Fri, Jun 15, 2012 at 5:48 AM, Florian Pflug f...@phlo.org wrote: On Jun15, 2012, at 12:09 , Magnus Hagander wrote: On Fri, Jun 15, 2012 at 5:52 PM, Florian Pflug f...@phlo.org wrote: On Jun15, 2012, at 07:50 , Magnus Hagander wrote: Second, we also have things like the JDBC driver and the .Net driver that don't use libpq. the JDBC driver uses the native java ssl support, AFAIK. Does that one support the compression, and does it support controlling it? Java uses pluggable providers with standardized interfaces for most things related to encryption. SSL support is provided by JSSE (Java Secure Socket Extension). The JSSE implementation included with the oracle JRE doesn't seem to support compression according to the wikipedia page quoted above. But chances are that there exists an alternative implementation which does. Yeah, but that alone is IMO a rather big blocker for claiming that this is the only way to do it :( And I think the fact that that wikipedia page doesn't list any other ones, is a sign that there might not be a lot of other choices out there in reality - expecially not opensource… Hm, but things get even harder for the JDBC and .NET folks if we go with a third-party compression method. Or would we require that the existence of a free Java (and maybe .NET) implementation of such a method would be an absolute must? The way I see it, if we use SSL-based compression then non-libpq clients there's at least a chance of those clients being able to use it easily (if their SSL implementation supports it). If we go with a third-party compression method, they *all* need to add yet another dependency, or may even need to re-implement the compression method in their implementation language of choice. hm, that's a really excellent point. merlin I agree and think that the SSL-based compression is an excellent default compression scheme. The plugable compression approach allows for the choice of the most appropriate compression implementation based on the application needs. It really addresses corner cases such as high- performance system. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On Fri, Jun 15, 2012 at 6:48 PM, Florian Pflug f...@phlo.org wrote: On Jun15, 2012, at 12:09 , Magnus Hagander wrote: On Fri, Jun 15, 2012 at 5:52 PM, Florian Pflug f...@phlo.org wrote: On Jun15, 2012, at 07:50 , Magnus Hagander wrote: Second, we also have things like the JDBC driver and the .Net driver that don't use libpq. the JDBC driver uses the native java ssl support, AFAIK. Does that one support the compression, and does it support controlling it? Java uses pluggable providers with standardized interfaces for most things related to encryption. SSL support is provided by JSSE (Java Secure Socket Extension). The JSSE implementation included with the oracle JRE doesn't seem to support compression according to the wikipedia page quoted above. But chances are that there exists an alternative implementation which does. Yeah, but that alone is IMO a rather big blocker for claiming that this is the only way to do it :( And I think the fact that that wikipedia page doesn't list any other ones, is a sign that there might not be a lot of other choices out there in reality - expecially not opensource… Hm, but things get even harder for the JDBC and .NET folks if we go with a third-party compression method. Or would we require that the existence of a free Java (and maybe .NET) implementation of such a method would be an absolute must? As long as a free implementation exists, it can be ported to Java/.Net. Sure, it takes more work, but it *can be done*. The way I see it, if we use SSL-based compression then non-libpq clients there's at least a chance of those clients being able to use it easily (if their SSL implementation supports it). If we go with a third-party compression method, they *all* need to add yet another dependency, or may even need to re-implement the compression method in their implementation language of choice. I only partially agree. If there *is* no third party SSL libary that does support it, then they're stuck reimplementing an *entire SSL library*, which is surely many orders of magnitude more work, and suddenly steps into writing encryption code which is a lot more sensitive. Basically if they have to do that, then they're stuck *never* being able to fix the problem. If we can prove such a third party library *exists*, that makes it different. But from what I can tell so far, I haven't seen a single one - let alone one that supports compression. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] measuring spinning
On Fri, Jun 15, 2012 at 9:43 AM, Merlin Moncure mmonc...@gmail.com wrote: On Thu, Jun 14, 2012 at 4:39 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Jan 11, 2012 at 8:48 PM, Robert Haas robertmh...@gmail.com wrote: I've had cause, a few times this development cycle, to want to measure the amount of spinning on each lwlock in the system. To that end, I've found the attached patch useful. Note that if you don't define LWLOCK_STATS, this changes nothing except that the return value from s_lock becomes int rather than void. If you do define LWLOCK_STATS, then LWLockAcquire() counts the number of pg_usleep() calls that are required to acquire each LWLock, in addition to the other statistics. Since this has come up for me a few times now, I'd like to proposing including it in core. Well, this fell through the cracks, because I forgot to add it to the January CommitFest. Here it is again, rebased. +1. It might be too awkward to add, but it would be nice to be able to fetch the number of spins as well as number of delays (aside, it's a bit confusing that in s_lock.c 'delay' is used both for the hardware sleep as well as the yielding sleep). Yeah, I'm inclined to keep it simple for now. We can always change it again later if someone comes up with something better. I suspect that delays probably tracks contention bad enough that you should be worried pretty well, but of course I might be wrong. The only thing I know for sure is that I've found this useful in my own testing, and therefore others testing with LWLOCK_STATS might also find it useful. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: New SQL functons pg_backup_in_progress() and pg_backup_start_tim
On Fri, Jun 15, 2012 at 8:16 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Jun 15, 2012 at 1:52 AM, Magnus Hagander mag...@hagander.net wrote: On Fri, Jun 15, 2012 at 1:29 AM, Robert Haas rh...@postgresql.org wrote: New SQL functons pg_backup_in_progress() and pg_backup_start_time() Darold Gilles, reviewed by Gabriele Bartolini and others, rebased by Marco Nenciarini. Stylistic cleanup and OID fixes by me. How well is the term on-line exclusive backup really settled with people? I wonder if we need to add a specific note to the docs saying that the function doesn't consider streaming base backups at all, and that one should refer to pg_stat_replication for info about those? Or really, should the function be pg_exclusive_backup_in_progress() perhaps? Well, if we think that the term exclusive backup is not going to be easily comprehensible, then sticking that into the function name isn't going to help us much. I think that's just wordiness for the sake of being wordy. I do agree that we could probably improve the clarity of the documentation along the lines you suggest. It would alert people to the existance of the term, and thus help those who didn't actually read the documentation. Which actually makes an argument for making that change *anyway*, because right now the function is incorrectly named. A function named pg_backup_in_progress() should answer the question is a backup in progress. And it doesn't answer that question. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On 15.06.2012 17:39, Magnus Hagander wrote: On Fri, Jun 15, 2012 at 6:48 PM, Florian Pflugf...@phlo.org wrote: The way I see it, if we use SSL-based compression then non-libpq clients there's at least a chance of those clients being able to use it easily (if their SSL implementation supports it). If we go with a third-party compression method, they *all* need to add yet another dependency, or may even need to re-implement the compression method in their implementation language of choice. I only partially agree. If there *is* no third party SSL libary that does support it, then they're stuck reimplementing an *entire SSL library*, which is surely many orders of magnitude more work, and suddenly steps into writing encryption code which is a lot more sensitive. You could write a dummy SSL implementation that only does compression, not encryption. Ie. only support the 'null' encryption method. That should be about the same amount of work as writing an implementation of compression using whatever protocol we would decide to use for negotiating the compression. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On Fri, Jun 15, 2012 at 10:56 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 15.06.2012 17:39, Magnus Hagander wrote: On Fri, Jun 15, 2012 at 6:48 PM, Florian Pflugf...@phlo.org wrote: The way I see it, if we use SSL-based compression then non-libpq clients there's at least a chance of those clients being able to use it easily (if their SSL implementation supports it). If we go with a third-party compression method, they *all* need to add yet another dependency, or may even need to re-implement the compression method in their implementation language of choice. I only partially agree. If there *is* no third party SSL libary that does support it, then they're stuck reimplementing an *entire SSL library*, which is surely many orders of magnitude more work, and suddenly steps into writing encryption code which is a lot more sensitive. You could write a dummy SSL implementation that only does compression, not encryption. Ie. only support the 'null' encryption method. That should be about the same amount of work as writing an implementation of compression using whatever protocol we would decide to use for negotiating the compression. Sure, but then what do you do if you actually want both? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: New SQL functons pg_backup_in_progress() and pg_backup_start_tim
On 15 June 2012 15:54, Magnus Hagander mag...@hagander.net wrote: On Fri, Jun 15, 2012 at 8:16 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Jun 15, 2012 at 1:52 AM, Magnus Hagander mag...@hagander.net wrote: On Fri, Jun 15, 2012 at 1:29 AM, Robert Haas rh...@postgresql.org wrote: New SQL functons pg_backup_in_progress() and pg_backup_start_time() Darold Gilles, reviewed by Gabriele Bartolini and others, rebased by Marco Nenciarini. Stylistic cleanup and OID fixes by me. How well is the term on-line exclusive backup really settled with people? I wonder if we need to add a specific note to the docs saying that the function doesn't consider streaming base backups at all, and that one should refer to pg_stat_replication for info about those? Or really, should the function be pg_exclusive_backup_in_progress() perhaps? Well, if we think that the term exclusive backup is not going to be easily comprehensible, then sticking that into the function name isn't going to help us much. I think that's just wordiness for the sake of being wordy. I do agree that we could probably improve the clarity of the documentation along the lines you suggest. It would alert people to the existance of the term, and thus help those who didn't actually read the documentation. Which actually makes an argument for making that change *anyway*, because right now the function is incorrectly named. A function named pg_backup_in_progress() should answer the question is a backup in progress. And it doesn't answer that question. Maybe pg_is_in_backup_mode, which would match the naming convention of pg_is_in_recovery, and would claim that a backup is actually underway. -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: New SQL functons pg_backup_in_progress() and pg_backup_start_tim
On Fri, Jun 15, 2012 at 11:08 PM, Thom Brown t...@linux.com wrote: On 15 June 2012 15:54, Magnus Hagander mag...@hagander.net wrote: On Fri, Jun 15, 2012 at 8:16 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Jun 15, 2012 at 1:52 AM, Magnus Hagander mag...@hagander.net wrote: On Fri, Jun 15, 2012 at 1:29 AM, Robert Haas rh...@postgresql.org wrote: New SQL functons pg_backup_in_progress() and pg_backup_start_time() Darold Gilles, reviewed by Gabriele Bartolini and others, rebased by Marco Nenciarini. Stylistic cleanup and OID fixes by me. How well is the term on-line exclusive backup really settled with people? I wonder if we need to add a specific note to the docs saying that the function doesn't consider streaming base backups at all, and that one should refer to pg_stat_replication for info about those? Or really, should the function be pg_exclusive_backup_in_progress() perhaps? Well, if we think that the term exclusive backup is not going to be easily comprehensible, then sticking that into the function name isn't going to help us much. I think that's just wordiness for the sake of being wordy. I do agree that we could probably improve the clarity of the documentation along the lines you suggest. It would alert people to the existance of the term, and thus help those who didn't actually read the documentation. Which actually makes an argument for making that change *anyway*, because right now the function is incorrectly named. A function named pg_backup_in_progress() should answer the question is a backup in progress. And it doesn't answer that question. Maybe pg_is_in_backup_mode, which would match the naming convention of pg_is_in_recovery, and would claim that a backup is actually underway. Wouldn't that make it even more wrong since it doesn't include backups taken using streaming backups? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: New SQL functons pg_backup_in_progress() and pg_backup_start_tim
On 15 June 2012 16:09, Magnus Hagander mag...@hagander.net wrote: On Fri, Jun 15, 2012 at 11:08 PM, Thom Brown t...@linux.com wrote: On 15 June 2012 15:54, Magnus Hagander mag...@hagander.net wrote: On Fri, Jun 15, 2012 at 8:16 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Jun 15, 2012 at 1:52 AM, Magnus Hagander mag...@hagander.net wrote: On Fri, Jun 15, 2012 at 1:29 AM, Robert Haas rh...@postgresql.org wrote: New SQL functons pg_backup_in_progress() and pg_backup_start_time() Darold Gilles, reviewed by Gabriele Bartolini and others, rebased by Marco Nenciarini. Stylistic cleanup and OID fixes by me. How well is the term on-line exclusive backup really settled with people? I wonder if we need to add a specific note to the docs saying that the function doesn't consider streaming base backups at all, and that one should refer to pg_stat_replication for info about those? Or really, should the function be pg_exclusive_backup_in_progress() perhaps? Well, if we think that the term exclusive backup is not going to be easily comprehensible, then sticking that into the function name isn't going to help us much. I think that's just wordiness for the sake of being wordy. I do agree that we could probably improve the clarity of the documentation along the lines you suggest. It would alert people to the existance of the term, and thus help those who didn't actually read the documentation. Which actually makes an argument for making that change *anyway*, because right now the function is incorrectly named. A function named pg_backup_in_progress() should answer the question is a backup in progress. And it doesn't answer that question. Maybe pg_is_in_backup_mode, which would match the naming convention of pg_is_in_recovery, and would claim that a backup is actually underway. Wouldn't that make it even more wrong since it doesn't include backups taken using streaming backups? Sorry I mean *wouldn't* claim that a backup is underway -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: New SQL functons pg_backup_in_progress() and pg_backup_start_tim
On 15.06.2012 17:54, Magnus Hagander wrote: On Fri, Jun 15, 2012 at 8:16 PM, Robert Haasrobertmh...@gmail.com wrote: On Fri, Jun 15, 2012 at 1:52 AM, Magnus Hagandermag...@hagander.net wrote: On Fri, Jun 15, 2012 at 1:29 AM, Robert Haasrh...@postgresql.org wrote: New SQL functons pg_backup_in_progress() and pg_backup_start_time() Darold Gilles, reviewed by Gabriele Bartolini and others, rebased by Marco Nenciarini. Stylistic cleanup and OID fixes by me. How well is the term on-line exclusive backup really settled with people? I wonder if we need to add a specific note to the docs saying that the function doesn't consider streaming base backups at all, and that one should refer to pg_stat_replication for info about those? Or really, should the function be pg_exclusive_backup_in_progress() perhaps? Well, if we think that the term exclusive backup is not going to be easily comprehensible, then sticking that into the function name isn't going to help us much. I think that's just wordiness for the sake of being wordy. I do agree that we could probably improve the clarity of the documentation along the lines you suggest. It would alert people to the existance of the term, and thus help those who didn't actually read the documentation. I'm not sure we want to expose the exclusive backup term to users. It's a bit confusing. It makes sense in the limited scope in the code in xlog.c where it's currently used, but if I wanted to explain what it is to users, I don't think I'd choose that term. Which actually makes an argument for making that change *anyway*, because right now the function is incorrectly named. A function named pg_backup_in_progress() should answer the question is a backup in progress. And it doesn't answer that question. I agree that pg_backup_in_progress() is confusing, if it returns false while you're running pg_basebackup. In the doc changes you proposed, you call the pg_start/stop_backup() a low level API for taking backups. That's not suitable for a function name, but I think we should work on that, and find a better term that works. Backup mode? Filesystem backup mode? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On 15.06.2012 17:58, Magnus Hagander wrote: On Fri, Jun 15, 2012 at 10:56 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 15.06.2012 17:39, Magnus Hagander wrote: On Fri, Jun 15, 2012 at 6:48 PM, Florian Pflugf...@phlo.orgwrote: The way I see it, if we use SSL-based compression then non-libpq clients there's at least a chance of those clients being able to use it easily (if their SSL implementation supports it). If we go with a third-party compression method, they *all* need to add yet another dependency, or may even need to re-implement the compression method in their implementation language of choice. I only partially agree. If there *is* no third party SSL libary that does support it, then they're stuck reimplementing an *entire SSL library*, which is surely many orders of magnitude more work, and suddenly steps into writing encryption code which is a lot more sensitive. You could write a dummy SSL implementation that only does compression, not encryption. Ie. only support the 'null' encryption method. That should be about the same amount of work as writing an implementation of compression using whatever protocol we would decide to use for negotiating the compression. Sure, but then what do you do if you actually want both? Umm, then you use a real SSL libray, not the dummy one? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On Fri, Jun 15, 2012 at 11:24 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 15.06.2012 17:58, Magnus Hagander wrote: On Fri, Jun 15, 2012 at 10:56 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 15.06.2012 17:39, Magnus Hagander wrote: On Fri, Jun 15, 2012 at 6:48 PM, Florian Pflugf...@phlo.org wrote: The way I see it, if we use SSL-based compression then non-libpq clients there's at least a chance of those clients being able to use it easily (if their SSL implementation supports it). If we go with a third-party compression method, they *all* need to add yet another dependency, or may even need to re-implement the compression method in their implementation language of choice. I only partially agree. If there *is* no third party SSL libary that does support it, then they're stuck reimplementing an *entire SSL library*, which is surely many orders of magnitude more work, and suddenly steps into writing encryption code which is a lot more sensitive. You could write a dummy SSL implementation that only does compression, not encryption. Ie. only support the 'null' encryption method. That should be about the same amount of work as writing an implementation of compression using whatever protocol we would decide to use for negotiating the compression. Sure, but then what do you do if you actually want both? Umm, then you use a real SSL libray, not the dummy one? But (in this scenario, and so far nobody has proven it to be wrong) there exists no real SSL library that does support compression. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Ability to listen on two unix sockets
On 06/13/2012 03:25 PM, Honza Horak wrote: A draft patch is attached. It renames unix_socket_directory to unix_socket_directories and allows to use directory:port to be able to create more sockets in one directory with different port number in the socket name. I realized the patch has some difficulties -- namely the socket path in the data dir lock file, which currently uses one port for socket and the same for interface. So to allow users to use arbitrary port for all unix sockets, we'd need to add another line only for unix socket, which doesn't apply for other platforms. Or we could just say that the first socket will allways use the default port (PostPortNumber), which is a solution I prefer currently, but will be glad for any other opinion. This is also why there is still un-necesary string splitting in pg_ctl.c, which will be removed after the issue above is solved. Regards, Honza -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
Magnus Hagander mag...@hagander.net writes: On Fri, Jun 15, 2012 at 11:24 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Umm, then you use a real SSL libray, not the dummy one? But (in this scenario, and so far nobody has proven it to be wrong) there exists no real SSL library that does support compression. I do not think it is incumbent on this project to rectify that problem ... especially when nobody has proven that such a library exists (and is not obsolete, nor are its authors busy fixing the lack so as to be interoperable with openssl). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On Fri, Jun 15, 2012 at 11:28:48PM +0800, Magnus Hagander wrote: On Fri, Jun 15, 2012 at 11:24 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 15.06.2012 17:58, Magnus Hagander wrote: On Fri, Jun 15, 2012 at 10:56 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 15.06.2012 17:39, Magnus Hagander wrote: On Fri, Jun 15, 2012 at 6:48 PM, Florian Pflugf...@phlo.org wrote: The way I see it, if we use SSL-based compression then non-libpq clients there's at least a chance of those clients being able to use it easily (if their SSL implementation supports it). If we go with a third-party compression method, they *all* need to add yet another dependency, or may even need to re-implement the compression method in their implementation language of choice. I only partially agree. If there *is* no third party SSL libary that does support it, then they're stuck reimplementing an *entire SSL library*, which is surely many orders of magnitude more work, and suddenly steps into writing encryption code which is a lot more sensitive. You could write a dummy SSL implementation that only does compression, not encryption. Ie. only support the 'null' encryption method. That should be about the same amount of work as writing an implementation of compression using whatever protocol we would decide to use for negotiating the compression. Sure, but then what do you do if you actually want both? Umm, then you use a real SSL libray, not the dummy one? But (in this scenario, and so far nobody has proven it to be wrong) there exists no real SSL library that does support compression. gnutls and openssl both support compression: http://www.gnu.org/software/gnutls/manual/html_node/Compression-algorithms-used-in-the-record-layer.html http://www.openssl.org/docs/apps/enc.html -Ryan Kelly -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On 15.06.2012 18:28, Magnus Hagander wrote: On Fri, Jun 15, 2012 at 11:24 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 15.06.2012 17:58, Magnus Hagander wrote: On Fri, Jun 15, 2012 at 10:56 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.comwrote: You could write a dummy SSL implementation that only does compression, not encryption. Ie. only support the 'null' encryption method. That should be about the same amount of work as writing an implementation of compression using whatever protocol we would decide to use for negotiating the compression. Sure, but then what do you do if you actually want both? Umm, then you use a real SSL libray, not the dummy one? But (in this scenario, and so far nobody has proven it to be wrong) there exists no real SSL library that does support compression. Oh, I see. Then you're screwed. But I think the right solution to that is to write/extend a Java SSL implementation to support compression, not to invent our own in PostgreSQL. The JDK is open source nowadays. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COMMENT on function's arguments
Robert Haas robertmh...@gmail.com writes: On Tue, Jun 12, 2012 at 10:59 PM, Vlad Arkhipov arhi...@dc.baikal.ru wrote: Does it make sense to have a comment on function's arguments? This would be somewhat tricky, because our COMMENT support assumes that the object upon which we're commenting has an ObjectAddress, and individual arguments to a function don't, although perhaps the sub-object-id stuff that we currently use to handle comments on table columns could be extended to handle this case. I guess I wouldn't object to a well-done patch that made this work, but creating such a patch seems likely to be tricky, owing to the fact that there's nothing in the system that thinks of the individual arguments to a function as separate objects at present. Also, once you'd created the infrastructure needed to *store* such comments, what would you actually *do* with them? I find it hard to imagine squeezing them into \df+ displays, for instance, without impossible clutter. Like Robert, I stand ready to be proven wrong by a well-designed patch; but this seems like something that would take a lot more work than it's really worth. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sortsupport for text
Peter Geoghegan pe...@2ndquadrant.com writes: On 14 June 2012 19:28, Robert Haas robertmh...@gmail.com wrote: I thought that doubling repeatedly would be overly aggressive in terms of memory usage. I fail to understand how this sortsupport buffer fundamentally differs from a generic dynamic array abstraction built to contain chars. That being the case, I see no reason not to just do what everyone else does when expanding dynamic arrays, and no reason why we shouldn't make essentially the same time-space trade-off here as others do elsewhere. I agree with Peter on this one; not only is double-each-time the most widespread plan, but it is what we do in just about every other place in Postgres that needs a dynamically expansible buffer. If you do it randomly differently here, readers of the code will be constantly stopping to wonder why it's different here and if that's a bug or not. (And from a performance standpoint, I'm not entirely convinced it's not a bug, anyway. Worst-case behavior could be pretty bad.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sortsupport for text
On Fri, Jun 15, 2012 at 12:22 PM, Tom Lane t...@sss.pgh.pa.us wrote: Peter Geoghegan pe...@2ndquadrant.com writes: On 14 June 2012 19:28, Robert Haas robertmh...@gmail.com wrote: I thought that doubling repeatedly would be overly aggressive in terms of memory usage. I fail to understand how this sortsupport buffer fundamentally differs from a generic dynamic array abstraction built to contain chars. That being the case, I see no reason not to just do what everyone else does when expanding dynamic arrays, and no reason why we shouldn't make essentially the same time-space trade-off here as others do elsewhere. I agree with Peter on this one; not only is double-each-time the most widespread plan, but it is what we do in just about every other place in Postgres that needs a dynamically expansible buffer. If you do it randomly differently here, readers of the code will be constantly stopping to wonder why it's different here and if that's a bug or not. That could, of course, be addressed by adding a comment. (And from a performance standpoint, I'm not entirely convinced it's not a bug, anyway. Worst-case behavior could be pretty bad.) Instead of simply asserting that, could you respond to the specific points raised in my analysis? I think there's no way it can be bad. I am happy to be proven wrong, but I like to understand why it is that I am wrong before changing things. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Strange behavior with pg_locks and partitioning
All, In the course of debugging why a particular server required increasing max_locks_per_transation, I found a peculiar behavior. If you do an UPDATE which doesn't match any CE constraint on the parent table in an inheritance chain, you get a RowExclusiveLock on every partition and every index on every partition. However, these rowexclusivelocks have no page or tuple reference; it's a RowExclusiveLock with no row. Is this intentional? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange behavior with pg_locks and partitioning
On Fri, Jun 15, 2012 at 12:42 PM, Josh Berkus j...@agliodbs.com wrote: In the course of debugging why a particular server required increasing max_locks_per_transation, I found a peculiar behavior. If you do an UPDATE which doesn't match any CE constraint on the parent table in an inheritance chain, you get a RowExclusiveLock on every partition and every index on every partition. However, these rowexclusivelocks have no page or tuple reference; it's a RowExclusiveLock with no row. Is this intentional? RowExclusiveLock is a type of table lock, not a lock on a row. You're going to get that on all tables (and their indexes) involved in any write query. So it sounds unsurprising to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Saving snapshots for later use
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 15.06.2012 06:19, Nikolas Everett wrote: I'd like to be able to save the current snapshot and then at a later date roll the entire database back to that snapshot, essentially erasing everything that happened since the snapshot. To revert the database to the earlier state, you'll also need to somehow roll back all the already-committed transactions. At first sight, that seems easy - just modify clog to mark them as aborted. However, it's not that easy, because you'd also need to somehow clear hint bits that claim those transactions to be committed. Not to mention prevent VACUUM from removing rows deleted by those committed transactions. A saved snapshot of this sort would have to act like an open transaction from the standpoint of resource reclamation, which makes it (a) complicated and (b) very expensive if you intend to hold that snapshot for a long time. I wonder whether your actual use-case could be solved with 9.2's exportable-snapshots feature, though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming-only Remastering
On 6/10/12 11:47 AM, Joshua Berkus wrote: So currently we have a major limitation in binary replication, where it is not possible to remaster your system (that is, designate the most caught-up standby as the new master) based on streaming replication only. This is a major limitation because the requirement to copy physical logs over scp (or similar methods), manage and expire them more than doubles the administrative overhead of managing replication. This becomes even more of a problem if you're doing cascading replication. Therefore I think this is a high priority for 9.3. As far as I can tell, the change required for remastering over streaming is relatively small; we just need to add a new record type to the streaming protocol, and then start writing the timeline change to that. Are there other steps required which I'm not seeing? *sound of crickets chirping* Is there other work involved which isn't immediately apparent? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sortsupport for text
Robert Haas robertmh...@gmail.com writes: On Fri, Jun 15, 2012 at 12:22 PM, Tom Lane t...@sss.pgh.pa.us wrote: (And from a performance standpoint, I'm not entirely convinced it's not a bug, anyway. Worst-case behavior could be pretty bad.) Instead of simply asserting that, could you respond to the specific points raised in my analysis? I think there's no way it can be bad. I am happy to be proven wrong, but I like to understand why it is that I am wrong before changing things. Maybe I missed something, but as far as I saw your argument was not that the performance wasn't bad but that the rest of the sort code would dominate the runtime anyway. I grant that entirely, but that doesn't mean that it's good for this piece of it to possibly have bad behavior. In any case, it seems to me that if the bottom line is that the performance of this piece of code isn't going to matter overall, then we might as well use the simplest, least surprising implementation we can. And I concur with Peter that a doubling-based approach meets that description, while what you've got here doesn't. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange behavior with pg_locks and partitioning
So it sounds unsurprising to me. OK, I'll just submit a note for the docs for max_locks_per_transaction, then. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: avoid heavyweight locking on hash metapage
On Wed, May 30, 2012 at 3:14 PM, Robert Haas robertmh...@gmail.com wrote: I developed the attached patch to avoid taking a heavyweight lock on the metapage of a hash index. Instead, an exclusive buffer content lock is viewed as sufficient permission to modify the metapage, and a shared buffer content lock is used when such modifications need to be prevented. For the most part this is a trivial change, because we were already taking these locks: we were just taking the heavyweight locks in addition. The only sticking point is that, when we're searching or inserting, we previously locked the bucket before releasing the heavyweight metapage lock, which is unworkable when holding only a buffer content lock because (1) we might deadlock and (2) buffer content locks can't be held for long periods of time even when there's no deadlock risk. To fix this, I implemented a simple loop-and-retry system: we release the metapage content lock, acquire the heavyweight lock on the target bucket, and then reacquire the metapage content lock and check that the bucket mapping has not changed. Normally it hasn't, and we're done. But if by chance it has, we simply unlock the metapage, release the heavyweight lock we acquired previously, lock the new bucket, and loop around again. Even in the worst case we cannot loop very many times here, since we don't split the same bucket again until we've split all the other buckets, and 2^N gets big pretty fast. Do we need the retry flag (applies to two places)? If oldblkno is still InvalidBlockNumber then it can't equal blkno. I think the extra variable might be clearer than the magic value, but we already have the magic value so do we want to have both a flag variable and a magic value? + if (retry) + { + if (oldblkno == blkno) + break; + _hash_droplock(rel, oldblkno, HASH_SHARE); + } In the README, the psuedo codes probably needs to mention re-locking the meta page in the loop. Also, page is used to mean either the disk page or the shared buffer currently holding that page, depending on context. This is confusing. Maybe we should clarify Lock the meta page buffer. Of course this gripe precedes your patch and applies to other parts of the code as well, but since we mingle LW locks (on buffers) and heavy locks (on names of disk pages) it might make sense to be more meticulous here. exclusive-lock page 0 (assert the right to begin a split) is no longer true, nor is release X-lock on page 0 Also in the README, section To prevent deadlock we enforce these coding rules: would need to be changed as those rules are being changed. But, should we change them at all? In _hash_expandtable, the claim But since we are only trylocking here it should be OK doesn't completely satisfy me. Even a conditional heavy-lock acquire still takes a transient non-conditional LW Lock on the lock manager partition. Unless there is a global rule that no one can take a buffer content lock while holding a lock manager partition lock, this seems dangerous. Could this be redone to follow the pattern of heavy locking with no content lock, then reacquiring the buffer content lock to check to make sure we locked the correct things? I don't know if it would be better to loop, or just give up, if the meta page changed underneath us. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [patch] libpq one-row-at-a-time API
The row-processor API is now in 9.2, but it solves only the different-row-storage problem, but not the one-row-at-a-time problem, as libpq is still in control until all rows are received. This means libpq cannet still be used to implement iterative result processing that almost all high-level languages are using. We discussed potential API for fetching on single row at a time, but did not reach conclusion. Basic arguments were: 1) Tom: PQisBusy must keep current behaviour. Thus also PQgetResult() must keep current behaviour: * PQisBusy() - 0: need to call PQgetResult(), which returns PGresult * PQisBusy() - 1: need to call PQconsumeInput() * PQisBusy() must be callable several times in a row, thus be stateless from clients POV. 2) Me: behaviour must not be controlled by callback, but client code that uses PQgetResult() + PQisBusy(). Now, looking at the problem with some perspective, the solution is obvious: when in single-row mode, the PQgetResult() must return proper PGresult for that single row. And everything else follows that. Such API is implemented in attached patch: * PQsetSingleRowMode(conn): set's single-row mode. * PQisBusy(): stops after each row in single-row mode, sets PGASYNC_ROW_READY. Thus keeping the property of being repeatedly callable. * PQgetResult(): returns copy of the row if PGASYNC_ROW_READY. Sets row resultStatus to PGRES_SINGLE_TUPLE. This needs to be different from PGRES_TUPLES_OK to detect resultset end. * PQgetRowData(): can be called instead PQgetResult() to get raw row data in buffer, for more efficient processing. This is optional feature that provides the original row-callback promise of avoiding unnecessary row data copy. * Although PQgetRowData() makes callback API unnecessary, it is still fully compatible with it - the callback should not see any difference whether the resultset is processed in single-row mode or old single-PGresult mode. Unless it wants to - it can check PGRES_TUPLES_OK vs. PGRES_SINGLE_TUPLE. There is some duplicate code here that can be refactored (callback exec), but I did not do it yet to avoid affecting existing code too much. -- marko PS. If a squint it seems like fix of exising API instead of new feature, so perhaps it can still fit into 9.2? commit 4114613 (HEAD, single-row) Author: Marko Kreen mark...@gmail.com Date: Sat Apr 7 15:05:01 2012 +0300 Single-row based processing diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 5c5dd68..0ea2c1f 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -4018,6 +4018,75 @@ PGresult *PQgetResult(PGconn *conn); /note /listitem /varlistentry + +varlistentry id=libpq-pqsetsinglerowmode + term + functionPQsetSingleRowMode/function + indexterm + primaryPQsetSingleRowMode/primary + /indexterm + /term + + listitem + para + Instead buffering all rows in structnamePGresult/structname + until full resultset has arrived, this changes resultset processing + to return rows as soon as they arrive from network. + +synopsis +int PQsetSingleRowMode(PGconn *conn); +/synopsis + /para + + para + The mode can be changed directly after + functionPQsendQuery/function, + functionPQsendQueryParams/function, + functionPQsendQueryPrepared/function call, and before + any result rows have arrived from network. Then this functions + changes mode and returns 1. Otherwise the mode stays unchanged + and this functions returns 0. + /para + + para + The rows returned have PQresultStatus() of literalPGRES_SINGLE_TUPLE/literal. + There will be final PGresult that has either literalPGRES_TUPLES_OK/literal + or literalPGRES_FATAL_ERROR/literal result status. In case + of error status, the actual query failed in the middle and received rows + should be dropped. + /para + + /listitem +/varlistentry + +varlistentry id=libpq-pqgetrowdata + term + functionPQgetRowData/function + indexterm + primaryPQgetRowData/primary + /indexterm + /term + + listitem + para + In single row mode it is possible to get row data directly, + without constructing structnamePGresult/structname for + each row. + +synopsis +int PQgetRowData(PGconn *conn, PGresult **hdr, PGdataValue **columns); +/synopsis + /para + + para + It can be called everywhere functionPQgetResult/function can. + It returns 1 and fills pointers if there is row data avilable. + It returns 0 otherwise. Then functionPQgetResult/function + should be called to get final status. + /para + /listitem +/varlistentry + /variablelist /para diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt index 1251455..a228a71 100644 --- a/src/interfaces/libpq/exports.txt +++
Re: [HACKERS] [RFC][PATCH] Logical Replication/BDR prototype and architecture
On Thu, Jun 14, 2012 at 4:13 PM, Andres Freund and...@2ndquadrant.com wrote: I don't plan to throw in loads of conflict resolution smarts. The aim is to get to the place where all the infrastructure is there so that a MM solution can be built by basically plugging in a conflict resolution mechanism. Maybe providing a very simple one. I think without in-core support its really, really hard to build a sensible MM implementation. Which doesn't mean it has to live entirely in core. Of course, several people have already done it, perhaps most notably Bucardo. Anyway, it would be good to get opinions from more people here. I am sure I am not the only person with an opinion on the appropriateness of trying to build a multi-master replication solution in core or, indeed, the only person with an opinion on any of these other issues. It is not good for those other opinions to be saved for a later date. Hm. Yes, you could do that. But I have to say I don't really see a point. Maybe the fact that I do envision multimaster systems at some point is clouding my judgement though as its far less easy in that case. Why? I don't think that particularly changes anything. It also complicates the wal format as you now need to specify whether you transport a full or a primary-key only tuple... Why? If the schemas are in sync, the target knows what the PK is perfectly well. If not, you're probably in trouble anyway. I think though that we do not want to enforce that mode of operation for tightly coupled instances. For those I was thinking of using command triggers to synchronize the catalogs. One of the big screwups of the current replication solutions is exactly that you cannot sensibly do DDL which is not a big problem if you have a huge system with loads of different databases and very knowledgeable people et al. but at the beginning it really sucks. I have no problem with making one of the nodes the schema master in that case. Also I would like to avoid the overhead of the proxy instance for use-cases where you really want one node replicated as fully as possible with the slight exception of being able to have summing tables, different indexes et al. In my view, a logical replication solution is precisely one in which the catalogs don't need to be in sync. If the catalogs have to be in sync, it's not logical replication. ISTM that what you're talking about is sort of a hybrid between physical replication (pages) and logical replication (tuples) - you want to ship around raw binary tuple data, but not entire pages. The problem with that is it's going to be tough to make robust. Users could easily end up with answers that are total nonsense, or probably even crash the server. To step back and talk about DDL more generally, you've mentioned a few times the idea of using an SR instance that has been filtered down to just the system catalogs as a means of generating logical change records. However, as things stand today, there's no reason to suppose that replicating anything less than the entire cluster is sufficient. For example, you can't translate enum labels to strings without access to the pg_enum catalog, which would be there, because enums are built-in types. But someone could supply a similar user-defined type that uses a user-defined table to do those lookups, and now you've got a problem. I think this is a contractual problem, not a technical one. From the point of view of logical replication, it would be nice if type output functions were basically guaranteed to look at nothing but the datum they get passed as an argument, or at the very least nothing other than the system catalogs, but there is no such guarantee. And, without such a guarantee, I don't believe that we can create a high-performance, robust, in-core replication solution. Now, the nice thing about being the people who make PostgreSQL happen is we get to decide what the C code that people load into the server is required to guarantee; we can change the rules. Before, types were allowed to do X, but now they're not. Unfortunately, in this case, I don't really find that an acceptable solution. First, it might break code that has worked with PostgreSQL for many years; but worse, it won't break it in any obvious way, but rather only if you're using logical replication, which will doubtless cause people to attribute the failure to logical replication rather than to their own code. Even if they do understand that we imposed a rule-change from on high, there's no really good workaround: an enum type is a good example of something that you *can't* implement without a side-table. Second, it flies in the face of our often-stated desire to make the server extensible. Also, even given the existence of such a restriction, you still need to run any output function that relies on catalogs with catalog contents that match what existed at the time that WAL was generated, and under the correct snapshot, which is not
Re: [HACKERS] sortsupport for text
On Fri, Jun 15, 2012 at 1:45 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Fri, Jun 15, 2012 at 12:22 PM, Tom Lane t...@sss.pgh.pa.us wrote: (And from a performance standpoint, I'm not entirely convinced it's not a bug, anyway. Worst-case behavior could be pretty bad.) Instead of simply asserting that, could you respond to the specific points raised in my analysis? I think there's no way it can be bad. I am happy to be proven wrong, but I like to understand why it is that I am wrong before changing things. Maybe I missed something, but as far as I saw your argument was not that the performance wasn't bad but that the rest of the sort code would dominate the runtime anyway. I grant that entirely, but that doesn't mean that it's good for this piece of it to possibly have bad behavior. That, plus the fact that not wasting memory in code paths where memory is at a premium seems important to me. I'm shocked that either of you think it's OK to overallocate by as much as 2X in a code path that's only going to be used when we're going through fantastic gyrations to make memory usage fit inside work_mem. The over-allocation by itself could easily exceed work_mem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On 15-06-2012 11:39, Magnus Hagander wrote: As long as a free implementation exists, it can be ported to Java/.Net. Sure, it takes more work, but it *can be done*. Good point. IMHO, if there isn't a solution that cover all PostgreSQL (it seems it is not), we should pick the most appropriate one for *libpq* and let other drivers implement it at their time. I only partially agree. If there *is* no third party SSL libary that does support it, then they're stuck reimplementing an *entire SSL library*, which is surely many orders of magnitude more work, and suddenly steps into writing encryption code which is a lot more sensitive. Basically if they have to do that, then they're stuck *never* being able to fix the problem. If we can prove such a third party library *exists*, that makes it different. But from what I can tell so far, I haven't seen a single one - let alone one that supports compression. Using SSL-based compression could be a solution but I would like to emphasize that (i) I'm obligated to use cryptography library to compress data, (ii) I'm paying the price for SSL overhead and (iii) it will confuse people when we said that for compression we need a SSL connection or (iv) even transform the libpq communication code into a spaghetti to support compression using SSL in non-SSL connections). I see the point in not adding another dependencies or reinventing the wheel but I see more drawbacks than benefits in adopting a SSL-based compression. I like the Farina's idea in supporting compression outside libpq but I'm ok with adding a standard algorithm for compression (problem is that in the future others could want to add another interesting compression algorithms). -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On 15-06-2012 11:10, k...@rice.edu wrote: I agree and think that the SSL-based compression is an excellent default compression scheme. The plugable compression approach allows for the choice of the most appropriate compression implementation based on the application needs. It really addresses corner cases such as high- performance system. That is my opinion too. I'm free to use to most appropriate algorithm for compression. It is just a matter of coding an interface for my favorite compression algorithm. We could even add some algorithms in a new contrib module. -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC][PATCH] Logical Replication/BDR prototype and architecture
Robert Haas robertmh...@gmail.com wrote: So maybe instead of trying to cobble together a set of catalog contents that we can use for decoding any tuple whatsoever, we should instead divide the world into well-behaved types and poorly-behaved types. Well-behaved types are those that can be interpreted without the catalogs, provided that you know what type it is. Poorly-behaved types (records, enums) are those where you can't. For well-behaved types, we only need a small amount of additional information in WAL to identify which types we're trying to decode (not the type OID, which might fail in the presence of nasty catalog hacks, but something more universal, like a UUID that means this is text, or something that identifies the C entrypoint). And then maybe we handle poorly-behaved types by pushing some of the work into the foreground task that's generating the WAL: in the worst case, the process logs a record before each insert/update/delete containing the text representation of any values that are going to be hard to decode. In some cases (e.g. records all of whose constituent fields are well-behaved types) we could instead log enough additional information about the type to permit blind decoding. What about matching those values up to the correct table name and the respective columns names? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backup docs
Magnus Hagander mag...@hagander.net writes: -The procedure for making a base backup is relatively simple: +The easiest way to perform a base backup is to use the +xref linkend=app-pgbasebackup tool. It can create +a base backup either as regular files or as a tar archive. If more +flexibility than xref linkend=app-pgbasebackup can provide is +required, you can also make a base backup using the low level API +(see xref linkend=backup-lowlevel-base-backup). + /para Good start. + para +It is not necessary to be concerned about the amount of time it takes +to make a base backup. However, if you normally run the Why not? +file, and can ordinarily be ignored.) Once you have safely archived +the file system backup and the WAL segment files used during the +backup (as specified in the backup history file), all archived WAL +segments with names numerically less are no longer needed to recover +the file system backup and can be deleted. However, you should +consider keeping several backup sets to be absolutely certain that +you can recover your data. + /para You're frighting off users when not detailing, I think. How to be certain I can recover my data, is there a way that I can't when a backup has been successfully made? How can I check? Also I don't see mention of basebackup+wal files all in one with the -x option, which I though would have to be addressed here? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow WAL information to recover corrupted pg_controldata
Le vendredi 15 juin 2012 03:27:11, Amit Kapila a écrit : I guess my first question is: why do we need this? There are lots of things in the TODO list that someone wanted once upon a time, but they're not all actually important. Do you have reason to believe that this one is? It's been six years since that email, so it's worth asking if this is actually relevant. As far as I know the pg_control is not WAL protected, which means if it gets corrupt due to any reason (disk crash during flush, so written partially), it might lead to failure in recovery of database. AFAIR pg_controldata fit on a disk sector so it can not be half written. So user can use pg_resetxlog to recover the database. Currently pg_resetxlog works on guessed values for pg_control. However this implementation can improve the logic that instead of guessing, it can try to regenerate the values from WAL. This implementation can allow better recovery in certain circumstances. The deadline for patches for this CommitFest is today, so I think you should target any work you're starting now for the NEXT CommitFest. Oh, I am sorry, as this was my first time I was not fully aware of the deadline. However I still seek your opinion whether it makes sense to work on this feature. -Original Message- From: Robert Haas [mailto:robertmh...@gmail.com] Sent: Friday, June 15, 2012 12:40 AM To: Amit Kapila Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Allow WAL information to recover corrupted pg_controldata On Thu, Jun 14, 2012 at 11:39 AM, Amit Kapila amit.kap...@huawei.com wrote: I am planning to work on the below Todo list item for this CommitFest Allow WAL information to recover corrupted pg_controldata http://archives.postgresql.org/pgsql-patches/2006-06/msg00025.php The deadline for patches for this CommitFest is today, so I think you should target any work you're starting now for the NEXT CommitFest. I wanted to confirm my understanding about the work involved for this patch: The existing patch has following set of problems: 1. Memory leak and linked list code path is not proper 2. lock check for if the server is already running, is removed in patch which needs to be reverted 3. Refactoring of the code. Apart from above what I understood from the patch is that its intention is to generate values for ControlFile using WAL logs when -r option is used. The change in algorithm from current will be if control file is corrupt which essentialy means ReadControlFile() will return False, then it should generate values (checkPointCopy, checkPoint, prevCheckPoint, state) using WAL if -r option is enabled. Also for -r option, it doesn't need to call function FindEndOfXLOG() as the that work will be achieved by above point. It will just rewrite the control file and dont do other resets. The algorithm of restoring the pg_control value from old xlog file: 1. Retrieve all of the active xlog files from xlog direcotry into a list by increasing order, according their timeline, log id, segment id. 2. Search the list to find the oldest xlog file of the lastest time line. 3. Search the records from the oldest xlog file of latest time line to the latest xlog file of latest time line, if the checkpoint record has been found, update the latest checkpoint and previous checkpoint. Apart from above some changes in code will be required after the Xlog patch by Heikki. Suggest me if my understanding is correct? I guess my first question is: why do we need this? There are lots of things in the TODO list that someone wanted once upon a time, but they're not all actually important. Do you have reason to believe that this one is? It's been six years since that email, so it's worth asking if this is actually relevant. -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] sortsupport for text
On 15 June 2012 21:06, Robert Haas robertmh...@gmail.com wrote: On Fri, Jun 15, 2012 at 1:45 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Fri, Jun 15, 2012 at 12:22 PM, Tom Lane t...@sss.pgh.pa.us wrote: (And from a performance standpoint, I'm not entirely convinced it's not a bug, anyway. Worst-case behavior could be pretty bad.) Instead of simply asserting that, could you respond to the specific points raised in my analysis? I think there's no way it can be bad. I am happy to be proven wrong, but I like to understand why it is that I am wrong before changing things. Maybe I missed something, but as far as I saw your argument was not that the performance wasn't bad but that the rest of the sort code would dominate the runtime anyway. I grant that entirely, but that doesn't mean that it's good for this piece of it to possibly have bad behavior. That, plus the fact that not wasting memory in code paths where memory is at a premium seems important to me. I'm shocked that either of you think it's OK to overallocate by as much as 2X in a code path that's only going to be used when we're going through fantastic gyrations to make memory usage fit inside work_mem. The over-allocation by itself could easily exceed work_mem. That seems pretty thin to me. We're talking about a couple of buffers whose ultimate size is only approximately just big enough to hold the largest text datum seen when sorting. Meanwhile, if it's the leading key we're dealing with (and of course, it usually will be), before exceeding work_mem all of the *entire* set of strings to be sorted are sitting in palloc()'d memory anyway. I'm surprised that you didn't immediately concede the point, to be honest. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [patch] libpq one-row-at-a-time API
On Fri, Jun 15, 2012 at 1:21 PM, Marko Kreen mark...@gmail.com wrote: The row-processor API is now in 9.2, but it solves only the different-row-storage problem, but not the one-row-at-a-time problem, as libpq is still in control until all rows are received. This means libpq cannet still be used to implement iterative result processing that almost all high-level languages are using. We discussed potential API for fetching on single row at a time, but did not reach conclusion. Basic arguments were: 1) Tom: PQisBusy must keep current behaviour. Thus also PQgetResult() must keep current behaviour: * PQisBusy() - 0: need to call PQgetResult(), which returns PGresult * PQisBusy() - 1: need to call PQconsumeInput() * PQisBusy() must be callable several times in a row, thus be stateless from clients POV. 2) Me: behaviour must not be controlled by callback, but client code that uses PQgetResult() + PQisBusy(). Now, looking at the problem with some perspective, the solution is obvious: when in single-row mode, the PQgetResult() must return proper PGresult for that single row. And everything else follows that. Such API is implemented in attached patch: * PQsetSingleRowMode(conn): set's single-row mode. * PQisBusy(): stops after each row in single-row mode, sets PGASYNC_ROW_READY. Thus keeping the property of being repeatedly callable. * PQgetResult(): returns copy of the row if PGASYNC_ROW_READY. Sets row resultStatus to PGRES_SINGLE_TUPLE. This needs to be different from PGRES_TUPLES_OK to detect resultset end. * PQgetRowData(): can be called instead PQgetResult() to get raw row data in buffer, for more efficient processing. This is optional feature that provides the original row-callback promise of avoiding unnecessary row data copy. * Although PQgetRowData() makes callback API unnecessary, it is still fully compatible with it - the callback should not see any difference whether the resultset is processed in single-row mode or old single-PGresult mode. Unless it wants to - it can check PGRES_TUPLES_OK vs. PGRES_SINGLE_TUPLE. There is some duplicate code here that can be refactored (callback exec), but I did not do it yet to avoid affecting existing code too much. -- marko PS. If a squint it seems like fix of exising API instead of new feature, so perhaps it can still fit into 9.2? +1 on rushing in row processing for 9.2, but only if the API feels right (i'll spend some time to review). I found the lack of iterative row processing to be really unfortunate. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Resource Owner reassign Locks
On Mon, Jun 11, 2012 at 9:30 PM, Amit Kapila amit.kap...@huawei.com wrote: Yes, that means the list has over-flowed. Once it is over-flowed, it is now invalid for the reminder of the life of the resource owner. Don't we need any logic to clear the reference of locallock in owner-locks array. I don't think so. C doesn't ref count its pointers. MAX_RESOWNER_LOCKS - How did you arrive at number 10 for it. Is there any specific reason for 10. I instrumented the code to record the maximum number of locks held by a resource owner, and report the max when it was destroyed. (That code is not in this patch). During a large pg_dump, the vast majority of the resource owners had maximum locks of 2, with some more at 4 and 6.Then there was one resource owner, for the top-level transaction, at tens or hundreds of thousands (basically one for every lockable object). There was little between 6 and this top-level number, so I thought 10 was a good compromise, safely above 6 but not so large that searching through the list itself was likely to bog down. Also, Tom independently suggested the same number. Should it emit a FATAL rather than an ERROR? I thought ERROR was sufficient to make the backend quit, as it is not clear how it could meaningfully recover. I am not able to visualize any valid scenario in which it can happen unless some corruption happens. If this happens, user can close all statements and abort its transactions. According to me ERROR is okay. However in the message Can't find lock to remove, it could be better, if there is information about resource owner and lock. I think we might end up changing that entirely once someone more familiar with the error handling mechanisms takes a look at it. I don't think that lock tags have good human readable formats, and just a pointer dump probably wouldn't be much use when something that can never happen has happened. But I'll at least add a reference to the resource owner if this stays in. Thanks, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sortsupport for text
Robert Haas robertmh...@gmail.com writes: On Fri, Jun 15, 2012 at 1:45 PM, Tom Lane t...@sss.pgh.pa.us wrote: Maybe I missed something, but as far as I saw your argument was not that the performance wasn't bad but that the rest of the sort code would dominate the runtime anyway. I grant that entirely, but that doesn't mean that it's good for this piece of it to possibly have bad behavior. That, plus the fact that not wasting memory in code paths where memory is at a premium seems important to me. I'm shocked that either of you think it's OK to overallocate by as much as 2X in a code path that's only going to be used when we're going through fantastic gyrations to make memory usage fit inside work_mem. The over-allocation by itself could easily exceed work_mem. I would be concerned about this if it were per-sort-tuple wastage, but what I understood to be happening was that this was a single instance of an expansible buffer (per sort, perhaps, but still just one buffer). And, as you keep pointing out when it suits your argument, it's relatively uncommon to be sorting enormous values anyway. So no, I am not particularly worried about that. If you are, there are more important places to be concerned about allocation pad wastage, starting with palloc itself. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
Euler Taveira eu...@timbira.com writes: I see the point in not adding another dependencies or reinventing the wheel but I see more drawbacks than benefits in adopting a SSL-based compression. In the end, judging this tradeoff is a matter of opinion, but I come to the opposite conclusion. Transport-level compression is not part of the core competence of this project. As such, if we have an opportunity to farm out that work to other projects (particularly ones that we are already relying on), we should do so. Not expend our limited resources on re-inventing this wheel, which we'd be more likely than not to do so less well than it's already been done. To draw an analogy: on the basis of the arguments that have been made about how some users might not have access to an SSL library implementing feature X, we should drop our use of OpenSSL entirely and re-implement transport encryption from scratch, incompatibly with OpenSSL. Now that's obviously ridiculous, not least because it does nothing at all to ease the pain of people who need a non-C implementation. But arguing that we should not use OpenSSL's compression features because some people might need to use a different SSL implementation doesn't seem to me to be any different from that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Resource Owner reassign Locks
Jeff Janes jeff.ja...@gmail.com writes: On Mon, Jun 11, 2012 at 9:30 PM, Amit Kapila amit.kap...@huawei.com wrote: MAX_RESOWNER_LOCKS - How did you arrive at number 10 for it. Is there any specific reason for 10. I instrumented the code to record the maximum number of locks held by a resource owner, and report the max when it was destroyed. (That code is not in this patch). During a large pg_dump, the vast majority of the resource owners had maximum locks of 2, with some more at 4 and 6.Then there was one resource owner, for the top-level transaction, at tens or hundreds of thousands (basically one for every lockable object). There was little between 6 and this top-level number, so I thought 10 was a good compromise, safely above 6 but not so large that searching through the list itself was likely to bog down. Also, Tom independently suggested the same number. FYI, I had likewise suggested 10 on the basis of examining pg_dump's behavior. It might be a good idea to examine a few other use-cases before settling on a value. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Event Triggers reduced, v1
On 15 June 2012 21:27, Dimitri Fontaine dfonta...@hi-media.com wrote: The goal for this first patch is to avoid semantics issues so that we can get something technically clean in, and have more time to talk semantics next times. The main discussion to avoid is deciding if we want to fire event triggers for CREATE SEQUENCE and CREATE INDEX in a command that just did implement a SERIAL PRIMARY KEY in a table. So this patch triggers once per top level command, just with information about the set of nested events? I'm happy if we make sweeping initial points like don't generate events for sequences and indexes in the first version. We can always add more later. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
I wrote: Euler Taveira eu...@timbira.com writes: I see the point in not adding another dependencies or reinventing the wheel but I see more drawbacks than benefits in adopting a SSL-based compression. In the end, judging this tradeoff is a matter of opinion, but I come to the opposite conclusion. BTW, there is an additional technical argument that I don't think has been made yet. Assume that we implement our own transport compression, and then somebody runs an SSL connection using a recent OpenSSL version (in which, IIRC, SSL-level compression is enabled by default). Now, OpenSSL is trying to compress already-compressed data. That's not merely a waste of cycles but is very likely to be counterproductive, ie recompressed data usually gets larger not smaller. We could possibly address this by adding control logic to tell OpenSSL not to compress ... but that's almost exactly the code you don't want to write, just making a different option selection. And I wonder whether SSL implementations that don't support compression will accept a set-the-compression-option call at all. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming-only Remastering
On 10 June 2012 19:47, Joshua Berkus j...@agliodbs.com wrote: So currently we have a major limitation in binary replication, where it is not possible to remaster your system (that is, designate the most caught-up standby as the new master) based on streaming replication only. This is a major limitation because the requirement to copy physical logs over scp (or similar methods), manage and expire them more than doubles the administrative overhead of managing replication. This becomes even more of a problem if you're doing cascading replication. The major limitation was solved by repmgr close to 2 years ago now. So while you're correct that the patch to fix that assumed that archiving worked as well, it has been possible to operate happily without it. http://www.repmgr.org New versions for 9.2 will be out soon. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On Fri, Jun 15, 2012 at 12:48:24PM +0200, Florian Pflug wrote: Yeah, but that alone is IMO a rather big blocker for claiming that this is the only way to do it :( And I think the fact that that wikipedia page doesn't list any other ones, is a sign that there might not be a lot of other choices out there in reality - expecially not opensource… Hm, but things get even harder for the JDBC and .NET folks if we go with a third-party compression method. Or would we require that the existence of a free Java (and maybe .NET) implementation of such a method would be an absolute must? The way I see it, if we use SSL-based compression then non-libpq clients there's at least a chance of those clients being able to use it easily (if their SSL implementation supports it). If we go with a third-party compression method, they *all* need to add yet another dependency, or may even need to re-implement the compression method in their implementation language of choice. Does OpenSSL use hardware acceleration for its compression? I know it often does for encryption --- that would be a big reason to do compression at the SSL layer. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] splitting htup.h
Hi, This patch splits htup.h in two pieces -- the first one (tupbasics.h; not wedded to the name) does not include many other headers and is just enough to have other parts of the code create tuples and pass them around, to be used by most other headers. The other one (which keeps the name htup.h) contains internal tuple stuff (struct declarations etc). Before patch, htup.h is directly or indirectly included by 364 .c files in src/backend; after patch, that's reduced to 299 files (that's 65 files less to compile if you modify the header). -- Álvaro Herrera alvhe...@alvh.no-ip.org tupbasics.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] splitting htup.h
Alvaro Herrera alvhe...@alvh.no-ip.org writes: This patch splits htup.h in two pieces -- the first one (tupbasics.h; not wedded to the name) does not include many other headers and is just enough to have other parts of the code create tuples and pass them around, to be used by most other headers. The other one (which keeps the name htup.h) contains internal tuple stuff (struct declarations etc). Before patch, htup.h is directly or indirectly included by 364 .c files in src/backend; after patch, that's reduced to 299 files (that's 65 files less to compile if you modify the header). That's kind of a disappointing result --- if we're going to split htup.h into public and private parts, I would have hoped for a much smaller inclusion footprint for the private part. Maybe you could adjust the boundary between public and private parts a bit more? If we can't cut the footprint I'm inclined to think this isn't worth the code churn. (Or perhaps I'm missing the point. Do you have a reason for doing this other than cutting the inclusion footprint?) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Resource Owner reassign Locks
On Fri, Jun 15, 2012 at 3:29 PM, Tom Lane t...@sss.pgh.pa.us wrote: Jeff Janes jeff.ja...@gmail.com writes: On Mon, Jun 11, 2012 at 9:30 PM, Amit Kapila amit.kap...@huawei.com wrote: MAX_RESOWNER_LOCKS - How did you arrive at number 10 for it. Is there any specific reason for 10. I instrumented the code to record the maximum number of locks held by a resource owner, and report the max when it was destroyed. (That code is not in this patch). During a large pg_dump, the vast majority of the resource owners had maximum locks of 2, with some more at 4 and 6. Then there was one resource owner, for the top-level transaction, at tens or hundreds of thousands (basically one for every lockable object). There was little between 6 and this top-level number, so I thought 10 was a good compromise, safely above 6 but not so large that searching through the list itself was likely to bog down. Also, Tom independently suggested the same number. FYI, I had likewise suggested 10 on the basis of examining pg_dump's behavior. It might be a good idea to examine a few other use-cases before settling on a value. Looking at the logging output of a make check run, there are many cases where the list would have overflown (max locks was 10), but in all of them the number of locks held at the time of destruction was equal to, or only slightly less than, the size of the local lock hash table. So iterating over a large memorized list would not save much computational complexity over iterating over the entire hash table (although the constant factor in iterating over pointers in an array might be smaller the constant factor for using a hash-iterator). Looking at pg_dump with more complex structures (table with multiple toasted columns and multiple unique indexes, and inherited tables) does use more max locks, but the number doesn't seem to depend on how many toast and indexes exist. There are very frequently a max of 9 locks occurring when the lock table is large, so that is uncomfortably close to overflowing. Adding sequences (or at least, using a type of serial) doesn't seem to increase the max used. I don't know if there a more principle-based way of approaching this. There are probably cases where maintaining the list of locks is loss rather than a gain, but since I don't how to create them I can't evaluate what the trade off might be to increasing the max. I'm inclined to increase the max from 10 to 15 to reclaim a margin of safety, and leave it at that, unless someone can recommend a better test case. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Run pgindent on 9.2 source tree in preparation for first 9.3
On Tue, Jun 12, 2012 at 01:50:48PM -0400, Noah Misch wrote: On Mon, Jun 11, 2012 at 05:57:41PM -0400, Alvaro Herrera wrote: What about something like this in the root of the tree: find . -name \*.pl -o -name \*.pm | xargs perltidy -b -bl -nsfs -naws -l=100 -ole=unix There are files all over the place. The file that would most be affected with one run of this is the ECPG grammar generator. I checked the -et=4 business (which is basically entab). We're pretty inconsistent about tabs in perl code it seems; some files use tabs others use spaces. Honestly I would just settle on what we use on C files, even if the Perl devs don't recommend it because of maintainability and portability. I mean if it works well for us for C code, why would it be a problem in Perl code? However, I don't write much of that Perl code myself. +1 for formatting all our Perl scripts and for including -et=4. Since that will rewrite currently-tidy files anyway, this is a good time to audit our perltidy settings. Why -l=100 instead of -l=78 like our C code? perltidy changes this code: for ($long_variable_name_to_initialize = 0; $long_variable_name_to_initialize $long_limit_variable_name; $long_variable_name_to_initialize++) { to this: for ( $long_variable_name_to_initialize = 0; $long_variable_name_to_initialize $long_limit_variable_name; $long_variable_name_to_initialize++ ) { Using -vtc=2 removes the new trailing line break. Additionally using -vt=2 -nsak=for removes the new leading line break, but it also removes the space between for and (. Anyone know how to make perltidy format this like we do in C code? Why -naws? I would lean toward -aws -dws -pt=2 to change code like this: -my $dbi=DBI-connect('DBI:Pg:dbname='.$opt{d}); +my $dbi = DBI-connect('DBI:Pg:dbname=' . $opt{d}); I'd also consider -kbl=2 to preserve runs of blank lines that the author used to delineate related groups of functions. OK, based on this feedback, I have updated the pgindent README to use these Perl indent instructions: find . -name \*.pl -o -name \*.pm | xargs perltidy \ --backup-and-modify-in-place --opening-brace-on-new-line \ --vertical-tightness=2 --vertical-tightness-closing=2 \ --nospace-after-keyword=for --nospace-for-semicolon \ --add-whitespace --delete-old-whitespace --paren-tightness=2 \ --keep-old-blank-lines=2 --maximum-line-length=78 \ --entab-leading-whitespace=4 --output-line-ending=unix Unless I hear otherwise, I will run this new command on the 9.2 and HEAD Perl files. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Run pgindent on 9.2 source tree in preparation for first 9.3
Bruce Momjian br...@momjian.us writes: OK, based on this feedback, I have updated the pgindent README to use these Perl indent instructions: find . -name \*.pl -o -name \*.pm | xargs perltidy \ --backup-and-modify-in-place --opening-brace-on-new-line \ --vertical-tightness=2 --vertical-tightness-closing=2 \ --nospace-after-keyword=for --nospace-for-semicolon \ --add-whitespace --delete-old-whitespace --paren-tightness=2 \ --keep-old-blank-lines=2 --maximum-line-length=78 \ --entab-leading-whitespace=4 --output-line-ending=unix Unless I hear otherwise, I will run this new command on the 9.2 and HEAD Perl files. No idea what all that stuff does. Would it be reasonable to post a diff showing what this would do to the files in question? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backup docs
On Sat, Jun 16, 2012 at 4:39 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Magnus Hagander mag...@hagander.net writes: - The procedure for making a base backup is relatively simple: + The easiest way to perform a base backup is to use the + xref linkend=app-pgbasebackup tool. It can create + a base backup either as regular files or as a tar archive. If more + flexibility than xref linkend=app-pgbasebackup can provide is + required, you can also make a base backup using the low level API + (see xref linkend=backup-lowlevel-base-backup). + /para Good start. + para + It is not necessary to be concerned about the amount of time it takes + to make a base backup. However, if you normally run the Why not? This is copied from the old documentation. It used to say It is not necessary to be concerned about the amount of time elapsed between pg_start_backup and the start of the actual backup, nor between the end of the backup and pg_stop_backup. And the whole idea was to simplify the text at the beginning ;) + file, and can ordinarily be ignored.) Once you have safely archived + the file system backup and the WAL segment files used during the + backup (as specified in the backup history file), all archived WAL + segments with names numerically less are no longer needed to recover + the file system backup and can be deleted. However, you should + consider keeping several backup sets to be absolutely certain that + you can recover your data. + /para You're frighting off users when not detailing, I think. How to be This is copied exactly from what it is today. I'm sure it can be approved, but it's not the goal of this patch. Let's not let perfection get in the way of improvement... Also I don't see mention of basebackup+wal files all in one with the -x option, which I though would have to be addressed here? It does, it's under standalone hot backups. The second to last part of the patch. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Run pgindent on 9.2 source tree in preparation for first 9.3
On Fri, Jun 15, 2012 at 10:48:27PM -0400, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: OK, based on this feedback, I have updated the pgindent README to use these Perl indent instructions: find . -name \*.pl -o -name \*.pm | xargs perltidy \ --backup-and-modify-in-place --opening-brace-on-new-line \ --vertical-tightness=2 --vertical-tightness-closing=2 \ --nospace-after-keyword=for --nospace-for-semicolon \ --add-whitespace --delete-old-whitespace --paren-tightness=2 \ --keep-old-blank-lines=2 --maximum-line-length=78 \ --entab-leading-whitespace=4 --output-line-ending=unix Unless I hear otherwise, I will run this new command on the 9.2 and HEAD Perl files. No idea what all that stuff does. Would it be reasonable to post a diff showing what this would do to the files in question? Sure: http://momjian.us/expire/perl.diff -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On Sat, Jun 16, 2012 at 12:03 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 15.06.2012 18:28, Magnus Hagander wrote: On Fri, Jun 15, 2012 at 11:24 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 15.06.2012 17:58, Magnus Hagander wrote: On Fri, Jun 15, 2012 at 10:56 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: You could write a dummy SSL implementation that only does compression, not encryption. Ie. only support the 'null' encryption method. That should be about the same amount of work as writing an implementation of compression using whatever protocol we would decide to use for negotiating the compression. Sure, but then what do you do if you actually want both? Umm, then you use a real SSL libray, not the dummy one? But (in this scenario, and so far nobody has proven it to be wrong) there exists no real SSL library that does support compression. Oh, I see. Then you're screwed. But I think the right solution to that is to write/extend a Java SSL implementation to support compression, not to invent our own in PostgreSQL. The JDK is open source nowadays. I don't have any personal experience with it, but it's my understanding that it's only opensource in the published opensource product sense. Meaning it's not really something that solicits (or even accepts? ast least not easily...) contributions from the outside. And forgive me for being negative, but I think you're going to have an even harder time to get Oracle to accept a contribution if the motivation for having it is to make the PostgreSQL driver work better... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On Sat, Jun 16, 2012 at 4:04 AM, Euler Taveira eu...@timbira.com wrote: On 15-06-2012 11:39, Magnus Hagander wrote: As long as a free implementation exists, it can be ported to Java/.Net. Sure, it takes more work, but it *can be done*. Good point. IMHO, if there isn't a solution that cover all PostgreSQL (it seems it is not), we should pick the most appropriate one for *libpq* and let other drivers implement it at their time. Fair enough if we decide that - but we should make that decision knowing that we're leaving the JDBC and .Net people in a bad position where they are not likely to be able to implement his. The JDBC people have a theoretical chance if the JDK is open. The .Net people are stuck with schannel that doesn't support it at this point. It might well do in the future (since it's in the standard); but they're at the mercy of Microsoft. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On Sat, Jun 16, 2012 at 6:37 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Euler Taveira eu...@timbira.com writes: I see the point in not adding another dependencies or reinventing the wheel but I see more drawbacks than benefits in adopting a SSL-based compression. In the end, judging this tradeoff is a matter of opinion, but I come to the opposite conclusion. BTW, there is an additional technical argument that I don't think has been made yet. Assume that we implement our own transport compression, and then somebody runs an SSL connection using a recent OpenSSL version (in which, IIRC, SSL-level compression is enabled by default). Now, OpenSSL is trying to compress already-compressed data. That's not merely a waste of cycles but is very likely to be counterproductive, ie recompressed data usually gets larger not smaller. We could possibly address this by adding control logic to tell OpenSSL not to compress ... but that's almost exactly the code you don't want to write, just making a different option selection. And I wonder whether SSL implementations that don't support compression will accept a set-the-compression-option call at all. Yes, but there's also a lot of such awkward logic we need to add if we *do* go with the SSL library doing the compression: For example, we can no longer trust the SSL library to always do encryption, since we specifically want to support null encryption. Meaning we need to teach pg_hba to treat a connection with null encryption as hostnossl, even if it's an openssl-backed connection, and mirrored. And in libpq, we have to make sure that a requiressl connection *does* fail even if we have ssl, when we're using null encryption. And we currently have no way to specify different encryption options on a per-host basis, which is something we'd have to do (e.g. i want to be able to say that subnet x requires encryption with these encryptions methods and subnet y doesn't require encryption but should do compression. Which in the easiest first look would require ssl_ciphers to be controllable from pg_hba.conf - but that doesn't work since we don't get to pg_hba.conf until after we've negotiated the SSL mode... So there's quite a bit of complexity that needs to be put in there just to deal with the fact that we're using SSL to do compression, if we want to support it in a way that's not hackish. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: New SQL functons pg_backup_in_progress() and pg_backup_start_tim
On Fri, Jun 15, 2012 at 11:14 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 15.06.2012 17:54, Magnus Hagander wrote: On Fri, Jun 15, 2012 at 8:16 PM, Robert Haasrobertmh...@gmail.com wrote: On Fri, Jun 15, 2012 at 1:52 AM, Magnus Hagandermag...@hagander.net wrote: On Fri, Jun 15, 2012 at 1:29 AM, Robert Haasrh...@postgresql.org wrote: New SQL functons pg_backup_in_progress() and pg_backup_start_time() Darold Gilles, reviewed by Gabriele Bartolini and others, rebased by Marco Nenciarini. Stylistic cleanup and OID fixes by me. How well is the term on-line exclusive backup really settled with people? I wonder if we need to add a specific note to the docs saying that the function doesn't consider streaming base backups at all, and that one should refer to pg_stat_replication for info about those? Or really, should the function be pg_exclusive_backup_in_progress() perhaps? Well, if we think that the term exclusive backup is not going to be easily comprehensible, then sticking that into the function name isn't going to help us much. I think that's just wordiness for the sake of being wordy. I do agree that we could probably improve the clarity of the documentation along the lines you suggest. It would alert people to the existance of the term, and thus help those who didn't actually read the documentation. I'm not sure we want to expose the exclusive backup term to users. It's a bit confusing. It makes sense in the limited scope in the code in xlog.c where it's currently used, but if I wanted to explain what it is to users, I don't think I'd choose that term. Which actually makes an argument for making that change *anyway*, because right now the function is incorrectly named. A function named pg_backup_in_progress() should answer the question is a backup in progress. And it doesn't answer that question. I agree that pg_backup_in_progress() is confusing, if it returns false while you're running pg_basebackup. In the doc changes you proposed, you call the pg_start/stop_backup() a low level API for taking backups. That's not suitable for a function name, but I think we should work on that, and find a better term that works. Backup mode? Filesystem backup mode? We already have backup mode, and it covers both of them really. And filesystem backup mode is also what pg_basebackup does - it takes a filesystem backup... The easiest one I can think of is the manual backup mode, but in the other thread Simon didn't like that term. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming-only Remastering
On Sat, Jun 16, 2012 at 6:53 AM, Simon Riggs si...@2ndquadrant.com wrote: On 10 June 2012 19:47, Joshua Berkus j...@agliodbs.com wrote: So currently we have a major limitation in binary replication, where it is not possible to remaster your system (that is, designate the most caught-up standby as the new master) based on streaming replication only. This is a major limitation because the requirement to copy physical logs over scp (or similar methods), manage and expire them more than doubles the administrative overhead of managing replication. This becomes even more of a problem if you're doing cascading replication. The major limitation was solved by repmgr close to 2 years ago now. It was solved for limited (but important) cases. For example, repmgr does (afaik, maybe I missed a major update at some point?) still require you to have set up ssh with trusted keys between the servers. There are many usecases where that's not an acceptable solution. One of the more obvious ones being when you're on Windows. repmgr hasn't really *solved* it, it has provided a well working workaround... IIRC repmgs is also GPLv3, which means that some companies just won't look at it... Not many, but some. And it's a license that's incompatible with PostgreSQL itself. So while you're correct that the patch to fix that assumed that archiving worked as well, it has been possible to operate happily without it. http://www.repmgr.org New versions for 9.2 will be out soon. That's certainly good, but that doesn't actually solve the problem either. It updates the good workaround. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On 16-06-2012 00:43, Magnus Hagander wrote: For example, we can no longer trust the SSL library to always do encryption, since we specifically want to support null encryption. Meaning we need to teach pg_hba to treat a connection with null encryption as hostnossl, even if it's an openssl-backed connection, and mirrored. And in libpq, we have to make sure that a requiressl connection *does* fail even if we have ssl, when we're using null encryption. And we currently have no way to specify different encryption options on a per-host basis, which is something we'd have to do (e.g. i want to be able to say that subnet x requires encryption with these encryptions methods and subnet y doesn't require encryption but should do compression. Which in the easiest first look would require ssl_ciphers to be controllable from pg_hba.conf - but that doesn't work since we don't get to pg_hba.conf until after we've negotiated the SSL mode... So there's quite a bit of complexity that needs to be put in there just to deal with the fact that we're using SSL to do compression, if we want to support it in a way that's not hackish. That's exactly the complexity I wouldn't add to the code. I'm in favor of experimenting an standard algorithm (zlib, for example -- let's say, it is the easiest way to implement it) or even hooks (libpq and backend -- that seems to be complex but less than openssl-backed connection just for compression). -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
On Sat, Jun 16, 2012 at 6:39 AM, Magnus Hagander mag...@hagander.net wrote: On Sat, Jun 16, 2012 at 4:04 AM, Euler Taveira eu...@timbira.com wrote: On 15-06-2012 11:39, Magnus Hagander wrote: As long as a free implementation exists, it can be ported to Java/.Net. Sure, it takes more work, but it *can be done*. Good point. IMHO, if there isn't a solution that cover all PostgreSQL (it seems it is not), we should pick the most appropriate one for *libpq* and let other drivers implement it at their time. Fair enough if we decide that - but we should make that decision knowing that we're leaving the JDBC and .Net people in a bad position where they are not likely to be able to implement his. The JDBC people have a theoretical chance if the JDK is open. The .Net people are stuck with schannel that doesn't support it at this point. It might well do in the future (since it's in the standard); but they're at the mercy of Microsoft. Both Java and C# are open-source enough that anybody can take existing SSL implementation and add compression to it, then distribute it as improved SSL library. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minimising windows installer password confusion
On 06/14/2012 11:59 PM, Dave Page wrote: On Thu, Jun 14, 2012 at 11:43 AM, Dave Page dp...@pgadmin.org wrote: I'll have a play with it and see if a simple switch to NetworkService seems feasible. OK, I worked up a patch which uses NT AUTHORITY\NetworkService as the service account by default. This doesn't need a password, so allows us to simply prompt during installation for the superuser password for the cluster, and not at all during upgrade. If you run the installer from the command line with --serviceaccount postgres (or some other account name), you get the current behaviour. I've posted it on our internal ReviewBoard system for the rest of the team to review and test on various platforms (I've only tried it on XP so far). Cool. Feel free to lob me a link if you want, I have several unimportant systems I can test it on too. -- Craig Ringer POST Newspapers 276 Onslow Rd, Shenton Park Ph: 08 9381 3088 Fax: 08 9388 2258 ABN: 50 008 917 717 http://www.postnewspapers.com.au/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
Marko Kreen mark...@gmail.com writes: On Sat, Jun 16, 2012 at 6:39 AM, Magnus Hagander mag...@hagander.net wrote: Fair enough if we decide that - but we should make that decision knowing that we're leaving the JDBC and .Net people in a bad position where they are not likely to be able to implement his. The JDBC people have a theoretical chance if the JDK is open. The .Net people are stuck with schannel that doesn't support it at this point. It might well do in the future (since it's in the standard); but they're at the mercy of Microsoft. Both Java and C# are open-source enough that anybody can take existing SSL implementation and add compression to it, then distribute it as improved SSL library. Possibly more to the point: that is work they might have to do, if nobody else steps up to the plate --- and if they do end up doing it, it could benefit other projects too. On the other hand, if we roll-our-own transport compression solution, that is work they *will* have to do, with no chance of sharing the effort with other projects. BTW, as far as the .Net case goes, it took only a moment's googling to find this: http://openssl-net.sourceforge.net/ which is a .Net wrapper around real OpenSSL. It doesn't appear to provide wrappers for the compression selection functions, but surely that's just a lack of round tuits, not that it would take more than five minutes to add them. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq compression
Magnus Hagander mag...@hagander.net writes: Yes, but there's also a lot of such awkward logic we need to add if we *do* go with the SSL library doing the compression: For example, we can no longer trust the SSL library to always do encryption, since we specifically want to support null encryption. True, but are you sure we don't need to do that anyway? What happens today, if a non-libpq client connects with SSL and specifies null encryption? And we currently have no way to specify different encryption options on a per-host basis, which is something we'd have to do (e.g. i want to be able to say that subnet x requires encryption with these encryptions methods and subnet y doesn't require encryption but should do compression. [ shrug... ] Having that sort of control over a homebrew compression solution will *also* require a lot of control logic that does not exist today. So there's quite a bit of complexity that needs to be put in there just to deal with the fact that we're using SSL to do compression, if we want to support it in a way that's not hackish. It's not obvious to me that we actually *need* anything except the ability to recognize that a null-encrypted SSL connection probably shouldn't be treated as matching a hostssl line; which is not something that requires any fundamental rearrangements, since it only requires an after-the-fact check of what was selected. Things like subnet x requires encryption with these encryption methods are features that are sensible with our existing feature set. But we don't have that now and nobody has asked for it, so I think you are moving the goalposts rather unfairly by claiming that a compression-related patch needs to add it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Run pgindent on 9.2 source tree in preparation for first 9.3
On Fri, Jun 15, 2012 at 10:45:16PM -0400, Bruce Momjian wrote: I have updated the pgindent README to use these Perl indent instructions: find . -name \*.pl -o -name \*.pm | xargs perltidy \ --backup-and-modify-in-place --opening-brace-on-new-line \ --vertical-tightness=2 --vertical-tightness-closing=2 \ --nospace-after-keyword=for --nospace-for-semicolon \ --add-whitespace --delete-old-whitespace --paren-tightness=2 \ --keep-old-blank-lines=2 --maximum-line-length=78 \ --entab-leading-whitespace=4 --output-line-ending=unix I would lean against using --nospace-after-keyword=for. Not using it means we get wrong formatting when the for-loop conditions span multiple lines. Using it means we get wrong formatting (albeit less severe) on every for-loop. In any event, if we do use it for for-loops, we should probably use it for all control structure keywords. Otherwise, I like this. As a last idle idea, how about putting the options in a configuration file and passing --profile= as the only option? Besides keeping you from copying a 7-line shell command, this has the benefit of ignoring any ~/.perltidyrc. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers