Re: [HACKERS] How to pass around collation information
On ons, 2010-06-02 at 16:56 -0400, Robert Haas wrote: But in the end the only purpose of setting it on a column is to set which one will be used for operations on that column. And the user might still override it for a particular query. Of course. I'm just saying that it *can* be useful to attach a collation to a column definition, rather than only allowing it to be specified along with the sort operation. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
On Wed, 2010-06-02 at 16:00 -0400, Tom Lane wrote: the current situation that query grace periods go to zero Possibly a better way to handle this concern is to make the second parameter: min_standby_grace_period - the minimum time a query will be given in which to execute, even if max_standby_delay has been reached or exceeded. Would that more directly address you concerns? min_standby_grace_period (ms) SIGHUP -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE .... make constraint DEFERRABLE
On 3 June 2010 02:06, Bruce Momjian br...@momjian.us wrote: Also, foreign keys can't be defined that refer to a deferrable primary key. That isn't mentioned at all in the manual with regard to the DEFERRABLE clause, though it is mentioned in the FK section. You get this error message ERROR: cannot use a deferrable unique constraint for referenced table The use case for this feature looks a little narrow at present. Can we do something about usability? Not sure why that was a limitation. That's in accordance with the SQL spec. I didn't think of this case originally, but all sorts of complications would arise if we were to allow FKs to refer to deferrable PKs. For example, if there are 2 temporarily duplicated PKs, and you update one of them, what would the FK's ON UPDATE actions do? I'm not convinced there is any sensible answer to this question. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] Fix leaky VIEWs for RLS
The attached patch is a proof of concept. It tries to fix the problem of leaky VIEWs for RLS. * The scenario of leaky VIEWs for RLS - When a view contains any table joins and user gives a function that takes arguments depending on only one-side table of the joins, the planner tries to distribute the qualifier into least unit of scans. It enables to minimize the scale of execution cost, but it means user given function may be invoked earlier than other qualifiers within a view, although row-level security is intended using views. If the user given function has a side-effect (E.g, it may try to insert given arguments into other tables), this behavior allows us to leak contents of invisible tuples. postgres=# SELECT * FROM v2 WHERE f_malicious(y); NOTICE: f_malicious: xxx NOTICE: f_malicious: yyy -- leaky contents NOTICE: f_malicious: zzz a | b | x | y ---+-+---+- 1 | aaa | 1 | xxx 3 | ccc | 3 | zzz (2 rows) * Discussions in -hackers - Although it is a security problem, we will face unignorable performance regression, if we disallow to distribute all user defined functions to inside of joins come from views originally. One idea is to distinguish a view into two types. The one is security oriented view, and the other is regular view. Because only creator of the view knows purpose of the view, an idea was suggested that takes a hint on CREATE VIEW, like CREATE SECURITY VIEW. (But I don't implement this feature in this patch yet.) In addition, I was pointed out it may be harmless as long as a person executes the query (not view's owner) has enough privileges on underlaying tables, even if a view is marked as security view. However, it was also pointed out the idea needs to check privileges on planner stage, not execution stage. Is it really preferable? It is not concluded yet. But inline_set_returning_function() checks ACL_EXECUTE permission on planner stage, then it makes a FuncExpr flatten prior to optimization. At least, it seems to me something violation to the dogma. Please point out any issues which I missed. * About this patch -- This patch mainly consists of two parts. The first part is at pull_up_simple_subquery(). It checks privileges of underlaying tables in the subquery to be pulled up, then it marks FromExpr-security_view as TRUE. Note that it does not distinct a subquery come from a view and a pure subquery in the original query. But it eventually causes access violation error, if current user does not have privileges on a pure subquery. So, no matter. And, also note that it does not have statement support right now. So, it assumes all the subqueries are possibly security views. The second part is at distribute_qual_to_rels(). It computes what relations does the given clause depend on at first. If the given clause depends on a part of relations to be joined inside of the security view, it expands the dependency of the clause into whole of the security view. In the result, the clause is not distributed into inside of the join loop. (*) This patch uses check_relation_privileges() to check permissions on DML execution, so please apply another my patch also on testing. * Execution result -- postgres=# CREATE TABLE t1 (a int primary key, b text); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index t1_pkey for table t1 CREATE TABLE postgres=# CREATE TABLE t2 (x int primary key, y text); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index t2_pkey for table t2 CREATE TABLE postgres=# CREATE TABLE t3 (s int primary key, t text); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index t3_pkey for table t3 CREATE TABLE postgres=# CREATE VIEW v1 AS SELECT * FROM t1 JOIN t2 ON t1.a = t2.x; CREATE VIEW postgres=# CREATE VIEW v2 AS SELECT * FROM v1 JOIN t3 ON v1.a = t3.s; CREATE VIEW postgres=# CREATE USER tak; CREATE ROLE postgres=# GRANT SELECT ON v1, v2, t3 TO tak; GRANT The user: tak is allowed to select v1, v2 and t3. postgres=# CREATE OR REPLACE FUNCTION f_malicious(text) RETURNS bool AS 'BEGIN RAISE NOTICE ''f_malicious: %'', $1; RETURN true; END;' LANGUAGE plpgsql; CREATE FUNCTION Because superuser can access t1 and t2 directly, f_malicious(y) shall be distributed to Seq-scan on t2. postgres=# EXPLAIN SELECT * FROM v1 WHERE f_malicious(y); QUERY PLAN --- Hash Join (cost=334.93..365.94 rows=410 width=72) Hash Cond: (t1.a = t2.x) - Seq Scan on t1 (cost=0.00..22.30 rows=1230 width=36) - Hash (cost=329.80..329.80 rows=410 width=36) - Seq Scan on t2 (cost=0.00..329.80 rows=410 width=36) Filter: f_malicious(y) (6 rows) But tak cannot access t1 and t2 directly, f_malicious(y) shall be distributed to outside of the join.
Re: [HACKERS] ALTER TABLE .... make constraint DEFERRABLE
On Wed, 2010-06-02 at 21:06 -0400, Bruce Momjian wrote: Simon Riggs wrote: Deferrable unique constraints seem an interesting feature, though I have either some questions or some issues, not sure which. I don't seem to be able to find any way to do an ALTER TABLE that adds this new capability to an existing table. I was able to do it: test= create table test (x int unique DEFERRABLE INITIALLY DEFERRED); NOTICE: CREATE TABLE / UNIQUE will create implicit index test_x_key for table test CREATE TABLE test= alter table test add column y int; ALTER TABLE test= alter table test add unique (y) DEFERRABLE INITIALLY DEFERRED; NOTICE: ALTER TABLE / ADD UNIQUE will create implicit index test_y_key for table test ALTER TABLE Is that what you were asking? No. I wanted to defer an existing UNIQUE constraint. That doesn't seem to be possible. You *can* add a whole new constraint and then drop the old one, though that's not quite as cool. There is no way to add a constraint via a CREATE TABLE AS SELECT, so that means there is no way to use the feature at all in that case. Uh, CREATE TABLE AS SELECT seems to be very limited, but I have not heard any complaints about it before. Also, foreign keys can't be defined that refer to a deferrable primary key. That isn't mentioned at all in the manual with regard to the DEFERRABLE clause, though it is mentioned in the FK section. You get this error message ERROR: cannot use a deferrable unique constraint for referenced table The use case for this feature looks a little narrow at present. Can we do something about usability? Not sure why that was a limitation. Regrettably it makes it an unusable limitation for many people. All large tables are referenced in a typical database that uses PKs/FKs. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
On Thu, Jun 3, 2010 at 4:41 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: I don't understand why you want to use a different delay when you're restoring from archive vs. when you're streaming (what about existing WAL files found in pg_xlog, BTW?). The source of WAL shouldn't make a difference. Yes. The pace of a recovery has nothing to do with that of log shipping. So to hurry up a recovery when restoring from archive seems to be useless. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
On Thu, 2010-06-03 at 17:56 +0900, Fujii Masao wrote: On Thu, Jun 3, 2010 at 4:41 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: I don't understand why you want to use a different delay when you're restoring from archive vs. when you're streaming (what about existing WAL files found in pg_xlog, BTW?). The source of WAL shouldn't make a difference. Yes. The pace of a recovery has nothing to do with that of log shipping. So to hurry up a recovery when restoring from archive seems to be useless. When streaming drops for some reason we revert to scanning the archive for files. There is clearly two modes of operation. So it makes sense that you might want to set different times for the parameter in each case. We might think of those modes as connected and degraded modes rather than streaming and file shipping. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] caught_up status in walsender
On Thu, Jun 3, 2010 at 4:21 AM, Tom Lane t...@sss.pgh.pa.us wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 02/06/10 21:44, Tom Lane wrote: In the current coding, the effect of not setting *caughtup here is just that we uselessly call XLogSend an extra time for each transmission (because the main loop won't ever delay immediately after a transmission). But without this, we'd never send caughtup = true to the slave. That's intentional. It could take some time for the WAL to be sent, if the network is busy, so by the time XLogSend returns you might well not be caught up anymore. It may have been intentional, but it's still wrong. If you were able to pull all of WAL into the record-to-be-sent, you should sleep afterwards, not send an extra record containing a few more bytes. For reducing the workload of walsender? This seems OK in 9.0 since only asynchronous replication is supported. But when we'll implement synchronous replication in the future, we might have to revert that change. Since a transaction commit might wait for such an extra record to be replicated, walsender should aggressively send all sendable WAL. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
On Thu, Jun 3, 2010 at 6:07 PM, Simon Riggs si...@2ndquadrant.com wrote: On Thu, 2010-06-03 at 17:56 +0900, Fujii Masao wrote: On Thu, Jun 3, 2010 at 4:41 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: I don't understand why you want to use a different delay when you're restoring from archive vs. when you're streaming (what about existing WAL files found in pg_xlog, BTW?). The source of WAL shouldn't make a difference. Yes. The pace of a recovery has nothing to do with that of log shipping. So to hurry up a recovery when restoring from archive seems to be useless. When streaming drops for some reason we revert to scanning the archive for files. There is clearly two modes of operation. Yes. So it makes sense that you might want to set different times for the parameter in each case. What purpose would that serve? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages
So I think the scheme in the original post of this thread is workable. Not as described but could be made to work. In which case I think it's preferable to a freeze map -- which I had previously assumed we would need eventually. The problem with the scheme originally described is that it assumed you could have an cycle counter and then arrange that all the xids on the page are within that cycle. That doesn't work because you could easily have two live xids on the page that belong to two cycles -- one FirstNormalTransactionId and one MaxTransactionId. I think to make it work you need to store a whole 64-bit reference transaction id consisting of both a cycle counter and a transaction id. The invariant for the page is that every xid on the page can be compared to that reference transaction id using normal transactionid semantics. Actually I think the easiest way to do that is to set it to the oldest xid on the page. The first thing to do before comparing any transaction id on the page with a real transaction id would be to figure out whether the reference xid is comparable to the live xid, which if it's the oldest xid on the page implies they'll all be comparable. The way to maintain that invariant would be that any xid insertion on the page must advance the reference xid if it's not comparable to the newly inserted xid. It has to be advanced to the oldest xid that's still comparable with the newly inserted xid. Any xids on the page that are older than the new refernce xid have to be frozen or removed. I'm not sure how to do that without keeping clog forever though. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
On Thu, 2010-06-03 at 18:39 +0900, Fujii Masao wrote: What purpose would that serve? Tom has already explained this and it makes sense for me. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] rfc: changing documentation about xpath
Hi! I'm to face a problem, and not at once resolve it. When I do: -- // -- test=# SELECT xpath('//domain:name/text()', $$?xml version='1.0' encoding='UTF-8'? epp xmlns='urn:ietf:params:xml:ns:epp-1.0' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xsi:schemaLocation='urn:ietf:params:xml:ns:epp-1.0 epp-1.0.xsd' response result code='1301' msg lang='en-US'Command completed successfully; ack to dequeue/msg /result msgQ count='3' id='114004' qDate2010-06-02T12:35:33.0Z/qDate msg lang='en-US'Transfer Requested./msg /msgQ resData domain:trnData xmlns:domain='urn:ietf:params:xml:ns:domain-1.0' xsi:schemaLocation='urn:ietf:params:xml:ns:domain-1.0 domain-1.0.xsd' domain:namexxx.xx/domain:name domain:trStatuspending/domain:trStatus domain:reIDadmin/domain:reID domain:reDate2010-06-02T12:35:33.0Z/domain:reDate domain:acIDxxx-xx/domain:acID domain:acDate2010-06-07T12:35:33.0Z/domain:acDate domain:exDate2014-05-11T12:52:07.0Z/domain:exDate /domain:trnData /resData trID svTRIDxx-xxx/svTRID /trID /response /epp$$, array[array['domain','urn:ietf:params:xml:ns:domain-1.0']]); xpath - {xxx.xx} (1 row) test=# -- // -- worked... and: -- // -- test=# SELECT xpath('//qDate/text()', $$?xml version='1.0' encoding='UTF-8'? epp xmlns='urn:ietf:params:xml:ns:epp-1.0' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xsi:schemaLocation='urn:ietf:params:xml:ns:epp-1.0 epp-1.0.xsd' response result code='1301' msg lang='en-US'Command completed successfully; ack to dequeue/msg /result msgQ count='3' id='114004' qDate2010-06-02T12:35:33.0Z/qDate msg lang='en-US'Transfer Requested./msg /msgQ resData domain:trnData xmlns:domain='urn:ietf:params:xml:ns:domain-1.0' xsi:schemaLocation='urn:ietf:params:xml:ns:domain-1.0 domain-1.0.xsd' domain:namexxx.xx/domain:name domain:trStatuspending/domain:trStatus domain:reIDadmin/domain:reID domain:reDate2010-06-02T12:35:33.0Z/domain:reDate domain:acIDxxx-xx/domain:acID domain:acDate2010-06-07T12:35:33.0Z/domain:acDate domain:exDate2014-05-11T12:52:07.0Z/domain:exDate /domain:trnData /resData trID svTRIDxx-xxx/svTRID /trID /response /epp$$); xpath --- {} (1 row) test=# -- // -- not worked. I think that the nuance about default namespaces should add to xpath's doc. This select: -- // -- SELECT xpath('//global:qDate/text()', $$?xml version='1.0' encoding='UTF-8'? epp xmlns='urn:ietf:params:xml:ns:epp-1.0' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xsi:schemaLocation='urn:ietf:params:xml:ns:epp-1.0 epp-1.0.xsd' response result code='1301' msg lang='en-US'Command completed successfully; ack to dequeue/msg /result msgQ count='3' id='114004' qDate2010-06-02T12:35:33.0Z/qDate msg lang='en-US'Transfer Requested./msg /msgQ resData domain:trnData xmlns:domain='urn:ietf:params:xml:ns:domain-1.0' xsi:schemaLocation='urn:ietf:params:xml:ns:domain-1.0 domain-1.0.xsd' domain:namexxx.xx/domain:name domain:trStatuspending/domain:trStatus domain:reIDadmin/domain:reID domain:reDate2010-06-02T12:35:33.0Z/domain:reDate domain:acIDxxx-xx/domain:acID domain:acDate2010-06-07T12:35:33.0Z/domain:acDate domain:exDate2014-05-11T12:52:07.0Z/domain:exDate /domain:trnData /resData trID svTRIDxx-xxx/svTRID /trID /response /epp$$, array[array['global', 'urn:ietf:params:xml:ns:epp-1.0']]); xpath -- {2010-06-02T12:35:33.0Z} (1 row) -- // -- work. Thanks to Ёж on www.sql.ru and pgsql-ru-general ... :) -- Best regards, Denis I. Polukarov developer Garant-Park-Telekom http://www.gpt.ru -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rfc: changing documentation about xpath
Denis I. Polukarov wrote: Hi! I'm to face a problem, and not at once resolve it. [default namespace mapped in xml xmlns= attribute requires corresponding mapping in third param of xpath()] It's a tolerably subtle point, and I'm not sure it's really PostgreSQL-specific. But if you think the docs need improvement, then please suggest a patch with the extra wording you think would make things clearer. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
On Thu, Jun 3, 2010 at 5:00 AM, Tom Lane t...@sss.pgh.pa.us wrote: I stand by my suggestion from yesterday: Let's define max_standby_delay as the difference between a piece of WAL becoming available in the standby, and applying it. My proposal is essentially the same as yours plus allowing the DBA to choose different max delays for the caught-up and not-caught-up cases. Maybe everybody will end up setting the two delays the same, but I think we don't have enough experience to decide that for them now. Applying WAL that arrives via SR is not always the sign of the caught-up or not-caught-up. OTOH, applying WAL restored from archive is not always the sign of either of them. So isn't it nonsense to separate the delay in order to control the behavior of a recovery for those cases? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow wal_keep_segments to keep all segments
Simon Riggs wrote: On Wed, 2010-06-02 at 20:28 -0400, Bruce Momjian wrote: Simon Riggs wrote: On Wed, 2010-06-02 at 15:20 -0400, Bruce Momjian wrote: The attached patch allows wal_keep_segments = -1 to keep all segements; this is particularly useful for taking a base backup, where you need all the WAL files during startup of the standby. I have documented this usage in the patch as well. I am thinking of applying this after 9.0 beta2 if there is no objection. It's not clear to me why keep all files until server breaks is a good setting. Surely you would set this parameter to the size of your disk. Why allow it to go higher? Well, the -1 allows them to set it temporarily without having to compute their free disk space. Frankly, because the disk space varies, it is impossible to know exactly how large the disk is at the time it would fill up. I think the normal computation would be: 1) How long is my file system backup and restore to standby going to take 2) How often do I generate a 16MB WAL file You would do some computation to figure that out, then maybe multiply it by 10x and set that for wal_keep_segments. I figured allowing a simple -1 would be easier. I think its much easier to find out your free disk space than it is to calculate how much WAL might be generated during backup. Disk space doesn't vary significantly on a production database. If we encourage that laziness then we will get reports that replication doesn't work and Postgres crashes. Well, we don't clean out the archive directory so I don't see this as anything new. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to pass around collation information
On Jun 3, 2010, at 2:43 AM, Peter Eisentraut wrote: On ons, 2010-06-02 at 16:56 -0400, Robert Haas wrote: But in the end the only purpose of setting it on a column is to set which one will be used for operations on that column. And the user might still override it for a particular query. Of course. I'm just saying that it *can* be useful to attach a collation to a column definition, rather than only allowing it to be specified along with the sort operation. How does collation relate to per-table/column encodings? For that matter, are per-table/column encodings spec, and/or something that we're looking to implement down the line? Regards, David -- David Christensen End Point Corporation da...@endpoint.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow wal_keep_segments to keep all segments
On 03/06/10 15:15, Bruce Momjian wrote: Simon Riggs wrote: I think its much easier to find out your free disk space than it is to calculate how much WAL might be generated during backup. Disk space doesn't vary significantly on a production database. If we encourage that laziness then we will get reports that replication doesn't work and Postgres crashes. Well, we don't clean out the archive directory so I don't see this as anything new. We leave that up to the DBA to clean out one way or another. We provide restartpoint_command and the %r option in restore_command to help with that. Surely we don't expect DBAs to delete old files in pg_xlog? I agree with Simon here, I think it would be better to not provide -1 as an option here. At least you better document well that you should only do that temporarily or you will eventually run out of disk space. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow wal_keep_segments to keep all segments
Heikki Linnakangas wrote: On 03/06/10 15:15, Bruce Momjian wrote: Simon Riggs wrote: I think its much easier to find out your free disk space than it is to calculate how much WAL might be generated during backup. Disk space doesn't vary significantly on a production database. If we encourage that laziness then we will get reports that replication doesn't work and Postgres crashes. Well, we don't clean out the archive directory so I don't see this as anything new. We leave that up to the DBA to clean out one way or another. We provide restartpoint_command and the %r option in restore_command to help with that. Surely we don't expect DBAs to delete old files in pg_xlog? I agree with Simon here, I think it would be better to not provide -1 as an option here. At least you better document well that you should only do that temporarily or you will eventually run out of disk space. Using this only temporarily is mentioned in the doc patch. Do I need more? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + Index: doc/src/sgml/config.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v retrieving revision 1.280 diff -c -c -r1.280 config.sgml *** doc/src/sgml/config.sgml 31 May 2010 15:50:48 - 1.280 --- doc/src/sgml/config.sgml 2 Jun 2010 19:19:18 - *** *** 1887,1893 Specifies the number of past log file segments kept in the filenamepg_xlog/ directory, in case a standby server needs to fetch them for streaming ! replication. Each segment is normally 16 megabytes. If a standby server connected to the primary falls behind by more than varnamewal_keep_segments/ segments, the primary might remove a WAL segment still needed by the standby, in which case the --- 1887,1893 Specifies the number of past log file segments kept in the filenamepg_xlog/ directory, in case a standby server needs to fetch them for streaming ! replication. Each segment is normally 16 megabytes. If a standby server connected to the primary falls behind by more than varnamewal_keep_segments/ segments, the primary might remove a WAL segment still needed by the standby, in which case the *** *** 1901,1908 is zero (the default), the system doesn't keep any extra segments for standby purposes, and the number of old WAL segments available for standbys is determined based only on the location of the previous ! checkpoint and status of WAL archiving. ! This parameter can only be set in the filenamepostgresql.conf/ file or on the server command line. /para /listitem --- 1901,1909 is zero (the default), the system doesn't keep any extra segments for standby purposes, and the number of old WAL segments available for standbys is determined based only on the location of the previous ! checkpoint and status of WAL archiving. If literal-1/ is ! specified, log file segments are kept indefinitely. This ! parameter can only be set in the filenamepostgresql.conf/ file or on the server command line. /para /listitem Index: doc/src/sgml/high-availability.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/high-availability.sgml,v retrieving revision 1.70 diff -c -c -r1.70 high-availability.sgml *** doc/src/sgml/high-availability.sgml 29 May 2010 09:01:10 - 1.70 --- doc/src/sgml/high-availability.sgml 2 Jun 2010 19:19:19 - *** *** 750,756 If you use streaming replication without file-based continuous archiving, you have to set varnamewal_keep_segments/ in the master to a value high enough to ensure that old WAL segments are not recycled ! too early, while the standby might still need them to catch up. If the standby falls behind too much, it needs to be reinitialized from a new base backup. If you set up a WAL archive that's accessible from the standby, wal_keep_segments is not required as the standby can always --- 750,760 If you use streaming replication without file-based continuous archiving, you have to set varnamewal_keep_segments/ in the master to a value high enough to ensure that old WAL segments are not recycled ! too early, while the standby might still need them to catch up. This ! is particularly important when performing a base backup because the ! standby will need all WAL segments generated since the start of the ! backup; consider setting varnamewal_keep_segments/ to ! literal-1/ temporarily in such cases. If the
[HACKERS] PITR Recovery Question
Hi, My production server is running PostgreSQL v8.2.3 on CentOS release 5.2 (Final). I've setup PITR in my production server. For some reason, after setting up PITR, we're not able to manage and maintain it. Because of this our WAL archive drive become full (100% use) approximately after 1 month. PITR SETUP DETAILS We've 2 drives. Primary drive (pgsql/data/ directory resides) is 400 GB and secondary drive (WAL archive) is 30 GB. All WAL archives are written to secondary drive. Base backup taken on: Aug03, 2009 WAL archive drive become full (100% use) on: Sep05, 2009 Because this WAL archive drive has become full, all WAL archive segments to be archived are accumulated into pg_xlog/ directory itself. Eventually, 9 months (as of today from Sep05, 2009) of WAL archives are residing in pg_xlog/ directory. My question is, in case if I would like to perform recovery process as it is in this situation, will this work out? That is, I'm seeing/finding out whether recovery process would perform successfully anywhere between the date range Aug03, 2009 (my base backup date) and as of today - Jun03, 2009. Reason I'm asking this is still all my WAL archives are residing in pg_xlog/ directory. Experts advice/idea/suggestion on this appreciated. Regards, Gnanam -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] caught_up status in walsender
Fujii Masao masao.fu...@gmail.com writes: On Thu, Jun 3, 2010 at 4:21 AM, Tom Lane t...@sss.pgh.pa.us wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: That's intentional. It could take some time for the WAL to be sent, if the network is busy, so by the time XLogSend returns you might well not be caught up anymore. It may have been intentional, but it's still wrong. If you were able to pull all of WAL into the record-to-be-sent, you should sleep afterwards, not send an extra record containing a few more bytes. For reducing the workload of walsender? This seems OK in 9.0 since only asynchronous replication is supported. But when we'll implement synchronous replication in the future, we might have to revert that change. Since a transaction commit might wait for such an extra record to be replicated, walsender should aggressively send all sendable WAL. It *is* aggressively sending all sendable WAL. The ideal steady state behavior of this loop ought to be that once per sleep interval, we send out one record containing all new WAL since the last time. We do not want it sending 1 bytes, then another record with 100 bytes, then another record with 10 bytes, etc etc. That's inefficient and ultimately pointless. You'll always be behind again a millisecond later. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow wal_keep_segments to keep all segments
Heikki Linnakangas wrote: On 03/06/10 15:15, Bruce Momjian wrote: Simon Riggs wrote: I think its much easier to find out your free disk space than it is to calculate how much WAL might be generated during backup. Disk space doesn't vary significantly on a production database. If we encourage that laziness then we will get reports that replication doesn't work and Postgres crashes. Well, we don't clean out the archive directory so I don't see this as anything new. We leave that up to the DBA to clean out one way or another. We provide restartpoint_command and the %r option in restore_command to help with that. Surely we don't expect DBAs to delete old files in pg_xlog? I agree with Simon here, I think it would be better to not provide -1 as an option here. At least you better document well that you should only do that temporarily or you will eventually run out of disk space. I have updated the doc text to mention temporarily everywhere '-1' is mentioned. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + Index: doc/src/sgml/config.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v retrieving revision 1.280 diff -c -c -r1.280 config.sgml *** doc/src/sgml/config.sgml31 May 2010 15:50:48 - 1.280 --- doc/src/sgml/config.sgml3 Jun 2010 14:05:21 - *** *** 1901,1908 is zero (the default), the system doesn't keep any extra segments for standby purposes, and the number of old WAL segments available for standbys is determined based only on the location of the previous ! checkpoint and status of WAL archiving. ! This parameter can only be set in the filenamepostgresql.conf/ file or on the server command line. /para /listitem --- 1901,1909 is zero (the default), the system doesn't keep any extra segments for standby purposes, and the number of old WAL segments available for standbys is determined based only on the location of the previous ! checkpoint and status of WAL archiving. To temporarily keep ! all log file segments, use the value literal-1/. This ! parameter can only be set in the filenamepostgresql.conf/ file or on the server command line. /para /listitem Index: doc/src/sgml/high-availability.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/high-availability.sgml,v retrieving revision 1.70 diff -c -c -r1.70 high-availability.sgml *** doc/src/sgml/high-availability.sgml 29 May 2010 09:01:10 - 1.70 --- doc/src/sgml/high-availability.sgml 3 Jun 2010 14:05:21 - *** *** 750,756 If you use streaming replication without file-based continuous archiving, you have to set varnamewal_keep_segments/ in the master to a value high enough to ensure that old WAL segments are not recycled ! too early, while the standby might still need them to catch up. If the standby falls behind too much, it needs to be reinitialized from a new base backup. If you set up a WAL archive that's accessible from the standby, wal_keep_segments is not required as the standby can always --- 750,760 If you use streaming replication without file-based continuous archiving, you have to set varnamewal_keep_segments/ in the master to a value high enough to ensure that old WAL segments are not recycled ! too early, while the standby might still need them to catch up. This ! is particularly important when performing a base backup because the ! standby will need all WAL segments generated since the start of the ! backup; consider setting varnamewal_keep_segments/ to ! literal-1/ temporarily in such cases. If the standby falls behind too much, it needs to be reinitialized from a new base backup. If you set up a WAL archive that's accessible from the standby, wal_keep_segments is not required as the standby can always Index: src/backend/access/transam/xlog.c === RCS file: /cvsroot/pgsql/src/backend/access/transam/xlog.c,v retrieving revision 1.415 diff -c -c -r1.415 xlog.c *** src/backend/access/transam/xlog.c 2 Jun 2010 09:28:44 - 1.415 --- src/backend/access/transam/xlog.c 3 Jun 2010 14:05:22 - *** *** 7337,7343 * Delete old log files (those no longer needed even for previous * checkpoint or the standbys in XLOG streaming). */ ! if (_logId || _logSeg) { /* * Calculate the last segment that we need to retain because of --- 7337,7343 * Delete old log files
Re: [HACKERS] ALTER TABLE .... make constraint DEFERRABLE
Simon Riggs wrote: On Wed, 2010-06-02 at 21:06 -0400, Bruce Momjian wrote: Simon Riggs wrote: Deferrable unique constraints seem an interesting feature, though I have either some questions or some issues, not sure which. I don't seem to be able to find any way to do an ALTER TABLE that adds this new capability to an existing table. I was able to do it: test= create table test (x int unique DEFERRABLE INITIALLY DEFERRED); NOTICE: CREATE TABLE / UNIQUE will create implicit index test_x_key for table test CREATE TABLE test= alter table test add column y int; ALTER TABLE test= alter table test add unique (y) DEFERRABLE INITIALLY DEFERRED; NOTICE: ALTER TABLE / ADD UNIQUE will create implicit index test_y_key for table test ALTER TABLE Is that what you were asking? No. I wanted to defer an existing UNIQUE constraint. That doesn't seem to be possible. You *can* add a whole new constraint and then drop the old one, though that's not quite as cool. Do we allow any kind of constraint modification via ALTER TABLE? I don't see much listed in the manual. There is no way to add a constraint via a CREATE TABLE AS SELECT, so that means there is no way to use the feature at all in that case. Uh, CREATE TABLE AS SELECT seems to be very limited, but I have not heard any complaints about it before. Also, foreign keys can't be defined that refer to a deferrable primary key. That isn't mentioned at all in the manual with regard to the DEFERRABLE clause, though it is mentioned in the FK section. You get this error message ERROR: cannot use a deferrable unique constraint for referenced table The use case for this feature looks a little narrow at present. Can we do something about usability? Not sure why that was a limitation. Regrettably it makes it an unusable limitation for many people. All large tables are referenced in a typical database that uses PKs/FKs. Yeah, no question. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow wal_keep_segments to keep all segments
Bruce Momjian br...@momjian.us writes: Heikki Linnakangas wrote: Surely we don't expect DBAs to delete old files in pg_xlog? I agree with Simon here, I think it would be better to not provide -1 as an option here. At least you better document well that you should only do that temporarily or you will eventually run out of disk space. I have updated the doc text to mention temporarily everywhere '-1' is mentioned. FWIW, I've come to agree with Simon. Allowing -1 doesn't do anything that you can't do with a large positive setting, and what it does do is to encourage people to set the variable to an unsafe value as a substitute for thinking. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to pass around collation information
On tor, 2010-06-03 at 07:30 -0500, David Christensen wrote: How does collation relate to per-table/column encodings? There is some connection between collations and character sets or encodings, because a collation is tied to one of those, just as a necessity of implementation (depending on implementation details). You could have per-column (or per-some-other-small-unit) collation with a global encoding, but a per-column encoding with a global collation wouldn't work. For that matter, are per-table/column encodings spec, Yes. and/or something that we're looking to implement down the line? I don't think anyone is seriously planning that. But per-column collations would have to come first, anyway. Of course there is always the possibility that someone comes up with an alternative plan that invalidates the above, but the above represents the facts from the SQL standard and other implementations. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Did we really want to force an initdb in beta2?
Because that's the consequences of fooling with pg_control. I committed the PG_CONTROL_VERSION bump that was missing from the patch Robert committed last night, but I wonder whether we shouldn't revert the whole thing instead. It's not apparent to me that what it bought is worth forcing beta testers to initdb. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
On Thu, Jun 3, 2010 at 12:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: Greg Stark gsst...@mit.edu writes: I was assuming the walreceiver only requests more wal in relatively small chunks and only when replay has caught up and needs more data. I haven't actually read this code so if that assumption is wrong then I'm off-base. It is off-base. The receiver does not request data, the sender is what determines how much WAL is sent when. Hm, so what happens if the slave blocks, doesn't the sender block when the kernel buffers fill up? So I think this isn't necessarily such a blue moon event. As I understand it, all it would take is a single long-running report and a vacuum or HOT cleanup occurring on the master. I think this is mostly FUD too. How often do you see vacuum blocked for an hour now? No, that's not comparable. On the master vacuum will just ignore tuples that are still visible to live transactions. On the slave it doesn't have a choice, it sees the cleanup record and must pause recovery until those transactions finish. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] functional call named notation clashes with SQL feature
Peter Eisentraut wrote: On m?n, 2010-05-31 at 18:23 -0400, Tom Lane wrote: My feeling is that (a) there is no hurry to do anything about an unreleased draft of the standard, and (b) perhaps Peter could lobby the committee to change the standard before it does get published. Given that Oracle and DB2 already support that syntax in released products, and I'm not even a member of any relevant body, that seems pretty much impossible. With beta2 being wrapped today, we are going to be releasing ':=' as our method for function parameter assignment, but also with the likely outcome that we are going to need to support '=' very soon. Are we sure we want hstore compatibility to drive this decision? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PITR Recovery Question
Hi, I'll try to answer your questions below, but in the future please post questions concerning the usage and administration of postgres to pgsql-general or pgsql-admin. This list focus is the development of new features and bugfixes. On Jun 3, 2010, at 15:37 , Gnanakumar wrote: PITR SETUP DETAILS We've 2 drives. Primary drive (pgsql/data/ directory resides) is 400 GB and secondary drive (WAL archive) is 30 GB. All WAL archives are written to secondary drive. Base backup taken on: Aug03, 2009 WAL archive drive become full (100% use) on: Sep05, 2009 Because this WAL archive drive has become full, all WAL archive segments to be archived are accumulated into pg_xlog/ directory itself. Eventually, 9 months (as of today from Sep05, 2009) of WAL archives are residing in pg_xlog/ directory. This is by design. WAL logs are only removed from pg_xlog once they have been archived successfully. Since your archive_command fails due to the disk being full, they remain in pg_xlog. Once you enlarge the filesystem holding the WAL archive they should be copied and subsequently removed from pg_xlog. Note that you'd usually take a new base backup once in a while to limit the number of WAL segments you need to retain. You can take a base backup while postgres is running by issuing pg_start_backup() before you start the copy and pg_stop_backup() after it finished. Apart from creating additional IO load, doing so won't interfere with normal query execution in any way. My question is, in case if I would like to perform recovery process as it is in this situation, will this work out? That is, I'm seeing/finding out whether recovery process would perform successfully anywhere between the date range Aug03, 2009 (my base backup date) and as of today - Jun03, 2009. Reason I'm asking this is still all my WAL archives are residing in pg_xlog/ directory. For PITR, you'll obviously need the WAL segment starting from the time your base backup started up until the point you want to recover to. If some of those WAL segments still reside in pg_xlog, you'll either need to teach your restore_command to fetch them from there. Note that you cannot recover in reverse. To recover up to a certain point in time you always need to start from a base backup taken *before* that time. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Did we really want to force an initdb in beta2?
On 03/06/10 17:54, Tom Lane wrote: Because that's the consequences of fooling with pg_control. I committed the PG_CONTROL_VERSION bump that was missing from the patch Robert committed last night, but I wonder whether we shouldn't revert the whole thing instead. It's not apparent to me that what it bought is worth forcing beta testers to initdb. Hmph, good point, I did not think of that at all when I reviewed the patch. If we moved the new DB_SHUTDOWNED_IN_RECOVERY as the last item in the enum, we would stay backwards-compatible. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] caught_up status in walsender
I wrote: On further contemplation, it seems like the protocol needs another field besides that: each record should also carry a boolean indicating whether walsender.c thinks it is currently caught up, ie the record carries all WAL data up to the current end of WAL. Actually, there's a better way to do that: let's have the record carry not just a boolean but the actual current end-of-WAL LSN. The receiver could then not just determine am I behind but find out *how far* behind it is, and thereby perhaps adjust its behavior in more subtle ways than just a binary on/off fashion. (Actually doing anything like that is material for future work, of course, but I think we should try to get the SR protocol right now.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] functional call named notation clashes with SQL feature
Bruce Momjian br...@momjian.us writes: Are we sure we want hstore compatibility to drive this decision? hstore is what it is, and has been that way for a long time. We can't just ignore it. And I don't think breaking it (and probably other code) on zero notice is an acceptable outcome. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Did we really want to force an initdb in beta2?
On Thu, Jun 3, 2010 at 11:25 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 03/06/10 17:54, Tom Lane wrote: Because that's the consequences of fooling with pg_control. I committed the PG_CONTROL_VERSION bump that was missing from the patch Robert committed last night, but I wonder whether we shouldn't revert the whole thing instead. It's not apparent to me that what it bought is worth forcing beta testers to initdb. Hmph, good point, I did not think of that at all when I reviewed the patch. If we moved the new DB_SHUTDOWNED_IN_RECOVERY as the last item in the enum, we would stay backwards-compatible. Ugh, sorry about that. I didn't realize this either. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
Greg Stark gsst...@mit.edu writes: Well, if the slave can't keep up, that's a separate problem. It will not fail to keep up as a result of the transmission mechanism. No, I mean if the slave is paused due to a conflict. Does it stop reading data from the master or does it buffer it up on disk? If it stops reading it from the master then the effect is the same as if the slave stopped requesting data even if there's no actual request. The data keeps coming in and getting dumped into the slave's pg_xlog. walsender/walreceiver are not at all tied to the slave's application of WAL. In principle we could have the code around max_standby_delay notice just how far behind it's gotten and adjust the delay tolerance based on that; but I think designing a feedback loop for that is 9.1 material. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] functional call named notation clashes with SQL feature
Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Are we sure we want hstore compatibility to drive this decision? hstore is what it is, and has been that way for a long time. We can't just ignore it. And I don't think breaking it (and probably other code) on zero notice is an acceptable outcome. Well, it seems we are going to be stuck supporting = because it is hard to argue that the SQL standards committee should adopt := instead of = because of hstore. ;-) I hate eventually having two documented ways of doing something, but it appears by releasing := we are doing exactly that. Is telling hstore users they have to change = to something else such a large major version incompatibility that it is worth supporting and documenting two syntaxes for parameter assignment? It is that calculus that has me questioning our approach. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
On Thu, Jun 3, 2010 at 4:34 PM, Tom Lane t...@sss.pgh.pa.us wrote: The data keeps coming in and getting dumped into the slave's pg_xlog. walsender/walreceiver are not at all tied to the slave's application of WAL. In principle we could have the code around max_standby_delay notice just how far behind it's gotten and adjust the delay tolerance based on that; but I think designing a feedback loop for that is 9.1 material. Er, no. In that case my first concern was misguided. I'm happy there's no feedback loop -- my fear was that there was and it would mean the time received could be decoupled from the time the wal was generated. But as you describe it then the time received might be slightly delayed from the time the wal was generated but to some constant degree -- not in a way that will be influenced by the log application being blocked on the slave. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] functional call named notation clashes with SQL feature
On Thu, Jun 3, 2010 at 11:34 AM, Bruce Momjian br...@momjian.us wrote: Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Are we sure we want hstore compatibility to drive this decision? hstore is what it is, and has been that way for a long time. We can't just ignore it. And I don't think breaking it (and probably other code) on zero notice is an acceptable outcome. Well, it seems we are going to be stuck supporting = because it is hard to argue that the SQL standards committee should adopt := instead of = because of hstore. ;-) I hate eventually having two documented ways of doing something, but it appears by releasing := we are doing exactly that. Is telling hstore users they have to change = to something else such a large major version incompatibility that it is worth supporting and documenting two syntaxes for parameter assignment? It is that calculus that has me questioning our approach. I don't mind supporting := and = as much as I mind supporting only :=, and I think that's the other reasonable alternative. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] functional call named notation clashes with SQL feature
Bruce Momjian wrote: Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Are we sure we want hstore compatibility to drive this decision? hstore is what it is, and has been that way for a long time. We can't just ignore it. And I don't think breaking it (and probably other code) on zero notice is an acceptable outcome. Well, it seems we are going to be stuck supporting = because it is hard to argue that the SQL standards committee should adopt := instead of = because of hstore. ;-) I hate eventually having two documented ways of doing something, but it appears by releasing := we are doing exactly that. Is telling hstore users they have to change = to something else such a large major version incompatibility that it is worth supporting and documenting two syntaxes for parameter assignment? It is that calculus that has me questioning our approach. Thinking some more, what is the value of keeping = in hstore for 9.0? Perhaps we could create a script they could run on 8.4 that would add support for the new hstore operator to replace =, and then they can upgrade to 9.0 when they are ready. I see only three mentions of = in hstore.sql. Do we really want to keep the := baggage forever just for hstore? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to pass around collation information
On Thu, Jun 3, 2010 at 3:43 AM, Peter Eisentraut pete...@gmx.net wrote: On ons, 2010-06-02 at 16:56 -0400, Robert Haas wrote: But in the end the only purpose of setting it on a column is to set which one will be used for operations on that column. And the user might still override it for a particular query. Of course. I'm just saying that it *can* be useful to attach a collation to a column definition, rather than only allowing it to be specified along with the sort operation. Oh, I agree with that, for sure. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] functional call named notation clashes with SQL feature
Bruce Momjian br...@momjian.us writes: Is telling hstore users they have to change = to something else such a large major version incompatibility that it is worth supporting and documenting two syntaxes for parameter assignment? It is that calculus that has me questioning our approach. Well it's not only hstore. Anyone can CREATE OPERATOR = on any released version of PostgreSQL currently. I don't think we can deprecate it on short notice, so we'll have := in 9.0, and maybe a deprecation notice for =. Now that it's pretty clear from Peter that the standard is not going to change its choice here, I'll vote adding a WARNING each time an operator called = is created, so that we get a chance to move later on. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] functional call named notation clashes with SQL feature
Bruce Momjian br...@momjian.us writes: Thinking some more, what is the value of keeping = in hstore for 9.0? Backwards compatibility. You have not made any argument today that we have not been over many times before. I do not have time to argue about this today --- I have to go fix max_standby_delay. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] functional call named notation clashes with SQL feature
On Jun 3, 2010, at 8:54 AM, Tom Lane wrote: Thinking some more, what is the value of keeping = in hstore for 9.0? Backwards compatibility. You have not made any argument today that we have not been over many times before. I do not have time to argue about this today --- I have to go fix max_standby_delay. I'm sure there's a lot of code using = in the wild. We can't just drop them without a deprecation cycle. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] functional call named notation clashes with SQL feature
On Jun 3, 2010, at 8:53 AM, Dimitri Fontaine wrote: Now that it's pretty clear from Peter that the standard is not going to change its choice here, I'll vote adding a WARNING each time an operator called = is created, so that we get a chance to move later on. Should support for == be added to hstore for 9.0? So both = and == will work? Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] functional call named notation clashes with SQL feature
Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Thinking some more, what is the value of keeping = in hstore for 9.0? Backwards compatibility. You have not made any argument today that we have not been over many times before. I do not have time to argue about this today --- I have to go fix max_standby_delay. Agreed. I am just making sure we are going in the right direction. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] functional call named notation clashes with SQL feature
David E. Wheeler wrote: On Jun 3, 2010, at 8:53 AM, Dimitri Fontaine wrote: Now that it's pretty clear from Peter that the standard is not going to change its choice here, I'll vote adding a WARNING each time an operator called = is created, so that we get a chance to move later on. Should support for == be added to hstore for 9.0? So both = and == will work? I have added the idea to the 9.0 open items wiki: http://wiki.postgresql.org/wiki/PostgreSQL_9.0_Open_Items#Code -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Did we really want to force an initdb in beta2?
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: If we moved the new DB_SHUTDOWNED_IN_RECOVERY as the last item in the enum, we would stay backwards-compatible. I don't think that's a terribly workable idea; the enum is laid out so that inequality tests are sensible, and I'm not sure there aren't any. The code would look mighty ugly in any case. What exactly was the reason for this patch? Could it be held over till 9.1? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] 9.0 release notes
The 9.0 release notes have been updated to current. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Did we really want to force an initdb in beta2?
On 03/06/10 19:16, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: If we moved the new DB_SHUTDOWNED_IN_RECOVERY as the last item in the enum, we would stay backwards-compatible. I don't think that's a terribly workable idea; the enum is laid out so that inequality tests are sensible, and I'm not sure there aren't any. Hmm, the only inequality tests on that field I can see check that the value is valid, i.e between the first and last valid value. The code would look mighty ugly in any case. True. One more hacky idea: Keep the code as it is and change pg_control version back to what it was in beta1. Add a note in the release notes that if you're upgrading from beta1, you must shut down the database cleanly first. When you do that, control file is in DB_SHUTDOWNED state, and the enum value for that did not change. One caveat is that a standby server will be DB_IN_ARCHIVE_RECOVERY, which did change value so that with beta2 binaries it will look like DB_IN_CRASH_RECOVERY. I think that would still work, though (and if not, in the worst case you'll just have to reinitialize the standby from a new base backup). What exactly was the reason for this patch? Could it be held over till 9.1? Before the patch, when you shut down a standby server, you get this message in the log on the next startup: LOG: database system was interrupted while in recovery at log time 2010-06-02 14:48:28 EEST HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. The problem is that that hint is pretty alarming. The data should be fine if the standby server was shut down cleanly with pg_ctl stop -m fast/smart. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
Simon Riggs si...@2ndquadrant.com writes: On Wed, 2010-06-02 at 13:14 -0400, Tom Lane wrote: This patch seems to me to be going in fundamentally the wrong direction. It's adding complexity and overhead (far more than is needed), and it's failing utterly to resolve the objections that I raised to start with. Having read your proposal, it seems changing from time-on-sender to time-on-receiver is a one line change to the patch. What else are you thinking of removing, if anything? Basically, we need to get rid of everything that feeds timestamps from the WAL content into the kill-delay logic. In particular, Simon seems to be basically refusing to do anything about the complaint that the code fails unless master and standby clocks are in close sync. I do not believe that this is acceptable, and since he won't fix it, I guess I'll have to. Syncing two servers in replication is common practice, as has been explained here; I'm still surprised people think otherwise. Doesn't affect the complaint in the least: I do not find it acceptable to have that be *mandated* in order for our code to work sensibly. I would be OK with having something approaching what you want as a non-default optional behavior (with a clearly-documented dependency on having synchronized clocks). But in any case the current behavior is still quite broken as regards reading stale timestamps from WAL. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
Simon Riggs si...@2ndquadrant.com writes: On Wed, 2010-06-02 at 16:00 -0400, Tom Lane wrote: the current situation that query grace periods go to zero Possibly a better way to handle this concern is to make the second parameter: min_standby_grace_period - the minimum time a query will be given in which to execute, even if max_standby_delay has been reached or exceeded. Would that more directly address you concerns? min_standby_grace_period (ms) SIGHUP A minimum grace period seems like a good idea to me, but I think it's somewhat orthogonal to the core problem here. I think we all intuitively feel that there should be a way to dial back the grace period when a slave is far behind on applying WAL. The problem is first how to figure out what far behind means, and second how to adjust the grace period in a way that doesn't have surprising misbehaviors. A minimum grace period would prevent some of the very worst misbehaviors but it's not really addressing the core problem. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Did we really want to force an initdb in beta2?
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 03/06/10 19:16, Tom Lane wrote: What exactly was the reason for this patch? Could it be held over till 9.1? Before the patch, when you shut down a standby server, you get this message in the log on the next startup: LOG: database system was interrupted while in recovery at log time 2010-06-02 14:48:28 EEST HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. The problem is that that hint is pretty alarming. Maybe we should just get rid of the hint. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
On Thu, 2010-06-03 at 12:47 -0400, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: On Wed, 2010-06-02 at 13:14 -0400, Tom Lane wrote: This patch seems to me to be going in fundamentally the wrong direction. It's adding complexity and overhead (far more than is needed), and it's failing utterly to resolve the objections that I raised to start with. Having read your proposal, it seems changing from time-on-sender to time-on-receiver is a one line change to the patch. What else are you thinking of removing, if anything? Basically, we need to get rid of everything that feeds timestamps from the WAL content into the kill-delay logic. I understand your wish to do this, though it isn't always accurate to do that in the case where there is a backlog. The patch already does that *mostly* for the streaming case. The only time it does use the WAL content timestamp is when the WAL content timestamp is later than the oldest receive time, so in that case it is used as a correction. (see code comments and comments below also) In particular, Simon seems to be basically refusing to do anything about the complaint that the code fails unless master and standby clocks are in close sync. I do not believe that this is acceptable, and since he won't fix it, I guess I'll have to. Syncing two servers in replication is common practice, as has been explained here; I'm still surprised people think otherwise. Doesn't affect the complaint in the least: I do not find it acceptable to have that be *mandated* in order for our code to work sensibly. OK, accepted. I would be OK with having something approaching what you want as a non-default optional behavior (with a clearly-documented dependency on having synchronized clocks). Yes, that's what I'd been thinking also. So that gives us a way forwards. standby_delay_base = apply (default) | send determines whether the standby_delay is calculated solely with reference to the standby server (apply) or whether times from the sending server are used (send). Use of send implies that the clocks on primary and standby are synchronised to within a useful accuracy, in which case it is usual to enforce that with time synchronisation software such as ntp. But in any case the current behavior is still quite broken as regards reading stale timestamps from WAL. Agreed. That wasn't the objective of this patch or a priority. If you're reading from an archive, you need to set max_standby_delay appropriately, current recommendation is -1. We can address that concern once the main streaming case is covered, or we can add that now. Are you planning to work on these things now as you said? How about I apply my patch now, then you do another set of changes afterwards to add the other items you mentioned, since that is now 2 additional parameters and related code? -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-hackers] Daily digest v1.10705 (13 messages)
On Thu, 2010-06-03 at 05:53 -0300, pgsql-hackers-ow...@postgresql.org wrote: [ . . . ] In my current idea, when a qual-node that contains FuncExpr tries to reference a part of relations within a security view, its referencing relations will be expanded to whole of the security view at distribute_qual_to_rels(). [ . . . ] I may be missing something here but this seems a bit too simplistic and, I think, fails to deal with an important use case. Security views, that do anything useful at all, tend to introduce performance issues. I am concerned that placing a conceptual barrier between the secured and unsecured parts of queries is going to unnecessarily restrict what the optimiser can do. For example consider that we have three secured views, each of the form: create view s_x as select * from x where i_can_see(x.key); and consider the query: select stuff from s_x inner join s_y on s_y.key = s_x.key inner join s_z on s_z.key = s_x.key where fn(s_x.a) = 3; The optimiser ought to be able to spot the fact that i_can_see() need only be called once for each joined result. By placing a barrier (if I understand your proposal correctly) between the outermost joins and the inner views, doesn't this optimisation become impossible? I think a simpler solution may be possible here. If you can tag the function i_can_see() as a security function, at least in the context of its use in the security views, and then create the rule that security functions are always considered to be lower cost than user-defined non-security functions, don't we achieve the result of preventing the insecure function from seeing rows that it shouldn't? I guess my concern is that a query may be constructed a=out of secured and unsecured parts and the optimiser should be free to group all of the secured parts together before considering the unsecured parts. Sorry for the imprecise language and terminolgy, and also if I have completely misunderstood the implications. Best Wishes __ Marc (the veil guy) signature.asc Description: This is a digitally signed message part
Re: [HACKERS] 9.0 release notes
On 06/03/2010 09:34 AM, Bruce Momjian wrote: The 9.0 release notes have been updated to current. OK. I'll be doing editing for readability starting on Monday. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
Simon Riggs si...@2ndquadrant.com writes: On Thu, 2010-06-03 at 12:47 -0400, Tom Lane wrote: But in any case the current behavior is still quite broken as regards reading stale timestamps from WAL. Agreed. That wasn't the objective of this patch or a priority. If you're reading from an archive, you need to set max_standby_delay appropriately, current recommendation is -1. That's not a fix; the problem is that there *is no* appropriate setting for max_standby_delay when you're reading from archive. It is unlikely that the values of the timestamps you're reading should be considered to have any bearing on what grace period to allow; but nonetheless it's desirable to be able to have a non-infinite grace time. Basically what I think is that we want what you called apply semantics always for reading from archive (and I still think the DBA should be able to set that grace period separately from the one that applies in SR operation). Paying attention to the master's timestamps is only reasonable in the SR context. And yes, I want the dependency on WAL timestamps to be gone completely, not just mostly. I think that driving the delay logic off of SR receipt time (and/or the timestamp we agreed to add to the SR protocol) is the appropriate and sufficient way to deal with the problem. Looking at the embedded timestamps just confuses the feedback loop behavior. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
On Thu, 2010-06-03 at 13:32 -0400, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: On Thu, 2010-06-03 at 12:47 -0400, Tom Lane wrote: But in any case the current behavior is still quite broken as regards reading stale timestamps from WAL. Agreed. That wasn't the objective of this patch or a priority. If you're reading from an archive, you need to set max_standby_delay appropriately, current recommendation is -1. That's not a fix; the problem is that there *is no* appropriate setting for max_standby_delay when you're reading from archive. It is unlikely that the values of the timestamps you're reading should be considered to have any bearing on what grace period to allow; but nonetheless it's desirable to be able to have a non-infinite grace time. I'm not saying that's a fix; I agree we should change that also. Basically what I think is that we want what you called apply semantics always for reading from archive (and I still think the DBA should be able to set that grace period separately from the one that applies in SR operation). Paying attention to the master's timestamps is only reasonable in the SR context. Agreed. And yes, I want the dependency on WAL timestamps to be gone completely, not just mostly. I think that driving the delay logic off of SR receipt time (and/or the timestamp we agreed to add to the SR protocol) is the appropriate and sufficient way to deal with the problem. Looking at the embedded timestamps just confuses the feedback loop behavior. I disagree with sufficient, with good reason. Please look at the code comments and see what will happen if we don't make the correction suggested. If after that you still disagree, then do it your way and we'll wait for comments in the beta; I think there will be some, but I am tired and prone to agreement to allow this to go through. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Did we really want to force an initdb in beta2?
On Jun 3, 2010, at 19:00 , Tom Lane wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 03/06/10 19:16, Tom Lane wrote: What exactly was the reason for this patch? Could it be held over till 9.1? Before the patch, when you shut down a standby server, you get this message in the log on the next startup: LOG: database system was interrupted while in recovery at log time 2010-06-02 14:48:28 EEST HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. The problem is that that hint is pretty alarming. Maybe we should just get rid of the hint. FYI, Robert Haas suggested the same in the thread that lead to this patch being applied. The arguments against doing that is that a real crash during recovery *is* something to be quite alarmed about. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow wal_keep_segments to keep all segments
Excerpts from Bruce Momjian's message of jue jun 03 08:36:28 -0400 2010: Using this only temporarily is mentioned in the doc patch. Do I need more? Yeah, it's far too easy to miss. Besides, I think the wording you used is ambiguous -- it can be read as the server will temporarily keep all segments if you set it to -1, which is not the same thing at all. If you can't add a 20-point-font red blinking warning with a pink dancing elephant in a tutu, maybe it's best to not offer the dangerous setting in the first place. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow wal_keep_segments to keep all segments
Alvaro Herrera wrote: Excerpts from Bruce Momjian's message of jue jun 03 08:36:28 -0400 2010: Using this only temporarily is mentioned in the doc patch. Do I need more? Yeah, it's far too easy to miss. Besides, I think the wording you used is ambiguous -- it can be read as the server will temporarily keep all segments if you set it to -1, which is not the same thing at all. If you can't add a 20-point-font red blinking warning with a pink dancing elephant in a tutu, maybe it's best to not offer the dangerous setting in the first place. Well, it seems enough people don't want this features that I am not going to add it. If we decide we want it later, we can add it. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
On Thu, Jun 3, 2010 at 4:18 PM, Tom Lane t...@sss.pgh.pa.us wrote: Greg Stark gsst...@mit.edu writes: On Thu, Jun 3, 2010 at 12:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: It is off-base. The receiver does not request data, the sender is what determines how much WAL is sent when. Hm, so what happens if the slave blocks, doesn't the sender block when the kernel buffers fill up? Well, if the slave can't keep up, that's a separate problem. It will not fail to keep up as a result of the transmission mechanism. No, I mean if the slave is paused due to a conflict. Does it stop reading data from the master or does it buffer it up on disk? If it stops reading it from the master then the effect is the same as if the slave stopped requesting data even if there's no actual request. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
Greg Stark gsst...@mit.edu writes: On Thu, Jun 3, 2010 at 12:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: It is off-base. The receiver does not request data, the sender is what determines how much WAL is sent when. Hm, so what happens if the slave blocks, doesn't the sender block when the kernel buffers fill up? Well, if the slave can't keep up, that's a separate problem. It will not fail to keep up as a result of the transmission mechanism. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
On 6/2/2010 7:49 PM, Greg Stark wrote: On Wed, Jun 2, 2010 at 6:45 PM, Chris Browne cbbro...@acm.org wrote: It would make it easy to conclude: This next transaction did 8328194 updates. Maybe we should do some kind of checkpoint (e.g. - commit transaction or such) before working on it. versus This transaction we're thinking of working on had 7 updates. No big deal... I'm puzzled how you would define this value. How do you add 7 inserts, 7 deletes, and 7 updates? Is that 21 rows modified? Why are the 7 inserts and 7 deletes worth twice as much as the 7 updates when they're basically the same thing? What if the inserts fired triggers which inserted 7 more rows, is that 14? What if the 7 updates modified 2 TB of TOAST data but the 8238194 updates were all to the same record and they were all HOT updates so all it did was change 8kB? In any case you'll have all the actual data from your triggers or hooks or whatever so what value does having the system keep track of this add? The point is not that we don't have that information now. The point is having a hint BEFORE wading through possibly gigabytes of WAL or log data. If getting that information requires to read all the log data twice or the need to read gigabytes of otherwise useless WAL data (as per Bruce's suggestion), we better not get it at all and just keep doing what we are doing now. I actually have a hard time understanding why people are so opposed to a feature that has zero impact at all unless a DBA actually turns in ON. What is the problem with exposing the commit order of transactions? Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
Jan Wieck wrote: The point is not that we don't have that information now. The point is having a hint BEFORE wading through possibly gigabytes of WAL or log data. If getting that information requires to read all the log data twice or the need to read gigabytes of otherwise useless WAL data (as per Bruce's suggestion), we better not get it at all and just keep doing what we are doing now. I actually have a hard time understanding why people are so opposed to a feature that has zero impact at all unless a DBA actually turns in ON. What is the problem with exposing the commit order of transactions? If you want to fork Postgres and add it, go ahead, but if the community has to maintain the code and document it, we care. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
Jan Wieck janwi...@yahoo.com wrote: I actually have a hard time understanding why people are so opposed to a feature that has zero impact at all unless a DBA actually turns in ON. What is the problem with exposing the commit order of transactions? FWIW, once I came to understand the use case, it seems to me a perfectly reasonable and useful thing to have. It does strike me that there may be value to add one more xid to support certain types of integrity for some use cases, but that's certainly something which could be added later, if at all. Once I realized that, I just dropped out of the discussion; perhaps I should have bowed out with an endorsement. Unless my memory is failing me worse than usual, Dan Ports, who is working on the serializable implementation so he can use the predicate locking with a transaction-aware caching feature, needs the ability to track commit order of transactions by xid; so the use cases go beyond Slony and Londiste. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] clarification on walsender protocol document
protocol.sgml says termSTART_REPLICATION XXX/XXX/term listitem para Instructs backend to start streaming WAL, starting at point XXX/XXX. Am I correct in thinking that this is wrong, and streaming actually starts with the byte after XXX/XXX? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] clarification on walsender protocol document
On 03/06/10 23:17, Tom Lane wrote: protocol.sgml says termSTART_REPLICATION XXX/XXX/term listitem para Instructs backend to start streaming WAL, starting at point XXX/XXX. Am I correct in thinking that this is wrong, and streaming actually starts with the byte after XXX/XXX? No, I believe the documentation is correct. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] clarification on walsender protocol document
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 03/06/10 23:17, Tom Lane wrote: Am I correct in thinking that this is wrong, and streaming actually starts with the byte after XXX/XXX? No, I believe the documentation is correct. OK, I read the code again. Thanks for the cross-check. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
On 6/3/2010 4:04 PM, Bruce Momjian wrote: If you want to fork Postgres and add it, go ahead, but if the community has to maintain the code and document it, we care. That comment was rather unprofessional. I think the rest of us still try to find the best solution for the problem, not kill the discussion. You may want to rejoin that effort. I care about an efficient, low overhead way to get a certain information, that is otherwise extremely difficult, expensive and version dependent to get. I care about cleaning up more of the mistakes, made in the original development of Slony. Namely using hacks and kluges to implement details, not supported by a current version of PostgreSQL. Londiste and Slony made a good leap on that with the txid data type. Slony made another step like that with 2.0, switching to the (for that very purpose developed and contributed) native trigger configuration instead of hacking system catalogs. This would be another step in that direction and we would be able to unify Londiste's and Slony's transport mechanism and eliminating the tick/sync kluge. Care to explain what exactly you care about? Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
br...@momjian.us (Bruce Momjian) writes: Jan Wieck wrote: The point is not that we don't have that information now. The point is having a hint BEFORE wading through possibly gigabytes of WAL or log data. If getting that information requires to read all the log data twice or the need to read gigabytes of otherwise useless WAL data (as per Bruce's suggestion), we better not get it at all and just keep doing what we are doing now. I actually have a hard time understanding why people are so opposed to a feature that has zero impact at all unless a DBA actually turns in ON. What is the problem with exposing the commit order of transactions? If you want to fork Postgres and add it, go ahead, but if the community has to maintain the code and document it, we care. Are you caring or opposing? It seems rather uncharitable to imply that Jan doesn't care. I know *I'm* not interested in a forked Postgres for this - I would prefer to find out what things could be done that don't involve gross amounts of WAL file grovelling for data that mayn't necessarily even be available. -- select 'cbbrowne' || '@' || 'cbbrowne.com'; http://cbbrowne.com/info/internet.html MS apparently now has a team dedicated to tracking problems with Linux and publicizing them. I guess eventually they'll figure out this back fires... ;) -- William Burrow aa...@delete.fan.nb.ca -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
gsst...@mit.edu (Greg Stark) writes: On Wed, Jun 2, 2010 at 6:45 PM, Chris Browne cbbro...@acm.org wrote: It would make it easy to conclude: This next transaction did 8328194 updates. Maybe we should do some kind of checkpoint (e.g. - commit transaction or such) before working on it. versus This transaction we're thinking of working on had 7 updates. No big deal... I'm puzzled how you would define this value. How do you add 7 inserts, 7 deletes, and 7 updates? Is that 21 rows modified? Why are the 7 inserts and 7 deletes worth twice as much as the 7 updates when they're basically the same thing? What if the inserts fired triggers which inserted 7 more rows, is that 14? What if the 7 updates modified 2 TB of TOAST data but the 8238194 updates were all to the same record and they were all HOT updates so all it did was change 8kB? The presence of those questions (and their ambiguity) is the reason why there's a little squirming as to whether this is super-useful and super-necessary. What this offers is *SOME* idea of how much updating work a particular transaction did. It's a bit worse than you suggest: - If replication triggers have captured tuples, those would get counted. - TOAST updates might lead to extra updates being counted. But back to where you started, I'd anticipate 7 inserts, 7 deletes, and 7 updates being counted as something around 21 updates. And if that included 5 TOAST changes, it might bump up to 26. If there were replication triggers in place, that might bump the count up to 45 (which I chose arbitrarily). In any case you'll have all the actual data from your triggers or hooks or whatever so what value does having the system keep track of this add? This means that when we'd pull the list of transactions to consider, we'd get something like: select * from next_transactions('4218:23', 50); [list of 50 transactions returned, each with... - txid - START timestamp - COMMIT timestamp - Approximate # of updates Then, for each of the 50, I'd pull replication log data for the corresponding transaction. If I have the approximate # of updates, that might lead me to stop short, and say: That next update looks like a doozy! I'm going to stop and commit what I've got before doing that one. It's not strictly necessary, but would surely be useful for flow control. -- select 'cbbrowne' || '@' || 'cbbrowne.com'; http://cbbrowne.com/info/internet.html MS apparently now has a team dedicated to tracking problems with Linux and publicizing them. I guess eventually they'll figure out this back fires... ;) -- William Burrow aa...@delete.fan.nb.ca -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow wal_keep_segments to keep all segments
Heikki Linnakangas wrote: We leave that up to the DBA to clean out one way or another. We provide restartpoint_command and the %r option in restore_command to help with that. I was in fact just looking into this, and I see that there is no example restartpoint_comand script given in the docs, nor in the wiki. A sample of such a command would be useful. This is all going to feel a bit strange to lots of users, and the more we can hold their hands the better off we and they will be. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
On Thu, Jun 3, 2010 at 8:50 PM, Jan Wieck janwi...@yahoo.com wrote: I'm puzzled how you would define this value. How do you add 7 inserts, 7 deletes, and 7 updates? Is that 21 rows modified? I actually have a hard time understanding why people are so opposed to a feature that has zero impact at all unless a DBA actually turns in ON. What is the problem with exposing the commit order of transactions? The post you were responding to was regarding the meaninglessness of the number of records attribute you wanted. Your response is a non sequitor. I think the commit order of transactions would be a good thing to expose though I've asked repeatedly what kind of interface you need and never gotten answers to all the questions. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SET CONSTRAINTS todo
I see what went wrong in my example. Unique constraints must have unique names since they create an index. I'll try again, sorry for the noise. --Dan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
Greg Stark gsst...@mit.edu wrote: what kind of interface you need For the potential uses I can see, it would be great to have a SRF which took two parameters: xid of last known commit and a limit how many commits past that to return. Perhaps a negative number could move earlier in time, if that seems reasonable to others. I think that's also consistent with Jan's posts. A GUC to enable it and some way to specify retention (or force cleanup) are the only other user-facing features which come to mind for me. (Not sure what form that last should take, but didn't Jan say something about both of these early in the thread?) Do you see a need for something else (besides, obviously, docs)? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Did we really want to force an initdb in beta2?
Florian Pflug f...@phlo.org writes: On Jun 3, 2010, at 19:00 , Tom Lane wrote: Maybe we should just get rid of the hint. FYI, Robert Haas suggested the same in the thread that lead to this patch being applied. The arguments against doing that is that a real crash during recovery *is* something to be quite alarmed about. After some discussion among core we're going to leave it as-is. Anybody who doesn't want to initdb for beta2 can test out pg_upgrade ;-) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
Kevin Grittner kevin.gritt...@wicourts.gov wrote: Greg Stark gsst...@mit.edu wrote: what kind of interface you need For the potential uses I can see, it would be great to have a SRF which took two parameters: xid of last known commit and a limit how many commits past that to return. Jan's very first post had it right; my idea was flawed: | Exposing the data will be done via a set returning function. The | SRF takes two arguments. The maximum number of rows to return and | the last serial number processed by the reader. The advantage of | such SRF is that the result can be used in a query that right away | delivers audit or replication log information in transaction | commit order. The SRF can return an empty set if no further | transactions have committed since, or an error if data segments | needed to answer the request have already been purged. | | Purging of the data will be possible in several different ways. | Autovacuum will call a function that drops segments of the data | that are outside the postgresql.conf configuration with respect to | maximum age or data volume. There will also be a function reserved | for superusers to explicitly purge the data up to a certain serial | number. Apologies for not looking back to the start of the thread before that last post. It was all laid out right at the start. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
Jan Wieck wrote: On 6/3/2010 4:04 PM, Bruce Momjian wrote: If you want to fork Postgres and add it, go ahead, but if the community has to maintain the code and document it, we care. That comment was rather unprofessional. I think the rest of us still try to find the best solution for the problem, not kill the discussion. You may want to rejoin that effort. I care about an efficient, low overhead way to get a certain information, that is otherwise extremely difficult, expensive and version dependent to get. I care about cleaning up more of the mistakes, made in the original development of Slony. Namely using hacks and kluges to implement details, not supported by a current version of PostgreSQL. Londiste and Slony made a good leap on that with the txid data type. Slony made another step like that with 2.0, switching to the (for that very purpose developed and contributed) native trigger configuration instead of hacking system catalogs. This would be another step in that direction and we would be able to unify Londiste's and Slony's transport mechanism and eliminating the tick/sync kluge. Care to explain what exactly you care about? Here is what I was replying to: I actually have a hard time understanding why people are so opposed t$ feature that has zero impact at all unless a DBA actually turns in ON. What is the problem with exposing the commit order of transactions? Jan's comment is why should others care what he wants because it has zero impact? I am saying the community cares because we have to maintain the code. I stand by my comment. I remember a dismissive comment by Jan when 'session_replication_role' was added, and a similar strong comment from me at that time as well. It seems we are doing this again. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
Bruce Momjian wrote: Jan Wieck wrote: On 6/3/2010 4:04 PM, Bruce Momjian wrote: If you want to fork Postgres and add it, go ahead, but if the community has to maintain the code and document it, we care. That comment was rather unprofessional. I think the rest of us still try to find the best solution for the problem, not kill the discussion. You may want to rejoin that effort. I care about an efficient, low overhead way to get a certain information, that is otherwise extremely difficult, expensive and version dependent to get. I care about cleaning up more of the mistakes, made in the original development of Slony. Namely using hacks and kluges to implement details, not supported by a current version of PostgreSQL. Londiste and Slony made a good leap on that with the txid data type. Slony made another step like that with 2.0, switching to the (for that very purpose developed and contributed) native trigger configuration instead of hacking system catalogs. This would be another step in that direction and we would be able to unify Londiste's and Slony's transport mechanism and eliminating the tick/sync kluge. Care to explain what exactly you care about? Here is what I was replying to: I actually have a hard time understanding why people are so opposed t$ feature that has zero impact at all unless a DBA actually turns in ON. What is the problem with exposing the commit order of transactions? Jan's comment is why should others care what he wants because it has zero impact? I am saying the community cares because we have to maintain the code. I stand by my comment. I remember a dismissive comment by Jan when 'session_replication_role' was added, and a similar strong comment from me at that time as well. It seems we are doing this again. Of course, if I am misintepreting what Jan said, please let me know. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
On Thu, 2010-06-03 at 18:18 +0100, Simon Riggs wrote: Are you planning to work on these things now as you said? Are you? Or do you want me to? -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Keepalive for max_standby_delay
Simon Riggs si...@2ndquadrant.com writes: On Thu, 2010-06-03 at 18:18 +0100, Simon Riggs wrote: Are you planning to work on these things now as you said? Are you? Or do you want me to? I decided there wasn't time to get anything useful done on it before the beta2 deadline (which is, more or less, right now). I will take another look over the next few days. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
Excerpts from Chris Browne's message of jue jun 03 16:21:35 -0400 2010: What this offers is *SOME* idea of how much updating work a particular transaction did. It's a bit worse than you suggest: - If replication triggers have captured tuples, those would get counted. - TOAST updates might lead to extra updates being counted. But back to where you started, I'd anticipate 7 inserts, 7 deletes, and 7 updates being counted as something around 21 updates. And if that included 5 TOAST changes, it might bump up to 26. If there were replication triggers in place, that might bump the count up to 45 (which I chose arbitrarily). Why not send separate numbers of tuple inserts/updates/deletes, which we already have from pgstats? -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.0 release notes
On 06/03/2010 10:27 AM, Josh Berkus wrote: On 06/03/2010 09:34 AM, Bruce Momjian wrote: The 9.0 release notes have been updated to current. OK. I'll be doing editing for readability starting on Monday. Actually, the section tags appear to be broken in release-9.0.sgml in the current cvs. Looks like a section got added somewhere and the sections were not renumbered. Fix? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] SET CONSTRAINTS todo
I wanted to work on this todo item and I have a few questions about the semantics of it. Essentially, it is not possible to have more than one relname for a constraint, even though the comment in trigger.c says otherwise. I have used this code to test this: CREATE TABLE products ( product_no integer CONSTRAINT must_be_different UNIQUE DEFERRABLE, name text, price numeric ); CREATE TABLE products2 ( product_no integer CONSTRAINT must_be_different UNIQUE DEFERRABLE, name text, price numeric ); which results in the following error: ERROR: relation must_be_different already exists Therefore prefixing them with a table name does not seem to make sense. Additionally, there is already the feature of prefixing the constraint relname with a schema to limit the search space. Is the intention of the todo to allow the user to specify a tablename which limits the search path to that table's schema or is the feature to extend constraints to allow per table naming. In other words, would the feature allow multiple constraints of the same name in a schema since they would be table specific? --Dan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
On 6/3/2010 5:58 PM, Greg Stark wrote: On Thu, Jun 3, 2010 at 8:50 PM, Jan Wieck janwi...@yahoo.com wrote: I'm puzzled how you would define this value. How do you add 7 inserts, 7 deletes, and 7 updates? Is that 21 rows modified? I actually have a hard time understanding why people are so opposed to a feature that has zero impact at all unless a DBA actually turns in ON. What is the problem with exposing the commit order of transactions? The post you were responding to was regarding the meaninglessness of the number of records attribute you wanted. Your response is a non sequitor. I never proposed a number of records attribute. I proposed a sum of the row counts in the statistics collector. That row count would be a mix of insert, update, delete and toast operations. It's not an exact indicator of anything, but a good enough hint of how much data may come down the pipe if I were to select all replication data belonging to that transaction. I think the commit order of transactions would be a good thing to expose though I've asked repeatedly what kind of interface you need and never gotten answers to all the questions. In the original email that started this whole thread I wrote: Exposing the data will be done via a set returning function. The SRF takes two arguments. The maximum number of rows to return and the last serial number processed by the reader. The advantage of such SRF is that the result can be used in a query that right away delivers audit or replication log information in transaction commit order. The SRF can return an empty set if no further transactions have committed since, or an error if data segments needed to answer the request have already been purged. Did that not answer your question? Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
On 6/3/2010 6:24 PM, Kevin Grittner wrote: Apologies for not looking back to the start of the thread before that last post. It was all laid out right at the start. No need to apologize. Happens. Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
On 6/3/2010 7:11 PM, Alvaro Herrera wrote: Excerpts from Chris Browne's message of jue jun 03 16:21:35 -0400 2010: What this offers is *SOME* idea of how much updating work a particular transaction did. It's a bit worse than you suggest: - If replication triggers have captured tuples, those would get counted. - TOAST updates might lead to extra updates being counted. But back to where you started, I'd anticipate 7 inserts, 7 deletes, and 7 updates being counted as something around 21 updates. And if that included 5 TOAST changes, it might bump up to 26. If there were replication triggers in place, that might bump the count up to 45 (which I chose arbitrarily). Why not send separate numbers of tuple inserts/updates/deletes, which we already have from pgstats? We only have them for the entire database. The purpose of this is just a guesstimate about what data volume to expect if I were to select all log from a particular transaction. This datum isn't critical, just handy for the overall feature to be useful. Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS
I fixed up the subject. (2010/06/04 2:23), Marc Munro wrote: On Thu, 2010-06-03 at 05:53 -0300, pgsql-hackers-ow...@postgresql.org wrote: [ . . . ] In my current idea, when a qual-node that contains FuncExpr tries to reference a part of relations within a security view, its referencing relations will be expanded to whole of the security view at distribute_qual_to_rels(). [ . . . ] I may be missing something here but this seems a bit too simplistic and, I think, fails to deal with an important use case. Security views, that do anything useful at all, tend to introduce performance issues. I am concerned that placing a conceptual barrier between the secured and unsecured parts of queries is going to unnecessarily restrict what the optimiser can do. For example consider that we have three secured views, each of the form: create view s_x as select * from x where i_can_see(x.key); and consider the query: select stuff from s_x inner join s_y on s_y.key = s_x.key inner join s_z on s_z.key = s_x.key where fn(s_x.a) = 3; The optimiser ought to be able to spot the fact that i_can_see() need only be called once for each joined result. By placing a barrier (if I understand your proposal correctly) between the outermost joins and the inner views, doesn't this optimisation become impossible? It seems to me incorrect. If i_can_see() can filter a part of tuples within table: x, the optimizer prefers to call i_can_see() on scanning each tables rather than result of join, because it effectively reduce the size of scan. In fact, the existing optimizer make the following plan: postgres=# CREATE FUNCTION i_can_see(int) RETURNS bool IMMUTABLE AS 'begin return $1 % 1 = 1; end;' LANGUAGE 'plpgsql'; CREATE FUNCTION postgres=# CREATE VIEW v1 as select * from t1 where i_can_see(a); CREATE VIEW postgres=# CREATE VIEW v2 as select * from t2 where i_can_see(x); CREATE VIEW postgres=# CREATE VIEW v3 as select * from t3 where i_can_see(s); CREATE VIEW postgres=# EXPLAIN SELECT * FROM (v1 JOIN v2 ON v1.a=v2.x) JOIN v3 on v1.a=v3.s; QUERY PLAN --- Nested Loop (cost=0.00..860.47 rows=410 width=108) - Nested Loop (cost=0.00..595.13 rows=410 width=72) - Seq Scan on t1 (cost=0.00..329.80 rows=410 width=36) Filter: i_can_see(a) - Index Scan using t2_pkey on t2 (cost=0.00..0.63 rows=1 width=36) Index Cond: (t2.x = t1.a) Filter: i_can_see(t2.x) - Index Scan using t3_pkey on t3 (cost=0.00..0.63 rows=1 width=36) Index Cond: (t3.s = t1.a) Filter: i_can_see(t3.s) (10 rows) Sorry, I may miss something your point. I think a simpler solution may be possible here. If you can tag the function i_can_see() as a security function, at least in the context of its use in the security views, and then create the rule that security functions are always considered to be lower cost than user-defined non-security functions, don't we achieve the result of preventing the insecure function from seeing rows that it shouldn't? Sorry, it seems to me you mixed up different issues. My patch tries to tackle the problem that optimizer distributes outermost functions into inside of the view when the function takes arguments only from a side of join, even if security views. This issue is independent from cost of functions. On the other hand, when a scan plan has multiple qualifiers to filter fetched tuples, we have to pay attention on the order to be called. If outermost functions are called prior to view's policy functions, it may cause unexpected leaks. In my opinion, we should consider the nestlevel of the function where it is originally put. But it is not concluded yet, and the patch does not tackle anything about this issue. I guess my concern is that a query may be constructed a=out of secured and unsecured parts and the optimiser should be free to group all of the secured parts together before considering the unsecured parts. Yes, it is an opinion, but it may cause unnecessary performance regression when user given condition is obviously harmless. E.g, If table: t has primary key t.a, and a security view is defined as: CREATE VIEW v AS SELECT * FROM t WHERE f_policy(t.b); When user gives the following query: SELECT * FROM v WHERE a = 100; If we strictly separate secured and unsecure part prior to optimizer, it will break a chance to scan the table: t with index. I also think security is not free, so it may take a certain level of performance degradation. But it should not bring down more than necessity. Thanks, -- KaiGai Kohei kai...@ak.jp.nec.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Open item: slave to standby in docs
Ther is an open item: Standby instead of slave in documentation http://archives.postgresql.org/message-id/1273682033.12754.1.ca...@vanquo.pezone.net I replacesd almost all slave to standby or standby servers not only in HS+SR but also in other places like third-party tools. There are still 3 places where slave is used. - Terminology: ... are called standby or slave servers. - Words in old release notes for 8.2 and 8.4. Could you check those replacements are proper? Regards, --- Takahiro Itagaki NTT Open Source Software Center slave_to_standby.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronization levels in SR
On 5/27/2010 4:31 PM, Bruce Momjian wrote: Heikki Linnakangas wrote: BTW, I think we're going to need a separate config file for listing the standbys anyway. There you can write per-server rules and options, but explicitly knowing about all the standbys also allows the master to recycle WAL as soon as it has been streamed to all the registered standbys. Currently we just keep wal_keep_segments files around, just in case there's a standby out there that needs them. Ideally we could set 'slave_sync_count' and 'slave_commit_continue_mode' on the master, and allow the sync/async mode to be set on each slave, e.g. if slave_sync_count = 2 and slave_commit_continue_mode = #2, then two slaves with sync mode of #2 or stricter have to complete before the master can continue. Naming the slaves on the master seems very confusing because I am unclear how we would identify named slaves, and the names have to match, etc. Also, what would be cool would be if you could run a query on the master to view the SR commit mode of each slave. What would be the use case for such a query? Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
On Thu, Jun 3, 2010 at 6:29 PM, Bruce Momjian br...@momjian.us wrote: Jan Wieck wrote: On 6/3/2010 4:04 PM, Bruce Momjian wrote: If you want to fork Postgres and add it, go ahead, but if the community has to maintain the code and document it, we care. That comment was rather unprofessional. I think the rest of us still try to find the best solution for the problem, not kill the discussion. You may want to rejoin that effort. I care about an efficient, low overhead way to get a certain information, that is otherwise extremely difficult, expensive and version dependent to get. I care about cleaning up more of the mistakes, made in the original development of Slony. Namely using hacks and kluges to implement details, not supported by a current version of PostgreSQL. Londiste and Slony made a good leap on that with the txid data type. Slony made another step like that with 2.0, switching to the (for that very purpose developed and contributed) native trigger configuration instead of hacking system catalogs. This would be another step in that direction and we would be able to unify Londiste's and Slony's transport mechanism and eliminating the tick/sync kluge. Care to explain what exactly you care about? Here is what I was replying to: I actually have a hard time understanding why people are so opposed t$ feature that has zero impact at all unless a DBA actually turns in ON. What is the problem with exposing the commit order of transactions? Jan's comment is why should others care what he wants because it has zero impact? I am saying the community cares because we have to maintain the code. I stand by my comment. I remember a dismissive comment by Jan when 'session_replication_role' was added, and a similar strong comment from me at that time as well. It seems we are doing this again. I think it's entirely legitimate and proper for us to make a decision about whether this feature is worth including in core PostgreSQL. We are obviously not in the business of adding random features solely for the benefit of third-party applications. That having been said, there are several reasons why I believe that this particular feature is an excellent candidate for inclusion in core. 1. It solves a problem for which there is no easy workaround. Rereading all the WAL to extract the commit records is not an easy workaround, nor is what Slony and Londiste are doing now. 2. It is usable by multiple projects, not just one. It may well have applications beyond replication (e.g. distributed transactions), but at a very minimum it is usable by and useful to multiple replication solutions. 3. It has a clear specification which can be easily understood even by people who do not fully understand how replication solutions will make use of it, which makes code maintenance much less burdensome. Obviously, Jan's original email on this topic was just a sketch, but I find it to be pretty clear. 4. We have an existing precedent of being willing to add limited support into core to allow replication solutions to do their thing (session_replication_role, ALTER TABLE ... ENABLE REPLICA TRIGGER, etc). Even though we now have built-in replication via HS and SR, there is still a BIG use case for Slony, Londiste, and other add-on tools. Making those tools more successful and performant is good for PostgreSQL. 5. It does not involve highly invasive changes to the core code. 6. It can be turned off for users who don't want it. I find the skeptical attitude on this thread altogether unwarranted. Jan made his case and, at least IMHO, presented it pretty clearly. He then answered, multiple times, numerous questions which were already addressed in the original email, as well as various others. I think we should be very careful about assuming that we understand replication and its needs better than someone who has spent many years developing one of the major PostgreSQL replication solutions. Independent of Jan's qualifications, there are clearly several people on this thread who understand why this is useful and valuable, including me. I am obviously not in a position to insist that we accept this feature (assuming Jan produces a patch rather than getting discouraged and giving up) but I would like us to think very, very carefully before rejecting it, and not to do so unless we have a DARN good reason. Most patches add code, and therefore require code maintenance - that is not, by itself, a reason to reject them. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
Bruce Momjian wrote: I care about cleaning up more of the mistakes, made in the original development of Slony. Namely using hacks and kluges to implement details, not supported by a current version of PostgreSQL. Londiste and Slony made a good leap on that with the txid data type. Slony made another step like that with 2.0, switching to the (for that very purpose developed and contributed) native trigger configuration instead of hacking system catalogs. This would be another step in that direction and we would be able to unify Londiste's and Slony's transport mechanism and eliminating the tick/sync kluge. Care to explain what exactly you care about? Here is what I was replying to: I actually have a hard time understanding why people are so opposed t$ feature that has zero impact at all unless a DBA actually turns in ON. What is the problem with exposing the commit order of transactions? Jan's comment is why should others care what he wants because it has zero impact? I am saying the community cares because we have to maintain the code. I stand by my comment. I remember a dismissive comment by Jan when 'session_replication_role' was added, and a similar strong comment from me at that time as well. It seems we are doing this again. FYI, I talked to Jan on the phone and we have resolved this issue. :-) -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS
I summarized the series of discussion at: http://wiki.postgresql.org/wiki/RLS#Issue:_A_leaky_VIEWs_for_RLS Please point out, if I missed or misunderstood something. Thanks, (2010/06/03 11:36), KaiGai Kohei wrote: (2010/06/02 12:02), KaiGai Kohei wrote: Here's another thought. If we're leaning toward explicit syntax to designate security views (and I do mean IF, since only one person has signed on to that, even if it is Tom Lane!), then maybe we should think about ripping out the logic that causes regular views to be evaluated using the credentials of the view owner rather than the person selecting from it. A security view would still use that logic, plus whatever additional stuff we come up with to prevent leakage. Perhaps this would be viewed as a nasty backward compatibility break, but the upside is that we'd then be being absolutely clear that a non-security view isn't and can never be trusted to be a security barrier. Right now we're shipping something that purports to act as a barrier but really doesn't. Sorry, should we make clear the purpose of explicit syntax for security views being issued now? In my understanding, if security views, the planner tries to checks privileges of the person selecting it to reference underlaying tables without any ereport. If violated, the planner prevents user given quals (perhaps FuncExpr only?) to come into inside of the join scan. Otherwise, if regular views, the planner works as is. Right? I'd like to check up the problem in details. We can sort out a view into three types, as follows: (1) A view which cannot be pulled up (2) A view which can be pulled up, but does not contain any joins (3) A view which can be pulled up, and contains joins. For the type (1), we don't need to handle anything special, because no external quals are unavailable to come into. For the type (2), we also don't need to handle anything special except for the evaluation order matter in ExecQual(), because it is impossible to distribute external quals into a part of relations. So, the type (3) is all we have to consider here. Right? Then, where should we distinguish a security view and others? At least, it should be decided at pull_up_subqueries() or earlier, because it merges subqueries into the upper query. In subquery_planner(), the pull_up_subqueries() is called just after inline_set_returning_functions() which makes RTE_FUNCTION entry flatten if available. It seems to me not a strange logic to check privileges on underlaying relations in pull_up_subqueries(), because inline_set_returning_functions() also checks ACL_EXECUTE here on the functions to be flatten. Then, once we can identify what is a security view and not, it shall be marked on somewhere. Maybe, FromExpr of the pulled-up subquery? In my current idea, when a qual-node that contains FuncExpr tries to reference a part of relations within a security view, its referencing relations will be expanded to whole of the security view at distribute_qual_to_rels(). Then, it will prevent a user defined function to come into inside of security views. At least, it seems to me reasonable to implement. Shall I launch to develop a proof of concept patch? Thanks, -- KaiGai Kohei kai...@ak.jp.nec.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.0 release notes
Josh Berkus wrote: On 06/03/2010 10:27 AM, Josh Berkus wrote: On 06/03/2010 09:34 AM, Bruce Momjian wrote: The 9.0 release notes have been updated to current. OK. I'll be doing editing for readability starting on Monday. Actually, the section tags appear to be broken in release-9.0.sgml in the current cvs. Looks like a section got added somewhere and the sections were not renumbered. Fix? What are you seeing that is wrong? I don't see it. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] functional call named notation clashes with SQL feature
On 5/27/2010 11:52 PM, Andrew Dunstan wrote: Bruce Momjian wrote: Peter Eisentraut wrote: On tor, 2010-05-27 at 12:59 -0400, Tom Lane wrote: I think we should fix it now. Quick thought: maybe we could use FOR instead of AS: select myfunc(7 for a, 6 for b); I'm afraid FOR doesn't work either; it'll create a conflict with the spec-defined SUBSTRING(x FOR y) syntax. How about select myfunc(a := 7, b := 6); One concern I have is that in PL/pgSQL, := and = behave the same, while in SQL, they would not. That might cause confusion. That is a sad wart that we should never have done, IMNSHO (it was before my time or I would have objected ;-) ). But beyond that, = is an operator in SQL and := is never an operator, IIRC. As far as I can tell, this was already in the code when Bruce moved it into core as -r1.1 on my behalf (before I had commit privileges). I do not recall if the = as alternative to := was my idea or not. But I'm willing to take the blame for it because it dates back to a time where convenience seemed important. Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-hackers] Daily digest v1.10705 (13 messages)
On Thu, Jun 3, 2010 at 1:23 PM, Marc Munro m...@bloodnok.com wrote: On Thu, 2010-06-03 at 05:53 -0300, pgsql-hackers-ow...@postgresql.org wrote: [ . . . ] In my current idea, when a qual-node that contains FuncExpr tries to reference a part of relations within a security view, its referencing relations will be expanded to whole of the security view at distribute_qual_to_rels(). [ . . . ] I may be missing something here but this seems a bit too simplistic and, I think, fails to deal with an important use case. If anything, you're putting it mildly. This is quite a bit too simplistic and fails to deal with several important issues, at least some of which have already been mentioned on this thread. The optimiser ought to be able to spot the fact that i_can_see() need only be called once for each joined result. By placing a barrier (if I understand your proposal correctly) between the outermost joins and the inner views, doesn't this optimisation become impossible? I think a simpler solution may be possible here. If you can tag the function i_can_see() as a security function, at least in the context of its use in the security views, and then create the rule that security functions are always considered to be lower cost than user-defined non-security functions, don't we achieve the result of preventing the insecure function from seeing rows that it shouldn't? So, yes and no. You DO need a security barrier between the view and the rest of the query, but if a function can be trusted not to do evil things, then it should be allowed to be pushed down. What we need to prevent is the pushdown of untrusted functions (or operators). A (very) important part of this problem is determining which quals are safe to push down. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronization levels in SR
On Thu, Jun 3, 2010 at 8:47 PM, Jan Wieck janwi...@yahoo.com wrote: On 5/27/2010 4:31 PM, Bruce Momjian wrote: Heikki Linnakangas wrote: BTW, I think we're going to need a separate config file for listing the standbys anyway. There you can write per-server rules and options, but explicitly knowing about all the standbys also allows the master to recycle WAL as soon as it has been streamed to all the registered standbys. Currently we just keep wal_keep_segments files around, just in case there's a standby out there that needs them. Ideally we could set 'slave_sync_count' and 'slave_commit_continue_mode' on the master, and allow the sync/async mode to be set on each slave, e.g. if slave_sync_count = 2 and slave_commit_continue_mode = #2, then two slaves with sync mode of #2 or stricter have to complete before the master can continue. Naming the slaves on the master seems very confusing because I am unclear how we would identify named slaves, and the names have to match, etc. Also, what would be cool would be if you could run a query on the master to view the SR commit mode of each slave. What would be the use case for such a query? Monitoring? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] functional call named notation clashes with SQL feature
Jan Wieck wrote: That is a sad wart that we should never have done, IMNSHO (it was before my time or I would have objected ;-) ). But beyond that, = is an operator in SQL and := is never an operator, IIRC. As far as I can tell, this was already in the code when Bruce moved it into core as -r1.1 on my behalf (before I had commit privileges). I do not recall if the = as alternative to := was my idea or not. But I'm willing to take the blame for it because it dates back to a time where convenience seemed important. I forgive you ;=) If my worst sin were only this bad I'd be truly happy. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] fillfactor gets set to zero for toast tables
Takahiro Itagaki itagaki.takah...@oss.ntt.co.jp wrote: Could you explain why default_only entries idea is better than adjusting those fields in the toast-specific codes? It's my understanding that reloption-framework is just a tool to fill reloption parameters, and it's not responsible for unused fields. Here is my proposal to fix the issue. I didn't introduce default_only field but simply adjust parameters for TOAST reloptions. BTW, not only heap and toast relations but also btree, hash, and gist indexes use StdRdOptions. However, they actually don't support autovacuum options. So, StdRdOptions is a bloated all-in-one descriptor now. Instead, they should use fillfactor-only reloptions that is defined separately from options for heap. Regards, --- Takahiro Itagaki NTT Open Source Software Center toast-default-only-relopts.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS
(2010/06/04 11:55), Robert Haas wrote: On Thu, Jun 3, 2010 at 1:23 PM, Marc Munrom...@bloodnok.com wrote: On Thu, 2010-06-03 at 05:53 -0300, pgsql-hackers-ow...@postgresql.org wrote: [ . . . ] In my current idea, when a qual-node that contains FuncExpr tries to reference a part of relations within a security view, its referencing relations will be expanded to whole of the security view at distribute_qual_to_rels(). [ . . . ] I may be missing something here but this seems a bit too simplistic and, I think, fails to deal with an important use case. If anything, you're putting it mildly. This is quite a bit too simplistic and fails to deal with several important issues, at least some of which have already been mentioned on this thread. Hmm. Was it too simplified even if just a proof of concept? The optimiser ought to be able to spot the fact that i_can_see() need only be called once for each joined result. By placing a barrier (if I understand your proposal correctly) between the outermost joins and the inner views, doesn't this optimisation become impossible? I think a simpler solution may be possible here. If you can tag the function i_can_see() as a security function, at least in the context of its use in the security views, and then create the rule that security functions are always considered to be lower cost than user-defined non-security functions, don't we achieve the result of preventing the insecure function from seeing rows that it shouldn't? So, yes and no. You DO need a security barrier between the view and the rest of the query, but if a function can be trusted not to do evil things, then it should be allowed to be pushed down. What we need to prevent is the pushdown of untrusted functions (or operators). A (very) important part of this problem is determining which quals are safe to push down. At least, I don't have an idea to distinguish trusted functions from others without any additional hints, because we support variable kind of PL languages. :( An idea is to add TRUSTED (or other) keyword for CREATE FUNCTION to mark this function is harmless to execute, but only DBA can define functions with the option. (Perhaps, most of built-in functions should be marked as trusted?) If we should identify a function as either trusted or untrusted, is there any other ideas? Thanks, -- KaiGai Kohei kai...@ak.jp.nec.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SET CONSTRAINTS todo
Dan Colish dcol...@gmail.com writes: I wanted to work on this todo item and I have a few questions about the semantics of it. Essentially, it is not possible to have more than one relname for a constraint, That is per SQL spec: SQL92 10.6 syntax rule 2 saith 2) The qualified identifier of constraint name shall be differ- ent from the qualified identifier of the constraint name of any other constraint defined in the same schema. I believe we are already laxer than the spec, because we don't enforce that restriction except for index-based constraints. I'm not terribly excited about trying to make it weaker yet. Is the intention of the todo to allow the user to specify a tablename which limits the search path to that table's schema or is the feature to extend constraints to allow per table naming. I think the TODO item you're looking at is just about how narrowly you can specify the target(s) of a SET CONSTRAINTS command. It's not meant to say anything about what constraint names can be created in the first place. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS
KaiGai Kohei kai...@ak.jp.nec.com writes: (2010/06/04 11:55), Robert Haas wrote: A (very) important part of this problem is determining which quals are safe to push down. At least, I don't have an idea to distinguish trusted functions from others without any additional hints, because we support variable kind of PL languages. :( The proposal some time back in this thread was to trust all built-in functions and no others. That's a bit simplistic, no doubt, but it seems to me to largely solve the performance problem and to do so with minimal effort. When and if you get to a solution that's committable with respect to everything else, it might be time to think about more flexible answers to that particular point. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers