Re: [HACKERS] How to pass around collation information

2010-06-03 Thread Peter Eisentraut
On ons, 2010-06-02 at 16:56 -0400, Robert Haas wrote:
 But in the end the only purpose of setting it on a column is to set
 which one will be used for operations on that column.  And the user
 might still override it for a particular query.

Of course.  I'm just saying that it *can* be useful to attach a
collation to a column definition, rather than only allowing it to be
specified along with the sort operation.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Simon Riggs
On Wed, 2010-06-02 at 16:00 -0400, Tom Lane wrote:
 the current situation that query grace periods go to zero

Possibly a better way to handle this concern is to make the second
parameter: min_standby_grace_period - the minimum time a query will be
given in which to execute, even if max_standby_delay has been reached or
exceeded.

Would that more directly address you concerns?

min_standby_grace_period (ms) SIGHUP 

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE .... make constraint DEFERRABLE

2010-06-03 Thread Dean Rasheed
On 3 June 2010 02:06, Bruce Momjian br...@momjian.us wrote:
 Also, foreign keys can't be defined that refer to a deferrable primary
 key. That isn't mentioned at all in the manual with regard to the
 DEFERRABLE clause, though it is mentioned in the FK section. You get
 this error message
 ERROR:  cannot use a deferrable unique constraint for referenced table

 The use case for this feature looks a little narrow at present. Can we
 do something about usability?

 Not sure why that was a limitation.


That's in accordance with the SQL spec.

I didn't think of this case originally, but all sorts of complications
would arise if we were to allow FKs to refer to deferrable PKs. For
example, if there are 2 temporarily duplicated PKs, and you update one
of them, what would the FK's ON UPDATE actions do? I'm not convinced
there is any sensible answer to this question.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-03 Thread KaiGai Kohei
The attached patch is a proof of concept.
It tries to fix the problem of leaky VIEWs for RLS.

* The scenario of leaky VIEWs for RLS
-
When a view contains any table joins and user gives a function that
takes arguments depending on only one-side table of the joins, the
planner tries to distribute the qualifier into least unit of scans.
It enables to minimize the scale of execution cost, but it means
user given function may be invoked earlier than other qualifiers
within a view, although row-level security is intended using views.
If the user given function has a side-effect (E.g, it may try to
insert given arguments into other tables), this behavior allows us
to leak contents of invisible tuples.

  postgres=# SELECT * FROM v2 WHERE f_malicious(y);
  NOTICE:  f_malicious: xxx
  NOTICE:  f_malicious: yyy -- leaky contents
  NOTICE:  f_malicious: zzz
   a |  b  | x |  y
  ---+-+---+-
   1 | aaa | 1 | xxx
   3 | ccc | 3 | zzz
  (2 rows)


* Discussions in -hackers
-
Although it is a security problem, we will face unignorable performance
regression, if we disallow to distribute all user defined functions to
inside of joins come from views originally.

One idea is to distinguish a view into two types. The one is security
oriented view, and the other is regular view. Because only creator of
the view knows purpose of the view, an idea was suggested that takes
a hint on CREATE VIEW, like CREATE SECURITY VIEW.
(But I don't implement this feature in this patch yet.)

In addition, I was pointed out it may be harmless as long as
a person executes the query (not view's owner) has enough privileges
on underlaying tables, even if a view is marked as security view.

However, it was also pointed out the idea needs to check privileges
on planner stage, not execution stage. Is it really preferable?
It is not concluded yet. But inline_set_returning_function() checks
ACL_EXECUTE permission on planner stage, then it makes a FuncExpr
flatten prior to optimization. At least, it seems to me something
violation to the dogma.

Please point out any issues which I missed.


* About this patch
--
This patch mainly consists of two parts.

The first part is at pull_up_simple_subquery().
It checks privileges of underlaying tables in the subquery to be pulled
up, then it marks FromExpr-security_view as TRUE.

Note that it does not distinct a subquery come from a view and a pure
subquery in the original query. But it eventually causes access violation
error, if current user does not have privileges on a pure subquery.
So, no matter.

And, also note that it does not have statement support right now.
So, it assumes all the subqueries are possibly security views.

The second part is at distribute_qual_to_rels().
It computes what relations does the given clause depend on at first.
If the given clause depends on a part of relations to be joined
inside of the security view, it expands the dependency of the clause
into whole of the security view.
In the result, the clause is not distributed into inside of the join
loop.

(*) This patch uses check_relation_privileges() to check permissions
on DML execution, so please apply another my patch also on testing.

* Execution result
--

  postgres=# CREATE TABLE t1 (a int primary key, b text);
  NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index t1_pkey for 
table t1
  CREATE TABLE
  postgres=# CREATE TABLE t2 (x int primary key, y text);
  NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index t2_pkey for 
table t2
  CREATE TABLE
  postgres=# CREATE TABLE t3 (s int primary key, t text);
  NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index t3_pkey for 
table t3
  CREATE TABLE
  postgres=# CREATE VIEW v1 AS SELECT * FROM t1 JOIN t2 ON t1.a = t2.x;
  CREATE VIEW
  postgres=# CREATE VIEW v2 AS SELECT * FROM v1 JOIN t3 ON v1.a = t3.s;
  CREATE VIEW
  postgres=# CREATE USER tak;
  CREATE ROLE
  postgres=# GRANT SELECT ON v1, v2, t3 TO tak;
  GRANT

The user: tak is allowed to select v1, v2 and t3.

  postgres=# CREATE OR REPLACE FUNCTION f_malicious(text) RETURNS bool
  AS 'BEGIN RAISE NOTICE ''f_malicious: %'', $1; RETURN true; END;'
  LANGUAGE plpgsql;
  CREATE FUNCTION

Because superuser can access t1 and t2 directly, f_malicious(y) shall be
distributed to Seq-scan on t2.

  postgres=# EXPLAIN SELECT * FROM v1 WHERE f_malicious(y);
  QUERY PLAN
  ---
   Hash Join  (cost=334.93..365.94 rows=410 width=72)
 Hash Cond: (t1.a = t2.x)
 -  Seq Scan on t1  (cost=0.00..22.30 rows=1230 width=36)
 -  Hash  (cost=329.80..329.80 rows=410 width=36)
   -  Seq Scan on t2  (cost=0.00..329.80 rows=410 width=36)
 Filter: f_malicious(y)
  (6 rows)

But tak cannot access t1 and t2 directly, f_malicious(y) shall be
distributed to outside of the join.

  

Re: [HACKERS] ALTER TABLE .... make constraint DEFERRABLE

2010-06-03 Thread Simon Riggs
On Wed, 2010-06-02 at 21:06 -0400, Bruce Momjian wrote:
 Simon Riggs wrote:
  
  Deferrable unique constraints seem an interesting feature, though I have
  either some questions or some issues, not sure which.
  
  I don't seem to be able to find any way to do an ALTER TABLE that adds
  this new capability to an existing table.
 
 I was able to do it:
 
   test= create table test (x int unique DEFERRABLE INITIALLY DEFERRED);
   NOTICE:  CREATE TABLE / UNIQUE will create implicit index test_x_key
   for table test
   CREATE TABLE
 
   test= alter table test add column y int;
   ALTER TABLE
 
   test= alter table test add unique (y) DEFERRABLE INITIALLY DEFERRED;
   NOTICE:  ALTER TABLE / ADD UNIQUE will create implicit index
   test_y_key for table test
   ALTER TABLE
 
 Is that what you were asking?

No. I wanted to defer an existing UNIQUE constraint. That doesn't seem
to be possible. You *can* add a whole new constraint and then drop the
old one, though that's not quite as cool.

  There is no way to add a constraint via a CREATE TABLE AS SELECT, so
  that means there is no way to use the feature at all in that case.
 
 Uh, CREATE TABLE AS SELECT seems to be very limited, but I have not
 heard any complaints about it before.
 
  Also, foreign keys can't be defined that refer to a deferrable primary
  key. That isn't mentioned at all in the manual with regard to the
  DEFERRABLE clause, though it is mentioned in the FK section. You get
  this error message
  ERROR:  cannot use a deferrable unique constraint for referenced table
  
  The use case for this feature looks a little narrow at present. Can we
  do something about usability?
 
 Not sure why that was a limitation.

Regrettably it makes it an unusable limitation for many people.

All large tables are referenced in a typical database that uses PKs/FKs.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Fujii Masao
On Thu, Jun 3, 2010 at 4:41 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 I don't understand why you want to use a different delay when you're
 restoring from archive vs. when you're streaming (what about existing WAL
 files found in pg_xlog, BTW?). The source of WAL shouldn't make a
 difference.

Yes. The pace of a recovery has nothing to do with that of log shipping.
So to hurry up a recovery when restoring from archive seems to be useless.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Simon Riggs
On Thu, 2010-06-03 at 17:56 +0900, Fujii Masao wrote:
 On Thu, Jun 3, 2010 at 4:41 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
  I don't understand why you want to use a different delay when you're
  restoring from archive vs. when you're streaming (what about existing WAL
  files found in pg_xlog, BTW?). The source of WAL shouldn't make a
  difference.
 
 Yes. The pace of a recovery has nothing to do with that of log shipping.
 So to hurry up a recovery when restoring from archive seems to be useless.

When streaming drops for some reason we revert to scanning the archive
for files. There is clearly two modes of operation. So it makes sense
that you might want to set different times for the parameter in each
case.

We might think of those modes as connected and degraded modes rather
than streaming and file shipping.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] caught_up status in walsender

2010-06-03 Thread Fujii Masao
On Thu, Jun 3, 2010 at 4:21 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 02/06/10 21:44, Tom Lane wrote:
 In the current coding, the effect of not setting *caughtup here is just
 that we uselessly call XLogSend an extra time for each transmission
 (because the main loop won't ever delay immediately after a
 transmission).  But without this, we'd never send caughtup = true
 to the slave.

 That's intentional. It could take some time for the WAL to be sent, if
 the network is busy, so by the time XLogSend returns you might well not
 be caught up anymore.

 It may have been intentional, but it's still wrong.  If you were able to
 pull all of WAL into the record-to-be-sent, you should sleep afterwards,
 not send an extra record containing a few more bytes.

For reducing the workload of walsender?

This seems OK in 9.0 since only asynchronous replication is supported.
But when we'll implement synchronous replication in the future, we
might have to revert that change. Since a transaction commit might wait
for such an extra record to be replicated, walsender should aggressively
send all sendable WAL.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Fujii Masao
On Thu, Jun 3, 2010 at 6:07 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Thu, 2010-06-03 at 17:56 +0900, Fujii Masao wrote:
 On Thu, Jun 3, 2010 at 4:41 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
  I don't understand why you want to use a different delay when you're
  restoring from archive vs. when you're streaming (what about existing WAL
  files found in pg_xlog, BTW?). The source of WAL shouldn't make a
  difference.

 Yes. The pace of a recovery has nothing to do with that of log shipping.
 So to hurry up a recovery when restoring from archive seems to be useless.

 When streaming drops for some reason we revert to scanning the archive
 for files. There is clearly two modes of operation.

Yes.

 So it makes sense
 that you might want to set different times for the parameter in each
 case.

What purpose would that serve?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-03 Thread Greg Stark
So I think the scheme in the original post of this thread is workable.
Not as described but could be made to work. In which case I think it's
preferable to a freeze map -- which I had previously assumed we would
need eventually.

The problem with the scheme originally described is that it assumed
you could have an cycle counter and then arrange that all the xids on
the page are within that cycle. That doesn't work because you could
easily have two live xids on the page that belong to two cycles -- one
FirstNormalTransactionId and one MaxTransactionId.

I think to make it work you need to store a whole 64-bit reference
transaction id consisting of both a cycle counter and a transaction
id. The invariant for the page is that every xid on the page can be
compared to that reference transaction id using normal transactionid
semantics. Actually I think the easiest way to do that is to set it to
the oldest xid on the page. The first thing to do before comparing any
transaction id on the page with a real transaction id would be to
figure out whether the reference xid is comparable to the live xid,
which if it's the oldest xid on the page implies they'll all be
comparable.

The way to maintain that invariant would be that any xid insertion on
the page must advance the reference xid if it's not comparable to the
newly inserted xid. It has to be advanced to the oldest xid that's
still comparable with the newly inserted xid. Any xids on the page
that are older than the new refernce xid have to be frozen or removed.
I'm not sure how to do that without keeping clog forever though.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Simon Riggs
On Thu, 2010-06-03 at 18:39 +0900, Fujii Masao wrote:

 What purpose would that serve?

Tom has already explained this and it makes sense for me.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] rfc: changing documentation about xpath

2010-06-03 Thread Denis I. Polukarov
 Hi!

I'm to face a problem, and not at once resolve it. When I do:

-- // --

test=# SELECT xpath('//domain:name/text()', 
$$?xml version='1.0' encoding='UTF-8'?


epp xmlns='urn:ietf:params:xml:ns:epp-1.0' 
xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' 
xsi:schemaLocation='urn:ietf:params:xml:ns:epp-1.0 
epp-1.0.xsd'
 response
  result code='1301'
   msg lang='en-US'Command completed successfully; ack to dequeue/msg
  /result
  msgQ count='3' id='114004'
   qDate2010-06-02T12:35:33.0Z/qDate
   msg lang='en-US'Transfer Requested./msg
  /msgQ
  resData
   domain:trnData 
xmlns:domain='urn:ietf:params:xml:ns:domain-1.0' 
xsi:schemaLocation='urn:ietf:params:xml:ns:domain-1.0 
domain-1.0.xsd'
domain:namexxx.xx/domain:name
domain:trStatuspending/domain:trStatus
domain:reIDadmin/domain:reID
domain:reDate2010-06-02T12:35:33.0Z/domain:reDate
domain:acIDxxx-xx/domain:acID
domain:acDate2010-06-07T12:35:33.0Z/domain:acDate
domain:exDate2014-05-11T12:52:07.0Z/domain:exDate
   /domain:trnData
  /resData
  trID
   svTRIDxx-xxx/svTRID
  /trID
 /response
/epp$$, 
array[array['domain','urn:ietf:params:xml:ns:domain-1.0']]);
xpath
-
 {xxx.xx}
(1 row)

test=#

-- // --

worked... and:

-- // --

test=# SELECT xpath('//qDate/text()',   

   
$$?xml version='1.0' encoding='UTF-8'?


epp xmlns='urn:ietf:params:xml:ns:epp-1.0' 
xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' 
xsi:schemaLocation='urn:ietf:params:xml:ns:epp-1.0 
epp-1.0.xsd'
 response
  result code='1301'
   msg lang='en-US'Command completed successfully; ack to dequeue/msg
 /result
  msgQ count='3' id='114004'
   qDate2010-06-02T12:35:33.0Z/qDate
   msg lang='en-US'Transfer Requested./msg
  /msgQ
  resData
   domain:trnData 
xmlns:domain='urn:ietf:params:xml:ns:domain-1.0' 
xsi:schemaLocation='urn:ietf:params:xml:ns:domain-1.0 
domain-1.0.xsd'
domain:namexxx.xx/domain:name
domain:trStatuspending/domain:trStatus
domain:reIDadmin/domain:reID
domain:reDate2010-06-02T12:35:33.0Z/domain:reDate
domain:acIDxxx-xx/domain:acID
domain:acDate2010-06-07T12:35:33.0Z/domain:acDate
domain:exDate2014-05-11T12:52:07.0Z/domain:exDate
   /domain:trnData
  /resData
  trID
   svTRIDxx-xxx/svTRID
  /trID
 /response
/epp$$);
 xpath
---
 {}
(1 row)

test=#

-- // --

not worked.

I think that the nuance about default namespaces should add to xpath's doc.

This select:

-- // --

SELECT xpath('//global:qDate/text()',   

   
$$?xml version='1.0' encoding='UTF-8'?


epp xmlns='urn:ietf:params:xml:ns:epp-1.0' 
xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' 
xsi:schemaLocation='urn:ietf:params:xml:ns:epp-1.0 
epp-1.0.xsd'
 response
  result code='1301'
   msg lang='en-US'Command completed successfully; ack to dequeue/msg
 /result
  msgQ count='3' id='114004'
   qDate2010-06-02T12:35:33.0Z/qDate
   msg lang='en-US'Transfer Requested./msg
  /msgQ
  resData
   domain:trnData 
xmlns:domain='urn:ietf:params:xml:ns:domain-1.0' 
xsi:schemaLocation='urn:ietf:params:xml:ns:domain-1.0 
domain-1.0.xsd'
domain:namexxx.xx/domain:name
domain:trStatuspending/domain:trStatus
domain:reIDadmin/domain:reID
domain:reDate2010-06-02T12:35:33.0Z/domain:reDate
domain:acIDxxx-xx/domain:acID
domain:acDate2010-06-07T12:35:33.0Z/domain:acDate
domain:exDate2014-05-11T12:52:07.0Z/domain:exDate
   /domain:trnData
  /resData
  trID
   svTRIDxx-xxx/svTRID
  /trID
 /response
/epp$$, array[array['global', 'urn:ietf:params:xml:ns:epp-1.0']]);
  xpath   
--
 {2010-06-02T12:35:33.0Z}
(1 row)

-- // --

work. Thanks to Ёж on www.sql.ru and pgsql-ru-general ... :)

-- 
Best regards,
Denis I. Polukarov
developer
Garant-Park-Telekom
http://www.gpt.ru

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rfc: changing documentation about xpath

2010-06-03 Thread Andrew Dunstan



Denis I. Polukarov wrote:

 Hi!

I'm to face a problem, and not at once resolve it. 

  
[default namespace mapped in xml xmlns= attribute requires 
corresponding mapping in third param of xpath()]


It's a tolerably subtle point, and I'm not sure it's really 
PostgreSQL-specific. But if you think the docs need improvement, then 
please suggest a patch with the extra wording you think would make 
things clearer.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Fujii Masao
On Thu, Jun 3, 2010 at 5:00 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I stand by my suggestion from yesterday: Let's define max_standby_delay
 as the difference between a piece of WAL becoming available in the
 standby, and applying it.

 My proposal is essentially the same as yours plus allowing the DBA to
 choose different max delays for the caught-up and not-caught-up cases.
 Maybe everybody will end up setting the two delays the same, but I think
 we don't have enough experience to decide that for them now.

Applying WAL that arrives via SR is not always the sign of the caught-up
or not-caught-up. OTOH, applying WAL restored from archive is not always
the sign of either of them. So isn't it nonsense to separate the delay in
order to control the behavior of a recovery for those cases?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow wal_keep_segments to keep all segments

2010-06-03 Thread Bruce Momjian
Simon Riggs wrote:
 On Wed, 2010-06-02 at 20:28 -0400, Bruce Momjian wrote:
  Simon Riggs wrote:
   On Wed, 2010-06-02 at 15:20 -0400, Bruce Momjian wrote:
   
The attached patch allows wal_keep_segments = -1 to keep all segements; 
this is particularly useful for taking a base backup, where you need all
the WAL files during startup of the standby.  I have documented this
usage in the patch as well.

I am thinking of applying this after 9.0 beta2 if there is no objection.
   
   It's not clear to me why keep all files until server breaks is a good
   setting. Surely you would set this parameter to the size of your disk.
   Why allow it to go higher?
  
  Well, the -1 allows them to set it temporarily without having to compute
  their free disk space.  Frankly, because the disk space varies, it is
  impossible to know exactly how large the disk is at the time it would
  fill up.
  
  I think the normal computation would be:
  
  1) How long is my file system backup and restore to standby
 going to take
  2) How often do I generate a 16MB WAL file
  
  You would do some computation to figure that out, then maybe multiply it
  by 10x and set that for wal_keep_segments.  I figured allowing a simple
  -1 would be easier.
 
 I think its much easier to find out your free disk space than it is to
 calculate how much WAL might be generated during backup. Disk space
 doesn't vary significantly on a production database.
 
 If we encourage that laziness then we will get reports that replication
 doesn't work and Postgres crashes.

Well, we don't clean out the archive directory so I don't see this as
anything new.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to pass around collation information

2010-06-03 Thread David Christensen

On Jun 3, 2010, at 2:43 AM, Peter Eisentraut wrote:

 On ons, 2010-06-02 at 16:56 -0400, Robert Haas wrote:
 But in the end the only purpose of setting it on a column is to set
 which one will be used for operations on that column.  And the user
 might still override it for a particular query.
 
 Of course.  I'm just saying that it *can* be useful to attach a
 collation to a column definition, rather than only allowing it to be
 specified along with the sort operation.


How does collation relate to per-table/column encodings?  For that matter, are 
per-table/column encodings spec, and/or something that we're looking to 
implement down the line?

Regards,

David
--
David Christensen
End Point Corporation
da...@endpoint.com





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow wal_keep_segments to keep all segments

2010-06-03 Thread Heikki Linnakangas

On 03/06/10 15:15, Bruce Momjian wrote:

Simon Riggs wrote:

I think its much easier to find out your free disk space than it is to
calculate how much WAL might be generated during backup. Disk space
doesn't vary significantly on a production database.

If we encourage that laziness then we will get reports that replication
doesn't work and Postgres crashes.


Well, we don't clean out the archive directory so I don't see this as
anything new.


We leave that up to the DBA to clean out one way or another. We provide 
restartpoint_command and the %r option in restore_command to help with that.


Surely we don't expect DBAs to delete old files in pg_xlog? I agree with 
Simon here, I think it would be better to not provide -1 as an option 
here. At least you better document well that you should only do that 
temporarily or you will eventually run out of disk space.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow wal_keep_segments to keep all segments

2010-06-03 Thread Bruce Momjian
Heikki Linnakangas wrote:
 On 03/06/10 15:15, Bruce Momjian wrote:
  Simon Riggs wrote:
  I think its much easier to find out your free disk space than it is to
  calculate how much WAL might be generated during backup. Disk space
  doesn't vary significantly on a production database.
 
  If we encourage that laziness then we will get reports that replication
  doesn't work and Postgres crashes.
 
  Well, we don't clean out the archive directory so I don't see this as
  anything new.
 
 We leave that up to the DBA to clean out one way or another. We provide 
 restartpoint_command and the %r option in restore_command to help with that.
 
 Surely we don't expect DBAs to delete old files in pg_xlog? I agree with 
 Simon here, I think it would be better to not provide -1 as an option 
 here. At least you better document well that you should only do that 
 temporarily or you will eventually run out of disk space.

Using this only temporarily is mentioned in the doc patch.  Do I need
more?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +
Index: doc/src/sgml/config.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.280
diff -c -c -r1.280 config.sgml
*** doc/src/sgml/config.sgml	31 May 2010 15:50:48 -	1.280
--- doc/src/sgml/config.sgml	2 Jun 2010 19:19:18 -
***
*** 1887,1893 
  Specifies the number of past log file segments kept in the
  filenamepg_xlog/
  directory, in case a standby server needs to fetch them for streaming
! replication. Each segment is normally 16 megabytes. If a standby
  server connected to the primary falls behind by more than
  varnamewal_keep_segments/ segments, the primary might remove
  a WAL segment still needed by the standby, in which case the
--- 1887,1893 
  Specifies the number of past log file segments kept in the
  filenamepg_xlog/
  directory, in case a standby server needs to fetch them for streaming
! replication.  Each segment is normally 16 megabytes. If a standby
  server connected to the primary falls behind by more than
  varnamewal_keep_segments/ segments, the primary might remove
  a WAL segment still needed by the standby, in which case the
***
*** 1901,1908 
  is zero (the default), the system doesn't keep any extra segments
  for standby purposes, and the number of old WAL segments available
  for standbys is determined based only on the location of the previous
! checkpoint and status of WAL archiving.
! This parameter can only be set in the filenamepostgresql.conf/
  file or on the server command line.
 /para
 /listitem
--- 1901,1909 
  is zero (the default), the system doesn't keep any extra segments
  for standby purposes, and the number of old WAL segments available
  for standbys is determined based only on the location of the previous
! checkpoint and status of WAL archiving.  If literal-1/ is
! specified, log file segments are kept indefinitely. This
! parameter can only be set in the filenamepostgresql.conf/
  file or on the server command line.
 /para
 /listitem
Index: doc/src/sgml/high-availability.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/high-availability.sgml,v
retrieving revision 1.70
diff -c -c -r1.70 high-availability.sgml
*** doc/src/sgml/high-availability.sgml	29 May 2010 09:01:10 -	1.70
--- doc/src/sgml/high-availability.sgml	2 Jun 2010 19:19:19 -
***
*** 750,756 
  If you use streaming replication without file-based continuous
  archiving, you have to set varnamewal_keep_segments/ in the master
  to a value high enough to ensure that old WAL segments are not recycled
! too early, while the standby might still need them to catch up. If the
  standby falls behind too much, it needs to be reinitialized from a new
  base backup. If you set up a WAL archive that's accessible from the
  standby, wal_keep_segments is not required as the standby can always
--- 750,760 
  If you use streaming replication without file-based continuous
  archiving, you have to set varnamewal_keep_segments/ in the master
  to a value high enough to ensure that old WAL segments are not recycled
! too early, while the standby might still need them to catch up. This
! is particularly important when performing a base backup because the
! standby will need all WAL segments generated since the start of the
! backup;  consider setting varnamewal_keep_segments/ to
! literal-1/ temporarily in such cases.  If the
  

[HACKERS] PITR Recovery Question

2010-06-03 Thread Gnanakumar
Hi,

My production server is running PostgreSQL v8.2.3 on CentOS release 5.2
(Final).

I've setup PITR in my production server.  For some reason, after setting up
PITR, we're not able to manage and maintain it.  Because of this our WAL
archive drive become full (100% use) approximately after 1 month.

PITR SETUP DETAILS
We've 2 drives.  Primary drive (pgsql/data/ directory resides) is 400 GB and
secondary drive (WAL archive) is 30 GB.  All WAL archives are written to
secondary drive.

Base backup taken on: Aug03, 2009
WAL archive drive become full (100% use) on: Sep05, 2009

Because this WAL archive drive has become full, all WAL archive segments to
be archived are accumulated into pg_xlog/ directory itself.  Eventually, 9
months (as of today from Sep05, 2009) of WAL archives are residing in
pg_xlog/ directory.

My question is, in case if I would like to perform recovery process as it is
in this situation, will this work out?  That is, I'm seeing/finding out
whether recovery process would perform successfully anywhere between the
date range Aug03, 2009 (my base backup date) and as of today - Jun03, 2009.
Reason I'm asking this is still all my WAL archives are residing in pg_xlog/
directory.

Experts advice/idea/suggestion on this appreciated.

Regards,
Gnanam


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] caught_up status in walsender

2010-06-03 Thread Tom Lane
Fujii Masao masao.fu...@gmail.com writes:
 On Thu, Jun 3, 2010 at 4:21 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 That's intentional. It could take some time for the WAL to be sent, if
 the network is busy, so by the time XLogSend returns you might well not
 be caught up anymore.
 
 It may have been intentional, but it's still wrong.  If you were able to
 pull all of WAL into the record-to-be-sent, you should sleep afterwards,
 not send an extra record containing a few more bytes.

 For reducing the workload of walsender?

 This seems OK in 9.0 since only asynchronous replication is supported.
 But when we'll implement synchronous replication in the future, we
 might have to revert that change. Since a transaction commit might wait
 for such an extra record to be replicated, walsender should aggressively
 send all sendable WAL.

It *is* aggressively sending all sendable WAL.  The ideal steady state
behavior of this loop ought to be that once per sleep interval, we send
out one record containing all new WAL since the last time.   We do not
want it sending 1 bytes, then another record with 100 bytes, then
another record with 10 bytes, etc etc.  That's inefficient and
ultimately pointless.  You'll always be behind again a millisecond
later.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow wal_keep_segments to keep all segments

2010-06-03 Thread Bruce Momjian
Heikki Linnakangas wrote:
 On 03/06/10 15:15, Bruce Momjian wrote:
  Simon Riggs wrote:
  I think its much easier to find out your free disk space than it is to
  calculate how much WAL might be generated during backup. Disk space
  doesn't vary significantly on a production database.
 
  If we encourage that laziness then we will get reports that replication
  doesn't work and Postgres crashes.
 
  Well, we don't clean out the archive directory so I don't see this as
  anything new.
 
 We leave that up to the DBA to clean out one way or another. We provide 
 restartpoint_command and the %r option in restore_command to help with that.
 
 Surely we don't expect DBAs to delete old files in pg_xlog? I agree with 
 Simon here, I think it would be better to not provide -1 as an option 
 here. At least you better document well that you should only do that 
 temporarily or you will eventually run out of disk space.

I have updated the doc text to mention temporarily everywhere '-1' is
mentioned.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +
Index: doc/src/sgml/config.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.280
diff -c -c -r1.280 config.sgml
*** doc/src/sgml/config.sgml31 May 2010 15:50:48 -  1.280
--- doc/src/sgml/config.sgml3 Jun 2010 14:05:21 -
***
*** 1901,1908 
  is zero (the default), the system doesn't keep any extra segments
  for standby purposes, and the number of old WAL segments available
  for standbys is determined based only on the location of the previous
! checkpoint and status of WAL archiving.
! This parameter can only be set in the filenamepostgresql.conf/
  file or on the server command line.
 /para
 /listitem
--- 1901,1909 
  is zero (the default), the system doesn't keep any extra segments
  for standby purposes, and the number of old WAL segments available
  for standbys is determined based only on the location of the previous
! checkpoint and status of WAL archiving.  To temporarily keep
! all log file segments, use the value literal-1/. This
! parameter can only be set in the filenamepostgresql.conf/
  file or on the server command line.
 /para
 /listitem
Index: doc/src/sgml/high-availability.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/high-availability.sgml,v
retrieving revision 1.70
diff -c -c -r1.70 high-availability.sgml
*** doc/src/sgml/high-availability.sgml 29 May 2010 09:01:10 -  1.70
--- doc/src/sgml/high-availability.sgml 3 Jun 2010 14:05:21 -
***
*** 750,756 
  If you use streaming replication without file-based continuous
  archiving, you have to set varnamewal_keep_segments/ in the master
  to a value high enough to ensure that old WAL segments are not recycled
! too early, while the standby might still need them to catch up. If the
  standby falls behind too much, it needs to be reinitialized from a new
  base backup. If you set up a WAL archive that's accessible from the
  standby, wal_keep_segments is not required as the standby can always
--- 750,760 
  If you use streaming replication without file-based continuous
  archiving, you have to set varnamewal_keep_segments/ in the master
  to a value high enough to ensure that old WAL segments are not recycled
! too early, while the standby might still need them to catch up. This
! is particularly important when performing a base backup because the
! standby will need all WAL segments generated since the start of the
! backup;  consider setting varnamewal_keep_segments/ to
! literal-1/ temporarily in such cases.  If the
  standby falls behind too much, it needs to be reinitialized from a new
  base backup. If you set up a WAL archive that's accessible from the
  standby, wal_keep_segments is not required as the standby can always
Index: src/backend/access/transam/xlog.c
===
RCS file: /cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.415
diff -c -c -r1.415 xlog.c
*** src/backend/access/transam/xlog.c   2 Jun 2010 09:28:44 -   1.415
--- src/backend/access/transam/xlog.c   3 Jun 2010 14:05:22 -
***
*** 7337,7343 
 * Delete old log files (those no longer needed even for previous
 * checkpoint or the standbys in XLOG streaming).
 */
!   if (_logId || _logSeg)
{
/*
 * Calculate the last segment that we need to retain because of
--- 7337,7343 
 * Delete old log files 

Re: [HACKERS] ALTER TABLE .... make constraint DEFERRABLE

2010-06-03 Thread Bruce Momjian
Simon Riggs wrote:
 On Wed, 2010-06-02 at 21:06 -0400, Bruce Momjian wrote:
  Simon Riggs wrote:
   
   Deferrable unique constraints seem an interesting feature, though I have
   either some questions or some issues, not sure which.
   
   I don't seem to be able to find any way to do an ALTER TABLE that adds
   this new capability to an existing table.
  
  I was able to do it:
  
  test= create table test (x int unique DEFERRABLE INITIALLY DEFERRED);
  NOTICE:  CREATE TABLE / UNIQUE will create implicit index test_x_key
  for table test
  CREATE TABLE
  
  test= alter table test add column y int;
  ALTER TABLE
  
  test= alter table test add unique (y) DEFERRABLE INITIALLY DEFERRED;
  NOTICE:  ALTER TABLE / ADD UNIQUE will create implicit index
  test_y_key for table test
  ALTER TABLE
  
  Is that what you were asking?
 
 No. I wanted to defer an existing UNIQUE constraint. That doesn't seem
 to be possible. You *can* add a whole new constraint and then drop the
 old one, though that's not quite as cool.

Do we allow any kind of constraint modification via ALTER TABLE?  I
don't see much listed in the manual.

   There is no way to add a constraint via a CREATE TABLE AS SELECT, so
   that means there is no way to use the feature at all in that case.
  
  Uh, CREATE TABLE AS SELECT seems to be very limited, but I have not
  heard any complaints about it before.
  
   Also, foreign keys can't be defined that refer to a deferrable primary
   key. That isn't mentioned at all in the manual with regard to the
   DEFERRABLE clause, though it is mentioned in the FK section. You get
   this error message
   ERROR:  cannot use a deferrable unique constraint for referenced table
   
   The use case for this feature looks a little narrow at present. Can we
   do something about usability?
  
  Not sure why that was a limitation.
 
 Regrettably it makes it an unusable limitation for many people.
 
 All large tables are referenced in a typical database that uses PKs/FKs.

Yeah, no question.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow wal_keep_segments to keep all segments

2010-06-03 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Heikki Linnakangas wrote:
 Surely we don't expect DBAs to delete old files in pg_xlog? I agree with 
 Simon here, I think it would be better to not provide -1 as an option 
 here. At least you better document well that you should only do that 
 temporarily or you will eventually run out of disk space.

 I have updated the doc text to mention temporarily everywhere '-1' is
 mentioned.

FWIW, I've come to agree with Simon.  Allowing -1 doesn't do anything
that you can't do with a large positive setting, and what it does do
is to encourage people to set the variable to an unsafe value as a
substitute for thinking.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to pass around collation information

2010-06-03 Thread Peter Eisentraut
On tor, 2010-06-03 at 07:30 -0500, David Christensen wrote:
 How does collation relate to per-table/column encodings?

There is some connection between collations and character sets or
encodings, because a collation is tied to one of those, just as a
necessity of implementation (depending on implementation details).  You
could have per-column (or per-some-other-small-unit) collation with a
global encoding, but a per-column encoding with a global collation
wouldn't work.

 For that matter, are per-table/column encodings spec,

Yes.

  and/or something that we're looking to implement down the line?

I don't think anyone is seriously planning that.  But per-column
collations would have to come first, anyway.

Of course there is always the possibility that someone comes up with an
alternative plan that invalidates the above, but the above represents
the facts from the SQL standard and other implementations.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Did we really want to force an initdb in beta2?

2010-06-03 Thread Tom Lane
Because that's the consequences of fooling with pg_control.
I committed the PG_CONTROL_VERSION bump that was missing from
the patch Robert committed last night, but I wonder whether
we shouldn't revert the whole thing instead.  It's not apparent
to me that what it bought is worth forcing beta testers to initdb.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Greg Stark
On Thu, Jun 3, 2010 at 12:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Greg Stark gsst...@mit.edu writes:
 I was assuming the walreceiver only requests more wal in relatively
 small chunks and only when replay has caught up and needs more data. I
 haven't actually read this code so if that assumption is wrong then
 I'm off-base.

 It is off-base.  The receiver does not request data, the sender is
 what determines how much WAL is sent when.

Hm, so what happens if the slave blocks, doesn't the sender block when
the kernel buffers fill up?

 So I think this isn't necessarily such a blue moon event. As I
 understand it, all it would take is a single long-running report and a
 vacuum or HOT cleanup occurring on the master.

 I think this is mostly FUD too.  How often do you see vacuum blocked for
 an hour now?

No, that's not comparable. On the master vacuum will just ignore
tuples that are still visible to live transactions. On the slave it
doesn't have a choice, it sees the cleanup record and must pause
recovery until those transactions finish.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] functional call named notation clashes with SQL feature

2010-06-03 Thread Bruce Momjian
Peter Eisentraut wrote:
 On m?n, 2010-05-31 at 18:23 -0400, Tom Lane wrote:
  My feeling is that (a) there is no hurry to do anything about an
  unreleased draft of the standard, and (b) perhaps Peter could lobby
  the committee to change the standard before it does get published. 
 
 Given that Oracle and DB2 already support that syntax in released
 products, and I'm not even a member of any relevant body, that seems
 pretty much impossible.

With beta2 being wrapped today, we are going to be releasing ':=' as our
method for function parameter assignment, but also with the likely
outcome that we are going to need to support '=' very soon.

Are we sure we want hstore compatibility to drive this decision?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PITR Recovery Question

2010-06-03 Thread Florian Pflug
Hi,

I'll try to answer your questions below, but in the future please post 
questions concerning the usage and administration of postgres to pgsql-general 
or pgsql-admin. This list focus is the development of new features and bugfixes.

On Jun 3, 2010, at 15:37 , Gnanakumar wrote:
 PITR SETUP DETAILS
 We've 2 drives.  Primary drive (pgsql/data/ directory resides) is 400 GB and
 secondary drive (WAL archive) is 30 GB.  All WAL archives are written to
 secondary drive.
 
 Base backup taken on: Aug03, 2009
 WAL archive drive become full (100% use) on: Sep05, 2009
 
 Because this WAL archive drive has become full, all WAL archive segments to
 be archived are accumulated into pg_xlog/ directory itself.  Eventually, 9
 months (as of today from Sep05, 2009) of WAL archives are residing in
 pg_xlog/ directory.

This is by design. WAL logs are only removed from pg_xlog once they have been 
archived successfully. Since your archive_command fails due to the disk being 
full, they remain in pg_xlog. Once you enlarge the filesystem holding the WAL 
archive they should be copied and subsequently removed from pg_xlog.

Note that you'd usually take a new base backup once in a while to limit the 
number of WAL segments you need to retain. You can take a base backup while 
postgres is running by issuing pg_start_backup() before you start the copy and 
pg_stop_backup() after it finished. Apart from creating additional IO load, 
doing so won't interfere with normal query execution in any way.

 My question is, in case if I would like to perform recovery process as it is
 in this situation, will this work out?  That is, I'm seeing/finding out
 whether recovery process would perform successfully anywhere between the
 date range Aug03, 2009 (my base backup date) and as of today - Jun03, 2009.
 Reason I'm asking this is still all my WAL archives are residing in pg_xlog/
 directory.

For PITR, you'll obviously need the WAL segment starting from the time your 
base backup started up until the point you want to recover to. If some of those 
WAL segments still reside in pg_xlog, you'll either need to teach your 
restore_command to fetch them from there. Note that you cannot recover in 
reverse. To recover up to a certain point in time you always need to start 
from a base backup taken *before* that time.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Did we really want to force an initdb in beta2?

2010-06-03 Thread Heikki Linnakangas

On 03/06/10 17:54, Tom Lane wrote:

Because that's the consequences of fooling with pg_control.
I committed the PG_CONTROL_VERSION bump that was missing from
the patch Robert committed last night, but I wonder whether
we shouldn't revert the whole thing instead.  It's not apparent
to me that what it bought is worth forcing beta testers to initdb.


Hmph, good point, I did not think of that at all when I reviewed the patch.

If we moved the new DB_SHUTDOWNED_IN_RECOVERY as the last item in the 
enum, we would stay backwards-compatible.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] caught_up status in walsender

2010-06-03 Thread Tom Lane
I wrote:
 On further contemplation, it seems like the protocol needs another field
 besides that: each record should also carry a boolean indicating whether
 walsender.c thinks it is currently caught up, ie the record carries
 all WAL data up to the current end of WAL.

Actually, there's a better way to do that: let's have the record carry
not just a boolean but the actual current end-of-WAL LSN.  The receiver
could then not just determine am I behind but find out *how far*
behind it is, and thereby perhaps adjust its behavior in more subtle
ways than just a binary on/off fashion.

(Actually doing anything like that is material for future work, of
course, but I think we should try to get the SR protocol right now.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] functional call named notation clashes with SQL feature

2010-06-03 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Are we sure we want hstore compatibility to drive this decision?

hstore is what it is, and has been that way for a long time.  We can't
just ignore it.  And I don't think breaking it (and probably other code)
on zero notice is an acceptable outcome.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Did we really want to force an initdb in beta2?

2010-06-03 Thread Robert Haas
On Thu, Jun 3, 2010 at 11:25 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 03/06/10 17:54, Tom Lane wrote:

 Because that's the consequences of fooling with pg_control.
 I committed the PG_CONTROL_VERSION bump that was missing from
 the patch Robert committed last night, but I wonder whether
 we shouldn't revert the whole thing instead.  It's not apparent
 to me that what it bought is worth forcing beta testers to initdb.

 Hmph, good point, I did not think of that at all when I reviewed the patch.

 If we moved the new DB_SHUTDOWNED_IN_RECOVERY as the last item in the enum,
 we would stay backwards-compatible.

Ugh, sorry about that.  I didn't realize this either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 Well, if the slave can't keep up, that's a separate problem.  It will
 not fail to keep up as a result of the transmission mechanism.

 No, I mean if the slave is paused due to a conflict. Does it stop
 reading data from the master or does it buffer it up on disk? If it
 stops reading it from the master then the effect is the same as if the
 slave stopped requesting data even if there's no actual request.

The data keeps coming in and getting dumped into the slave's pg_xlog.
walsender/walreceiver are not at all tied to the slave's application
of WAL.  In principle we could have the code around max_standby_delay
notice just how far behind it's gotten and adjust the delay tolerance
based on that; but I think designing a feedback loop for that is 9.1
material.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] functional call named notation clashes with SQL feature

2010-06-03 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Are we sure we want hstore compatibility to drive this decision?
 
 hstore is what it is, and has been that way for a long time.  We can't
 just ignore it.  And I don't think breaking it (and probably other code)
 on zero notice is an acceptable outcome.

Well, it seems we are going to be stuck supporting = because it is hard
to argue that the SQL standards committee should adopt := instead of =
because of hstore.  ;-)

I hate eventually having two documented ways of doing something, but it
appears by releasing := we are doing exactly that.

Is telling hstore users they have to change = to something else such a
large major version incompatibility that it is worth supporting and
documenting two syntaxes for parameter assignment?  It is that calculus
that has me questioning our approach.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Greg Stark
On Thu, Jun 3, 2010 at 4:34 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The data keeps coming in and getting dumped into the slave's pg_xlog.
 walsender/walreceiver are not at all tied to the slave's application
 of WAL.  In principle we could have the code around max_standby_delay
 notice just how far behind it's gotten and adjust the delay tolerance
 based on that; but I think designing a feedback loop for that is 9.1
 material.

Er, no. In that case my first concern was misguided. I'm happy there's
no feedback loop -- my fear was that there was and it would mean the
time received could be decoupled from the time the wal was
generated. But as you describe it then the time received might be
slightly delayed from the time the wal was generated but to some
constant degree -- not in a way that will be influenced by the log
application being blocked on the slave.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] functional call named notation clashes with SQL feature

2010-06-03 Thread Robert Haas
On Thu, Jun 3, 2010 at 11:34 AM, Bruce Momjian br...@momjian.us wrote:
 Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Are we sure we want hstore compatibility to drive this decision?

 hstore is what it is, and has been that way for a long time.  We can't
 just ignore it.  And I don't think breaking it (and probably other code)
 on zero notice is an acceptable outcome.

 Well, it seems we are going to be stuck supporting = because it is hard
 to argue that the SQL standards committee should adopt := instead of =
 because of hstore.  ;-)

 I hate eventually having two documented ways of doing something, but it
 appears by releasing := we are doing exactly that.

 Is telling hstore users they have to change = to something else such a
 large major version incompatibility that it is worth supporting and
 documenting two syntaxes for parameter assignment?  It is that calculus
 that has me questioning our approach.

I don't mind supporting := and = as much as I mind supporting only
:=, and I think that's the other reasonable alternative.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] functional call named notation clashes with SQL feature

2010-06-03 Thread Bruce Momjian
Bruce Momjian wrote:
 Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
   Are we sure we want hstore compatibility to drive this decision?
  
  hstore is what it is, and has been that way for a long time.  We can't
  just ignore it.  And I don't think breaking it (and probably other code)
  on zero notice is an acceptable outcome.
 
 Well, it seems we are going to be stuck supporting = because it is hard
 to argue that the SQL standards committee should adopt := instead of =
 because of hstore.  ;-)
 
 I hate eventually having two documented ways of doing something, but it
 appears by releasing := we are doing exactly that.
 
 Is telling hstore users they have to change = to something else such a
 large major version incompatibility that it is worth supporting and
 documenting two syntaxes for parameter assignment?  It is that calculus
 that has me questioning our approach.

Thinking some more, what is the value of keeping = in hstore for 9.0?
Perhaps we could create a script they could run on 8.4 that would add
support for the new hstore operator to replace =, and then they can
upgrade to 9.0 when they are ready.  I see only three mentions of = in
hstore.sql.  Do we really want to keep the := baggage forever just for
hstore?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to pass around collation information

2010-06-03 Thread Robert Haas
On Thu, Jun 3, 2010 at 3:43 AM, Peter Eisentraut pete...@gmx.net wrote:
 On ons, 2010-06-02 at 16:56 -0400, Robert Haas wrote:
 But in the end the only purpose of setting it on a column is to set
 which one will be used for operations on that column.  And the user
 might still override it for a particular query.

 Of course.  I'm just saying that it *can* be useful to attach a
 collation to a column definition, rather than only allowing it to be
 specified along with the sort operation.

Oh, I agree with that, for sure.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] functional call named notation clashes with SQL feature

2010-06-03 Thread Dimitri Fontaine
Bruce Momjian br...@momjian.us writes:
 Is telling hstore users they have to change = to something else such a
 large major version incompatibility that it is worth supporting and
 documenting two syntaxes for parameter assignment?  It is that calculus
 that has me questioning our approach.

Well it's not only hstore. Anyone can CREATE OPERATOR = on any released
version of PostgreSQL currently. I don't think we can deprecate it on
short notice, so we'll have := in 9.0, and maybe a deprecation notice
for =.

Now that it's pretty clear from Peter that the standard is not going to
change its choice here, I'll vote adding a WARNING each time an operator
called = is created, so that we get a chance to move later on.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] functional call named notation clashes with SQL feature

2010-06-03 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Thinking some more, what is the value of keeping = in hstore for 9.0?

Backwards compatibility.  You have not made any argument today that we
have not been over many times before.  I do not have time to argue
about this today --- I have to go fix max_standby_delay.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] functional call named notation clashes with SQL feature

2010-06-03 Thread David E. Wheeler
On Jun 3, 2010, at 8:54 AM, Tom Lane wrote:

 Thinking some more, what is the value of keeping = in hstore for 9.0?
 
 Backwards compatibility.  You have not made any argument today that we
 have not been over many times before.  I do not have time to argue
 about this today --- I have to go fix max_standby_delay.

I'm sure there's a lot of code using = in the wild. We can't just drop them 
without a deprecation cycle.

Best,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] functional call named notation clashes with SQL feature

2010-06-03 Thread David E. Wheeler
On Jun 3, 2010, at 8:53 AM, Dimitri Fontaine wrote:

 Now that it's pretty clear from Peter that the standard is not going to
 change its choice here, I'll vote adding a WARNING each time an operator
 called = is created, so that we get a chance to move later on.

Should support for == be added to hstore for 9.0? So both = and == will work?

Best,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] functional call named notation clashes with SQL feature

2010-06-03 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Thinking some more, what is the value of keeping = in hstore for 9.0?
 
 Backwards compatibility.  You have not made any argument today that we
 have not been over many times before.  I do not have time to argue
 about this today --- I have to go fix max_standby_delay.

Agreed.  I am just making sure we are going in the right direction.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] functional call named notation clashes with SQL feature

2010-06-03 Thread Bruce Momjian
David E. Wheeler wrote:
 On Jun 3, 2010, at 8:53 AM, Dimitri Fontaine wrote:
 
  Now that it's pretty clear from Peter that the standard is not going to
  change its choice here, I'll vote adding a WARNING each time an operator
  called = is created, so that we get a chance to move later on.
 
 Should support for == be added to hstore for 9.0? So both = and == will 
 work?

I have added the idea to the 9.0 open items wiki:

http://wiki.postgresql.org/wiki/PostgreSQL_9.0_Open_Items#Code

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Did we really want to force an initdb in beta2?

2010-06-03 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 If we moved the new DB_SHUTDOWNED_IN_RECOVERY as the last item in the 
 enum, we would stay backwards-compatible.

I don't think that's a terribly workable idea; the enum is laid out so
that inequality tests are sensible, and I'm not sure there aren't any.
The code would look mighty ugly in any case.

What exactly was the reason for this patch?  Could it be held over till
9.1?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.0 release notes

2010-06-03 Thread Bruce Momjian
The 9.0 release notes have been updated to current.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Did we really want to force an initdb in beta2?

2010-06-03 Thread Heikki Linnakangas

On 03/06/10 19:16, Tom Lane wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:

If we moved the new DB_SHUTDOWNED_IN_RECOVERY as the last item in the
enum, we would stay backwards-compatible.


I don't think that's a terribly workable idea; the enum is laid out so
that inequality tests are sensible, and I'm not sure there aren't any.


Hmm, the only inequality tests on that field I can see check that the 
value is valid, i.e between the first and last valid value.



The code would look mighty ugly in any case.


True.

One more hacky idea: Keep the code as it is and change pg_control 
version back to what it was in beta1. Add a note in the release notes 
that if you're upgrading from beta1, you must shut down the database 
cleanly first. When you do that, control file is in DB_SHUTDOWNED state, 
and the enum value for that did not change.


One caveat is that a standby server will be DB_IN_ARCHIVE_RECOVERY, 
which did change value so that with beta2 binaries it will look like 
DB_IN_CRASH_RECOVERY. I think that would still work, though (and if not, 
in the worst case you'll just have to reinitialize the standby from a 
new base backup).



What exactly was the reason for this patch?  Could it be held over till
9.1?


Before the patch, when you shut down a standby server, you get this 
message in the log on the next startup:


LOG:  database system was interrupted while in recovery at log time 
2010-06-02 14:48:28 EEST
HINT:  If this has occurred more than once some data might be corrupted 
and you might need to choose an earlier recovery target.


The problem is that that hint is pretty alarming. The data should be 
fine if the standby server was shut down cleanly with pg_ctl stop -m 
fast/smart.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On Wed, 2010-06-02 at 13:14 -0400, Tom Lane wrote:
 This patch seems to me to be going in fundamentally the wrong direction.
 It's adding complexity and overhead (far more than is needed), and it's
 failing utterly to resolve the objections that I raised to start with.

 Having read your proposal, it seems changing from time-on-sender to
 time-on-receiver is a one line change to the patch.

 What else are you thinking of removing, if anything?

Basically, we need to get rid of everything that feeds timestamps from
the WAL content into the kill-delay logic.

 In particular, Simon seems to be basically refusing to do anything about
 the complaint that the code fails unless master and standby clocks are
 in close sync.  I do not believe that this is acceptable, and since he
 won't fix it, I guess I'll have to.

 Syncing two servers in replication is common practice, as has been
 explained here; I'm still surprised people think otherwise.

Doesn't affect the complaint in the least: I do not find it acceptable
to have that be *mandated* in order for our code to work sensibly.
I would be OK with having something approaching what you want as a
non-default optional behavior (with a clearly-documented dependency
on having synchronized clocks).  But in any case the current behavior is
still quite broken as regards reading stale timestamps from WAL.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On Wed, 2010-06-02 at 16:00 -0400, Tom Lane wrote:
 the current situation that query grace periods go to zero

 Possibly a better way to handle this concern is to make the second
 parameter: min_standby_grace_period - the minimum time a query will be
 given in which to execute, even if max_standby_delay has been reached or
 exceeded.

 Would that more directly address you concerns?
 min_standby_grace_period (ms) SIGHUP 

A minimum grace period seems like a good idea to me, but I think it's
somewhat orthogonal to the core problem here.  I think we all
intuitively feel that there should be a way to dial back the grace
period when a slave is far behind on applying WAL.  The problem is
first how to figure out what far behind means, and second how to
adjust the grace period in a way that doesn't have surprising
misbehaviors.  A minimum grace period would prevent some of the very
worst misbehaviors but it's not really addressing the core problem.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Did we really want to force an initdb in beta2?

2010-06-03 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 03/06/10 19:16, Tom Lane wrote:
 What exactly was the reason for this patch?  Could it be held over till
 9.1?

 Before the patch, when you shut down a standby server, you get this 
 message in the log on the next startup:

 LOG:  database system was interrupted while in recovery at log time 
 2010-06-02 14:48:28 EEST
 HINT:  If this has occurred more than once some data might be corrupted 
 and you might need to choose an earlier recovery target.

 The problem is that that hint is pretty alarming.

Maybe we should just get rid of the hint.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Simon Riggs
On Thu, 2010-06-03 at 12:47 -0400, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On Wed, 2010-06-02 at 13:14 -0400, Tom Lane wrote:
  This patch seems to me to be going in fundamentally the wrong direction.
  It's adding complexity and overhead (far more than is needed), and it's
  failing utterly to resolve the objections that I raised to start with.
 
  Having read your proposal, it seems changing from time-on-sender to
  time-on-receiver is a one line change to the patch.
 
  What else are you thinking of removing, if anything?
 
 Basically, we need to get rid of everything that feeds timestamps from
 the WAL content into the kill-delay logic.

I understand your wish to do this, though it isn't always accurate to do
that in the case where there is a backlog.

The patch already does that *mostly* for the streaming case. The only
time it does use the WAL content timestamp is when the WAL content
timestamp is later than the oldest receive time, so in that case it is
used as a correction. (see code comments and comments below also)

  In particular, Simon seems to be basically refusing to do anything about
  the complaint that the code fails unless master and standby clocks are
  in close sync.  I do not believe that this is acceptable, and since he
  won't fix it, I guess I'll have to.
 
  Syncing two servers in replication is common practice, as has been
  explained here; I'm still surprised people think otherwise.
 
 Doesn't affect the complaint in the least: I do not find it acceptable
 to have that be *mandated* in order for our code to work sensibly.

OK, accepted.

 I would be OK with having something approaching what you want as a
 non-default optional behavior (with a clearly-documented dependency
 on having synchronized clocks).  

Yes, that's what I'd been thinking also. So that gives us a way
forwards.

standby_delay_base = apply (default) | send

determines whether the standby_delay is calculated solely with reference
to the standby server (apply) or whether times from the sending server
are used (send). Use of send implies that the clocks on primary and
standby are synchronised to within a useful accuracy, in which case it
is usual to enforce that with time synchronisation software such as ntp.

 But in any case the current behavior is
 still quite broken as regards reading stale timestamps from WAL.

Agreed. That wasn't the objective of this patch or a priority.

If you're reading from an archive, you need to set max_standby_delay
appropriately, current recommendation is -1.

We can address that concern once the main streaming case is covered, or
we can add that now.

Are you planning to work on these things now as you said? How about I
apply my patch now, then you do another set of changes afterwards to add
the other items you mentioned, since that is now 2 additional parameters
and related code?

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [pgsql-hackers] Daily digest v1.10705 (13 messages)

2010-06-03 Thread Marc Munro
On Thu, 2010-06-03 at 05:53 -0300, pgsql-hackers-ow...@postgresql.org
wrote:
 [ . . . ]

 In my current idea, when a qual-node that contains FuncExpr tries to
 reference a part of relations within a security view, its referencing
 relations will be expanded to whole of the security view at
 distribute_qual_to_rels().
 [ . . . ]

I may be missing something here but this seems a bit too simplistic and,
I think, fails to deal with an important use case.

Security views, that do anything useful at all, tend to introduce
performance issues.  I am concerned that placing a conceptual barrier
between the secured and unsecured parts of queries is going to
unnecessarily restrict what the optimiser can do.

For example consider that we have three secured views, each of the form:

  create view s_x as select * from x where i_can_see(x.key);

and consider the query:

  select stuff from s_x 
inner join s_y on s_y.key = s_x.key
inner join s_z on s_z.key = s_x.key  
  where fn(s_x.a) = 3;

The optimiser ought to be able to spot the fact that i_can_see() need
only be called once for each joined result.  By placing a barrier (if I
understand your proposal correctly) between the outermost joins and the
inner views, doesn't this optimisation become impossible?

I think a simpler solution may be possible here.  If you can tag the
function i_can_see() as a security function, at least in the context of
its use in the security views, and then create the rule that security
functions are always considered to be lower cost than user-defined
non-security functions, don't we achieve the result of preventing the
insecure function from seeing rows that it shouldn't?

I guess my concern is that a query may be constructed a=out of secured
and unsecured parts and the optimiser should be free to group all of the
secured parts together before considering the unsecured parts.

Sorry for the imprecise language and terminolgy, and also if I have
completely misunderstood the implications.

Best Wishes

__
Marc (the veil guy)



signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] 9.0 release notes

2010-06-03 Thread Josh Berkus

On 06/03/2010 09:34 AM, Bruce Momjian wrote:

The 9.0 release notes have been updated to current.


OK. I'll be doing editing for readability starting on Monday.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On Thu, 2010-06-03 at 12:47 -0400, Tom Lane wrote:
 But in any case the current behavior is
 still quite broken as regards reading stale timestamps from WAL.

 Agreed. That wasn't the objective of this patch or a priority.

 If you're reading from an archive, you need to set max_standby_delay
 appropriately, current recommendation is -1.

That's not a fix; the problem is that there *is no* appropriate setting
for max_standby_delay when you're reading from archive.  It is unlikely
that the values of the timestamps you're reading should be considered to
have any bearing on what grace period to allow; but nonetheless it's
desirable to be able to have a non-infinite grace time.

Basically what I think is that we want what you called apply semantics
always for reading from archive (and I still think the DBA should be
able to set that grace period separately from the one that applies in
SR operation).  Paying attention to the master's timestamps is only
reasonable in the SR context.

And yes, I want the dependency on WAL timestamps to be gone completely,
not just mostly.  I think that driving the delay logic off of SR receipt
time (and/or the timestamp we agreed to add to the SR protocol) is the
appropriate and sufficient way to deal with the problem.  Looking at the
embedded timestamps just confuses the feedback loop behavior.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Simon Riggs
On Thu, 2010-06-03 at 13:32 -0400, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On Thu, 2010-06-03 at 12:47 -0400, Tom Lane wrote:
  But in any case the current behavior is
  still quite broken as regards reading stale timestamps from WAL.
 
  Agreed. That wasn't the objective of this patch or a priority.
 
  If you're reading from an archive, you need to set max_standby_delay
  appropriately, current recommendation is -1.
 
 That's not a fix; the problem is that there *is no* appropriate setting
 for max_standby_delay when you're reading from archive.  It is unlikely
 that the values of the timestamps you're reading should be considered to
 have any bearing on what grace period to allow; but nonetheless it's
 desirable to be able to have a non-infinite grace time.

I'm not saying that's a fix; I agree we should change that also.

 Basically what I think is that we want what you called apply semantics
 always for reading from archive (and I still think the DBA should be
 able to set that grace period separately from the one that applies in
 SR operation).  Paying attention to the master's timestamps is only
 reasonable in the SR context.

Agreed.

 And yes, I want the dependency on WAL timestamps to be gone completely,
 not just mostly.  I think that driving the delay logic off of SR receipt
 time (and/or the timestamp we agreed to add to the SR protocol) is the
 appropriate and sufficient way to deal with the problem.  Looking at the
 embedded timestamps just confuses the feedback loop behavior.

I disagree with sufficient, with good reason. Please look at the code
comments and see what will happen if we don't make the correction
suggested. If after that you still disagree, then do it your way and
we'll wait for comments in the beta; I think there will be some, but I
am tired and prone to agreement to allow this to go through. 

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Did we really want to force an initdb in beta2?

2010-06-03 Thread Florian Pflug
On Jun 3, 2010, at 19:00 , Tom Lane wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 03/06/10 19:16, Tom Lane wrote:
 What exactly was the reason for this patch?  Could it be held over till
 9.1?
 
 Before the patch, when you shut down a standby server, you get this 
 message in the log on the next startup:
 
 LOG:  database system was interrupted while in recovery at log time 
 2010-06-02 14:48:28 EEST
 HINT:  If this has occurred more than once some data might be corrupted 
 and you might need to choose an earlier recovery target.
   
 The problem is that that hint is pretty alarming.
 
 Maybe we should just get rid of the hint.


FYI, Robert Haas suggested the same in the thread that lead to this patch being 
applied. The arguments against doing that is that a real crash during recovery 
*is* something to be quite alarmed about.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow wal_keep_segments to keep all segments

2010-06-03 Thread Alvaro Herrera
Excerpts from Bruce Momjian's message of jue jun 03 08:36:28 -0400 2010:

 Using this only temporarily is mentioned in the doc patch.  Do I need
 more?

Yeah, it's far too easy to miss.  Besides, I think the wording you used
is ambiguous -- it can be read as the server will temporarily keep all
segments if you set it to -1, which is not the same thing at all.  If
you can't add a 20-point-font red blinking warning with a pink dancing
elephant in a tutu, maybe it's best to not offer the dangerous setting
in the first place.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow wal_keep_segments to keep all segments

2010-06-03 Thread Bruce Momjian
Alvaro Herrera wrote:
 Excerpts from Bruce Momjian's message of jue jun 03 08:36:28 -0400 2010:
 
  Using this only temporarily is mentioned in the doc patch.  Do I need
  more?
 
 Yeah, it's far too easy to miss.  Besides, I think the wording you used
 is ambiguous -- it can be read as the server will temporarily keep all
 segments if you set it to -1, which is not the same thing at all.  If
 you can't add a 20-point-font red blinking warning with a pink dancing
 elephant in a tutu, maybe it's best to not offer the dangerous setting
 in the first place.

Well, it seems enough people don't want this features that I am not
going to add it.  If we decide we want it later, we can add it.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Greg Stark
On Thu, Jun 3, 2010 at 4:18 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Greg Stark gsst...@mit.edu writes:
 On Thu, Jun 3, 2010 at 12:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 It is off-base.  The receiver does not request data, the sender is
 what determines how much WAL is sent when.

 Hm, so what happens if the slave blocks, doesn't the sender block when
 the kernel buffers fill up?

 Well, if the slave can't keep up, that's a separate problem.  It will
 not fail to keep up as a result of the transmission mechanism.

No, I mean if the slave is paused due to a conflict. Does it stop
reading data from the master or does it buffer it up on disk? If it
stops reading it from the master then the effect is the same as if the
slave stopped requesting data even if there's no actual request.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 On Thu, Jun 3, 2010 at 12:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 It is off-base.  The receiver does not request data, the sender is
 what determines how much WAL is sent when.

 Hm, so what happens if the slave blocks, doesn't the sender block when
 the kernel buffers fill up?

Well, if the slave can't keep up, that's a separate problem.  It will
not fail to keep up as a result of the transmission mechanism.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-06-03 Thread Jan Wieck

On 6/2/2010 7:49 PM, Greg Stark wrote:

On Wed, Jun 2, 2010 at 6:45 PM, Chris Browne cbbro...@acm.org wrote:

It would make it easy to conclude:

  This next transaction did 8328194 updates.  Maybe we should do
  some kind of checkpoint (e.g. - commit transaction or such) before
  working on it.

   versus

  This transaction we're thinking of working on had 7 updates.  No
  big deal...


I'm puzzled how you would define this value. How do you add 7 inserts,
7 deletes, and 7 updates? Is that 21 rows modified? Why are the 7
inserts and 7 deletes worth twice as much as the 7 updates when
they're basically the same thing? What if the inserts fired triggers
which inserted 7 more rows, is that 14? What if the 7 updates modified
2 TB of TOAST data but the 8238194 updates were all to the same record
and they were all HOT updates so all it did was change 8kB?

In any case you'll have all the actual data from your triggers or
hooks or whatever so what value does having the system keep track of
this add?


The point is not that we don't have that information now. The point is 
having a hint BEFORE wading through possibly gigabytes of WAL or log data.


If getting that information requires to read all the log data twice or 
the need to read gigabytes of otherwise useless WAL data (as per Bruce's 
suggestion), we better not get it at all and just keep doing what we are 
doing now.


I actually have a hard time understanding why people are so opposed to a 
feature that has zero impact at all unless a DBA actually turns in ON. 
What is the problem with exposing the commit order of transactions?



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-06-03 Thread Bruce Momjian
Jan Wieck wrote:
 The point is not that we don't have that information now. The point is 
 having a hint BEFORE wading through possibly gigabytes of WAL or log data.
 
 If getting that information requires to read all the log data twice or 
 the need to read gigabytes of otherwise useless WAL data (as per Bruce's 
 suggestion), we better not get it at all and just keep doing what we are 
 doing now.
 
 I actually have a hard time understanding why people are so opposed to a 
 feature that has zero impact at all unless a DBA actually turns in ON. 
 What is the problem with exposing the commit order of transactions?

If you want to fork Postgres and add it, go ahead, but if the community
has to maintain the code and document it, we care.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-06-03 Thread Kevin Grittner
Jan Wieck janwi...@yahoo.com wrote:
 
 I actually have a hard time understanding why people are so
 opposed to a feature that has zero impact at all unless a DBA
 actually turns in ON.  What is the problem with exposing the
 commit order of transactions?
 
FWIW, once I came to understand the use case, it seems to me a
perfectly reasonable and useful thing to have.  It does strike me
that there may be value to add one more xid to support certain
types of integrity for some use cases, but that's certainly
something which could be added later, if at all.  Once I realized
that, I just dropped out of the discussion; perhaps I should have
bowed out with an endorsement.
 
Unless my memory is failing me worse than usual, Dan Ports, who is
working on the serializable implementation so he can use the
predicate locking with a transaction-aware caching feature, needs
the ability to track commit order of transactions by xid; so the use
cases go beyond Slony and Londiste.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] clarification on walsender protocol document

2010-06-03 Thread Tom Lane
protocol.sgml says

termSTART_REPLICATION XXX/XXX/term
listitem
 para
  Instructs backend to start streaming WAL, starting at point XXX/XXX.

Am I correct in thinking that this is wrong, and streaming actually
starts with the byte after XXX/XXX?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clarification on walsender protocol document

2010-06-03 Thread Heikki Linnakangas

On 03/06/10 23:17, Tom Lane wrote:

protocol.sgml says

 termSTART_REPLICATION XXX/XXX/term
 listitem
  para
   Instructs backend to start streaming WAL, starting at point XXX/XXX.

Am I correct in thinking that this is wrong, and streaming actually
starts with the byte after XXX/XXX?


No, I believe the documentation is correct.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clarification on walsender protocol document

2010-06-03 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 03/06/10 23:17, Tom Lane wrote:
 Am I correct in thinking that this is wrong, and streaming actually
 starts with the byte after XXX/XXX?

 No, I believe the documentation is correct.

OK, I read the code again.  Thanks for the cross-check.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-06-03 Thread Jan Wieck

On 6/3/2010 4:04 PM, Bruce Momjian wrote:

If you want to fork Postgres and add it, go ahead, but if the community
has to maintain the code and document it, we care.


That comment was rather unprofessional. I think the rest of us still try 
to find the best solution for the problem, not kill the discussion. You 
may want to rejoin that effort.


I care about an efficient, low overhead way to get a certain 
information, that is otherwise extremely difficult, expensive and 
version dependent to get.


I care about cleaning up more of the mistakes, made in the original 
development of Slony. Namely using hacks and kluges to implement 
details, not supported by a current version of PostgreSQL. Londiste and 
Slony made a good leap on that with the txid data type. Slony made 
another step like that with 2.0, switching to the (for that very purpose 
developed and contributed) native trigger configuration instead of 
hacking system catalogs. This would be another step in that direction 
and we would be able to unify Londiste's and Slony's transport mechanism 
and eliminating the tick/sync kluge.


Care to explain what exactly you care about?


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-06-03 Thread Chris Browne
br...@momjian.us (Bruce Momjian) writes:
 Jan Wieck wrote:
 The point is not that we don't have that information now. The point is 
 having a hint BEFORE wading through possibly gigabytes of WAL or log data.
 
 If getting that information requires to read all the log data twice or 
 the need to read gigabytes of otherwise useless WAL data (as per Bruce's 
 suggestion), we better not get it at all and just keep doing what we are 
 doing now.
 
 I actually have a hard time understanding why people are so opposed to a 
 feature that has zero impact at all unless a DBA actually turns in ON. 
 What is the problem with exposing the commit order of transactions?

 If you want to fork Postgres and add it, go ahead, but if the community
 has to maintain the code and document it, we care.

Are you caring or opposing?  It seems rather uncharitable to imply
that Jan doesn't care.

I know *I'm* not interested in a forked Postgres for this - I would
prefer to find out what things could be done that don't involve gross
amounts of WAL file grovelling for data that mayn't necessarily even
be available.
-- 
select 'cbbrowne' || '@' || 'cbbrowne.com';
http://cbbrowne.com/info/internet.html
MS  apparently now  has a  team dedicated  to tracking  problems with
Linux  and publicizing them.   I guess  eventually they'll  figure out
this back fires... ;) -- William Burrow aa...@delete.fan.nb.ca

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-06-03 Thread Chris Browne
gsst...@mit.edu (Greg Stark) writes:
 On Wed, Jun 2, 2010 at 6:45 PM, Chris Browne cbbro...@acm.org wrote:
 It would make it easy to conclude:

   This next transaction did 8328194 updates.  Maybe we should do
   some kind of checkpoint (e.g. - commit transaction or such) before
   working on it.

    versus

   This transaction we're thinking of working on had 7 updates.  No
   big deal...

 I'm puzzled how you would define this value. How do you add 7 inserts,
 7 deletes, and 7 updates? Is that 21 rows modified? Why are the 7
 inserts and 7 deletes worth twice as much as the 7 updates when
 they're basically the same thing? What if the inserts fired triggers
 which inserted 7 more rows, is that 14? What if the 7 updates modified
 2 TB of TOAST data but the 8238194 updates were all to the same record
 and they were all HOT updates so all it did was change 8kB?

The presence of those questions (and their ambiguity) is the reason
why there's a little squirming as to whether this is super-useful and
super-necessary.

What this offers is *SOME* idea of how much updating work a particular
transaction did.  It's a bit worse than you suggest:

 - If replication triggers have captured tuples, those would get
   counted.

 - TOAST updates might lead to extra updates being counted.

But back to where you started, I'd anticipate 7 inserts, 7 deletes,
and 7 updates being counted as something around 21 updates.

And if that included 5 TOAST changes, it might bump up to 26.

If there were replication triggers in place, that might bump the count
up to 45 (which I chose arbitrarily).

 In any case you'll have all the actual data from your triggers or
 hooks or whatever so what value does having the system keep track of
 this add?

This means that when we'd pull the list of transactions to consider,
we'd get something like:

 select * from next_transactions('4218:23', 50);

[list of 50 transactions returned, each with...
   - txid
   - START timestamp
   - COMMIT timestamp
   - Approximate # of updates

Then, for each of the 50, I'd pull replication log data for the
corresponding transaction.

If I have the approximate # of updates, that might lead me to stop
short, and say:

  That next update looks like a doozy!  I'm going to stop and commit
  what I've got before doing that one.

It's not strictly necessary, but would surely be useful for flow
control.
-- 
select 'cbbrowne' || '@' || 'cbbrowne.com';
http://cbbrowne.com/info/internet.html
MS  apparently now  has a  team dedicated  to tracking  problems with
Linux  and publicizing them.   I guess  eventually they'll  figure out
this back fires... ;) -- William Burrow aa...@delete.fan.nb.ca

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow wal_keep_segments to keep all segments

2010-06-03 Thread Andrew Dunstan



Heikki Linnakangas wrote:


We leave that up to the DBA to clean out one way or another. We 
provide restartpoint_command and the %r option in restore_command to 
help with that.





I was in fact just looking into this, and I see that there is no example 
restartpoint_comand script given in the docs, nor in the wiki.


A sample of such a command would be useful. This is all going to feel a 
bit strange to lots of users, and the more we can hold their hands the 
better off we and they will be.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-06-03 Thread Greg Stark
On Thu, Jun 3, 2010 at 8:50 PM, Jan Wieck janwi...@yahoo.com wrote:
 I'm puzzled how you would define this value. How do you add 7 inserts,
 7 deletes, and 7 updates? Is that 21 rows modified?

 I actually have a hard time understanding why people are so opposed to a
 feature that has zero impact at all unless a DBA actually turns in ON. What
 is the problem with exposing the commit order of transactions?

The post you were responding to was regarding the meaninglessness of
the number of records attribute you wanted. Your response is a non
sequitor.

I think the commit order of transactions would be a good thing to
expose though I've asked repeatedly what kind of interface you need
and never gotten answers to all the questions.




-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SET CONSTRAINTS todo

2010-06-03 Thread Dan Colish
I see what went wrong in my example. Unique constraints must have unique names 
since they create an index. I'll try again, sorry for the noise.

--Dan
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-06-03 Thread Kevin Grittner
Greg Stark gsst...@mit.edu wrote:
 
 what kind of interface you need
 
For the potential uses I can see, it would be great to have a SRF
which took two parameters: xid of last known commit and a limit how
many commits past that to return.  Perhaps a negative number could
move earlier in time, if that seems reasonable to others.  I think
that's also consistent with Jan's posts.  A GUC to enable it and
some way to specify retention (or force cleanup) are the only other
user-facing features which come to mind for me.  (Not sure what form
that last should take, but didn't Jan say something about both of
these early in the thread?)
 
Do you see a need for something else (besides, obviously, docs)?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Did we really want to force an initdb in beta2?

2010-06-03 Thread Tom Lane
Florian Pflug f...@phlo.org writes:
 On Jun 3, 2010, at 19:00 , Tom Lane wrote:
 Maybe we should just get rid of the hint.

 FYI, Robert Haas suggested the same in the thread that lead to this patch 
 being applied. The arguments against doing that is that a real crash during 
 recovery *is* something to be quite alarmed about.

After some discussion among core we're going to leave it as-is.  Anybody
who doesn't want to initdb for beta2 can test out pg_upgrade ;-)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-06-03 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Greg Stark gsst...@mit.edu wrote:
  
 what kind of interface you need
  
 For the potential uses I can see, it would be great to have a SRF
 which took two parameters: xid of last known commit and a limit
 how many commits past that to return.
 
Jan's very first post had it right; my idea was flawed:
 
| Exposing the data will be done via a set returning function. The
| SRF takes two arguments. The maximum number of rows to return and
| the last serial number processed by the reader. The advantage of
| such SRF is that the result can be used in a query that right away
| delivers audit or replication log information in transaction
| commit order. The SRF can return an empty set if no further
| transactions have committed since, or an error if data segments
| needed to answer the request have already been purged.
| 
| Purging of the data will be possible in several different ways.
| Autovacuum will call a function that drops segments of the data
| that are outside the postgresql.conf configuration with respect to
| maximum age or data volume. There will also be a function reserved
| for superusers to explicitly purge the data up to a certain serial
| number.
 
Apologies for not looking back to the start of the thread before
that last post.  It was all laid out right at the start.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-06-03 Thread Bruce Momjian
Jan Wieck wrote:
 On 6/3/2010 4:04 PM, Bruce Momjian wrote:
  If you want to fork Postgres and add it, go ahead, but if the community
  has to maintain the code and document it, we care.
 
 That comment was rather unprofessional. I think the rest of us still try 
 to find the best solution for the problem, not kill the discussion. You 
 may want to rejoin that effort.
 
 I care about an efficient, low overhead way to get a certain 
 information, that is otherwise extremely difficult, expensive and 
 version dependent to get.
 
 I care about cleaning up more of the mistakes, made in the original 
 development of Slony. Namely using hacks and kluges to implement 
 details, not supported by a current version of PostgreSQL. Londiste and 
 Slony made a good leap on that with the txid data type. Slony made 
 another step like that with 2.0, switching to the (for that very purpose 
 developed and contributed) native trigger configuration instead of 
 hacking system catalogs. This would be another step in that direction 
 and we would be able to unify Londiste's and Slony's transport mechanism 
 and eliminating the tick/sync kluge.
 
 Care to explain what exactly you care about?

Here is what I was replying to:

  I actually have a hard time understanding why people are so opposed t$
   feature that has zero impact at all unless a DBA actually turns in ON.
  What is the problem with exposing the commit order of transactions?

Jan's comment is why should others care what he wants because it has
zero impact?  I am saying the community cares because we have to
maintain the code.  I stand by my comment.

I remember a dismissive comment by Jan when 'session_replication_role'
was added, and a similar strong comment from me at that time as well. 
It seems we are doing this again.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-06-03 Thread Bruce Momjian
Bruce Momjian wrote:
 Jan Wieck wrote:
  On 6/3/2010 4:04 PM, Bruce Momjian wrote:
   If you want to fork Postgres and add it, go ahead, but if the community
   has to maintain the code and document it, we care.
  
  That comment was rather unprofessional. I think the rest of us still try 
  to find the best solution for the problem, not kill the discussion. You 
  may want to rejoin that effort.
  
  I care about an efficient, low overhead way to get a certain 
  information, that is otherwise extremely difficult, expensive and 
  version dependent to get.
  
  I care about cleaning up more of the mistakes, made in the original 
  development of Slony. Namely using hacks and kluges to implement 
  details, not supported by a current version of PostgreSQL. Londiste and 
  Slony made a good leap on that with the txid data type. Slony made 
  another step like that with 2.0, switching to the (for that very purpose 
  developed and contributed) native trigger configuration instead of 
  hacking system catalogs. This would be another step in that direction 
  and we would be able to unify Londiste's and Slony's transport mechanism 
  and eliminating the tick/sync kluge.
  
  Care to explain what exactly you care about?
 
 Here is what I was replying to:
 
   I actually have a hard time understanding why people are so opposed t$
feature that has zero impact at all unless a DBA actually turns in ON.
   What is the problem with exposing the commit order of transactions?
 
 Jan's comment is why should others care what he wants because it has
 zero impact?  I am saying the community cares because we have to
 maintain the code.  I stand by my comment.
 
 I remember a dismissive comment by Jan when 'session_replication_role'
 was added, and a similar strong comment from me at that time as well. 
 It seems we are doing this again.

Of course, if I am misintepreting what Jan said, please let me know.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Simon Riggs
On Thu, 2010-06-03 at 18:18 +0100, Simon Riggs wrote:

 Are you planning to work on these things now as you said?

Are you? Or do you want me to?

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-03 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On Thu, 2010-06-03 at 18:18 +0100, Simon Riggs wrote:
 Are you planning to work on these things now as you said?

 Are you? Or do you want me to?

I decided there wasn't time to get anything useful done on it before the
beta2 deadline (which is, more or less, right now).  I will take another
look over the next few days.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-06-03 Thread Alvaro Herrera
Excerpts from Chris Browne's message of jue jun 03 16:21:35 -0400 2010:

 What this offers is *SOME* idea of how much updating work a particular
 transaction did.  It's a bit worse than you suggest:
 
  - If replication triggers have captured tuples, those would get
counted.
 
  - TOAST updates might lead to extra updates being counted.
 
 But back to where you started, I'd anticipate 7 inserts, 7 deletes,
 and 7 updates being counted as something around 21 updates.
 
 And if that included 5 TOAST changes, it might bump up to 26.
 
 If there were replication triggers in place, that might bump the count
 up to 45 (which I chose arbitrarily).

Why not send separate numbers of tuple inserts/updates/deletes, which we
already have from pgstats?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.0 release notes

2010-06-03 Thread Josh Berkus

On 06/03/2010 10:27 AM, Josh Berkus wrote:

On 06/03/2010 09:34 AM, Bruce Momjian wrote:

The 9.0 release notes have been updated to current.


OK. I'll be doing editing for readability starting on Monday.



Actually, the section tags appear to be broken in release-9.0.sgml in 
the current cvs.  Looks like a section got added somewhere and the 
sections were not renumbered.  Fix?


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] SET CONSTRAINTS todo

2010-06-03 Thread Dan Colish
I wanted to work on this todo item and I have a few questions about the 
semantics of it. Essentially, it is not possible to have more than one relname 
for a constraint, even though the comment in trigger.c says otherwise. I have 
used this code to test this:


CREATE TABLE products ( 
  
 product_no integer CONSTRAINT must_be_different UNIQUE DEFERRABLE, 

name text,  
   
price numeric   
   
);

CREATE TABLE products2 (
   
 product_no integer CONSTRAINT must_be_different UNIQUE DEFERRABLE, 

name text,  
   
price numeric   
   
);


which results in the following error:


 ERROR:  relation must_be_different already exists


Therefore prefixing them with a table name does not seem to make sense. 
Additionally, there is already the feature of prefixing the constraint relname 
with a schema to limit the search space.

Is the intention of the todo to allow the user to specify a tablename which 
limits the search path to that table's schema or is the feature to extend 
constraints to allow per table naming. In other words, would the feature allow 
multiple constraints of the same name in a schema since they would be table 
specific?

--Dan
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-06-03 Thread Jan Wieck

On 6/3/2010 5:58 PM, Greg Stark wrote:

On Thu, Jun 3, 2010 at 8:50 PM, Jan Wieck janwi...@yahoo.com wrote:

I'm puzzled how you would define this value. How do you add 7 inserts,
7 deletes, and 7 updates? Is that 21 rows modified?


I actually have a hard time understanding why people are so opposed to a
feature that has zero impact at all unless a DBA actually turns in ON. What
is the problem with exposing the commit order of transactions?


The post you were responding to was regarding the meaninglessness of
the number of records attribute you wanted. Your response is a non
sequitor.


I never proposed a number of records attribute. I proposed a sum of 
the row counts in the statistics collector. That row count would be a 
mix of insert, update, delete and toast operations. It's not an exact 
indicator of anything, but a good enough hint of how much data may come 
down the pipe if I were to select all replication data belonging to that 
transaction.




I think the commit order of transactions would be a good thing to
expose though I've asked repeatedly what kind of interface you need
and never gotten answers to all the questions.


In the original email that started this whole thread I wrote:


Exposing the data will be done via a set returning function. The SRF
takes two arguments. The maximum number of rows to return and the last
serial number processed by the reader. The advantage of such SRF is that
the result can be used in a query that right away delivers audit or
replication log information in transaction commit order. The SRF can
return an empty set if no further transactions have committed since, or
an error if data segments needed to answer the request have already been
purged.


Did that not answer your question?


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-06-03 Thread Jan Wieck

On 6/3/2010 6:24 PM, Kevin Grittner wrote:


Apologies for not looking back to the start of the thread before
that last post.  It was all laid out right at the start.


No need to apologize. Happens.


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-06-03 Thread Jan Wieck

On 6/3/2010 7:11 PM, Alvaro Herrera wrote:

Excerpts from Chris Browne's message of jue jun 03 16:21:35 -0400 2010:


What this offers is *SOME* idea of how much updating work a particular
transaction did.  It's a bit worse than you suggest:

 - If replication triggers have captured tuples, those would get
   counted.

 - TOAST updates might lead to extra updates being counted.

But back to where you started, I'd anticipate 7 inserts, 7 deletes,
and 7 updates being counted as something around 21 updates.

And if that included 5 TOAST changes, it might bump up to 26.

If there were replication triggers in place, that might bump the count
up to 45 (which I chose arbitrarily).


Why not send separate numbers of tuple inserts/updates/deletes, which we
already have from pgstats?



We only have them for the entire database. The purpose of this is just a 
guesstimate about what data volume to expect if I were to select all log 
from a particular transaction.


This datum isn't critical, just handy for the overall feature to be useful.


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-03 Thread KaiGai Kohei
I fixed up the subject.

(2010/06/04 2:23), Marc Munro wrote:
 On Thu, 2010-06-03 at 05:53 -0300, pgsql-hackers-ow...@postgresql.org
 wrote:
 [ . . . ]

 In my current idea, when a qual-node that contains FuncExpr tries to
 reference a part of relations within a security view, its referencing
 relations will be expanded to whole of the security view at
 distribute_qual_to_rels().
 [ . . . ]
 
 I may be missing something here but this seems a bit too simplistic and,
 I think, fails to deal with an important use case.
 
 Security views, that do anything useful at all, tend to introduce
 performance issues.  I am concerned that placing a conceptual barrier
 between the secured and unsecured parts of queries is going to
 unnecessarily restrict what the optimiser can do.
 
 For example consider that we have three secured views, each of the form:
 
create view s_x as select * from x where i_can_see(x.key);
 
 and consider the query:
 
select stuff from s_x
  inner join s_y on s_y.key = s_x.key
  inner join s_z on s_z.key = s_x.key
where fn(s_x.a) = 3;
 
 The optimiser ought to be able to spot the fact that i_can_see() need
 only be called once for each joined result.  By placing a barrier (if I
 understand your proposal correctly) between the outermost joins and the
 inner views, doesn't this optimisation become impossible?
 
It seems to me incorrect.

If i_can_see() can filter a part of tuples within table: x, the optimizer
prefers to call i_can_see() on scanning each tables rather than result of
join, because it effectively reduce the size of scan.

In fact, the existing optimizer make the following plan:

  postgres=# CREATE FUNCTION i_can_see(int) RETURNS bool IMMUTABLE
  AS 'begin return $1 % 1 = 1; end;' LANGUAGE 'plpgsql';
  CREATE FUNCTION
  postgres=# CREATE VIEW v1 as select * from t1 where i_can_see(a);
  CREATE VIEW
  postgres=# CREATE VIEW v2 as select * from t2 where i_can_see(x);
  CREATE VIEW
  postgres=# CREATE VIEW v3 as select * from t3 where i_can_see(s);
  CREATE VIEW

  postgres=# EXPLAIN SELECT * FROM (v1 JOIN v2 ON v1.a=v2.x) JOIN v3 on 
v1.a=v3.s;
QUERY PLAN
  
---
   Nested Loop  (cost=0.00..860.47 rows=410 width=108)
 -  Nested Loop  (cost=0.00..595.13 rows=410 width=72)
   -  Seq Scan on t1  (cost=0.00..329.80 rows=410 width=36)
 Filter: i_can_see(a)
   -  Index Scan using t2_pkey on t2  (cost=0.00..0.63 rows=1 width=36)
 Index Cond: (t2.x = t1.a)
 Filter: i_can_see(t2.x)
 -  Index Scan using t3_pkey on t3  (cost=0.00..0.63 rows=1 width=36)
   Index Cond: (t3.s = t1.a)
   Filter: i_can_see(t3.s)
  (10 rows)

Sorry, I may miss something your point.

 I think a simpler solution may be possible here.  If you can tag the
 function i_can_see() as a security function, at least in the context of
 its use in the security views, and then create the rule that security
 functions are always considered to be lower cost than user-defined
 non-security functions, don't we achieve the result of preventing the
 insecure function from seeing rows that it shouldn't?
 
Sorry, it seems to me you mixed up different issues.

My patch tries to tackle the problem that optimizer distributes outermost
functions into inside of the view when the function takes arguments only
from a side of join, even if security views.
This issue is independent from cost of functions.

On the other hand, when a scan plan has multiple qualifiers to filter
fetched tuples, we have to pay attention on the order to be called.
If outermost functions are called prior to view's policy functions,
it may cause unexpected leaks.

In my opinion, we should consider the nestlevel of the function where
it is originally put. But it is not concluded yet, and the patch does
not tackle anything about this issue.

 I guess my concern is that a query may be constructed a=out of secured
 and unsecured parts and the optimiser should be free to group all of the
 secured parts together before considering the unsecured parts.
 
Yes, it is an opinion, but it may cause unnecessary performance regression
when user given condition is obviously harmless.

E.g, If table: t has primary key t.a, and a security view is defined as:

  CREATE VIEW v AS SELECT * FROM t WHERE f_policy(t.b);

When user gives the following query:

  SELECT * FROM v WHERE a = 100;

If we strictly separate secured and unsecure part prior to optimizer,
it will break a chance to scan the table: t with index.

I also think security is not free, so it may take a certain level of
performance degradation. But it should not bring down more than necessity.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Open item: slave to standby in docs

2010-06-03 Thread Takahiro Itagaki
Ther is an open item:
Standby instead of slave in documentation
http://archives.postgresql.org/message-id/1273682033.12754.1.ca...@vanquo.pezone.net

I replacesd almost all slave to standby or standby servers
not only in HS+SR but also in other places like third-party tools.

There are still 3 places where slave is used.
  - Terminology: ... are called standby or slave servers.
  - Words in old release notes for 8.2 and 8.4.

Could you check those replacements are proper?

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center


slave_to_standby.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-06-03 Thread Jan Wieck

On 5/27/2010 4:31 PM, Bruce Momjian wrote:

Heikki Linnakangas wrote:
BTW, I think we're going to need a separate config file for listing the 
standbys anyway. There you can write per-server rules and options, but 
explicitly knowing about all the standbys also allows the master to 
recycle WAL as soon as it has been streamed to all the registered 
standbys. Currently we just keep wal_keep_segments files around, just in 
case there's a standby out there that needs them.


Ideally we could set 'slave_sync_count' and 'slave_commit_continue_mode'
on the master, and allow the sync/async mode to be set on each slave,
e.g. if slave_sync_count = 2 and slave_commit_continue_mode = #2, then
two slaves with sync mode of #2 or stricter have to complete before the
master can continue.

Naming the slaves on the master seems very confusing because I am
unclear how we would identify named slaves, and the names have to match,
etc.  


Also, what would be cool would be if you could run a query on the master
to view the SR commit mode of each slave.


What would be the use case for such a query?


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-06-03 Thread Robert Haas
On Thu, Jun 3, 2010 at 6:29 PM, Bruce Momjian br...@momjian.us wrote:
 Jan Wieck wrote:
 On 6/3/2010 4:04 PM, Bruce Momjian wrote:
  If you want to fork Postgres and add it, go ahead, but if the community
  has to maintain the code and document it, we care.

 That comment was rather unprofessional. I think the rest of us still try
 to find the best solution for the problem, not kill the discussion. You
 may want to rejoin that effort.

 I care about an efficient, low overhead way to get a certain
 information, that is otherwise extremely difficult, expensive and
 version dependent to get.

 I care about cleaning up more of the mistakes, made in the original
 development of Slony. Namely using hacks and kluges to implement
 details, not supported by a current version of PostgreSQL. Londiste and
 Slony made a good leap on that with the txid data type. Slony made
 another step like that with 2.0, switching to the (for that very purpose
 developed and contributed) native trigger configuration instead of
 hacking system catalogs. This would be another step in that direction
 and we would be able to unify Londiste's and Slony's transport mechanism
 and eliminating the tick/sync kluge.

 Care to explain what exactly you care about?

 Here is what I was replying to:

  I actually have a hard time understanding why people are so opposed t$
   feature that has zero impact at all unless a DBA actually turns in ON.
  What is the problem with exposing the commit order of transactions?

 Jan's comment is why should others care what he wants because it has
 zero impact?  I am saying the community cares because we have to
 maintain the code.  I stand by my comment.

 I remember a dismissive comment by Jan when 'session_replication_role'
 was added, and a similar strong comment from me at that time as well.
 It seems we are doing this again.

I think it's entirely legitimate and proper for us to make a decision
about whether this feature is worth including in core PostgreSQL.  We
are obviously not in the business of adding random features solely for
the benefit of third-party applications.  That having been said, there
are several reasons why I believe that this particular feature is an
excellent candidate for inclusion in core.

1. It solves a problem for which there is no easy workaround.
Rereading all the WAL to extract the commit records is not an easy
workaround, nor is what Slony and Londiste are doing now.

2. It is usable by multiple projects, not just one.  It may well have
applications beyond replication (e.g. distributed transactions), but
at a very minimum it is usable by and useful to multiple replication
solutions.

3. It has a clear specification which can be easily understood even by
people who do not fully understand how replication solutions will make
use of it, which makes code maintenance much less burdensome.
Obviously, Jan's original email on this topic was just a sketch, but I
find it to be pretty clear.

4. We have an existing precedent of being willing to add limited
support into core to allow replication solutions to do their thing
(session_replication_role, ALTER TABLE ... ENABLE REPLICA TRIGGER,
etc).  Even though we now have built-in replication via HS and SR,
there is still a BIG use case for Slony, Londiste, and other add-on
tools.  Making those tools more successful and performant is good for
PostgreSQL.

5. It does not involve highly invasive changes to the core code.

6. It can be turned off for users who don't want it.

I find the skeptical attitude on this thread altogether unwarranted.
Jan made his case and, at least IMHO, presented it pretty clearly.  He
then answered, multiple times, numerous questions which were already
addressed in the original email, as well as various others.  I think
we should be very careful about assuming that we understand
replication and its needs better than someone who has spent many years
developing one of the major PostgreSQL replication solutions.
Independent of Jan's qualifications, there are clearly several people
on this thread who understand why this is useful and valuable,
including me.  I am obviously not in a position to insist that we
accept this feature (assuming Jan produces a patch rather than getting
discouraged and giving up) but I would like us to think very, very
carefully before rejecting it, and not to do so unless we have a DARN
good reason.  Most patches add code, and therefore require code
maintenance - that is not, by itself, a reason to reject them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the Xact commit order to the user

2010-06-03 Thread Bruce Momjian
Bruce Momjian wrote:
  I care about cleaning up more of the mistakes, made in the original 
  development of Slony. Namely using hacks and kluges to implement 
  details, not supported by a current version of PostgreSQL. Londiste and 
  Slony made a good leap on that with the txid data type. Slony made 
  another step like that with 2.0, switching to the (for that very purpose 
  developed and contributed) native trigger configuration instead of 
  hacking system catalogs. This would be another step in that direction 
  and we would be able to unify Londiste's and Slony's transport mechanism 
  and eliminating the tick/sync kluge.
  
  Care to explain what exactly you care about?
 
 Here is what I was replying to:
 
   I actually have a hard time understanding why people are so opposed t$
feature that has zero impact at all unless a DBA actually turns in ON.
   What is the problem with exposing the commit order of transactions?
 
 Jan's comment is why should others care what he wants because it has
 zero impact?  I am saying the community cares because we have to
 maintain the code.  I stand by my comment.
 
 I remember a dismissive comment by Jan when 'session_replication_role'
 was added, and a similar strong comment from me at that time as well. 
 It seems we are doing this again.

FYI, I talked to Jan on the phone and we have resolved this issue.  :-)

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] A tackle to the leaky VIEWs for RLS

2010-06-03 Thread KaiGai Kohei
I summarized the series of discussion at:
  http://wiki.postgresql.org/wiki/RLS#Issue:_A_leaky_VIEWs_for_RLS

Please point out, if I missed or misunderstood something.

Thanks,

(2010/06/03 11:36), KaiGai Kohei wrote:
 (2010/06/02 12:02), KaiGai Kohei wrote:
 Here's another thought.  If we're leaning toward explicit syntax to
 designate security views (and I do mean IF, since only one person has
 signed on to that, even if it is Tom Lane!), then maybe we should
 think about ripping out the logic that causes regular views to be
 evaluated using the credentials of the view owner rather than the
 person selecting from it.  A security view would still use that logic,
 plus whatever additional stuff we come up with to prevent leakage.
 Perhaps this would be viewed as a nasty backward compatibility break,
 but the upside is that we'd then be being absolutely clear that a
 non-security view isn't and can never be trusted to be a security
 barrier.  Right now we're shipping something that purports to act as a
 barrier but really doesn't.


 Sorry, should we make clear the purpose of explicit syntax for security
 views being issued now?
 In my understanding, if security views, the planner tries to checks
 privileges of the person selecting it to reference underlaying tables
 without any ereport. If violated, the planner prevents user given
 quals (perhaps FuncExpr only?) to come into inside of the join scan.
 Otherwise, if regular views, the planner works as is. Right?

 
 I'd like to check up the problem in details.
 
 We can sort out a view into three types, as follows:
 (1) A view which cannot be pulled up
 (2) A view which can be pulled up, but does not contain any joins
 (3) A view which can be pulled up, and contains joins.
 
 For the type (1), we don't need to handle anything special, because
 no external quals are unavailable to come into.
 
 For the type (2), we also don't need to handle anything special
 except for the evaluation order matter in ExecQual(), because it is
 impossible to distribute external quals into a part of relations.
 
 So, the type (3) is all we have to consider here. Right?
 
 
 Then, where should we distinguish a security view and others?
 At least, it should be decided at pull_up_subqueries() or earlier,
 because it merges subqueries into the upper query.
 In subquery_planner(), the pull_up_subqueries() is called just after
 inline_set_returning_functions() which makes RTE_FUNCTION entry flatten
 if available. It seems to me not a strange logic to check privileges
 on underlaying relations in pull_up_subqueries(), because
 inline_set_returning_functions() also checks ACL_EXECUTE here on the
 functions to be flatten.
 
 
 Then, once we can identify what is a security view and not, it shall
 be marked on somewhere. Maybe, FromExpr of the pulled-up subquery?
 
 
 In my current idea, when a qual-node that contains FuncExpr tries to
 reference a part of relations within a security view, its referencing
 relations will be expanded to whole of the security view at
 distribute_qual_to_rels().
 Then, it will prevent a user defined function to come into inside of
 security views.
 
 
 At least, it seems to me reasonable to implement.
 Shall I launch to develop a proof of concept patch?
 
 Thanks,


-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.0 release notes

2010-06-03 Thread Bruce Momjian
Josh Berkus wrote:
 On 06/03/2010 10:27 AM, Josh Berkus wrote:
  On 06/03/2010 09:34 AM, Bruce Momjian wrote:
  The 9.0 release notes have been updated to current.
 
  OK. I'll be doing editing for readability starting on Monday.
 
 
 Actually, the section tags appear to be broken in release-9.0.sgml in 
 the current cvs.  Looks like a section got added somewhere and the 
 sections were not renumbered.  Fix?

What are you seeing that is wrong?  I don't see it.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] functional call named notation clashes with SQL feature

2010-06-03 Thread Jan Wieck

On 5/27/2010 11:52 PM, Andrew Dunstan wrote:


Bruce Momjian wrote:

Peter Eisentraut wrote:
  

On tor, 2010-05-27 at 12:59 -0400, Tom Lane wrote:


I think we should fix it now.  Quick thought: maybe we could use

FOR 
  

instead of AS: select myfunc(7 for a, 6 for b);


I'm afraid FOR doesn't work either; it'll create a conflict with the
spec-defined SUBSTRING(x FOR y) syntax.
  

How about

select myfunc(a := 7, b := 6);



One concern I have is that in PL/pgSQL, := and = behave the same, while
in SQL, they would not.  That might cause confusion.

  


That is a sad wart that we should never have done, IMNSHO (it was before 
my time or I would have objected ;-) ). But beyond that, = is an 
operator in SQL and := is never an operator, IIRC.


As far as I can tell, this was already in the code when Bruce moved it 
into core as -r1.1 on my behalf (before I had commit privileges). I do 
not recall if the = as alternative to := was my idea or not. But I'm 
willing to take the blame for it because it dates back to a time where 
convenience seemed important.



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [pgsql-hackers] Daily digest v1.10705 (13 messages)

2010-06-03 Thread Robert Haas
On Thu, Jun 3, 2010 at 1:23 PM, Marc Munro m...@bloodnok.com wrote:
 On Thu, 2010-06-03 at 05:53 -0300, pgsql-hackers-ow...@postgresql.org
 wrote:
 [ . . . ]

 In my current idea, when a qual-node that contains FuncExpr tries to
 reference a part of relations within a security view, its referencing
 relations will be expanded to whole of the security view at
 distribute_qual_to_rels().
 [ . . . ]

 I may be missing something here but this seems a bit too simplistic and,
 I think, fails to deal with an important use case.

If anything, you're putting it mildly.  This is quite a bit too
simplistic and fails to deal with several important issues, at least
some of which have already been mentioned on this thread.

 The optimiser ought to be able to spot the fact that i_can_see() need
 only be called once for each joined result.  By placing a barrier (if I
 understand your proposal correctly) between the outermost joins and the
 inner views, doesn't this optimisation become impossible?

 I think a simpler solution may be possible here.  If you can tag the
 function i_can_see() as a security function, at least in the context of
 its use in the security views, and then create the rule that security
 functions are always considered to be lower cost than user-defined
 non-security functions, don't we achieve the result of preventing the
 insecure function from seeing rows that it shouldn't?

So, yes and no.  You DO need a security barrier between the view and
the rest of the query, but if a function can be trusted not to do evil
things, then it should be allowed to be pushed down.  What we need to
prevent is the pushdown of untrusted functions (or operators).  A
(very) important part of this problem is determining which quals are
safe to push down.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronization levels in SR

2010-06-03 Thread Robert Haas
On Thu, Jun 3, 2010 at 8:47 PM, Jan Wieck janwi...@yahoo.com wrote:
 On 5/27/2010 4:31 PM, Bruce Momjian wrote:

 Heikki Linnakangas wrote:

 BTW, I think we're going to need a separate config file for listing the
 standbys anyway. There you can write per-server rules and options, but
 explicitly knowing about all the standbys also allows the master to recycle
 WAL as soon as it has been streamed to all the registered standbys.
 Currently we just keep wal_keep_segments files around, just in case there's
 a standby out there that needs them.

 Ideally we could set 'slave_sync_count' and 'slave_commit_continue_mode'
 on the master, and allow the sync/async mode to be set on each slave,
 e.g. if slave_sync_count = 2 and slave_commit_continue_mode = #2, then
 two slaves with sync mode of #2 or stricter have to complete before the
 master can continue.

 Naming the slaves on the master seems very confusing because I am
 unclear how we would identify named slaves, and the names have to match,
 etc.
 Also, what would be cool would be if you could run a query on the master
 to view the SR commit mode of each slave.

 What would be the use case for such a query?

Monitoring?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] functional call named notation clashes with SQL feature

2010-06-03 Thread Andrew Dunstan



Jan Wieck wrote:


That is a sad wart that we should never have done, IMNSHO (it was 
before my time or I would have objected ;-) ). But beyond that, = is 
an operator in SQL and := is never an operator, IIRC.


As far as I can tell, this was already in the code when Bruce moved it 
into core as -r1.1 on my behalf (before I had commit privileges). I do 
not recall if the = as alternative to := was my idea or not. But I'm 
willing to take the blame for it because it dates back to a time where 
convenience seemed important.




I forgive you ;=) If my worst sin were only this bad I'd be truly happy.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fillfactor gets set to zero for toast tables

2010-06-03 Thread Takahiro Itagaki

Takahiro Itagaki itagaki.takah...@oss.ntt.co.jp wrote:

 Could you explain why default_only entries idea is
 better than adjusting those fields in the toast-specific codes?
 It's my understanding that reloption-framework is just a tool to fill
 reloption parameters, and it's not responsible for unused fields.

Here is my proposal to fix the issue. I didn't introduce default_only
field but simply adjust parameters for TOAST reloptions.


BTW, not only heap and toast relations but also btree, hash, and gist
indexes use StdRdOptions. However, they actually don't support autovacuum
options. So, StdRdOptions is a bloated all-in-one descriptor now.
Instead, they should use fillfactor-only reloptions that is defined
separately from options for heap.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



toast-default-only-relopts.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-03 Thread KaiGai Kohei
(2010/06/04 11:55), Robert Haas wrote:
 On Thu, Jun 3, 2010 at 1:23 PM, Marc Munrom...@bloodnok.com  wrote:
 On Thu, 2010-06-03 at 05:53 -0300, pgsql-hackers-ow...@postgresql.org
 wrote:
 [ . . . ]

 In my current idea, when a qual-node that contains FuncExpr tries to
 reference a part of relations within a security view, its referencing
 relations will be expanded to whole of the security view at
 distribute_qual_to_rels().
 [ . . . ]

 I may be missing something here but this seems a bit too simplistic and,
 I think, fails to deal with an important use case.
 
 If anything, you're putting it mildly.  This is quite a bit too
 simplistic and fails to deal with several important issues, at least
 some of which have already been mentioned on this thread.
 
Hmm. Was it too simplified even if just a proof of concept?

 The optimiser ought to be able to spot the fact that i_can_see() need
 only be called once for each joined result.  By placing a barrier (if I
 understand your proposal correctly) between the outermost joins and the
 inner views, doesn't this optimisation become impossible?

 I think a simpler solution may be possible here.  If you can tag the
 function i_can_see() as a security function, at least in the context of
 its use in the security views, and then create the rule that security
 functions are always considered to be lower cost than user-defined
 non-security functions, don't we achieve the result of preventing the
 insecure function from seeing rows that it shouldn't?
 
 So, yes and no.  You DO need a security barrier between the view and
 the rest of the query, but if a function can be trusted not to do evil
 things, then it should be allowed to be pushed down.  What we need to
 prevent is the pushdown of untrusted functions (or operators).  A
 (very) important part of this problem is determining which quals are
 safe to push down.
 
At least, I don't have an idea to distinguish trusted functions from
others without any additional hints, because we support variable kind
of PL languages. :(

An idea is to add TRUSTED (or other) keyword for CREATE FUNCTION to mark
this function is harmless to execute, but only DBA can define functions
with the option.
(Perhaps, most of built-in functions should be marked as trusted?)

If we should identify a function as either trusted or untrusted, is
there any other ideas?

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SET CONSTRAINTS todo

2010-06-03 Thread Tom Lane
Dan Colish dcol...@gmail.com writes:
 I wanted to work on this todo item and I have a few questions about
 the semantics of it. Essentially, it is not possible to have more than
 one relname for a constraint,

That is per SQL spec: SQL92 10.6 syntax rule 2 saith

 2) The qualified identifier of constraint name shall be differ-
ent from the qualified identifier of the constraint name of
any other constraint defined in the same schema.

I believe we are already laxer than the spec, because we don't enforce
that restriction except for index-based constraints.  I'm not terribly
excited about trying to make it weaker yet.

 Is the intention of the todo to allow the user to specify a tablename
 which limits the search path to that table's schema or is the feature to
 extend constraints to allow per table naming.

I think the TODO item you're looking at is just about how narrowly you
can specify the target(s) of a SET CONSTRAINTS command.  It's not meant
to say anything about what constraint names can be created in the first
place.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix leaky VIEWs for RLS

2010-06-03 Thread Tom Lane
KaiGai Kohei kai...@ak.jp.nec.com writes:
 (2010/06/04 11:55), Robert Haas wrote:
 A (very) important part of this problem is determining which quals are
 safe to push down.
 
 At least, I don't have an idea to distinguish trusted functions from
 others without any additional hints, because we support variable kind
 of PL languages. :(

The proposal some time back in this thread was to trust all built-in
functions and no others.  That's a bit simplistic, no doubt, but it
seems to me to largely solve the performance problem and to do so with
minimal effort.  When and if you get to a solution that's committable
with respect to everything else, it might be time to think about
more flexible answers to that particular point.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >