Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Markus Wanner

Hi,

Quoting Tom Lane t...@sss.pgh.pa.us:

Not at the table level.  If you could lock only at the tuple level
maybe you'd have something


AFAIUI this is about the tuple level lock, yes.


but it seems like you can't find the
target tuples without having acquired a snapshot.


Maybe not *the* target tuple, but we could certainly find candidate  
target tuples. Of course it's impossible to determine visibility  
without a snapshot (and thus find *the* one).


But it seems to me it might suffice to optimistically pick a plausible  
tuple (i.e. determined by a candidate snapshot) and try to lock that.  
Only after we hold the lock, we get a real snapshot and re-check  
visibility of the tuple we've locked.


If it's the visible target tuple we want to update we are all fine, if  
not another transaction updated the tuple and we have to look for the  
new version of that tuple. As we've just taken a new snapshot *after*  
that other transaction updating the tuple of interest, we now see the  
*new* tuple and can continue with our transaction normally (instead of  
having to abort the transaction with a serialization failure).


So yes, to me this looks like a theoretically possible performance  
gain. It certainly only helps the very first tuple write. And it seems  
to apply to SERIALIZABLE transactions exclusively.


Another minor gotcha exists, though. There's another possible cause  
for the visibility check to fail: our initial pick with the candidate  
snapshot might have been wrong. In that unfortunate case we can  
continue as described, but it's worth mentioning that we were waiting  
for the wrong lock (i.e. a tuple that's not visible according to the  
real snapshot might have been one from an aborted transaction, for  
example). The candidate snapshot should thus be rather good, but  
that's not much of an issue, I think.


If we want to completely get rid of serialization failures in the  
first (writing) operation within a transaction, we'd have to repeat  
these steps after a visibility check fails. Meaning having to degrade  
the real snapshot acquired after the first lock to a candidate  
snapshot for the second tuple lock we try.


Maybe candidate snapshots is a good short name for such a feature?

Another line of thought: isn't this like READ COMMITTED for just the  
first operation in a SERIALIZABLE transaction?


Regards

Markus Wanner


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUG?] strange behavior in ALTER TABLE ... RENAME TO on inherited columns

2009-12-17 Thread KaiGai Kohei
It is a patch for the matter which I reported before.

When a column is inherited from multiple relations, ALTER TABLE with
RENAME TO option is problematic.
This patch fixes the matter. In correctly, it prevent to rename columns
inherited from multiple relations and merged.

Also see the past discussion:
  http://archives.postgresql.org/pgsql-hackers/2009-11/msg00138.php

  postgres=# CREATE TABLE t1 (a int, b int);
  CREATE TABLE
  postgres=# CREATE TABLE t2 (b int, c int);
  CREATE TABLE
  postgres=# CREATE TABLE t3 (x text) inherits (t1, t2);
  NOTICE:  merging multiple inherited definitions of column b
  CREATE TABLE
  postgres=# SELECT * FROM t3;
   a | b | c | x
  ---+---+---+---
  (0 rows)

It looks to me fine.

  postgres=# ALTER TABLE t1 RENAME b TO y;
  ALTER TABLE
  postgres=# SELECT * FROM t3;
   a | y | c | x
  ---+---+---+---
  (0 rows)

  postgres=# SELECT * FROM t1;
   a | y
  ---+---
  (0 rows)

It looks to me fine.

  postgres=# SELECT * FROM t2;
  ERROR:  could not find inherited attribute b of relation t3

Oops, when we refer the t3 via t2, it expects the inherited relation
also has the column b, but it was already renamed.


One trouble is regression test. The misc_create test create a_star
table, then it is inherited by b_star and c_star, then these are
inherited to d_star table. Then misc test rename the a_star.a, but
this patch prevent it.

It looks like works well, but it is a corner case, because d_star.a
is eventually inherited from a_star via b_star and c_star, and these
are all the inherited relations.
In generally, we don't have reasonable way to rename all the related
columns upper and lower of the inheritance relationship.

Thanks,

(2009/11/05 9:57), KaiGai Kohei wrote:
 Tom Lane wrote:
 Thom Brownthombr...@gmail.com  writes:
 2009/11/4 Alvaro Herreraalvhe...@commandprompt.com:
 KaiGai Kohei wrote:
 I think we should not allow to rename a column with attinhcount  1.

 I think we should fix ALTER TABLE to cope with multiple inheritance.

 I'd be interested to see how this should work.

 Yeah.  I don't think a fix is possible, because there is no
 non-astonishing way for it to behave.  I think KaiGai is right that
 forbidding the rename is the best solution.
 
 The attached patch forbids rename when the attribute is inherited
 from multiple parents.
 
postgres=# CREATE TABLE t1 (a int, b int);
CREATE TABLE
postgres=# CREATE TABLE t2 (b int, c int);
CREATE TABLE
postgres=# CREATE TABLE t3 (d int) INHERITS (t1, t2);
NOTICE:  merging multiple inherited definitions of column b
CREATE TABLE
postgres=# SELECT * FROM t3;
 a | b | c | d
---+---+---+---
(0 rows)
 
postgres=# ALTER TABLE t1 RENAME b TO x;
ERROR:  cannot rename multiple inherited column b
 
 
 The regression test detected a matter in the misc test.
 
 It tries to rename column a of a_star table, but it failed due to
 the new restriction.
 
--
-- test the star operators a bit more thoroughly -- this time,
-- throw in lots of NULL fields...
--
-- a is the type root
-- b and c inherit from a (one-level single inheritance)
-- d inherits from b and c (two-level multiple inheritance)
-- e inherits from c (two-level single inheritance)
-- f inherits from e (three-level single inheritance)
--
CREATE TABLE a_star (
class   char,
a   int4
);
 
CREATE TABLE b_star (
b   text
) INHERITS (a_star);
 
CREATE TABLE c_star (
c   name
) INHERITS (a_star);
 
CREATE TABLE d_star (
d   float8
) INHERITS (b_star, c_star);
 
 At the misc test,
 
--- 242,278 
  ALTER TABLE c_star* RENAME COLUMN c TO cc;
  ALTER TABLE b_star* RENAME COLUMN b TO bb;
  ALTER TABLE a_star* RENAME COLUMN a TO aa;
+ ERROR:  cannot rename multiple inherited column a
  SELECT class, aa
 FROM a_star* x
 WHERE aa ISNULL;
! ERROR:  column aa does not exist
! LINE 1: SELECT class, aa
!
 
 It seems to me it is a case the regression test to be fixed up.
 (We don't have any reasonable way to know whether a certain attribute
 has a same source, or not.)
 
 Any comments?
 
 
 
 


-- 
OSS Platform Development Division, NEC
KaiGai Kohei kai...@ak.jp.nec.com
*** src/test/regress/sql/inherit.sql	(revision 2486)
--- src/test/regress/sql/inherit.sql	(working copy)
***
*** 336,338 
--- 336,350 
  CREATE TABLE inh_error2 (LIKE t4 INCLUDING STORAGE) INHERITS (t1);
  
  DROP TABLE t1, t2, t3, t4, t12_storage, t12_comments, t1_inh, t13_inh, t13_like, t_all;
+ 
+ -- Test for renaming
+ CREATE TABLE t1 (a int, b int);
+ CREATE TABLE t2 (b int, c int);
+ CREATE TABLE t3 (d int) INHERITS(t1, t2);
+ ALTER TABLE t1 RENAME a TO x;
+ ALTER TABLE t1 RENAME b TO y;		-- to be failed
+ ALTER TABLE t3 RENAME d TO z;
+ SELECT * FROM t3;
+ DROP TABLE t3;
+ DROP TABLE t2;
+ DROP TABLE t1;
*** src/test/regress/expected/inherit.out	

Re: [HACKERS] Streaming replication and non-blocking I/O

2009-12-17 Thread Fujii Masao
On Wed, Dec 16, 2009 at 6:53 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Great! The logical next step is move the handling of TimelineID and
 system identifier out of libpq as well.

All right.

 0. Begin by connecting to the master just like a normal backend does. We
 don't necessarily need the new ProtocolVersion code either, though it's
 probably still a good idea to reject connections to older server versions.

And, I think that such backend should switch to walsender mode when the startup
packet arrives. Otherwise, we would have to authenticate such backend twice
on different context, i.e., a normal backend and walsender. So the settings for
each context would be required in pg_hba.conf. This is odd, I think. Thought?

 1. Get the system identifier of the master.

 Slave - Master: Query message, with a query string like
 GET_SYSTEM_IDENTIFIER

 Master - Slave: RowDescription, DataRow CommandComplete, and
 ReadyForQuery messages. The system identifier is returned in the DataRow
 message.

 This is identical to what happens when a query is executed against a
 normal backend using the simple query protocol, so walsender can use
 PQexec() for this.

s/walsender/walreceiver ?

A signal cannot cancel PQexec() during waiting for the message from the
server. We might need to change SIGTERM handler of walreceiver so as to
call proc_exit() immediately if it's during PQexec().

 2. Another query exchange like above, for timeline ID. (or these two
 steps can be joined into one query, to eliminate one round-trip).

 3. Request a backup history file, if needed:

 Slave - Master: Query message, with a query string like
 GET_BACKUP_HISTORY_FILE XXX where XXX is XLogRecPtr or file name.

 Master - Slave: RowDescription, DataRow CommandComplete and
 ReadyForQuery messages as usual. The file contents are returned in the
 DataRow message.

 4. Start replication

 Slave - Master: Query message, with query string START REPLICATION:
 , where  is the RecPtr of the starting point.

 Master - Slave: CopyOutResponse followed by a continuous stream of
 CopyData messages with WAL contents.

Seems OK.

 This minimizes the changes to the protocol and libpq, with a clear way
 of extending by adding new commands. Similar to what you did a long time
 ago, connecting as an actual backend at first and then switching to
 walsender mode after running a few queries, but this would all be
 handled in a separate loop in walsender instead of running as a
 full-blown backend.

Agreed. Only walsender should be allowed to handle the query strings that
you proposed, in order that we avoid touching a parser.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-17 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 Dimitri Fontaine dfonta...@hi-media.com writes:
 Tom Lane t...@sss.pgh.pa.us writes:
 foreach p2_member in unnest(p2) loop
 p1 := array(select period_except(p1_member, p2_member)
 from unnest(p1) p1_member);
 end loop;
 
 But maybe it can be done in a single SQL command.

 Yeah, as soon as you have LATERAL, I think. Without it there's no way to
 compose SRF in SQL, AFAIK.

 Hm, how would you do it with LATERAL?  The problem is not so much
 composition as the need for a variable number of rounds of
 composition.

Let's have a try at it:

select p2_member, array_accum(p1)
  from unnest(p2) as p2_member
   lateral (select period_except(p1_member, p2_member)
  from unnest(p1) p1_member) as x(p1);

I'm not sure I understand how the explicit looping over unnest(p2) is different
from using lateral, or even if that's what you're talking about when
mentioning variable number of rounds.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] How should the notice message from the primary be handled in the standby?

2009-12-17 Thread Fujii Masao
On Tue, Dec 15, 2009 at 12:56 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Use PQsetNoticeReceiver.  The other one is just there for backwards
 compatibility.

Thanks!

I'm thinking of changing nothing about handling of a notice message.
Because there is no notice message (from the primary) which is worth
being logged also in the standby. Thought?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-17 Thread Albe Laurenz
Robert Haas wrote:
  A predicate can include columns from an index plus others.
  Am I missing something?
 
 Hmm, interesting point.  In that case you couldn't use the index to
 enforce predicate locking under MVCC without disabling HOT.  But there
 will be other cases where that wouldn't help anyway - a predicate
 could also include unindexed columns exclusively.  For those, the
 traditional approach (not the one discussed in this paper) probably
 requires locking against any heap insert, or checking each new heap
 insert against the constraint, or... something.

If I understand that correctly

   [...] by acquiring a shared lock on the next
   row in order, as a scan is made to check whether rows match a predicate.
   The scan might be through the data records or through an index

I would say that in the case of a table scan, the whole table will
be SILOCKed. I guess that's pretty much unavoidable if you want
serializability.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-17 Thread Nicolas Barbier
[ Forgot the list, resending. ]

2009/12/16 Boszormenyi Zoltan z...@cybertec.at:

 Robert Haas írta:

 On Wed, Dec 16, 2009 at 1:25 PM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Dec 16, 2009 at 1:14 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:

 So you'd have to disable HOT updates when true serializability was
 active?

 I thought about that, but I don't think so.   HOT only applies to
 updates, and predicate locking only applies to inserts.  Unless I have
 my head in the sand?

 Err, no, wait.  Predicate locking can apply to updates, but since HOT
 updates never update an indexed column, I think we might still be OK?

 A predicate can include columns from an index plus others.
 Am I missing something?

This whole concept (next-key locking) also applies in case there are
no indexes. In the case of a table scan, the next key is either the
next row relative to the scanned range (if the DBMS supports the
notion of non-full table scans, for example if the table contents are
themselves stored in sorted order), or something that indicates that
the whole table was scanned (i.e., a table lock).

Therefore, with next-key locking you better don't have too many table
scans if you want to have any concurrent transactions.

Nicolas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby, release candidate?

2009-12-17 Thread Peter Eisentraut
On sön, 2009-12-13 at 19:20 +, Simon Riggs wrote:
 Barring resolving a few points and subject to even more testing, this
 is the version I expect to commit to CVS on Wednesday.

So it's Thursday now.  Please keep us updated on the schedule, as we
need to decide when to wrap alpha3 and whether to reopen the floodgates
for post-CF commits.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Spurious 22 in hstore.sgml

2009-12-17 Thread Albe Laurenz
Magnus Hagander wrote:
 On Wed, Dec 16, 2009 at 20:34, David E. Wheeler wrote:
  *** a/doc/src/sgml/hstore.sgml
  --- b/doc/src/sgml/hstore.sgml
 
 Heh, interesting. That clearly shouldn't be there. Applied.

Does this count as catch-22?

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby, release candidate?

2009-12-17 Thread Simon Riggs
On Thu, 2009-12-17 at 12:01 +0200, Peter Eisentraut wrote:
 On sön, 2009-12-13 at 19:20 +, Simon Riggs wrote:
  Barring resolving a few points and subject to even more testing, this
  is the version I expect to commit to CVS on Wednesday.
 
 So it's Thursday now.  Please keep us updated on the schedule, as we
 need to decide when to wrap alpha3 and whether to reopen the floodgates
 for post-CF commits.

I've been looking at a semaphore deadlock problem reported by Hiroyuki
Yamada. It's a serious issue, though luckily somewhat restricted in
scope.

I don't think its wise to rush in with a solution, since that involves
some work with semaphores and I could easily make that area less stable
by acting too quickly.

What I will do now is put in a restriction on lock waits in Hot Standby,
which will only happen for Alpha3. That avoids the deadlock issue in an
easy and safe way, though it is heavy handed and I aim to replace it
fairly soon, following discussion and testing.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Spurious 22 in hstore.sgml

2009-12-17 Thread David Fetter
On Thu, Dec 17, 2009 at 11:01:19AM +0100, Albe Laurenz wrote:
 Magnus Hagander wrote:
  On Wed, Dec 16, 2009 at 20:34, David E. Wheeler wrote:
   *** a/doc/src/sgml/hstore.sgml
   --- b/doc/src/sgml/hstore.sgml
  
  Heh, interesting. That clearly shouldn't be there. Applied.
 
 Does this count as catch-22?

Patch-22, perhaps?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby and prepared transactions

2009-12-17 Thread Hiroyuki Yamada

On Wed, 2009-12-16 at 19:35 +0900, Hiroyuki Yamada wrote:

  * There is a window beween gathering lock information in 
 GetRunningTransactionLocks()
and writing WAL in LogAccessExclusiveLocks().
  * In current lock redo algorithm, locks are released when the transaction 
 holding the lock
are commited or aborted.
 
 ... then what happens if any transaction holding ACCESS EXCLUSIVE lock 
 commits in the 
window ?

Yes, was a problem in that code. Fixed in git.

We were doing it for prepared transactions but not for normal xacts.
I will look again at that code.

Thanks very much for reading the code. Any more?!?


Well, I've read some more and have a question.

The implementation assumes that transactions write COMMIT/ABORT WAL at the end
of them, while it does not seem to write ABORT WAL in immediate shutdown. So,

1. acquire ACCESS EXCLUSIVE lock in table A in xact 1
2. execute immediate shutdown of the active node
3. restart it
4. acquire ACCESS EXCLUSIVE lock in table A in xact 2

...then, duplicate lock acquisition by two diffrent transactions can occur in 
the standby node.

Am I missing something ? Or is this already reported ?


regards,

--
  Hiroyuki YAMADA
  Kokolink Corporation
  yam...@kokolink.net

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby and prepared transactions

2009-12-17 Thread Heikki Linnakangas
Hiroyuki Yamada wrote:
 The implementation assumes that transactions write COMMIT/ABORT WAL at the end
 of them, while it does not seem to write ABORT WAL in immediate shutdown. So,
 
 1. acquire ACCESS EXCLUSIVE lock in table A in xact 1
 2. execute immediate shutdown of the active node
 3. restart it
 4. acquire ACCESS EXCLUSIVE lock in table A in xact 2
 
 ...then, duplicate lock acquisition by two diffrent transactions can occur in 
 the standby node.
 
 Am I missing something ? Or is this already reported ?

Replay of the shutdown checkpoint that is written when the server is
restarted should abort the old transaction and release its locks.

Hmm, looking at the code, I think Simon threw that baby with the
bathwater when he removed support for starting standby from a shutdown
checkpoint.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby and prepared transactions

2009-12-17 Thread Simon Riggs
On Thu, 2009-12-17 at 19:55 +0900, Hiroyuki Yamada wrote:

 Well, I've read some more and have a question.
 
 The implementation assumes that transactions write COMMIT/ABORT WAL at the end
 of them, while it does not seem to write ABORT WAL in immediate shutdown. So,
 
 1. acquire ACCESS EXCLUSIVE lock in table A in xact 1
 2. execute immediate shutdown of the active node
 3. restart it
 4. acquire ACCESS EXCLUSIVE lock in table A in xact 2
 
 ...then, duplicate lock acquisition by two diffrent transactions can occur in 
 the standby node.
 
 Am I missing something ? Or is this already reported ?

This was a tricky point in the design because we already knew that abort
records are not always written for every transaction.

ProcArrayApplyRecoveryInfo() was specifically designed to prune away
older transactions that might have become stuck in that state, which
calls StandbyReleaseOldLocks() to remove any such locks. The pruning
operation is also one of the reasons why we need to log
XLOG_RUNNING_XACTS on a regular basis.

Duplicate lock acquisition isn't specifically checked for, since it is
blocked on primary, but lock release if duplicates did exist is handled.

I've checked the code some more to see if the above is possible. At step
(3) we perform a shutdown checkpoint, which would/should be responsible
for performing the prune operation that would prevent your case from
being a problem.

The code around shutdown checkpoint has changed a few times and it looks
like that bug has been introduced by my edit on Dec 6 to prevent
shutdown checkpoint as starting places. They need to be handled, since
we now don't write a XLOG_RUNNING_XACTS record during a shutdown
checkpoint.

I will re-add this now.

Thanks again: keep going, you're on a roll.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming replication and non-blocking I/O

2009-12-17 Thread Heikki Linnakangas
Fujii Masao wrote:
 On Wed, Dec 16, 2009 at 6:53 PM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 0. Begin by connecting to the master just like a normal backend does. We
 don't necessarily need the new ProtocolVersion code either, though it's
 probably still a good idea to reject connections to older server versions.
 
 And, I think that such backend should switch to walsender mode when the 
 startup
 packet arrives. Otherwise, we would have to authenticate such backend twice
 on different context, i.e., a normal backend and walsender. So the settings 
 for
 each context would be required in pg_hba.conf. This is odd, I think. Thought?

True.

 This is identical to what happens when a query is executed against a
 normal backend using the simple query protocol, so walsender can use
 PQexec() for this.
 
 s/walsender/walreceiver ?

Right.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming replication and non-blocking I/O

2009-12-17 Thread Fujii Masao
On Thu, Dec 17, 2009 at 9:02 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 And, I think that such backend should switch to walsender mode when the 
 startup
 packet arrives. Otherwise, we would have to authenticate such backend twice
 on different context, i.e., a normal backend and walsender. So the settings 
 for
 each context would be required in pg_hba.conf. This is odd, I think. Thought?

 True.

Currently this switch depends on whether XLOG_STREAMING_CODE is sent from the
standby or not, also which depends on whether PQstartXLogStreaming() is called
or not. But, as the next step, we should get rid of also such changes of libpq.

I'm thinking of making the standby send the walsender-switch-code the same way
as application_name; walreceiver always specifies the option like
replication=on
in conninfo string and calls PQconnectdb(), which sends the code as a part of
startup packet. And, the environment variable for that should not be defined to
avoid user's mis-configuration, I think.

Thought? Better idea?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby and prepared transactions

2009-12-17 Thread Simon Riggs
On Thu, 2009-12-17 at 13:24 +0200, Heikki Linnakangas wrote:

 Hmm, looking at the code, I think Simon threw that baby with the
 bathwater when he removed support for starting standby from a shutdown
 checkpoint.

Hmm, I think that code was just for starting points only. It would not
have been executed in normal running of the standby, so it appears the
bug was older than that. Absence of baby now appears inconclusive.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Kevin Grittner
Tom Lane  wrote:
 Kevin Grittner  writes:
 Tom Lane  wrote:
 (Besides which the lock acquired by UPDATE isn't exclusive and
 wouldn't block anyway...)
 
 It blocks other UPDATEs.
 
 Not at the table level.
 
The question was about whether we could change the timing of when
we get the current locks, not about getting different locks than we
currently do.  Clearly, adding a table lock would make no sense.
 
 If you could lock only at the tuple level maybe you'd have
 something, but it seems like you can't find the target tuples
 without having acquired a snapshot.
 
That would be the fatal difference then.  InnoDB doesn't actually
keep old versions of a row; it generates them from undo records in
the log when needed, which might be why it was feasible to get the
lock before the snapshot there.  If we need the snapshot before we
can get the lock, it's not an optimization which is available to us.
 
Thanks,
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby and prepared transactions

2009-12-17 Thread Heikki Linnakangas
Simon Riggs wrote:
 On Thu, 2009-12-17 at 13:24 +0200, Heikki Linnakangas wrote:
 
 Hmm, looking at the code, I think Simon threw that baby with the
 bathwater when he removed support for starting standby from a shutdown
 checkpoint.
 
 Hmm, I think that code was just for starting points only. It would not
 have been executed in normal running of the standby, so it appears the
 bug was older than that. Absence of baby now appears inconclusive.

This is the piece of code we're talking about, in xlog_redo():

--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7340,41 +7340,6 @@ xlog_redo(XLogRecPtr lsn, XLogRecord *record)
ShmemVariableCache-oldestXid = checkPoint.oldestXid;
ShmemVariableCache-oldestXidDB = checkPoint.oldestXidDB;

-   /*
-* We know nothing was running on the master at this point, 
though
-* we can only use it as a starting point iff wal_standby_info 
was
-* enabled, otherwise we may not get further information about 
changes
-* from the master.
-*/
-   if (standbyState = STANDBY_UNINITIALIZED 
checkPoint.XLogStandbyInfoMode)
-   {
-   RunningTransactionsData running;
-   TransactionId oldestRunningXid;
-   TransactionId *xids;
-   int nxids;
-
-   oldestRunningXid = PrescanPreparedTransactions(xids, 
nxids);
-
-   /*
-* Construct a RunningTransactions snapshot 
representing a shut
-* down server, with only prepared transactions still 
alive.
-*
-* StandbyRecoverPreparedTransactions will call 
SubTransSetParent
-* for any subtransactions, so we consider this a 
non-overflowed
-* snapshot.
-*/
-   running.xcnt = nxids;
-   running.subxid_overflow = false;
-   running.nextXid = checkPoint.nextXid;
-   running.oldestRunningXid = oldestRunningXid;
-   running.xids = xids;
-
-   ProcArrayInitRecoveryInfo(oldestRunningXid);
-   ProcArrayApplyRecoveryInfo(running);
-
-   StandbyRecoverPreparedTransactions();
-   }
-
/* Check to see if any changes to max_connections give problems 
*/
if (standbyState != STANDBY_DISABLED)
CheckRequiredParameterValues(checkPoint);


That removed piece of code was executed in the standby whenever we saw a
shutdown checkpoint. It calls ProcArrayApplyRecoveryInfo(), which calls
ExpireOldKnownAssignedTransactionIds() and StandbyReleaseOldLocks() to
clean up known-assigned-xid entries and locks of the implicitly-aborted
transactions.

I see now that in the presence of prepared transactions, we would fail
to clean up failed transations with XID  the oldest prepared
transaction, but other than that it looks fine to me.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming replication and non-blocking I/O

2009-12-17 Thread Heikki Linnakangas
Fujii Masao wrote:
 I'm thinking of making the standby send the walsender-switch-code the same 
 way
 as application_name; walreceiver always specifies the option like
 replication=on
 in conninfo string and calls PQconnectdb(), which sends the code as a part of
 startup packet. And, the environment variable for that should not be defined 
 to
 avoid user's mis-configuration, I think.

Sounds good.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Kevin Grittner
Markus Wanner  wrote:
 Another line of thought: isn't this like READ COMMITTED for just
 the first operation in a SERIALIZABLE transaction?
 
Hmmm...  You have a point.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-17 Thread Florian Pflug

On 16.12.09 16:40 , Kevin Grittner wrote:

Nicolas Barbiernicolas.barb...@gmail.com  wrote:


I am not sure whether the serialization failures that it may cause
 are dependent on the plan used.


They are.


But so are failures due to deadlocks even today, no? The processing
order of UPDATES which involve joins isn't any more well-defined that
the order of rows returned by a similarly complex select I think.

Actually, I think the whole SIREAD-lock idea can be seen as a kind of
2PL with opportunistic locking and deferred deadlock detection. Instead
of making readers and writers block each other, you let them proceed in
parallel, and check if that resulted in any mutual toe-stepping later.
The surprising part is that SIREAD locks and those inConflict,
outConflict flags actually provide enough information to detect possible
problems. Or at least this is the idea I got after skipping through the
thesis for an hour or so.

best regards,
Florian Pflug

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-17 Thread Kevin Grittner
Albe Laurenz  wrote:
 I would say that in the case of a table scan, the whole table will
 be SILOCKed. I guess that's pretty much unavoidable if you want
 serializability.
 
Agreed.
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby and prepared transactions

2009-12-17 Thread Simon Riggs
On Thu, 2009-12-17 at 15:18 +0200, Heikki Linnakangas wrote:

 That removed piece of code was executed in the standby whenever we saw a
 shutdown checkpoint. It calls ProcArrayApplyRecoveryInfo(), which calls
 ExpireOldKnownAssignedTransactionIds() and StandbyReleaseOldLocks() to
 clean up known-assigned-xid entries and locks of the implicitly-aborted
 transactions.

OK, I was presuming that running StandbyRecoverPreparedTransactions()
and ProcArrayInitRecoveryInfo() twice would cause problems.

 I see now that in the presence of prepared transactions, we would fail
 to clean up failed transations with XID  the oldest prepared
 transaction

Good! I just spotted that also, just prior to posting my fix, so
rewriting it again now.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update on true serializable techniques in MVCC

2009-12-17 Thread Kevin Grittner
Nicolas Barbier  wrote:
 Therefore, with next-key locking you better don't have too many table
 scans if you want to have any concurrent transactions.
 
Well, I would say that you don't want too many table scans on heavily
updated tables if you don't want too many serialization failures.  Keep
in
mind that the SIREAD locks would not block anything.  A dangerous
structure,
involving two adjacent rw dependencies, must be found before anything is
rolled back.
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Largeobject Access Controls (r2460)

2009-12-17 Thread Robert Haas
2009/12/17 Takahiro Itagaki itagaki.takah...@oss.ntt.co.jp:

 Robert Haas robertmh...@gmail.com wrote:

 2009/12/16 KaiGai Kohei kai...@ak.jp.nec.com:
  ? ?long desc: When turned on, privilege checks on large objects perform 
  with
  ? ? ? ? ? ? ? backward compatibility as 8.4.x or earlier releases.

 Mostly English quality, but there are some other issues too.  Proposed
 patch attached.

 I remember we had discussions about the version number, like
 Don't use '8.5' because it might be released as '9.0', no?

I chose to follow the style which Tom indicated that he preferred
here.  We don't use the phrase 8.4.x series anywhere else in the
documentation, so this doesn't seem like a good time to start.  Tom or
I will go through and renumber everything if we end up renaming the
release to 9.0.

 Another comment is I'd like to keep link 
 linkend=catalog-pg-largeobject-metadata
 for the first structnamepg_largeobject/structname in each topic.

Those two things aren't the same.  Perhaps you meant link
linkend=catalog-pg-largeobject? I'll tweak the pg_largeobject and
pg_largeobject_metadata sections to make sure each has a link to the
other and commit this.  I also found one more spelling mistake so I
will include that correction as well.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] NOT IN Doesn't use Anti Joins?

2009-12-17 Thread Rod Taylor
I'm sure there is a good reason why NOT IN will not use an Anti-Join
plan equivalent to NOT EXISTS due to NULL handling, but in this
particular case the value being compared is in the PRIMARY KEY of both
structures being joined.

The NOT IN plan was killed after 10 minutes. The NOT EXISTS plan
returned data in roughly 10ms.

Is there a reason why the NOT IN plan could not use Anti-Joins when
the column being compared against is guaranteed to be NOT NULL? Too
much planner overhead to determine nullness of the column?


sk=# explain select * from source_reb_listing where listing_id not in
(select listing_id from source_reb_listing_specs) order by file_id
desc limit 5;

 QUERY
PLAN
-
 Limit  (cost=729015.39..3420463.83 rows=5 width=28)
   -  Index Scan Backward using source_reb_listing_fileid_idx on
source_reb_listing  (cost=729015.39..169537219655.96 rows=314954
width=28)
 Filter: (NOT (SubPlan 1))
 SubPlan 1
   -  Materialize  (cost=729015.39..1185280.74 rows=32810035 width=8)
 -  Seq Scan on source_reb_listing_specs
(cost=0.00..568040.35 rows=32810035 width=8)
(6 rows)

sk=# explain select * from source_reb_listing where not exists (select
* from source_reb_listing_specs as t where t.listing_id =
source_reb_listing.listing_id) order by file_id desc limit 5;
   QUERY
PLAN
-
 Limit  (cost=0.00..35.31 rows=5 width=28)
   -  Nested Loop Anti Join  (cost=0.00..3880495.87 rows=549496 width=28)
 -  Index Scan Backward using source_reb_listing_fileid_idx
on source_reb_listing  (cost=0.00..1107142.20 rows=629907 width=28)
 -  Index Scan using source_reb_listing_specs_pkey on
source_reb_listing_specs t  (cost=0.00..1592.74 rows=408 width=8)
   Index Cond: (t.listing_id = source_reb_listing.listing_id)
(5 rows)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT IN Doesn't use Anti Joins?

2009-12-17 Thread Robert Haas
On Thu, Dec 17, 2009 at 9:02 AM, Rod Taylor rod.tay...@gmail.com wrote:
 Is there a reason why the NOT IN plan could not use Anti-Joins when
 the column being compared against is guaranteed to be NOT NULL? Too
 much planner overhead to determine nullness of the column?

I doubt it.  I think it's just a question of round tuits.  I think Tom
hasn't felt it worth the effort since you can work around it by
rewriting the query, and nobody else has bothered to work up a patch.
If you feel like working on it, I think it would be a nice
improvement.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-17 Thread Tom Lane
Dimitri Fontaine dfonta...@hi-media.com writes:
 Tom Lane t...@sss.pgh.pa.us writes:
 Hm, how would you do it with LATERAL?  The problem is not so much
 composition as the need for a variable number of rounds of
 composition.

 Let's have a try at it:

 select p2_member, array_accum(p1)
   from unnest(p2) as p2_member
lateral (select period_except(p1_member, p2_member)
   from unnest(p1) p1_member) as x(p1);

I don't think that does it.  Maybe I misunderstand LATERAL, but what
that looks like to me is that each p1 will be separately filtered by
each p2, giving rise to a distinct element in the output.  What we
need is for each p1 to be filtered by *all* p2's, successively
(though in any order).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby and prepared transactions

2009-12-17 Thread Simon Riggs
On Thu, 2009-12-17 at 13:38 +, Simon Riggs wrote:

  I see now that in the presence of prepared transactions, we would fail
  to clean up failed transations with XID  the oldest prepared
  transaction
 
 Good! I just spotted that also, just prior to posting my fix, so
 rewriting it again now.

solution pushed to git

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Kevin Grittner
Markus Wanner mar...@bluegap.ch wrote:
 
 Another line of thought: isn't this like READ COMMITTED for just
 the first operation in a SERIALIZABLE transaction?
 
I've mulled it over and I have two different logical proofs that
this is safe; if anyone is dubious I'd be happy to share.
 
This seems likely to be of significant benefit in some workloads,
and I can't see anywhere that it is likely to cost much.  Any
objections to adding this to the TODO list as a performance item?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Robert Haas
On Thu, Dec 17, 2009 at 10:05 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Markus Wanner mar...@bluegap.ch wrote:

 Another line of thought: isn't this like READ COMMITTED for just
 the first operation in a SERIALIZABLE transaction?

 I've mulled it over and I have two different logical proofs that
 this is safe; if anyone is dubious I'd be happy to share.

 This seems likely to be of significant benefit in some workloads,
 and I can't see anywhere that it is likely to cost much.  Any
 objections to adding this to the TODO list as a performance item?

I thought you concluded two emails ago that it wouldn't work for PG?
It's certainly not clear to me what exactly the TODO would be.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Markus Wanner mar...@bluegap.ch wrote:

 Another line of thought: isn't this like READ COMMITTED for just
 the first operation in a SERIALIZABLE transaction?

 I've mulled it over and I have two different logical proofs that
 this is safe; if anyone is dubious I'd be happy to share.

 This seems likely to be of significant benefit in some workloads,
 and I can't see anywhere that it is likely to cost much.  Any
 objections to adding this to the TODO list as a performance item?
 
 I thought you concluded two emails ago that it wouldn't work for
 PG?  It's certainly not clear to me what exactly the TODO would
 be.
 
Tom's emails had me pretty convinced that this technique wouldn't
work in PostgreSQL, but Markus put a fresh perspective on it which
makes it seem relatively painless.  (Although, as is often the case,
my perspective may be naive.)
 
Basically, in a SERIALIZABLE transaction, if the first statement
which would require a snapshot would currently fail with ERROR: 
could not serialize access due to concurrent update we would
instead get a fresh snapshot and retry -- which is what we do in a
READ COMMITTED transaction.
 
One way of looking at this is that any transaction which fails with
a serialization error can be retried with a reasonable chance of
success.  There is no evidence of anything wrong with the
transaction itself, just that its actions conflicted with those of a
concurrent transaction.  For the case we're discussing, that other
transaction has now committed.  (We blocked waiting to see whether
it would commit or roll back.)  If this is the first statement which
needed a snapshot, retrying it with a new snapshot can't create any
conflicting views of the data.  We *could* view this sort of as an
automatic transaction retry in the limited situations where the
database engine can determine what to do.  (If there had been prior
statements, you can't really know that the current statement would
have been issued by the client had the prior statements been run
against a different snapshot.)  Where this view of things is a
little off is that explicit locks obtained earlier in the
transaction would still be held; we're not really starting the
*whole* transaction over.  While this doesn't seem a fatal flaw, it
does mean the other way of looking at it is a more technically
correct.
 
The other way of looking at it is that until a statement succeeds
with a given snapshot, you have not fixed your snapshot for the
serializable transaction.  A retry similar to what we do for READ
COMMITTED would just be part of obtaining the one snapshot used for
the SERIALIZABLE transaction -- it isn't fixed until that first
statement succeeds.
 
I'm assuming that this could be a fairly small change because we
already have code to do exactly the right thing for READ COMMITTED
transactions.  The logic to choose which way to handle the commit of
a transaction which held a competing lock would need to be modified
to use the READ COMMITTED lock on the first statement which obtains
a snapshot in a SERIALIZABLE transaction, and the snapshot for a
SERIALIZABLE transaction would not be fixed until the completion
of the first statement needing a snapshot.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Basically, in a SERIALIZABLE transaction, if the first statement
 which would require a snapshot would currently fail with ERROR: 
 could not serialize access due to concurrent update we would
 instead get a fresh snapshot and retry -- which is what we do in a
 READ COMMITTED transaction.

This sounds like a pretty horrid kluge.  For one thing, the statement
might already have done a great deal of work before you hit the failure.
(Admittedly, that work will be lost anyway if we abort, but it's not
a localized change to make it all happen all over again.)  Also,
aborting that statement without also losing any previously-acquired
locks would require establishing a hidden subtransaction, with ensuing
extra costs to be paid even when there isn't a failure.

I think you misunderstand how READ COMMITTED works; it does not change
the snapshot for the entire statement, it only follows the update chain
for a particular tuple that's been chosen for update or delete.
 
 I'm assuming that this could be a fairly small change 

It would not be.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-17 Thread Andrew Gierth
 Tom == Tom Lane t...@sss.pgh.pa.us writes:

Someone mentioned LATERAL?

  Tom Lane t...@sss.pgh.pa.us writes:
  Hm, how would you do it with LATERAL?  The problem is not so much
  composition as the need for a variable number of rounds of
  composition.

  Let's have a try at it:

  select p2_member, array_accum(p1)
  from unnest(p2) as p2_member
  lateral (select period_except(p1_member, p2_member)
  from unnest(p1) p1_member) as x(p1);

 Tom I don't think that does it.  Maybe I misunderstand LATERAL, but
 Tom what that looks like to me is that each p1 will be separately
 Tom filtered by each p2, giving rise to a distinct element in the
 Tom output.  What we need is for each p1 to be filtered by *all*
 Tom p2's, successively (though in any order).

Right, that's not a job for LATERAL, though it could be done easily
enough in one statement with a recursive CTE, I think.

-- 
Andrew (irc:RhodiumToad)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Greg Stark
On Thu, Dec 17, 2009 at 3:39 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Basically, in a SERIALIZABLE transaction, if the first statement
 which would require a snapshot would currently fail with ERROR:
 could not serialize access due to concurrent update we would
 instead get a fresh snapshot and retry -- which is what we do in a
 READ COMMITTED transaction.

So I for multi-statement transactions I don't see what this buys you.
You'll still have to write the code to retry, and postgres retrying in
the cases where it can isn't really going to be a whole lot better.

Moreover I think it would kick in less often than you might expect and
sometimes surprise people by not kicking in when they expect it to.
Any internal queries could count (though i think you get away with
catalog operations in snapshot_now), any volatile functions, etc. So
people might write a single-statement SQL transaction and not bother
writing retry logic and then be surprised by errors.

I'm unclear why serialization failures would be rare. It depends
entirely on the application. If you're querying records which are busy
from concurrent updates you could get a continuous stream of
serialization failures. It seems better to report the situation to the
user all the time since they have to handle it already and might want
to know about the problem and implement some kind of backoff rather
than hide it from them but only sometimes so they still have to write
code to handle it but aren't allows to handle it consistently.

This isn't the first time that we've seen advantages that could be had
from packaging up a whole transaction so the database can see
everything the transaction needs to do. Perhaps we should have an
interface for saying you're going to feed a series of commands which
you want the database to repeat for you verbatim automatically on
serialization failures. Since you can't construct the queries based on
the results of previous queries the database would be free to buffer
them all up and run them together at the end of the transaction which
would allow the other tricky optimizations we've pondered in the past
as well.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 we would instead get a fresh snapshot and retry -- which is what
 we do in a READ COMMITTED transaction.
 
 I think you misunderstand how READ COMMITTED works; it does not
 change the snapshot for the entire statement, it only follows the
 update chain for a particular tuple that's been chosen for update
 or delete.
 
Thanks for the clarification.  That does not work for SERIALIZABLE
at all, because other tables or rows referenced in that first
statement would be using the original snapshot.  Indeed, the
behavior under READ COMMITTED could be astonishing in certain
circumstances as it breaks atomicity:
 
atomicity: all of the results of a transaction should be visible in
the database, or none of them should be. It should never be possible
to see the results of some operations in a transaction without the
others.
 
connection1:
 
test=# create table t (c1 int not null primary key, c2 int);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
t_pkey for table t
CREATE TABLE
test=# insert into t values (1, 101), (2, 7);
INSERT 0 2
test=# start TRANSACTION ISOLATION LEVEL READ COMMITTED ;
START TRANSACTION
test=# update t set c2 = c2 + (select c2 from t where c1 = 2) where
c1 = 1;
UPDATE 1
test=# update t set c2 = 11 where c1 = 2;
UPDATE 1
 
connection2:

test=# START TRANSACTION ISOLATION LEVEL READ COMMITTED ;
START TRANSACTION
test=# update t set c2 = c2 + (select c2 from t where c1 = 2) where
c1 = 1;
[blocks]
 
connection1:

test=# commit;
COMMIT
 
connection2:

UPDATE 1
test=# commit;
COMMIT
test=# select * from t;
 c1 | c2
+-
  2 |  11
  1 | 115
(2 rows)
 
The update on connection2 added the modified value of the first
update from connection1 to the unmodified value from the second
update on connection1.  In other words, the atomicity of the update
on connection1 is broken in this case.
 
I'm not sure why this is considered OK.  At a minimum it should be
mentioned in our documentation of our implementation of the READ
COMMITTED isolation level.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range types

2009-12-17 Thread Scott Bailey

Tom Lane wrote:

Dimitri Fontaine dfonta...@hi-media.com writes:

Tom Lane t...@sss.pgh.pa.us writes:

Hm, how would you do it with LATERAL?  The problem is not so much
composition as the need for a variable number of rounds of
composition.



Let's have a try at it:



select p2_member, array_accum(p1)
  from unnest(p2) as p2_member
   lateral (select period_except(p1_member, p2_member)
  from unnest(p1) p1_member) as x(p1);


I don't think that does it.  Maybe I misunderstand LATERAL, but what
that looks like to me is that each p1 will be separately filtered by
each p2, giving rise to a distinct element in the output.  What we
need is for each p1 to be filtered by *all* p2's, successively
(though in any order).

regards, tom lane


That approach will only work if you coalesce your inputs into 
non-contiguous sets (NCS) first. Overlapping ranges would break it in a 
hurry. In addition to two coalesce operations, period_except would be 
calculated 1000x for a pair of 100 element arrays. Original solution, 
while not short was probably a little more elegant than Tom gave credit 
for. In a single pass it pulls out only the data points needed to build 
the resultant NCS without making assumptions that the inputs were coalesced.


I think I'll still be able to do a single pass solution for continuous 
ranges. I just wont be able to do the coalesce operations inline with 
the set operations.


Scott

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: determine snapshot after obtaining locks for first statement

2009-12-17 Thread Kevin Grittner
Greg Stark gsst...@mit.edu wrote: 
 So I for multi-statement transactions I don't see what this buys
 you.
 
Well, I became interested when Dr. Cahill said that adding this
optimization yielded dramatic improvements in his high contention
benchmarks.  Clearly it won't help every load pattern.
 
 You'll still have to write the code to retry, and postgres
 retrying in the cases where it can isn't really going to be a
 whole lot better.
 
In my view, any use of a relational database always carries with it
the possibility of a serialization error.  In other database
products I've run into situations where a simple SELECT at READ
COMMITTED can result in a serialization failure, so in my view all
application software should use a framework capable of recognizing
and automatically recovering from these.  I just try to keep them to
a manageable level.
 
 people might write a single-statement SQL transaction and not
 bother writing retry logic and then be surprised by errors.
 
As has often been said here -- you can't always protect people from
their own stupidity.
 
 I'm unclear why serialization failures would be rare.
 
Did I say that somewhere???
 
 It seems better to report the situation to the user all the time
 since they have to handle it already and might want to know about
 the problem and implement some kind of backoff
 
The point was to avoid a serialization failure and its related
rollback.  Do you think we should be reporting something to the
users every time a READ COMMITTED transaction blocks and then picks
the updated row?  (Actually, given that the results may be based on
an inconsistent view of the database, maybe we should)
 
 This isn't the first time that we've seen advantages that could be
 had from packaging up a whole transaction so the database can see
 everything the transaction needs to do. Perhaps we should have an
 interface for saying you're going to feed a series of commands
 which you want the database to repeat for you verbatim
 automatically on serialization failures. Since you can't construct
 the queries based on the results of previous queries the database
 would be free to buffer them all up and run them together at the
 end of the transaction which would allow the other tricky
 optimizations we've pondered in the past as well.
 
How is that different from putting the logic into a function and
retrying on serialization failure?  Are you just proposing a more
convenient mechanism to do the same thing?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Thanks for the clarification.  That does not work for SERIALIZABLE
 at all, because other tables or rows referenced in that first
 statement would be using the original snapshot.  Indeed, the
 behavior under READ COMMITTED could be astonishing in certain
 circumstances as it breaks atomicity:

Yup.  That is stated fairly clearly already in the description of
READ COMMITTED mode, no?
http://developer.postgresql.org/pgdocs/postgres/transaction-iso.html#XACT-READ-COMMITTED

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] COPY IN as SELECT target

2009-12-17 Thread Andrew Dunstan


Recently there was discussion about allowing a COPY statement to be a 
SELECT target, returning a text array, although the syntax wasn't really 
nailed down that I recall. I was thinking that  we might have


   COPY RETURNING ARRAY FROM ...

instead of

   COPY tablename opt_column_list FROM ...


the we possibly could do things like:

   SELECT t[5] as a, 3*(t[3]::numeric) as b FROM (COPY RETURNING ARRAY 
FROM STDIN CSV) as t;


Thoughts?

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 the behavior under READ COMMITTED could be astonishing in certain
 circumstances as it breaks atomicity:
 
 Yup.  That is stated fairly clearly already in the description of
 READ COMMITTED mode, no?

http://developer.postgresql.org/pgdocs/postgres/transaction-iso.html#XACT-READ-COMMITTED
 
: it is possible for an updating command to see an inconsistent
: snapshot: it can see the effects of concurrent updating commands
: on the same rows it is trying to update, but it does not see
: effects of those commands on other rows in the database. This
: behavior makes Read Committed mode unsuitable for commands that
: involve complex search conditions
 
I don't know how many times I've read that page (many), yet I never
properly comprehended the impact of that part.  I think the last bit
I quoted above is somewhat misleading, in that it implies that the
issue is limited to complex search conditions.  In the failing case
I showed in this thread, the search conditions involved are
comparisons for equality of an integer literal to the one-column
integer primary key.  It seems like any join or subquery which
references a table is vulnerable, yes?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY IN as SELECT target

2009-12-17 Thread Heikki Linnakangas
Andrew Dunstan wrote:
 
 Recently there was discussion about allowing a COPY statement to be a
 SELECT target, returning a text array, although the syntax wasn't really
 nailed down that I recall. I was thinking that  we might have
 
COPY RETURNING ARRAY FROM ...
 
 instead of
 
COPY tablename opt_column_list FROM ...

It's not really returning an array, is it? It's returning a bag of rows
like a (sub)query.

 the we possibly could do things like:
 
SELECT t[5] as a, 3*(t[3]::numeric) as b FROM (COPY RETURNING ARRAY
 FROM STDIN CSV) as t;

How about just COPY FROM? As in

SELECT t[5] as a, 3*(t[3]::numeric) as b FROM (COPY FROM STDIN CSV) as t

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY IN as SELECT target

2009-12-17 Thread Robert Haas
On Thu, Dec 17, 2009 at 12:23 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 How about just COPY FROM? As in

 SELECT t[5] as a, 3*(t[3]::numeric) as b FROM (COPY FROM STDIN CSV) as t

I had the same thought.  Though it would also be nice to allow something like:

COPY (type1, type2, type3, type4) FROM STDIN CSV

...which is obviously going to create a horrible parser problem if you
actually tried to use that syntax.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY IN as SELECT target

2009-12-17 Thread David Fetter
On Thu, Dec 17, 2009 at 12:28:50PM -0500, Robert Haas wrote:
 On Thu, Dec 17, 2009 at 12:23 PM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
  How about just COPY FROM? As in
 
  SELECT t[5] as a, 3*(t[3]::numeric) as b FROM (COPY FROM STDIN
  CSV) as t
 
 I had the same thought.  Though it would also be nice to allow
 something like:
 
 COPY (type1, type2, type3, type4) FROM STDIN CSV
 
 ...which is obviously going to create a horrible parser problem if
 you actually tried to use that syntax.

How about using the CTE syntax?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 http://developer.postgresql.org/pgdocs/postgres/transaction-iso.html#XACT-READ-COMMITTED

 I don't know how many times I've read that page (many), yet I never
 properly comprehended the impact of that part.  I think the last bit
 I quoted above is somewhat misleading, in that it implies that the
 issue is limited to complex search conditions.  In the failing case
 I showed in this thread, the search conditions involved are
 comparisons for equality of an integer literal to the one-column
 integer primary key.  It seems like any join or subquery which
 references a table is vulnerable, yes?

Well, it would all depend on what you're trying to do.  Typical
single-row UPDATE commands aren't really affected by this problem,
and in fact the behavior is pretty much exactly what they want as
long as the WHERE conditions don't involve columns that are being
changed.

Maybe we should replace the bit about complex search conditions
with something about referencing multiple rows to perform one update.
I'm not very sure what a clearer explanation would look like though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY IN as SELECT target

2009-12-17 Thread Robert Haas
On Thu, Dec 17, 2009 at 12:38 PM, David Fetter da...@fetter.org wrote:
 On Thu, Dec 17, 2009 at 12:28:50PM -0500, Robert Haas wrote:
 On Thu, Dec 17, 2009 at 12:23 PM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
  How about just COPY FROM? As in
 
  SELECT t[5] as a, 3*(t[3]::numeric) as b FROM (COPY FROM STDIN
  CSV) as t

 I had the same thought.  Though it would also be nice to allow
 something like:

 COPY (type1, type2, type3, type4) FROM STDIN CSV

 ...which is obviously going to create a horrible parser problem if
 you actually tried to use that syntax.

 How about using the CTE syntax?

I'm not sure what you're suggesting exactly, but the problem with the
syntax I suggested is that COPY (...) TO whatever expects the ...
part to be a subselect.  You can't make COPY (...) FROM have something
in there other than a subselect, because the parser can't fast-forward
and look at the word FROM and then go back and decide how to parse the
parenthesized stuff.  That's almost magic in the general case.  You'd
have to stick a keyword in there before the opening parentheses.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY IN as SELECT target

2009-12-17 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Andrew Dunstan wrote:
 COPY RETURNING ARRAY FROM ...

 It's not really returning an array, is it? It's returning a bag of rows
 like a (sub)query.

 How about just COPY FROM?

The problem with COPY FROM is that it hard-wires a decision that there
is one and only one possible result format, which I think we pretty
much proved already is the wrong thing.  I'm not thrilled with RETURNING
ARRAY either, but we need to leave ourselves wiggle room to have more
than one result format from the same source file.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Greg Stark
On Thu, Dec 17, 2009 at 5:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, it would all depend on what you're trying to do.  Typical
 single-row UPDATE commands aren't really affected by this problem,
 and in fact the behavior is pretty much exactly what they want as
 long as the WHERE conditions don't involve columns that are being
 changed.

 Maybe we should replace the bit about complex search conditions
 with something about referencing multiple rows to perform one update.
 I'm not very sure what a clearer explanation would look like though.

I wonder if RETURNING hasn't created a whole new set of cases where
our READ COMMITTED behaviour is bogus. I guess it's equivalent to
having used SELECT FOR UPDATE.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 I'm not very sure what a clearer explanation would look like
 
As a stab at it, how about?:
 
This behavior makes Read Committed mode unsuitable for many UPDATE
or DELETE commands with joins or subqueries
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 I wonder if RETURNING hasn't created a whole new set of cases where
 our READ COMMITTED behaviour is bogus.

I don't see how.  It just gives you access to the same values that were
actually used by the UPDATE.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Robert Haas
On Thu, Dec 17, 2009 at 12:51 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:

 I'm not very sure what a clearer explanation would look like

 As a stab at it, how about?:

 This behavior makes Read Committed mode unsuitable for many UPDATE
 or DELETE commands with joins or subqueries

I don't think that's any clearer, though it is more disparaging.  :-)

Note we also say: The partial transaction isolation provided by Read
Committed mode is adequate for many applications, and this mode is
fast and simple to use; however, it is not sufficient for all cases.
Applications that do complex queries and updates might require a more
rigorously consistent view of the database than Read Committed mode
provides.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Joshua D. Drake
On Thu, 2009-12-17 at 12:58 -0500, Robert Haas wrote:
 On Thu, Dec 17, 2009 at 12:51 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
  Tom Lane t...@sss.pgh.pa.us wrote:
 
  I'm not very sure what a clearer explanation would look like
 
  As a stab at it, how about?:
 
  This behavior makes Read Committed mode unsuitable for many UPDATE
  or DELETE commands with joins or subqueries
 
 I don't think that's any clearer, though it is more disparaging.  :-)
 
 Note we also say: The partial transaction isolation provided by Read
 Committed mode is adequate for many applications, and this mode is
 fast and simple to use; however, it is not sufficient for all cases.
 Applications that do complex queries and updates might require a more
 rigorously consistent view of the database than Read Committed mode
 provides.

What is needed here is a layman's context of what isolation modes are
good for what type of operation. Neither your explanation or Tom's is
particularly useful except to say, Crap, I might be screwed but I don't
know if I am... how do I find out?

Joshua D. Drake



 
 ...Robert
 


-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
If the world pushes look it in the eye and GRR. Then push back harder. - 
Salamander


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 I'm not very sure what a clearer explanation would look like
 
 As a stab at it, how about?:
 
 This behavior makes Read Committed mode unsuitable for many UPDATE
 or DELETE commands with joins or subqueries

After thinking a bit, I'd be inclined to add a new paragraph.
In particular, now that FOR UPDATE actually works in subqueries,
it'd be worth pointing out that you can add that to guard against
this type of issue.  Perhaps, after the DELETE FROM website
example, we could add something like

UPDATEs and DELETEs involving joins or subqueries are particularly
at risk, since they may perform an update based on a combination of
old rows from other tables with an up-to-date target row.  This risk
can be mitigated by adding FOR UPDATE or FOR SHARE to subqueries, so
that all rows directly involved in an update are guaranteed current.
However that will also increase the risk of deadlock failures.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Robert Haas
On Thu, Dec 17, 2009 at 1:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 I'm not very sure what a clearer explanation would look like

 As a stab at it, how about?:

 This behavior makes Read Committed mode unsuitable for many UPDATE
 or DELETE commands with joins or subqueries

 After thinking a bit, I'd be inclined to add a new paragraph.
 In particular, now that FOR UPDATE actually works in subqueries,
 it'd be worth pointing out that you can add that to guard against
 this type of issue.  Perhaps, after the DELETE FROM website
 example, we could add something like

 UPDATEs and DELETEs involving joins or subqueries are particularly
 at risk, since they may perform an update based on a combination of
 old rows from other tables with an up-to-date target row.  This risk
 can be mitigated by adding FOR UPDATE or FOR SHARE to subqueries, so
 that all rows directly involved in an update are guaranteed current.
 However that will also increase the risk of deadlock failures.

I like that.  It might also be worth trying to explain that if you
select some data out of the database, do a computation with it, and
then use the results to drive an update, you're going to want to make
the initial select be FOR SHARE.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 I don't think that's any clearer, though it is more disparaging. 
 :-)
 
It's certainly not my goal to knock PostgreSQL.  The precise
conditions in which an UPDATE or DELETE can view an inconsistent
database state (and therefore potentially persist something based on
that inconsistent state) are that it has a FROM clause and/or
subqueries which reference data changed by a concurrent database
transaction which also affects rows which are targets of the UPDATE
or DELETE.  Precise descriptions of problem circumstances seem more
useful to developers than vague statements like it's usually good
enough, except when it isn't.
 
If an accurate description of the behavior is considered
disparaging, perhaps it's the behavior which should change, not just
the description of it.  Since I never use READ COMMITTED for
updates, I'm not going to weigh in on whether this is a big enough
problem to merit the effort and overhead of a different
implementation; I'm just suggesting we should put the information
out there more explicitly.  My wording was still a little on the
vague side, in an attempt to keep it short; perhaps that was a
mistake.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Tom Lane
Joshua D. Drake j...@commandprompt.com writes:
 What is needed here is a layman's context of what isolation modes are
 good for what type of operation. Neither your explanation or Tom's is
 particularly useful except to say, Crap, I might be screwed but I don't
 know if I am... how do I find out?

If we had a simple way to characterize that, we'd not be having this
discussion :-(

One possibility is to try to list the risky cases.  So far I can think
of:

* updates using a WHERE clause that tests columns being changed by other
transactions

* updates using subqueries/joins so that the result depends on other rows
besides the one directly updated/deleted, and those other rows are
subject to concurrent changes

But I'm not sure this is a complete list, and an incomplete one might do
more harm than good ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 After thinking a bit, I'd be inclined to add a new paragraph.
 In particular, now that FOR UPDATE actually works in subqueries,
 it'd be worth pointing out that you can add that to guard against
 this type of issue.  Perhaps, after the DELETE FROM website
 example, we could add something like
 
 UPDATEs and DELETEs involving joins or subqueries are particularly
 at risk, since they may perform an update based on a combination
 of old rows from other tables with an up-to-date target row.  This
 risk can be mitigated by adding FOR UPDATE or FOR SHARE to
 subqueries, so that all rows directly involved in an update are
 guaranteed current.  However that will also increase the risk of
 deadlock failures.
 
Much better than my suggestion.  Including both the problem
conditions and the solution is ideal.
 
I'd missed that we now allow FOR UPDATE and FOR SHARE on subqueries.
Nice enhancement.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 ...  The precise
 conditions in which an UPDATE or DELETE can view an inconsistent
 database state (and therefore potentially persist something based on
 that inconsistent state) are that it has a FROM clause and/or
 subqueries which reference data changed by a concurrent database
 transaction which also affects rows which are targets of the UPDATE
 or DELETE.

Are we sure that's a precise and complete description?  I don't have
a problem with putting a description just like that in the docs, but
I'm not yet convinced it's right.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 Are we sure that's a precise and complete description?  I don't
 have a problem with putting a description just like that in the
 docs, but I'm not yet convinced it's right.
 
Well, I thought it was when I typed it.  You mentioned referencing
other columns in the updated rows; I'll test to see how that
behaves.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY IN as SELECT target

2009-12-17 Thread Andrew Dunstan



Tom Lane wrote:

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
  

Andrew Dunstan wrote:


COPY RETURNING ARRAY FROM ...
  


  

It's not really returning an array, is it? It's returning a bag of rows
like a (sub)query.



  

How about just COPY FROM?



The problem with COPY FROM is that it hard-wires a decision that there
is one and only one possible result format, which I think we pretty
much proved already is the wrong thing.  I'm not thrilled with RETURNING
ARRAY either, but we need to leave ourselves wiggle room to have more
than one result format from the same source file.


  


Well, we could have RETURNING type-expression with  text[] supported 
for the first iteration.


In answer to Heiki's argument, what I wanted was exactly to return an 
array of text for each row. Whatever we have needs to be able to handle 
to possibility of ragged input (see previous discussion) so we can't tie 
it down too tightly.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY IN as SELECT target

2009-12-17 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Tom Lane wrote:
 The problem with COPY FROM is that it hard-wires a decision that there
 is one and only one possible result format, which I think we pretty
 much proved already is the wrong thing.  I'm not thrilled with RETURNING
 ARRAY either, but we need to leave ourselves wiggle room to have more
 than one result format from the same source file.

 Well, we could have RETURNING type-expression with  text[] supported 
 for the first iteration.

 In answer to Heiki's argument, what I wanted was exactly to return an 
 array of text for each row. Whatever we have needs to be able to handle 
 to possibility of ragged input (see previous discussion) so we can't tie 
 it down too tightly.

I think that there are two likely possibilities for the result format:

* Raw data after just the de-escaping and column separation steps.
Array of text is probably the right thing here, at least for a text COPY
(doesn't seem to cover the binary case though).

* The data converted to some specified row type.

RETURNING type-expression is probably not good since it looks more
like the second case than the first --- and in fact it could be outright
ambiguous, what if your data actually is one column that is a text
array?

If we're willing to assume these are the *only* possibilities then we
could use COPY FROM ... for the first and COPY RETURNING type-list
FROM ... for the second.  I'm a bit uncomfortable with that assumption
though; it seems likely that we'll want to shoehorn in some more
alternatives later.  (Like, what about the binary case?)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY IN as SELECT target

2009-12-17 Thread Josh Berkus

 In answer to Heiki's argument, what I wanted was exactly to return an
 array of text for each row. Whatever we have needs to be able to handle
 to possibility of ragged input (see previous discussion) so we can't tie
 it down too tightly.

I would have *lots* of use for this feature.

Mind you, returning (arbitrary expression) would be even better, but if
we can get returning TEXT[] for 8.5, I think it's worth doing on its own.

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Robert Haas
On Thu, Dec 17, 2009 at 1:12 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:

 I don't think that's any clearer, though it is more disparaging.
 :-)

 It's certainly not my goal to knock PostgreSQL.  The precise
 conditions in which an UPDATE or DELETE can view an inconsistent
 database state (and therefore potentially persist something based on
 that inconsistent state) are that it has a FROM clause and/or
 subqueries which reference data changed by a concurrent database
 transaction which also affects rows which are targets of the UPDATE
 or DELETE.  Precise descriptions of problem circumstances seem more
 useful to developers than vague statements like it's usually good
 enough, except when it isn't.

 If an accurate description of the behavior is considered
 disparaging, perhaps it's the behavior which should change, not just
 the description of it.  Since I never use READ COMMITTED for
 updates, I'm not going to weigh in on whether this is a big enough
 problem to merit the effort and overhead of a different
 implementation; I'm just suggesting we should put the information
 out there more explicitly.  My wording was still a little on the
 vague side, in an attempt to keep it short; perhaps that was a
 mistake.

Don't get me wrong, I don't love the current behavior.  (I don't have
a competing proposal either.)  But I think we want to describe it with
precision, because there are also many cases where _it works fine_.
Telling people when it works and when it doesn't work is a lot more
useful than attempting to qualitatively estimate how good or bad it
is.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY IN as SELECT target

2009-12-17 Thread Robert Haas
On Thu, Dec 17, 2009 at 1:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andrew Dunstan and...@dunslane.net writes:
 Tom Lane wrote:
 The problem with COPY FROM is that it hard-wires a decision that there
 is one and only one possible result format, which I think we pretty
 much proved already is the wrong thing.  I'm not thrilled with RETURNING
 ARRAY either, but we need to leave ourselves wiggle room to have more
 than one result format from the same source file.

 Well, we could have RETURNING type-expression with  text[] supported
 for the first iteration.

 In answer to Heiki's argument, what I wanted was exactly to return an
 array of text for each row. Whatever we have needs to be able to handle
 to possibility of ragged input (see previous discussion) so we can't tie
 it down too tightly.

 I think that there are two likely possibilities for the result format:

 * Raw data after just the de-escaping and column separation steps.
 Array of text is probably the right thing here, at least for a text COPY
 (doesn't seem to cover the binary case though).

 * The data converted to some specified row type.

Agreed.

 RETURNING type-expression is probably not good since it looks more
 like the second case than the first --- and in fact it could be outright
 ambiguous, what if your data actually is one column that is a text
 array?

 If we're willing to assume these are the *only* possibilities then we
 could use COPY FROM ... for the first and COPY RETURNING type-list
 FROM ... for the second.  I'm a bit uncomfortable with that assumption
 though; it seems likely that we'll want to shoehorn in some more
 alternatives later.  (Like, what about the binary case?)

You might want to specify column names as well as well as types, in
this second case.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
  
 Are we sure that's a precise and complete description?  I don't
 have a problem with putting a description just like that in the
 docs, but I'm not yet convinced it's right.
  
 Well, I thought it was when I typed it.  You mentioned referencing
 other columns in the updated rows; I'll test to see how that
 behaves.
 
Some quick testing seems to show that for the rows on which we were
blocking, all columns reflect all updates from the concurrent
transaction on which we were waiting, including columns used in the
WHERE clause.  I'm not sure exactly what other tests might be
necessary.  I'm having trouble coming up with anything which doesn't
involve a join or subquery, but that could be a failure of
imagination.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY IN as SELECT target

2009-12-17 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 You might want to specify column names as well as well as types, in
 this second case.

Well, we could do it like VALUES: arbitrarily name the columns column1
... columnN and tell people to use an alias if they want other names.
If it's convenient to fit column names into the syntax, good, but we
don't absolutely have to.

[ thinks... ] Although actually the obvious SQL-ish syntax for a rowtype
specification is

( colname typename [ , ... ] )

so that's probably what we'd want to do in the processed-data case.
Not sure about the raw-data case --- maybe a predetermined name is
okay there.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Some quick testing seems to show that for the rows on which we were
 blocking, all columns reflect all updates from the concurrent
 transaction on which we were waiting, including columns used in the
 WHERE clause.  I'm not sure exactly what other tests might be
 necessary.  I'm having trouble coming up with anything which doesn't
 involve a join or subquery, but that could be a failure of
 imagination.

The issue that I was thinking about is that there are actually two
rounds of WHERE testing involved in a READ COMMITTED update: first
we fetch a row that matches the WHERE clause *in the query snapshot*,
and then we fetch its most up-to-date version and recheck the WHERE
condition for that.  If the updated version no longer satisfies WHERE
we ignore it.  The trouble with this is that the same transaction that
changed that row to not satisfy WHERE might have also changed some
other row so that it now *does* satisfy WHERE, but we won't ever find
that other row because in the query snapshot it doesn't pass the WHERE.
The website example in the docs is meant to illustrate this hazard.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 Don't get me wrong, I don't love the current behavior.  (I don't
 have a competing proposal either.)  But I think we want to
 describe it with precision, because there are also many cases
 where _it works fine_. Telling people when it works and when it
 doesn't work is a lot more useful than attempting to qualitatively
 estimate how good or bad it is.
 
It sounds like we're in violent agreement.  I'm not sure what I said
which might have led you to believe I felt otherwise.
 
[reviews thread]
 
The suggestion you felt was more disparaging was:
 
: This behavior makes Read Committed mode unsuitable for
: many UPDATE or DELETE commands with joins or subqueries
 
You do realize that what is already in the documentation, for which
this was a suggested replacement, was?:
 
: This behavior makes Read Committed mode unsuitable for
: commands that involve complex search conditions
 
I'm not seeing where I made it more disparaging; I was trying to
clarify under what circumstances it was a problem.  If you have a
suggestion for a better way to phrase the part I left alone, feel
free to suggest something.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY IN as SELECT target

2009-12-17 Thread Robert Haas
On Thu, Dec 17, 2009 at 1:50 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 You might want to specify column names as well as well as types, in
 this second case.

 Well, we could do it like VALUES: arbitrarily name the columns column1
 ... columnN and tell people to use an alias if they want other names.
 If it's convenient to fit column names into the syntax, good, but we
 don't absolutely have to.

 [ thinks... ] Although actually the obvious SQL-ish syntax for a rowtype
 specification is

        ( colname typename [ , ... ] )

 so that's probably what we'd want to do in the processed-data case.

Yeah, I think that's good.

 Not sure about the raw-data case --- maybe a predetermined name is
 okay there.

I would expect so.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY IN as SELECT target

2009-12-17 Thread Dimitri Fontaine
Hi,

Le 17 déc. 2009 à 19:39, Josh Berkus a écrit :
 Mind you, returning (arbitrary expression) would be even better, but if
 we can get returning TEXT[] for 8.5, I think it's worth doing on its own.

Well, you already have it as soon as you have text[]:

 INSERT INTO destination
 SELECT row[0], row[1], myfunction(row[0], row[1]), row[2]::int + 1
   FROM (COPY RETURNING text[] FROM '/path/to/file.cvs' CVS HEADER) as 
file(row);

Of course as Andrew said already what it needs that the syntax here does not 
cover is ragged file processing, that is accepting file content when all the 
rows will not have the same number of columns.

But if you have ragged input reading and COPY as a relation in a query, then 
you're able to apply any expression you want to in the query itself. Such as 
transforming the input slightly in order to conform to PostgreSQL datatype 
input syntaxes, e.g.

Regards,
-- 
dim

Let's deprecate pgloader.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 [a transaction] might have also changed some other row so that it
 now *does* satisfy WHERE, but we won't ever find that other row
 because in the query snapshot it doesn't pass the WHERE.
 
OK; got it.  No way to fix that, really, without getting a fresh
snapshot and re-starting the command, is there?  I take it from your
earlier posts that wouldn't be pretty. On the bright side, to be
taken as showing an inconsistent state, the transaction on which we
block has to both move one or more rows into the matching set as
well as moving one or more rows out.
 
Another example of the phenomenon:
 
connection1:

test=# create table t (name text not null primary key, is_it boolean
not null);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
t_pkey for table t
CREATE TABLE
test=# insert into t values ('huey', true), ('dewey', false),
('louie', false);
INSERT 0 3
test=# start transaction isolation level read committed;
START TRANSACTION
test=# update t set is_it = not is_it where name in ('huey',
'dewey');
UPDATE 2
 
connection2:

test=# start transaction isolation level read committed;
START TRANSACTION
test=# select * from t where is_it for update;
[blocks]
 
connection1:

test=# commit;
COMMIT
 
connection2:

 name | is_it
--+---
(0 rows)
 
test=# select * from t where is_it for update;
 name  | is_it
---+---
 dewey | t
(1 row)
 
So this particular issue means that rows affected will be the
intersection of rows matching the WHERE clause before and after the
conflicting concurrent transaction(s) commit.  The join/subquery
issue means that all values used would be based on the snapshot at
the start of the statement except that values from rows updated by
concurrent transactions on which we blocked would be based on the
updated rows.  Any other issues?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Spurious 22 in hstore.sgml

2009-12-17 Thread Bruce Momjian
David Fetter wrote:
 On Thu, Dec 17, 2009 at 11:01:19AM +0100, Albe Laurenz wrote:
  Magnus Hagander wrote:
   On Wed, Dec 16, 2009 at 20:34, David E. Wheeler wrote:
*** a/doc/src/sgml/hstore.sgml
--- b/doc/src/sgml/hstore.sgml
   
   Heh, interesting. That clearly shouldn't be there. Applied.
  
  Does this count as catch-22?
 
 Patch-22, perhaps?

Wait, that 22 was holding up the entire database system.  ;-)

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] remove redundant ownership checks

2009-12-17 Thread Stephen Frost
KaiGai,

* KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
 The patch was not attached...

This patch either does too much, or not enough.

I would either leave the Assert() in-place as a double-check (I presume
that's why it was there in the first place, and if that Assert() fails
then our assumption about the permissions check being already done on
the object in question would be wrong, since the check is done against
the passed-in 'rel' and the assert is that 'rel' and 'ruletup-ev_class'
are the same; if they're not, then we might need to do perms checking on
ruletup-ev_class)

Or

Remove the now-unused variable eventRelationOid.

Overall, I agree with removing this check as it's already done by
ATSimplePermissions() and we don't double-check the permissions in the
other things called through ATExecCmd() (though there are cases where
specific commands have to do *additional* checks beyond what
ATSimplePermissions() does..  it might be worth looking into what those
are and thinking about if we should move/consolidate/etc them, or if it
makes sense to leave them where they are).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] An example of bugs for Hot Standby

2009-12-17 Thread Simon Riggs
On Wed, 2009-12-16 at 14:05 +, Simon Riggs wrote:
 On Wed, 2009-12-16 at 10:33 +, Simon Riggs wrote:
  On Tue, 2009-12-15 at 20:25 +0900, Hiroyuki Yamada wrote:
   Hot Standby node can freeze when startup process calls 
   LockBufferForCleanup().
   This bug can be reproduced by the following procedure.
  
  Interesting. Looks like this can happen, which is a shame cos I just
  removed the wait checking code after not ever having seen a wait.
  
  Thanks for the report.
  
  Must-fix item for HS.
 
 So this deadlock can happen at two places:
 
 1. When a relation lock waits behind an AccessExclusiveLock and then
 Startup runs LockBufferForCleanup()
 
 2. When Startup is a pin count waiter and a lock acquire begins to wait
 on a relation lock
 
 So we must put in direct deadlock detection in both places. We can't use
 the normal deadlock detector because in case (1) the backend might
 already have exceeded deadlock_timeout.
 
 Proposal:

Better proposal

 * It's possible for 3-way deadlocks to occur in Hot Standby mode.
 * If a user backend sleeps on a lock while it holds a buffer pin that
 * leaves open the risk of deadlock. The user backend will only sleep
 * if it waits behind an AccessExclusiveLock held by Startup process.
 * If the Startup process then tries to access any buffer that is pinned
 * then it too will sleep and neither process will ever wake.
 *
 * We need to make a deadlock check in two places: in the user backend
 * when we sleep on a lock, and in the Startup process when we sleep
 * on a buffer pin. We need both checks because the deadlock can occur
 * from both directions.
 *
 * Just before a user backend sleeps on a lock, we accumulate a list of
 * buffers pinned by the backend. We then grab the an LWlock
 * and then check each of the buffers to see if the Startup process is
 * waiting on them. If so, we release the lock and throw deadlock error.
 * If Startup process is not waiting we then record the pinned buffers
 * in the BufferDeadlockRisk data structure and release the lock.
 * When we later get the lock we remove the deadlock risk.
 *
 * When the Startup process is about to wait on a buffer pin it checks
 * the buffer it is about to pin in the BufferDeadlockRisk list. If the
 * buffer is already held by one or more lock waiters then we send a
 * conflict cancel to them and wait for them to die before rechecking
 * the buffer lock.

This way we only cancel direct deadlocks.

It doesn't solve general problem of buffer waits, but they may be
solvable by different mechanism.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Spurious 22 in hstore.sgml

2009-12-17 Thread Greg Williamson
Bruce Momjian wrote:



 
 David Fetter wrote:
  On Thu, Dec 17, 2009 at 11:01:19AM +0100, Albe Laurenz wrote:
   Magnus Hagander wrote:
On Wed, Dec 16, 2009 at 20:34, David E. Wheeler wrote:
 *** a/doc/src/sgml/hstore.sgml
 --- b/doc/src/sgml/hstore.sgml

Heh, interesting. That clearly shouldn't be there. Applied.
   
   Does this count as catch-22?
  
  Patch-22, perhaps?
 
 Wait, that 22 was holding up the entire database system.  ;-)


You mean holding it up as in that was a load bearing integer! or holding it 
up as in this is a small caliber firearm -- gimme yer schema! ?

Greg W.



  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Installing PL/pgSQL by default

2009-12-17 Thread Bruce Momjian
Bruce Momjian wrote:
 Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
   I installed PL/pgSQL by default via initdb with the attached patch.  The
   only problem is that pg_dump still dumps out the language creation:
 CREATE PROCEDURAL LANGUAGE plpgsql;
 ALTER PROCEDURAL LANGUAGE plpgsql OWNER TO postgres;
   What is odd is that I used the same process that initdb uses to create
   other objects.  Does anyone know why this is happening?
  
  I think pg_dump pays attention to what schema the objects are in,
  and that's most likely creating them in PUBLIC.  Try adding
  set search_path = pg_catalog.
  
  It's not impossible that we'll have to tweak pg_dump a bit; it's
  never had to deal with languages that shouldn't be dumped ...
 
 I found that pg_dump tests for pg_language.lanispl == true, which is
 true for all the stored procedure languages.  I can easily special case
 plpgsql, or check for FirstNormalObjectId, though I don't see that used
 in pg_dump currently.
 
 A more difficult issue is whether we should preserve the fact that
 plpgsql was _removed_ in the pg_dump output, i.e, if someone removes
 plpgsql from a database, do we issue a DROP LANGUAGE in pg_dump?  I
 don't remember us having to deal with anything like this before.

OK, the attached patch installs plpgsql by default from initdb, and
supresses the dumping of CREATE LANGUAGE in 8.5 and in 8.3/8.4 if binary
upgrade is used (because you know you are upgrading to a release that
has plpgsql installed by default).  The 8.3/8.4 is necessary so the
schema load doesn't generate any errors and cause pg_migrator to exit.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/installation.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/installation.sgml,v
retrieving revision 1.333
diff -c -c -r1.333 installation.sgml
*** doc/src/sgml/installation.sgml	15 Dec 2009 22:59:53 -	1.333
--- doc/src/sgml/installation.sgml	17 Dec 2009 23:35:36 -
***
*** 2266,2279 
   is commandcreatelang/command failing with unusual errors.
   For example, running as the owner of the PostgreSQL installation:
  screen
! -bash-3.00$ createlang plpgsql template1
! createlang: language installation failed: ERROR:  could not load library /opt/dbs/pgsql748/lib/plpgsql.so: A memory address is not in the address space for the process.
  /screen
  Running as a non-owner in the group posessing the PostgreSQL
  installation:
  screen
! -bash-3.00$ createlang plpgsql template1
! createlang: language installation failed: ERROR:  could not load library /opt/dbs/pgsql748/lib/plpgsql.so: Bad address
  /screen
   Another example is out of memory errors in the PostgreSQL server
   logs, with every memory allocation near or greater than 256 MB
--- 2266,2279 
   is commandcreatelang/command failing with unusual errors.
   For example, running as the owner of the PostgreSQL installation:
  screen
! -bash-3.00$ createlang plperl template1
! createlang: language installation failed: ERROR:  could not load library /opt/dbs/pgsql748/lib/plperl.so: A memory address is not in the address space for the process.
  /screen
  Running as a non-owner in the group posessing the PostgreSQL
  installation:
  screen
! -bash-3.00$ createlang plperl template1
! createlang: language installation failed: ERROR:  could not load library /opt/dbs/pgsql748/lib/plperl.so: Bad address
  /screen
   Another example is out of memory errors in the PostgreSQL server
   logs, with every memory allocation near or greater than 256 MB
Index: src/bin/initdb/initdb.c
===
RCS file: /cvsroot/pgsql/src/bin/initdb/initdb.c,v
retrieving revision 1.178
diff -c -c -r1.178 initdb.c
*** src/bin/initdb/initdb.c	11 Dec 2009 03:34:56 -	1.178
--- src/bin/initdb/initdb.c	17 Dec 2009 23:35:36 -
***
*** 176,181 
--- 176,182 
  static void setup_privileges(void);
  static void set_info_version(void);
  static void setup_schema(void);
+ static void load_plpgsql(void);
  static void vacuum_db(void);
  static void make_template0(void);
  static void make_postgres(void);
***
*** 1894,1899 
--- 1895,1925 
  }
  
  /*
+  * load PL/pgsql server-side language
+  */
+ static void
+ load_plpgsql(void)
+ {
+ 	PG_CMD_DECL;
+ 
+ 	fputs(_(loading PL/pgSQL server-side language ... ), stdout);
+ 	fflush(stdout);
+ 
+ 	snprintf(cmd, sizeof(cmd),
+ 			 \%s\ %s template1 %s,
+ 			 backend_exec, backend_options,
+ 			 DEVNULL);
+ 
+ 	PG_CMD_OPEN;
+ 
+ 	PG_CMD_PUTS(CREATE LANGUAGE plpgsql;\n);
+ 
+ 	PG_CMD_CLOSE;
+ 
+ 	check_ok();
+ }
+ 
+ /*
   * clean everything up in template1
   */
  static void
***
*** 3126,3131 
--- 3152,3159 

Re: [HACKERS] [PATCH] remove redundant ownership checks

2009-12-17 Thread KaiGai Kohei
(2009/12/18 6:38), Stephen Frost wrote:
 KaiGai,
 
 * KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
 The patch was not attached...
 
 This patch either does too much, or not enough.
 
 I would either leave the Assert() in-place as a double-check (I presume
 that's why it was there in the first place, and if that Assert() fails
 then our assumption about the permissions check being already done on
 the object in question would be wrong, since the check is done against
 the passed-in 'rel' and the assert is that 'rel' and 'ruletup-ev_class'
 are the same; if they're not, then we might need to do perms checking on
 ruletup-ev_class)
 
 Or
 
 Remove the now-unused variable eventRelationOid.

My preference is the later option, because the pg_rewrite entry to be
checked is fetched using RULERELNAME syscache which takes OID of the
relation and name of the rule.
If this Assert() failed, it implies syscache mechanism has problem,
not only integrity of pg_rewrite system catalog.

The attached patch is an revised one.

Thanks,

 Overall, I agree with removing this check as it's already done by
 ATSimplePermissions() and we don't double-check the permissions in the
 other things called through ATExecCmd() (though there are cases where
 specific commands have to do *additional* checks beyond what
 ATSimplePermissions() does..  it might be worth looking into what those
 are and thinking about if we should move/consolidate/etc them, or if it
 makes sense to leave them where they are).
 
   Thanks,
 
   Stephen


-- 
OSS Platform Development Division, NEC
KaiGai Kohei kai...@ak.jp.nec.com
*** base/src/backend/rewrite/rewriteDefine.c	(revision 2486)
--- base/src/backend/rewrite/rewriteDefine.c	(working copy)
***
*** 671,677 
  {
  	Relation	pg_rewrite_desc;
  	Oid			owningRel = RelationGetRelid(rel);
- 	Oid			eventRelationOid;
  	HeapTuple	ruletup;
  	bool		changed = false;
  
--- 671,676 
***
*** 690,704 
  		rulename, get_rel_name(owningRel;
  
  	/*
- 	 * Verify that the user has appropriate permissions.
- 	 */
- 	eventRelationOid = ((Form_pg_rewrite) GETSTRUCT(ruletup))-ev_class;
- 	Assert(eventRelationOid == owningRel);
- 	if (!pg_class_ownercheck(eventRelationOid, GetUserId()))
- 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,
- 	   get_rel_name(eventRelationOid));
- 
- 	/*
  	 * Change ev_enabled if it is different from the desired new state.
  	 */
  	if (DatumGetChar(((Form_pg_rewrite) GETSTRUCT(ruletup))-ev_enabled) !=
--- 689,694 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] remove redundant ownership checks

2009-12-17 Thread Tom Lane
KaiGai Kohei kai...@ak.jp.nec.com writes:
 [ patch to remove EnableDisableRule's permissions check ]

I don't particularly like this patch, mainly because I disagree with
randomly removing permissions checks without any sort of plan about
where they ought to go.  There are two principal entry points in
rewriteDefine.c (the other one being DefineQueryRewrite), and currently
they both do permissions checks.  There is also a third major function
RenameRewriteRule, currently ifdef'd out because it's not used, which
is commented as Note that it lacks a permissions check, indicating
that somebody (possibly me, I don't recall for sure) thought it was
surprising that there wasn't such a check there.  This is sensible if
you suppose that this file implements rule utility commands that are
more or less directly called by the user, which is in fact the case for
DefineQueryRewrite, and it's not obvious why it wouldn't be the case for
EnableDisableRule.  Since that's a globally exposed function, it's quite
possible that there's third-party code calling it and expecting it to do
the appropriate permissions check.  (A quick look at Slony, in
particular, would be a good idea here.)

If we're going to start moving these checks around we need a very
well-defined notion of where permissions checks should be made, so that
everyone knows what to expect.  I have not seen any plan for that.
Removing one check at a time because it appears to not be necessary
in the code paths you've looked at is not a plan.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Joshua D. Drake
On Thu, 2009-12-17 at 13:13 -0500, Tom Lane wrote:
 Joshua D. Drake j...@commandprompt.com writes:
  What is needed here is a layman's context of what isolation modes are
  good for what type of operation. Neither your explanation or Tom's is
  particularly useful except to say, Crap, I might be screwed but I don't
  know if I am... how do I find out?
 
 If we had a simple way to characterize that, we'd not be having this
 discussion :-(

Certainly true. Sorry if I came off harsh my intent was to illustrate
the more verbose yet less detailed information isn't going to help.

Joshua D. Drake



-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
Respect is earned, not gained through arbitrary and repetitive use or Mr. or 
Sir.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] determine snapshot after obtaining locks for first statement

2009-12-17 Thread Joshua D. Drake
On Thu, 2009-12-17 at 12:16 -0600, Kevin Grittner wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
  
  After thinking a bit, I'd be inclined to add a new paragraph.
  In particular, now that FOR UPDATE actually works in subqueries,
  it'd be worth pointing out that you can add that to guard against
  this type of issue.  Perhaps, after the DELETE FROM website
  example, we could add something like
  
  UPDATEs and DELETEs involving joins or subqueries are particularly
  at risk, since they may perform an update based on a combination
  of old rows from other tables with an up-to-date target row.  This
  risk can be mitigated by adding FOR UPDATE or FOR SHARE to
  subqueries, so that all rows directly involved in an update are
  guaranteed current.  However that will also increase the risk of
  deadlock failures.
  
 Much better than my suggestion.  Including both the problem
 conditions and the solution is ideal.
  
 I'd missed that we now allow FOR UPDATE and FOR SHARE on subqueries.
 Nice enhancement.

+1

Joshua D. Drake


  
 -Kevin
 


-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
Respect is earned, not gained through arbitrary and repetitive use or Mr. or 
Sir.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Largeobject Access Controls (r2460)

2009-12-17 Thread Takahiro Itagaki

Robert Haas robertmh...@gmail.com wrote:

  Another comment is I'd like to keep link 
  linkend=catalog-pg-largeobject-metadata
  for the first structnamepg_largeobject/structname in each topic.
 
 Those two things aren't the same.  Perhaps you meant link
 linkend=catalog-pg-largeobject?

Oops, yes. Thank you for the correction.

We also have 8.4.x series in the core code. Do you think we also
rewrite those messages? One of them is an use-visible message.

LargeObjectAlterOwner() : pg_largeobject.c
 * The 'lo_compat_privileges' is not checked here, because we
 * don't have any access control features in the 8.4.x series
 * or earlier release.
 * So, it is not a place we can define a compatible behavior.

guc.c
{lo_compat_privileges, PGC_SUSET, COMPAT_OPTIONS_PREVIOUS,
gettext_noop(Enables backward compatibility in privilege checks on 
large objects),
gettext_noop(When turned on, privilege checks on large objects perform 

 with backward compatibility as 8.4.x or earlier 
releases.)


Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch - per-tablespace random_page_cost/seq_page_cost

2009-12-17 Thread Robert Haas
On Thu, Dec 3, 2009 at 11:00 AM, Robert Haas robertmh...@gmail.com wrote:
 On Sat, Nov 28, 2009 at 9:54 PM, David Rowley dgrow...@gmail.com wrote:
 Robert Haas Wrote:
 Hmm.  I'm not able to reliably detect a performance difference between
 unpatched CVS HEAD (er... git master branch) and same with spcoptions-
 v2.patch applied.  I figured that if there were going to be an impact,
 it would be most likely to manifest itself in a query that touches lots
 and lots of tables but does very little actual work.  So I used the
 attached script to create 200 empty tables, 100 in the default
 tablespace and 100 in tablespace dork (also known as, why I am
 working on this at 11 PM on Thanksgiving).  Then I did:

 SELECT * FROM a1, a2, a3, ..., a100;

 (I've not read the patch, but I've just read the thread)
 If you're just benchmarking the planner times to see if the extra lookups
 are affecting the planning times, would it not be better to benchmark
 EXPLAIN SELECT * FROM a1, a2, a3, ..., a100; ?
 Otherwise any small changes might be drowned out in the execution time.
 Scanning 100 relations even if they are empty could account for quite a bit
 of that time, right?

 Possibly, but even if I can measure a difference doing it that way,
 it's not clear that it matters.  It's fairly certain that there will
 be a performance degradation if we measure carefully enough, but if
 that difference is imperceptible in real-world scanerios, then it's
 not worth worrying about.  Still, I probably will test it just to see.

I did some fairly careful benchmarking of EXPLAIN SELECT * FROM a1,
a2, a3, ..., a100.  I explained this query 100 times via DBD::Pg and
used time to measure how long the script took to run.  I ran the
script three times.  And the result is... the unpatched version came
out 1.7% SLOWER than the patched version.  This seems difficult to
take seriously, since it can't possibly be faster to do a syscache
lookup and parse an array than it is to fetch a constant from a known
memory address, but that's what I got.  At any rate, it seems pretty
clear that it's not hurting much.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch - per-tablespace random_page_cost/seq_page_cost

2009-12-17 Thread Robert Haas
On Thu, Nov 26, 2009 at 4:25 PM, Robert Haas robertmh...@gmail.com wrote:
 On Sat, Nov 14, 2009 at 4:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I don't think there's even a
 solid consensus right now on which GUCs people would want to set at the
 tablespace level.

 This seems like an important point that we need to nail down.  The
 original motivation for this patch was based on seq_page_cost and
 random_page_cost, to cover the case where, for example, one tablespace
 is on an SSD and another tablespace is on a RAID array.

 Greg Stark proposed adding effective_io_concurrency, and that makes
 plenty of sense to me, but I'm sort of disinclined to attempt to
 implement that as part of this patch because I have no familiarity
 with that part of the code and no hardware that I can use to test
 either the current behavior or the modified behavior.  Since I'm
 recoding this to use the reloptions mechanism, a patch to add support
 for that should be pretty easy to write as a follow-on patch once this
 goes in.

 Any other suggestions?

Going once...  going twice...  since no one has suggested anything or
spoken against the proposal above, I'm just going to implement
seq_page_cost and random_page_cost for now.

 Current version of patch is attached.  I've revised it to use the
 reloptions stuff, but I don't think it's committable as-is because it
 currently thinks that extracting options from a pg_tablespace tuple is
 a cheap operation, which was true in the non-reloptions-based
 implementation but is less true now.  At least, some benchmarking
 needs to be done to figure out whether and to what extent this is an
 issue.

Per the email that I just sent a few minutes ago, there doesn't appear
to be a performance impact in doing this even in a relatively stupid
way - every call that requires seq_page_cost and/or random_page_cost
results in a syscache lookup and then uses the relcache machinery to
parse the returned array.

That leaves the question of what the most elegant design is here.  Tom
suggested upthread that we should tag every RelOptInfo - and,
presumably, IndexOptInfo, though it wasn't discussed - with this
information.  I don't however much like the idea of adding identically
named members in both places.  Should the number of options expand in
the future, this will become silly very quickly.  One option is to
define a struct with seq_page_cost and random_page_cost that is then
included in RelOptInfo and IndexOptInfo.  It would seem to make sense
to make the struct, rather than a pointer to the struct, the member,
because it makes the copyfuncs/equalfuncs stuff easier to handle, and
there's not really any benefit in incurring more palloc overhead.

However, I'm sort of inclined to go ahead and invent a mini-cache for
tablespaces.  It avoids the (apparently insignificant) overhead of
reparsing the array multiple times, but it also avoids bloating
RelOptInfo and IndexOptInfo with more members than really necessary.
It seems like a good idea to add one member to those structures
anyway, for reltablespace, but copying all the values into every one
we create just seems silly.  Admittedly there are only two values
right now, but again we may want to add more someday, and caching at
the tablespace level just seems like the right way to do it.

Thoughts?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] remove redundant ownership checks

2009-12-17 Thread Robert Haas
On Thu, Dec 17, 2009 at 7:19 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 If we're going to start moving these checks around we need a very
 well-defined notion of where permissions checks should be made, so that
 everyone knows what to expect.  I have not seen any plan for that.
 Removing one check at a time because it appears to not be necessary
 in the code paths you've looked at is not a plan.

I'm not completely familiar with the existing code structure here, but
it sort of seems like in general you might want to divide up the
processing of a statement into a parse analysis phase, a permissions
checking phase, and an execution phase.  The parse analysis seems to
be mostly separated out into transformXyz() functions, but the
permissions checking is mixed in with the execution.  Disentangling
that seems like a job and a half.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming replication and non-blocking I/O

2009-12-17 Thread Fujii Masao
On Thu, Dec 17, 2009 at 10:25 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Fujii Masao wrote:
 I'm thinking of making the standby send the walsender-switch-code the same 
 way
 as application_name; walreceiver always specifies the option like
 replication=on
 in conninfo string and calls PQconnectdb(), which sends the code as a part of
 startup packet. And, the environment variable for that should not be defined 
 to
 avoid user's mis-configuration, I think.

 Sounds good.

Okey. Design clarification again;

0. Begin by connecting to the master using PQconnectdb() with new conninfo
option specifying the request of replication. The startup packet with the
request is sent to the master, then the backend switches to the walsender
mode. The walsender goes into the main loop and wait for the request from
the walreceiver.

1. Get the system identifier of the master.

Slave - Master: Query message, with a query string like
GET_SYSTEM_IDENTIFIER

Master - Slave: RowDescription, DataRow CommandComplete, and
ReadyForQuery messages. The system identifier is returned in the DataRow
message.

2. Another query exchange like above, for timeline ID.

Slave - Master: Query message, with a query string like
GET_TIMELINE

Master - Slave: RowDescription, DataRow CommandComplete, and
ReadyForQuery messages. The timeline ID is returned in the DataRow
message.

3. Request a backup history file, if needed:

Slave - Master: Query message, with a query string like
GET_BACKUP_HISTORY_FILE XXX where XXX is XLogRecPtr.

Master - Slave: RowDescription, DataRow CommandComplete and
ReadyForQuery messages as usual. The file contents are returned in the
DataRow message.

In 1, 2, 3, the walreceiver uses PQexec() to send Query message and receive
the results.

4. Start replication

Slave - Master: Query message, with query string START REPLICATION:
, where  is the RecPtr of the starting point.

Master - Slave: CopyOutResponse followed by a continuous stream of
CopyData messages with WAL contents.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] remove redundant ownership checks

2009-12-17 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 KaiGai Kohei kai...@ak.jp.nec.com writes:
  [ patch to remove EnableDisableRule's permissions check ]
 
 I don't particularly like this patch, mainly because I disagree with
 randomly removing permissions checks without any sort of plan about
 where they ought to go.  

The goal of this was to increase consistancy with the rest of the code,
in particular, ATPrepCmd checks ownership rights on the table, and
anything which wants to check permissions beyond that has to do it
independently.  Do I like that?  No, not really.

 There are two principal entry points in
 rewriteDefine.c (the other one being DefineQueryRewrite), and currently
 they both do permissions checks.

DefineQueryRewrite gets called out of tcop/utility.c (either under a
CREATE VIEW or a CREATE RULE).  tcop/utility.c expects the functions it
calls to to handle permissions checking (except in the one case it
doesn't call a function- T_IndexStmt).  Interestingly, DefineIndex,
while it handles a number of other permissions checks, doesn't do checks
to ensure the caller is the table owner- it expects callers to have done
that (which happens both in tcop/utility.c for CREATE INDEX and in
ATPrepCmd for ALTER TABLE ...).

 There is also a third major function
 RenameRewriteRule, currently ifdef'd out because it's not used, which
 is commented as Note that it lacks a permissions check, indicating
 that somebody (possibly me, I don't recall for sure) thought it was
 surprising that there wasn't such a check there.  This is sensible if
 you suppose that this file implements rule utility commands that are
 more or less directly called by the user, which is in fact the case for
 DefineQueryRewrite, and it's not obvious why it wouldn't be the case for
 EnableDisableRule.  Since that's a globally exposed function, it's quite
 possible that there's third-party code calling it and expecting it to do
 the appropriate permissions check.  (A quick look at Slony, in
 particular, would be a good idea here.)

Personally I find it suprising that things called from ATExecCmd()
expect some permissions checks to have been done already.

 If we're going to start moving these checks around we need a very
 well-defined notion of where permissions checks should be made, so that
 everyone knows what to expect.  I have not seen any plan for that.
 Removing one check at a time because it appears to not be necessary
 in the code paths you've looked at is not a plan.

What I suggested previously, though it may be naive, is to do the
permissions checks when you actually have enough information to do them.
At the moment we do a 'simple' check in ATPrepCmd (essentially,
ownership of the relation) and then any more complicated checks have to
be done by the function which actually understands what's going on
enough to know what else needs to be checked (eg:
ATAddForeignKeyConstraint).

As I see it, you've mainly got three steps:

Parsing the command (syntax, basic understanding)
Validation (check for object existance, get object info, etc)
Implementation (do the requested action)

I would put the permissions checking between Validation and
Implementation, ideally at the 'top' of Implementation.  This, imv,
is pretty similar to how we handle DML commands today-
parsing/validation are done first but then the permissions checks aren't
done until we actually go to run the command (of course, this also deals
with prepared queries and the like).  Right now what we have are a bunch
of checks scattered around during Validation, as we come across things
we think should be checked (gee, we know the table the user referenced,
let's check if he owns it now).

Of course, this patch isn't doing that because it was intended to make
the existing code consistant, not institute a new policy for how
permissions checking should be done.  The big downside about trying to
define a new policy is that it's alot more difficult to get agreement
on, and there are likely to be exceptions brought out that might make
the policy appear to be ill-formed.

Do people think this is a sensible approach?  Are there known exceptions
where this won't work or would cause problems?  Is it possible to make
that kind of deliniation in general?  Should we work up an example patch
which moves some of these checks in that direction?

Thanks for your comments.

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] LATERAL

2009-12-17 Thread Robert Haas
On Sun, Oct 18, 2009 at 2:57 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 You could probably convince me that a merge join is not going to be
 too useful (how often can you want a merge join on the inner side of a
 nested loop?

 Why not?  As Andrew pointed out, what we're really trying to accomplish
 here is consider sub-join plans that are parameterized by a value
 obtained from an outer relation.  I think we shouldn't artificially
 limit what we consider.

 But anyway I think we're on the same page here: what we ought to do is
 try implementing this scheme without any extra restrictions on what it
 considers, and see what the performance is like.  We can try to limit
 what it considers if it turns out not to work well in the simplest
 form.

You mention what we ought to do here, so - is this something that
you're planning to have a go at?

Another question I have - while generalizing the inner-indexscan
machinery is an interesting join optimization technique, I'm thinking
that it actually has very little to do with LATERAL.  Is there any
reason to suppose that one or the other needs to be done first?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] remove redundant ownership checks

2009-12-17 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 Disentangling that seems like a job and a half.

Indeed it will be, but I think it would be a good thing to actually have
a defined point at which permissions checking is to be done.  Trying to
read the code and figure out what permissions you need to perform
certain actions, when some of those checks are done as 'prep work' far
up the tree, isn't fun.  Makes validation of the checks we say we do in
the documentation more difficult too.  Not to mention that if we want to
allow more granular permission granting for certain operations, it gets
even uglier..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] remove redundant ownership checks

2009-12-17 Thread Robert Haas
On Thu, Dec 17, 2009 at 10:14 PM, Stephen Frost sfr...@snowman.net wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
 Disentangling that seems like a job and a half.

 Indeed it will be, but I think it would be a good thing to actually have
 a defined point at which permissions checking is to be done.

Completely agree...

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Largeobject Access Controls (r2460)

2009-12-17 Thread Robert Haas
On Thu, Dec 17, 2009 at 7:27 PM, Takahiro Itagaki
itagaki.takah...@oss.ntt.co.jp wrote:
  Another comment is I'd like to keep link 
  linkend=catalog-pg-largeobject-metadata
  for the first structnamepg_largeobject/structname in each topic.
 Those two things aren't the same.  Perhaps you meant link
 linkend=catalog-pg-largeobject?
 Oops, yes. Thank you for the correction.

 We also have 8.4.x series in the core code. Do you think we also
 rewrite those messages? One of them is an use-visible message.

Yes.  I started going through the comments tonight.  Partial patch
attached.  There were two comments that I was unable to understand and
therefore could not reword - the one at the top of
pg_largeobject_aclmask_snapshot(), and the second part of the comment
at the top of LargeObjectExists():

 * Note that LargeObjectExists always scans the system catalog
 * with SnapshotNow, so it is unavailable to use to check
 * existence in read-only accesses.

In both cases, I'm lost.  Help?

In acldefault(), there is this comment:

  /* Grant SELECT,UPDATE by default, for now */

This doesn't seem to match what the code is doing, so I think we
should remove it.

I also notice that dumpBlobComments() is now misnamed, but it seems
we've chosen to add a comment
mentioning that fact rather than fixing it.  That doesn't seem like
the right approach.

...Robert
diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 809df7a..b0aea41 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -4261,9 +4261,8 @@ pg_language_ownercheck(Oid lan_oid, Oid roleid)
 /*
  * Ownership check for a largeobject (specified by OID)
  *
- * Note that we have no candidate to call this routine with a certain
- * snapshot except for SnapshotNow, so we don't provide an interface
- * with _snapshot() version now.
+ * This is only used for operations like ALTER LARGE OBJECT that are always
+ * relative to SnapshotNow.
  */
 bool
 pg_largeobject_ownercheck(Oid lobj_oid, Oid roleid)
diff --git a/src/backend/catalog/pg_largeobject.c b/src/backend/catalog/pg_largeobject.c
index ada5b88..dfbf350 100644
--- a/src/backend/catalog/pg_largeobject.c
+++ b/src/backend/catalog/pg_largeobject.c
@@ -79,10 +79,8 @@ LargeObjectCreate(Oid loid)
 }
 
 /*
- * Drop a large object having the given LO identifier.
- *
- * When we drop a large object, it is necessary to drop both of metadata
- * and data pages in same time.
+ * Drop a large object having the given LO identifier.  Both the data pages
+ * and metadata must be dropped.
  */
 void
 LargeObjectDrop(Oid loid)
@@ -191,13 +189,12 @@ LargeObjectAlterOwner(Oid loid, Oid newOwnerId)
 		if (!superuser())
 		{
 			/*
-			 * The 'lo_compat_privileges' is not checked here, because we
-			 * don't have any access control features in the 8.4.x series
-			 * or earlier release.
-			 * So, it is not a place we can define a compatible behavior.
+			 * lo_compat_privileges is not checked here, because ALTER
+			 * LARGE OBJECT ... OWNER did not exist at all prior to
+			 * PostgreSQL 8.5.
+			 *
+			 * We must be the owner of the existing object.
 			 */
-
-			/* Otherwise, must be owner of the existing object */
 			if (!pg_largeobject_ownercheck(loid, GetUserId()))
 ereport(ERROR,
 		(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
@@ -251,9 +248,8 @@ LargeObjectAlterOwner(Oid loid, Oid newOwnerId)
 /*
  * LargeObjectExists
  *
- * Currently, we don't use system cache to contain metadata of
- * large objects, because massive number of large objects can
- * consume not a small amount of process local memory.
+ * We don't use the system cache to for large object metadata, for fear of
+ * using too much local memory.
  *
  * Note that LargeObjectExists always scans the system catalog
  * with SnapshotNow, so it is unavailable to use to check
diff --git a/src/backend/commands/comment.c b/src/backend/commands/comment.c
index 8f8ecc7..ece2a30 100644
--- a/src/backend/commands/comment.c
+++ b/src/backend/commands/comment.c
@@ -1449,7 +1449,7 @@ CommentLargeObject(List *qualname, char *comment)
 	 *
 	 * See the comment in the inv_create() which describes
 	 * the reason why LargeObjectRelationId is used instead
-	 * of the LargeObjectMetadataRelationId.
+	 * of LargeObjectMetadataRelationId.
 	 */
 	CreateComments(loid, LargeObjectRelationId, 0, comment);
 }
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index c799b13..22b8ea7 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1229,9 +1229,9 @@ static struct config_bool ConfigureNamesBool[] =
 
 	{
 		{lo_compat_privileges, PGC_SUSET, COMPAT_OPTIONS_PREVIOUS,
-			gettext_noop(Enables backward compatibility in privilege checks on large objects),
-			gettext_noop(When turned on, privilege checks on large objects perform 
-		 with backward compatibility as 8.4.x or earlier releases.)
+			gettext_noop(Enables backward compatibility mode for privilege checks on large objects),
+			

Re: [HACKERS] [PATCH] remove redundant ownership checks

2009-12-17 Thread KaiGai Kohei
(2009/12/18 9:19), Tom Lane wrote:
 KaiGai Koheikai...@ak.jp.nec.com  writes:
 [ patch to remove EnableDisableRule's permissions check ]
 
 I don't particularly like this patch, mainly because I disagree with
 randomly removing permissions checks without any sort of plan about
 where they ought to go.  There are two principal entry points in
 rewriteDefine.c (the other one being DefineQueryRewrite), and currently
 they both do permissions checks.  There is also a third major function
 RenameRewriteRule, currently ifdef'd out because it's not used, which
 is commented as Note that it lacks a permissions check, indicating
 that somebody (possibly me, I don't recall for sure) thought it was
 surprising that there wasn't such a check there.  This is sensible if
 you suppose that this file implements rule utility commands that are
 more or less directly called by the user, which is in fact the case for
 DefineQueryRewrite, and it's not obvious why it wouldn't be the case for
 EnableDisableRule.  Since that's a globally exposed function, it's quite
 possible that there's third-party code calling it and expecting it to do
 the appropriate permissions check.  (A quick look at Slony, in
 particular, would be a good idea here.)

If we consider this permission check is a part of specification in
the EnableDisableRule(), it is a viewpoint/standpoint.

However, if we adopt this standpoint, we should skip ATSimplePermissions()
for ENABLE/DISABLE RULE options in ATPrepCmd(), because ATExecCmd() calls
EnableDisableRule() which applies permission checks according to the
specification.

We don't need to apply same checks twice. It will be enough with either
of them. But I don't think skipping ATSimplePermissions() is a right
direction, because the existing PG model obviously handles rewrite rules
as properties of relation, although it is not stored within pg_class.
So, it is quite natural to check permission to modify properties of
relations in ATPrepCmd().

-- 
OSS Platform Development Division, NEC
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Largeobject Access Controls (r2460)

2009-12-17 Thread Takahiro Itagaki

Robert Haas robertmh...@gmail.com wrote:

 In both cases, I'm lost.  Help?

They might be contrasted with the comments for myLargeObjectExists.
Since we use MVCC visibility in loread(), metadata for large object
also should be visible in MVCC rule.

If I understand them, they say:
  * pg_largeobject_aclmask_snapshot requires a snapshot which will be
used in loread().
  * Don't use LargeObjectExists if you need MVCC visibility.

 In acldefault(), there is this comment:
   /* Grant SELECT,UPDATE by default, for now */
 This doesn't seem to match what the code is doing, so I think we
 should remove it.

Ah, ACL_NO_RIGHTS is the default.

 I also notice that dumpBlobComments() is now misnamed, but it seems
 we've chosen to add a comment mentioning that fact rather than fixing it.

Hmmm, now it dumps not only comments but also ownership of large objects.
Should we rename it dumpBlobMetadata() or so?

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY IN as SELECT target

2009-12-17 Thread Pavel Stehule
2009/12/17 Andrew Dunstan and...@dunslane.net:

 Recently there was discussion about allowing a COPY statement to be a SELECT
 target, returning a text array, although the syntax wasn't really nailed
 down that I recall. I was thinking that  we might have

   COPY RETURNING ARRAY FROM ...

 instead of

   COPY tablename opt_column_list FROM ...


 the we possibly could do things like:

   SELECT t[5] as a, 3*(t[3]::numeric) as b FROM (COPY RETURNING ARRAY FROM
 STDIN CSV) as t;

 Thoughts?

In this case copy doesn't return array - so RETURNING ARRAY is little
bit strange.

what

SELECT .. FROM (COPY VALUES [(colums)] FROM )

Regards
Pavel


 cheers

 andrew


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers