Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-21 Thread Florian Pflug
On May 21, 2010, at 4:20 , Florian Pflug wrote:
 On May 19, 2010, at 2:15 , Florian Pflug wrote:
 On May 17, 2010, at 3:30 , Robert Haas wrote:
 On Sun, May 16, 2010 at 9:07 PM, Florian Pflug f...@phlo.org wrote:
 On May 14, 2010, at 22:54 , Robert Haas wrote:
 On Thu, May 13, 2010 at 5:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 All in all, I believe that SHARE and UPDATE row-level locks should be
 changed to cause concurrent UPDATEs to fail with a serialization
 error.
 
 I don't see an argument for doing that for FOR SHARE locks, and it
 already happens for FOR UPDATE (at least if the row actually gets
 updated).  AFAICS this proposal mainly breaks things, in pursuit of
 an unnecessary and probably-impossible-anyway goal of making FK locking
 work with only user-level snapshots.
 
 After giving this considerable thought and testing the behavior at
 some length, I think the OP has it right.  One thing I sometimes need
 to do is denormalize a copy of a field, e.g.
 
 snipped example
 
 I've whipped up a quick and still rather dirty patch that implements the 
 behavior I proposed, at least for the case of conflicts between FOR UPDATE 
 locks and updates. With the patch, any attempt to UPDATE or FOR UPDATE 
 lock a row that has concurrently been FOR UPDATE locked will cause a 
 serialization error. (The same for an actually updated row of course, but 
 that happened before too).
 
 While this part of the patch was fairly straight forward, make FOR SHARE 
 conflict too seems to be much harder. The assumption that a lock becomes 
 irrelevant after the transaction(s) that held it completely is built 
 deeply into the multi xact machinery that powers SHARE locks. That 
 machinery therefore assumes that once all members of a multi xact have 
 completed the multi xact is dead also. But my proposal depends on a 
 SERIALIZABLE transaction being able to find if any of the lockers of a row 
 are invisible under it's snapshot - for which it'd need any multi xact 
 containing invisible xids to outlive its snapshot.
 
 Thanks for putting this together. I suggest adding it to the open
 CommitFest - even if we decide to go forward with this, I don't
 imagine anyone is going to be excited about changing it during beta.
 
 https://commitfest.postgresql.org/action/commitfest_view/open
 
 
 Will do. Thanks for the link.
 
 Here is an updated version that works for SHARE locks too.
 
 Forgetting to run make check before sending a patch is bad, as I just 
 proved :-(
 
 For the archives' and the commitfest app's sake, here is a version that 
 actually passes the regression tests.
 
 To make up for it, I also did some testing with a custom pgbench script  
 schema and proved the effectiveness of this patch. I ran this with pgbench  
 -s 10 -j 10 -c 10 -t 1000 -n -f fkbench.pgbench on both HEAD and HEAD+patch. 
 The former errors out quickly with database inconsistent while the later 
 completes the pgbench run without errors. 
 
 The patch still needs more work, at least on the comments  documentation 
 side of things, but I'm going to let this rest now while we're in beta.
 
 Patch, pgbench script and schema attached.

Great, now my mail client decided to send encode those attachments with 
MacBinary instead of sending them as plain text :-(

Not sure if MUAs other than Mail.app can open those, so I'm resending this. 
Really sorry for the noise, guys

best regards,
Florian Pflug



serializable_lock_consistency.patch
Description: Binary data


fkbench.init.sql
Description: Binary data


fkbench.pgbench
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-20 Thread Florian Pflug
On May 19, 2010, at 2:15 , Florian Pflug wrote:
 On May 17, 2010, at 3:30 , Robert Haas wrote:
 On Sun, May 16, 2010 at 9:07 PM, Florian Pflug f...@phlo.org wrote:
 On May 14, 2010, at 22:54 , Robert Haas wrote:
 On Thu, May 13, 2010 at 5:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 All in all, I believe that SHARE and UPDATE row-level locks should be
 changed to cause concurrent UPDATEs to fail with a serialization
 error.
 
 I don't see an argument for doing that for FOR SHARE locks, and it
 already happens for FOR UPDATE (at least if the row actually gets
 updated).  AFAICS this proposal mainly breaks things, in pursuit of
 an unnecessary and probably-impossible-anyway goal of making FK locking
 work with only user-level snapshots.
 
 After giving this considerable thought and testing the behavior at
 some length, I think the OP has it right.  One thing I sometimes need
 to do is denormalize a copy of a field, e.g.
 
 snipped example
 
 I've whipped up a quick and still rather dirty patch that implements the 
 behavior I proposed, at least for the case of conflicts between FOR UPDATE 
 locks and updates. With the patch, any attempt to UPDATE or FOR UPDATE lock 
 a row that has concurrently been FOR UPDATE locked will cause a 
 serialization error. (The same for an actually updated row of course, but 
 that happened before too).
 
 While this part of the patch was fairly straight forward, make FOR SHARE 
 conflict too seems to be much harder. The assumption that a lock becomes 
 irrelevant after the transaction(s) that held it completely is built deeply 
 into the multi xact machinery that powers SHARE locks. That machinery 
 therefore assumes that once all members of a multi xact have completed the 
 multi xact is dead also. But my proposal depends on a SERIALIZABLE 
 transaction being able to find if any of the lockers of a row are invisible 
 under it's snapshot - for which it'd need any multi xact containing 
 invisible xids to outlive its snapshot.
 
 Thanks for putting this together. I suggest adding it to the open
 CommitFest - even if we decide to go forward with this, I don't
 imagine anyone is going to be excited about changing it during beta.
 
 https://commitfest.postgresql.org/action/commitfest_view/open
 
 
 Will do. Thanks for the link.
 
 Here is an updated version that works for SHARE locks too.

Forgetting to run make check before sending a patch is bad, as I just proved 
:-(

For the archives' and the commitfest app's sake, here is a version that 
actually passes the regression tests.

To make up for it, I also did some testing with a custom pgbench script  
schema and proved the effectiveness of this patch. I ran this with pgbench  -s 
10 -j 10 -c 10 -t 1000 -n -f fkbench.pgbench on both HEAD and HEAD+patch. The 
former errors out quickly with database inconsistent while the later 
completes the pgbench run without errors. 

The patch still needs more work, at least on the comments  documentation side 
of things, but I'm going to let this rest now while we're in beta.

Patch, pgbench script and schema attached.


serializable_lock_consistency.patch
Description: Binary data


fkbench.init.sql
Description: Binary data


fkbench.pgbench
Description: Binary data


best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-18 Thread Florian Pflug
On May 17, 2010, at 3:30 , Robert Haas wrote:
 On Sun, May 16, 2010 at 9:07 PM, Florian Pflug f...@phlo.org wrote:
 On May 14, 2010, at 22:54 , Robert Haas wrote:
 On Thu, May 13, 2010 at 5:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 All in all, I believe that SHARE and UPDATE row-level locks should be
 changed to cause concurrent UPDATEs to fail with a serialization
 error.
 
 I don't see an argument for doing that for FOR SHARE locks, and it
 already happens for FOR UPDATE (at least if the row actually gets
 updated).  AFAICS this proposal mainly breaks things, in pursuit of
 an unnecessary and probably-impossible-anyway goal of making FK locking
 work with only user-level snapshots.
 
 After giving this considerable thought and testing the behavior at
 some length, I think the OP has it right.  One thing I sometimes need
 to do is denormalize a copy of a field, e.g.
 
 snipped example
 
 I've whipped up a quick and still rather dirty patch that implements the 
 behavior I proposed, at least for the case of conflicts between FOR UPDATE 
 locks and updates. With the patch, any attempt to UPDATE or FOR UPDATE lock 
 a row that has concurrently been FOR UPDATE locked will cause a 
 serialization error. (The same for an actually updated row of course, but 
 that happened before too).
 
 While this part of the patch was fairly straight forward, make FOR SHARE 
 conflict too seems to be much harder. The assumption that a lock becomes 
 irrelevant after the transaction(s) that held it completely is built deeply 
 into the multi xact machinery that powers SHARE locks. That machinery 
 therefore assumes that once all members of a multi xact have completed the 
 multi xact is dead also. But my proposal depends on a SERIALIZABLE 
 transaction being able to find if any of the lockers of a row are invisible 
 under it's snapshot - for which it'd need any multi xact containing 
 invisible xids to outlive its snapshot.
 
 Thanks for putting this together. I suggest adding it to the open
 CommitFest - even if we decide to go forward with this, I don't
 imagine anyone is going to be excited about changing it during beta.
 
 https://commitfest.postgresql.org/action/commitfest_view/open


Will do. Thanks for the link.

Here is an updated version that works for SHARE locks too.

(This message mainly serves as a way to link the updated patch to the commit 
fest entry. Is this how I'm supposed to do that, or am I doing something wrong?)

best regards,
Florian Pflug


serializable_lock_consistency.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-18 Thread Robert Haas
On Tue, May 18, 2010 at 8:15 PM, Florian Pflug f...@phlo.org wrote:
 Will do. Thanks for the link.

 Here is an updated version that works for SHARE locks too.

 (This message mainly serves as a way to link the updated patch to the commit 
 fest entry. Is this how I'm supposed to do that, or am I doing something 
 wrong?)

Yeah - just go to the existing CF entry and say New Comment then
select type Patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-16 Thread Florian Pflug
On May 14, 2010, at 22:54 , Robert Haas wrote:
 On Thu, May 13, 2010 at 5:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 All in all, I believe that SHARE and UPDATE row-level locks should be
 changed to cause concurrent UPDATEs to fail with a serialization
 error.
 
 I don't see an argument for doing that for FOR SHARE locks, and it
 already happens for FOR UPDATE (at least if the row actually gets
 updated).  AFAICS this proposal mainly breaks things, in pursuit of
 an unnecessary and probably-impossible-anyway goal of making FK locking
 work with only user-level snapshots.
 
 After giving this considerable thought and testing the behavior at
 some length, I think the OP has it right.  One thing I sometimes need
 to do is denormalize a copy of a field, e.g.
 
 snipped example

I've whipped up a quick and still rather dirty patch that implements the 
behavior I proposed, at least for the case of conflicts between FOR UPDATE 
locks and updates. With the patch, any attempt to UPDATE or FOR UPDATE lock a 
row that has concurrently been FOR UPDATE locked will cause a serialization 
error. (The same for an actually updated row of course, but that happened 
before too).

While this part of the patch was fairly straight forward, make FOR SHARE 
conflict too seems to be much harder. The assumption that a lock becomes 
irrelevant after the transaction(s) that held it completely is built deeply 
into the multi xact machinery that powers SHARE locks. That machinery therefore 
assumes that once all members of a multi xact have completed the multi xact is 
dead also. But my proposal depends on a SERIALIZABLE transaction being able to 
find if any of the lockers of a row are invisible under it's snapshot - for 
which it'd need any multi xact containing invisible xids to outlive its 
snapshot.

best regards,
Florian Pflug




serializable_share_lock.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-16 Thread Florian Pflug

On May 14, 2010, at 16:32 , Kevin Grittner wrote:

 Florian Pflug f...@phlo.org wrote:
 
 I must admit that I wasn't able to find an explicit reference to
 Oracle's behavior in their docs, so I had to resort to
 experiments. They do have examples showing how to do FK-like
 constraints with triggers, and those don't contain any warning
 whatsoever about problems in SERIALIZABLE mode, though.  But
 still, if there is word on this from Oracle somewhere, I'd love to
 hear about it.
 
 I suspect that in trying to emulate Oracle on this, you may run into
 an issue which posed challenges for the SSI implementation which
 didn't come up in the Cahill prototype implementations: Oracle, and
 all other MVCC databases I've read about outside of PostgreSQL, use
 an update in place with a rollback log technique.  Access to any
 version of a given row or index entry goes through a single
 location, with possible backtracking through the log after that,
 which simplifies management of certain concurrency issues.  Do they
 perhaps use an in-RAM lock table, pointing to the base location of
 the row for these SELECT FOR UPDATE locks?  (Just guessing; I've
 never used Oracle, myself.)

Thanks for the heads up. I think my proposed doges this, though, since UPDATE 
as well as FOR SHARE and FOR UPDATE already follow the ctid chain to find the 
most recent tuple and fail with a serialization error (within = REPEATABLE 
READ transaction) should this tuple be inaccessible to the transaction's 
snapshot.

Btw, I've just posted a quick-and-dirty patch that implements the parts of my 
proposal that deal with FOR UPDATE vs. UPDATE conflicts in response to Robert 
Haas' mail on this thread, just in case you're interested.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-16 Thread Robert Haas
On Sun, May 16, 2010 at 9:07 PM, Florian Pflug f...@phlo.org wrote:
 On May 14, 2010, at 22:54 , Robert Haas wrote:
 On Thu, May 13, 2010 at 5:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 All in all, I believe that SHARE and UPDATE row-level locks should be
 changed to cause concurrent UPDATEs to fail with a serialization
 error.

 I don't see an argument for doing that for FOR SHARE locks, and it
 already happens for FOR UPDATE (at least if the row actually gets
 updated).  AFAICS this proposal mainly breaks things, in pursuit of
 an unnecessary and probably-impossible-anyway goal of making FK locking
 work with only user-level snapshots.

 After giving this considerable thought and testing the behavior at
 some length, I think the OP has it right.  One thing I sometimes need
 to do is denormalize a copy of a field, e.g.

 snipped example

 I've whipped up a quick and still rather dirty patch that implements the 
 behavior I proposed, at least for the case of conflicts between FOR UPDATE 
 locks and updates. With the patch, any attempt to UPDATE or FOR UPDATE lock a 
 row that has concurrently been FOR UPDATE locked will cause a serialization 
 error. (The same for an actually updated row of course, but that happened 
 before too).

 While this part of the patch was fairly straight forward, make FOR SHARE 
 conflict too seems to be much harder. The assumption that a lock becomes 
 irrelevant after the transaction(s) that held it completely is built deeply 
 into the multi xact machinery that powers SHARE locks. That machinery 
 therefore assumes that once all members of a multi xact have completed the 
 multi xact is dead also. But my proposal depends on a SERIALIZABLE 
 transaction being able to find if any of the lockers of a row are invisible 
 under it's snapshot - for which it'd need any multi xact containing invisible 
 xids to outlive its snapshot.

Thanks for putting this together. I suggest adding it to the open
CommitFest - even if we decide to go forward with this, I don't
imagine anyone is going to be excited about changing it during beta.

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-15 Thread Rob Wultsch
On Fri, May 14, 2010 at 7:32 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Oracle, and all other MVCC databases I've read about outside of PostgreSQL, 
 use
 an update in place with a rollback log technique.

Have you looked at PBXT (which is explicitly NOT SERIALIZABLE)?

-- 
Rob Wultsch
wult...@gmail.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-15 Thread Kevin Grittner
Rob Wultsch  wrote:
 
 Have you looked at PBXT (which is explicitly NOT SERIALIZABLE)?
 
I hadn't heard of it; so I took a quick look based on your post.  It
seems to a new engine to use with MySQL which has MVCC without a
rollback log, so it presumably uses techniques at least vaguely
similar to PostgreSQL.  Anything in particular you wanted me to
notice about it besides that?  (Of course another MySQL engine which
doesn't provide very strong integrity guarantees isn't exciting to
me as a technology in itself.)
 
-Kevin



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-15 Thread Rob Wultsch
On Sat, May 15, 2010 at 4:09 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
   Anything in particular you wanted me to notice about it besides that?

Nope. It was just a counter point to your previous comment.

-- 
Rob Wultsch
wult...@gmail.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Nicolas Barbier
2010/5/14 Greg Stark gsst...@mit.edu:

 On Thu, May 13, 2010 at 10:25 PM, Florian Pflug f...@phlo.org wrote:

 C1: BEGIN
 C1: SELECT * FROM t WHERE id = 1 FOR UPDATE
 C2: BEGIN
 C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
 C2: SELECT * FROM t -- Take snapshot before C1 commits
 C1: COMMIT
 C2: DELETE FROM t WHERE id = 1
 C2: COMMIT

 Can you give an actual realistic example -- ie, not doing a select for
 update and then never updating the row or with an explanation of what
 the programmer is attempting to accomplish with such an unusual
 sequence? The rest of the post talks about FKs but I don't see any
 here...

The link with FKs is as follows:

* The example does not use a real FK, because the whole purpose is to
do the same as FKs while not using the FK machinery.
* The example uses only one table, because that is enough to
illustrate the problem (see next items).
* C1 locks a row, supposedly because it wants to create a reference to
it in a non-mentioned table, and wants to prevent the row from being
deleted under it.
* C2 deletes that row (supposedly after it verified that there are no
references to it; it would indeed not be able to see the reference
that C1 created/would create), and C1 fails to detect that.
* C2 also fails to detect the problem, because the lock that C1 held
is being released after C1 commits, and C2 can happily go on deleting
the row.
* The end result is that the hypothetical reference is created,
although the referent is gone.

Nicolas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Florian Pflug

On May 14, 2010, at 2:37 , Greg Stark wrote:

 On Thu, May 13, 2010 at 10:25 PM, Florian Pflug f...@phlo.org wrote:
 C1: BEGIN
 C1: SELECT * FROM t WHERE id = 1 FOR UPDATE
 C2: BEGIN
 C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
 C2: SELECT * FROM t -- Take snapshot before C1 commits
 C1: COMMIT
 C2: DELETE FROM t WHERE id = 1
 C2: COMMIT
 
 
 Can you give an actual realistic example -- ie, not doing a select for
 update and then never updating the row or with an explanation of what
 the programmer is attempting to accomplish with such an unusual
 sequence? The rest of the post talks about FKs but I don't see any
 here...

The table t is supposed to represent the parent table of a FK constraint. The 
SELECT FOR UPDATE is done upon an INSERT to the child table to protect the 
parent row against concurrent deletion. I've used FOR UPDATE instead of FOR 
SHARE because I did test this against oracle also, and oracle does not support 
FOR SHARE. 

Here's a full example of a pair of FK triggers in PL/PGSQL that work correctly 
in READ COMMITTED mode but fail to enforce the constraint in SERIALIZABLE mode 
as the following sequence of commands show. With my proposal, the DELETE would 
again raise a serialization error and hence keep the constraint satisfied.

C1: BEGIN
C1: INSERT INTO child (parent_id) VALUES (1) -- Locks the parent row FOR UPDATE
C2: BEGIN
C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
C2: SELECT TRUE -- Take snapshot *before* C1 commits
C1: COMMIT
C2: DELETE FROM parent WHERE parent_id = 1 -- Succeeds
C2: COMMIT

--
CREATE TABLE parent (parent_id SERIAL NOT NULL PRIMARY KEY);
CREATE TABLE child (child_id SERIAL NOT NULL PRIMARY KEY, parent_id INTEGER NOT 
NULL);

CREATE FUNCTION ri_parent() RETURNS TRIGGER AS $body$
BEGIN
 PERFORM TRUE FROM child WHERE parent_id = OLD.parent_id;
 IF FOUND THEN
   RAISE SQLSTATE '23503' USING MESSAGE = 'Parent ' || OLD.parent_id || ' still 
referenced during ' || TG_OP;
 END IF;
 RETURN NULL;
END;
$body$ LANGUAGE PLPGSQL VOLATILE;
CREATE TRIGGER ri_parent AFTER UPDATE OR DELETE ON parent FOR EACH ROW EXECUTE 
PROCEDURE ri_parent();

CREATE FUNCTION ri_child() RETURNS TRIGGER AS $body$
BEGIN
 PERFORM TRUE FROM parent WHERE parent_id = NEW.parent_id FOR UPDATE OF parent;
 IF NOT FOUND THEN
   RAISE SQLSTATE '23503' USING MESSAGE = 'Parent ' || NEW.parent_id || ' does 
not exist during ' || TG_OP;
 END IF;
 RETURN NULL;
END;
$body$ LANGUAGE PLPGSQL VOLATILE;
CREATE TRIGGER ri_child AFTER INSERT OR UPDATE ON child FOR EACH ROW EXECUTE 
PROCEDURE ri_child();
--

best regards,

Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Kevin Grittner
[slight rearrangement]
 
Florian Pflug  wrote:
 
 I'm very exited about the work you're doing
 
Always nice to hear.  :-)
 
 I view my proposal as pretty orthogonal to that work.
 
 My proposal allows for simple FK-like constraints to be
 implemented at user-level that are correct for all isolation
 levels.
 
OK, I can see the attraction in that.
 
 True serializable transaction are much more powerful than what I
 proposed, but at a much higher price too, due to the necessity of
 SIREAD locks.
 
I think that SIREAD locks will generally be cheaper than SELECT FOR
UPDATE, since the former don't require any disk I/O and the latter
do.  I only have one benchmark so far (more on the way), but it
attempts to isolate the cost of acquiring the SIREAD locks by using
a read-only load against a fully cached database.  Benchmarks so far
show the new version of the SERIALIZABLE level as supporting 1.8%
fewer TPS than REPEATABLE READ (the existing snapshot isolation
level) in that environment.  That will probably disappear into the
noise for any load involving disk I/O.
 
Now *rollbacks*, particularly those due to false positives, might
become a more serious issue in some pessimal loads, but I'm still
working on developing meaningful benchmarks for that.
 
I guess what I'm suggesting is that unless you have a very small
database with a very large number of connections in a high
contention workload, or you can't require SERIALIZABLE transaction
isolation level, SSI might actually perform better than what you're
proposing.  Of course, that's all conjecture until there are
benchmarks; but I'd be very interested in getting any and all
alternative solutions like this worked into a benchmark -- where I
can pull out the FOR UPDATE and FOR SHARE clauses, any redundant
updates or denormalizations added just for concurrency issues, and
all explicit locking -- and compare that under SERIALIZABLE to the
original performance.
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Florian Pflug
On May 14, 2010, at 12:56 , Kevin Grittner wrote:
 True serializable transaction are much more powerful than what I
 proposed, but at a much higher price too, due to the necessity of
 SIREAD locks.
 
 I think that SIREAD locks will generally be cheaper than SELECT FOR
 UPDATE, since the former don't require any disk I/O and the latter
 do.  I only have one benchmark so far (more on the way), but it
 attempts to isolate the cost of acquiring the SIREAD locks by using
 a read-only load against a fully cached database.  Benchmarks so far
 show the new version of the SERIALIZABLE level as supporting 1.8%
 fewer TPS than REPEATABLE READ (the existing snapshot isolation
 level) in that environment.  That will probably disappear into the
 noise for any load involving disk I/O.

I can see how a single SIREAD lock can potentially be cheaper than a FOR SHARE 
or FOR UPDATE lock. But the number of SIREAD locks would exceed the number of 
FOR SHARE / FOR UPDATE locks by a few order of magnitude I'd think - at least 
of you ran even transaction under true serializable isolation.

I don't quite understand how SIREAD locks work if they don't involve any disk 
IO, since shared memory isn't resizable. But I guess I'll find out once you 
post the patch ;-)

 I guess what I'm suggesting is that unless you have a very small
 database with a very large number of connections in a high
 contention workload, or you can't require SERIALIZABLE transaction
 isolation level, SSI might actually perform better than what you're
 proposing.

That is entirely possible. However, unless your patch completely removes 
support for snapshot isolation (what is current called SERIALIZABLE), my 
proposal still eliminates the situation that user-level constraints are correct 
in READ COMMITTED and (true) SERIALIZABLE isolation but not in snapshot 
isolation.

Btw, the only user of FOR SHARE locks inside postgres proper are the RI 
triggers, and those do that special crosscheck when called within a 
SERIALIZABLE transactions. I do take this as evidence that the current behavior 
might not be all that useful with serializable transactions...

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Kevin Grittner
Florian Pflug f...@phlo.org wrote:
 On May 14, 2010, at 12:56 , Kevin Grittner wrote:
 
 I think that SIREAD locks will generally be cheaper than SELECT
 FOR UPDATE, since the former don't require any disk I/O and the
 latter do.
 
 I can see how a single SIREAD lock can potentially be cheaper than
 a FOR SHARE or FOR UPDATE lock. But the number of SIREAD locks
 would exceed the number of FOR SHARE / FOR UPDATE locks by a few
 order of magnitude I'd think - at least of you ran even
 transaction under true serializable isolation.
 
 I don't quite understand how SIREAD locks work if they don't
 involve any disk IO, since shared memory isn't resizable.
 
We use a well-worn technique used by many (most?) database products
-- granularity promotion.  This is one of the things which could
cause enough false positives under some loads to cause your
technique to perform better than SSI for those loads.
 
 unless your patch completely removes support for snapshot
 isolation (what is current called SERIALIZABLE)
 
Both SERIALIZABLE and REPEATABLE READ currently map to snapshot
isolation.  We're leaving REPEATABLE READ alone.
 
 my proposal still eliminates the situation that user-level
 constraints are correct in READ COMMITTED and (true) SERIALIZABLE
 isolation but not in snapshot isolation.
 
Agreed.  If someone wants to enforce user-level constraints using
SSI, they will somehow need to ensure that less strict isolation
levels are never used to modify data.  Your approach lifts that
burden.
 
By the way, if you can make this behave in a similar way to Oracle,
especially if the syntax is compatible, I'm sure it will help
promote PostgreSQL adoption.  At PostgreSQL Conference U.S. East
2010, I talked briefly with a couple guys from an Oracle shop who
were looking at converting to PostgreSQL, and were very concerned
about not having what you describe.  The techniques required to
ensure integrity in PostgreSQL were not, to put it mildly, appealing
to them.  I suspect that they would be satisfied with *either* SSI
or the change you describe.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Florian Pflug
On May 14, 2010, at 15:54 , Kevin Grittner wrote:
 Florian Pflug f...@phlo.org wrote:
 On May 14, 2010, at 12:56 , Kevin Grittner wrote:
 unless your patch completely removes support for snapshot
 isolation (what is current called SERIALIZABLE)
 
 Both SERIALIZABLE and REPEATABLE READ currently map to snapshot
 isolation.  We're leaving REPEATABLE READ alone.

Ah, yeah, that makes a lot of sense. I kinda had forgotten about REPEATABLE 
READ...

 my proposal still eliminates the situation that user-level
 constraints are correct in READ COMMITTED and (true) SERIALIZABLE
 isolation but not in snapshot isolation.
 
 Agreed.  If someone wants to enforce user-level constraints using
 SSI, they will somehow need to ensure that less strict isolation
 levels are never used to modify data.  Your approach lifts that
 burden.
 
 By the way, if you can make this behave in a similar way to Oracle,
 especially if the syntax is compatible, I'm sure it will help
 promote PostgreSQL adoption.  At PostgreSQL Conference U.S. East
 2010, I talked briefly with a couple guys from an Oracle shop who
 were looking at converting to PostgreSQL, and were very concerned
 about not having what you describe.  The techniques required to
 ensure integrity in PostgreSQL were not, to put it mildly, appealing
 to them.  I suspect that they would be satisfied with *either* SSI
 or the change you describe.

My proposal would make SELECT ... FOR UPDATE behave like Oracle does with 
regard to serialization conflicts. SELECT ... FOR SHARE doesn't seem to exist 
on Oracle at all - at least I couldn't find a reference to it in the docs.

The syntax isn't 100% compatible because Oracle seems to expect a list of 
columns after the FOR UPDATE clause, while postgres expects a list of tables.

I must admit that I wasn't able to find an explicit reference to Oracle's 
behavior in their docs, so I had to resort to experiments. They do have 
examples showing how to do FK-like constraints with triggers, and those don't 
contain any warning whatsoever about problems in SERIALIZABLE mode, though. But 
still, if there is word on this from Oracle somewhere, I'd love to hear about 
it.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Kevin Grittner
Florian Pflug f...@phlo.org wrote:
 
 I must admit that I wasn't able to find an explicit reference to
 Oracle's behavior in their docs, so I had to resort to
 experiments. They do have examples showing how to do FK-like
 constraints with triggers, and those don't contain any warning
 whatsoever about problems in SERIALIZABLE mode, though.  But
 still, if there is word on this from Oracle somewhere, I'd love to
 hear about it.
 
I suspect that in trying to emulate Oracle on this, you may run into
an issue which posed challenges for the SSI implementation which
didn't come up in the Cahill prototype implementations: Oracle, and
all other MVCC databases I've read about outside of PostgreSQL, use
an update in place with a rollback log technique.  Access to any
version of a given row or index entry goes through a single
location, with possible backtracking through the log after that,
which simplifies management of certain concurrency issues.  Do they
perhaps use an in-RAM lock table, pointing to the base location of
the row for these SELECT FOR UPDATE locks?  (Just guessing; I've
never used Oracle, myself.)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Robert Haas
On Thu, May 13, 2010 at 5:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 All in all, I believe that SHARE and UPDATE row-level locks should be
 changed to cause concurrent UPDATEs to fail with a serialization
 error.

 I don't see an argument for doing that for FOR SHARE locks, and it
 already happens for FOR UPDATE (at least if the row actually gets
 updated).  AFAICS this proposal mainly breaks things, in pursuit of
 an unnecessary and probably-impossible-anyway goal of making FK locking
 work with only user-level snapshots.

After giving this considerable thought and testing the behavior at
some length, I think the OP has it right.  One thing I sometimes need
to do is denormalize a copy of a field, e.g.

CREATE TABLE parent (id serial, mode integer not null, primary key (id));
CREATE TABLE child (id serial, parent_id integer not null references
parent (id), parent_mode integer not null);

The way I have typically implemented this in the past is:

1. Add a trigger to the parent table so that, whenever the mode column
gets updated, we do an update on the parent_mode of all children.
2. Add a trigger to the child table so that, when a new child is
inserted, it initializes parent_mode from its parent.  I do SELECT
with FOR UPDATE on the parent parent can't change under me; though FOR
SHARE ought to be enough also since we're just trying to lock out
concurrent updates.

Suppose T1 updates the parent's mode while T2 adds a new child; then
both commit.  In read committed mode, this seems to work OK regardless
of the order of T1 and T2.  If T1 grabs the lock first, then T2 sees
the updated version of the row after T1 commits.  If T2 grabs the lock
first, then the update on the parent blocks until the child commits.
Subsequently, when the trigger fires, it apparently uses an up-to-date
snapshot, so the new child is updated also.  In serializable mode,
things are not so good.  If T1 grabs the lock first, the child waits
to see whether it commits or aborts.  On commit, it complains that it
can't serialize and aborts, which is reasonable - transaction aborts
are the price you pay for serializability.  If T2 grabs the lock
first, the update on the parent blocks as before, but now the update
is done with the old snapshot and ignores the new child, so the new
child now has a value for parent_mode that doesn't match the parent's
actual mode.  That is, you get the wrong answer due to a serialization
anomaly that didn't existed at the read committed level.

Increasing the transaction isolation level is supposed to *eliminate*
serialization anomalies, not create them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-13 Thread Tom Lane
Florian Pflug f...@phlo.org writes:
 All in all, I believe that SHARE and UPDATE row-level locks should be
 changed to cause concurrent UPDATEs to fail with a serialization
 error.

I don't see an argument for doing that for FOR SHARE locks, and it
already happens for FOR UPDATE (at least if the row actually gets
updated).  AFAICS this proposal mainly breaks things, in pursuit of
an unnecessary and probably-impossible-anyway goal of making FK locking
work with only user-level snapshots.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-13 Thread Kevin Grittner
Florian Pflug f...@phlo.org wrote:
 
 All in all, I believe that SHARE and UPDATE row-level locks should
 be changed to cause concurrent UPDATEs to fail with a
 serialization error. I can come up with a patch that does that,
 but I wanted to get some feedback on the idea before I put the
 work in.
 
Before you work on that, you might want to wait until you can review
the work that I and Dan Ports (a Ph.D. candidate from MIT) have been
doing on support for true serializable transactions.  You don't need
to use FOR SHARE or FOR UPDATE or any explicit locks as long as the
concurrent transactions are SERIALIZABLE.  We have it working, but
have been holding off on discussion or patch submission at Tom's
request -- he felt it would distract from the process of getting the
release out.
 
Whenever people are ready, I can submit a WIP patch.  All issues
discuss on this thread Just Work with the patch applied.  There's
a Wiki page and a public git repository related to this work, for
anyone who is interested and not busy with release work.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-13 Thread Florian Pflug
On May 13, 2010, at 23:39 , Tom Lane wrote:
 Florian Pflug f...@phlo.org writes:
 All in all, I believe that SHARE and UPDATE row-level locks should be
 changed to cause concurrent UPDATEs to fail with a serialization
 error.
 
 I don't see an argument for doing that for FOR SHARE locks, and it
 already happens for FOR UPDATE (at least if the row actually gets
 updated).

Yes, actually updating the row is a workaround. A prohibitively expensive one, 
though.

The arguments are as stated

a) SHARE or UPDATE locking a concurrently updated row *does* cause as 
serialization error, making the current behavior asymmetric

b) Locking primitives usually ensure that once you obtain the lock you see the 
most recent version of the data. This is currently true for READ COMMITTED 
transactions but not for SERIALIZABLE ones, and pretty undesirable a behavior 
for a locking primitive.

c) I fail to see how the current behavior is useful in the presence of 
SERIALIZABLE transactions. Currently, they could IMHO completely ignore FOR 
SHARE locks, without making any previously correct algorithm incorrect.

plus a weaker one:

d) Oracle does it for FOR UPDATE locks, and actually has an example of a FK 
trigger in PL/SQL in their docs.

 AFAICS this proposal mainly breaks things, in pursuit of
 an unnecessary and probably-impossible-anyway goal of making FK locking
 work with only user-level snapshots.

I don't see the breakage this'd cause. For READ COMMITTED transactions nothing 
changes. For SERIALIZABLE transactions the behavior of FOR UPDATE / FOR SHARE 
becomes much easier to grasp. In both cases a SHARE lock would then say Only 
update this row if you have seen the locking transaction's changes.

Why do you think that making FK locking work with only user-level snapshots is 
probably-impossible-anyway? With the proposed changes, simply FOR SHARE locking 
the parent row on INSERT/UPDATE of the child, plus checking for child rows on 
UPDATE/DELETE of the parent gives a 100% correct FK trigger.

I do not have a formal proof for that last assertion, but I'm not aware of any 
counter-examples either. Would love to hear of any, though.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-13 Thread Florian Pflug

On May 13, 2010, at 23:51 , Kevin Grittner wrote:

 Florian Pflug f...@phlo.org wrote:
 
 All in all, I believe that SHARE and UPDATE row-level locks should
 be changed to cause concurrent UPDATEs to fail with a
 serialization error. I can come up with a patch that does that,
 but I wanted to get some feedback on the idea before I put the
 work in.
 
 Before you work on that, you might want to wait until you can review
 the work that I and Dan Ports (a Ph.D. candidate from MIT) have been
 doing on support for true serializable transactions.  You don't need
 to use FOR SHARE or FOR UPDATE or any explicit locks as long as the
 concurrent transactions are SERIALIZABLE.  We have it working, but
 have been holding off on discussion or patch submission at Tom's
 request -- he felt it would distract from the process of getting the
 release out.

I'm very exited about the work you're doing there, and believe it'd be a great 
feature to have.

However, I view my proposal as pretty orthogonal to that work. True 
serializable transaction are much more powerful than what I proposed, but at a 
much higher price too, due to the necessity of SIREAD locks. My proposal allows 
for simple FK-like constraints to be implemented at user-level that are correct 
for all isolation levels.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-13 Thread Greg Stark
On Thu, May 13, 2010 at 10:25 PM, Florian Pflug f...@phlo.org wrote:
 C1: BEGIN
 C1: SELECT * FROM t WHERE id = 1 FOR UPDATE
 C2: BEGIN
 C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
 C2: SELECT * FROM t -- Take snapshot before C1 commits
 C1: COMMIT
 C2: DELETE FROM t WHERE id = 1
 C2: COMMIT


Can you give an actual realistic example -- ie, not doing a select for
update and then never updating the row or with an explanation of what
the programmer is attempting to accomplish with such an unusual
sequence? The rest of the post talks about FKs but I don't see any
here...

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-level Locks SERIALIZABLE transactions, postgres vs. Oracle

2010-05-13 Thread Anssi Kääriäinen

On 05/14/2010 03:37 AM, Greg Stark wrote:
 On Thu, May 13, 2010 at 10:25 PM, Florian Pflugf...@phlo.org  wrote:
 C1: BEGIN
 C1: SELECT * FROM t WHERE id = 1 FOR UPDATE
 C2: BEGIN
 C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
 C2: SELECT * FROM t -- Take snapshot before C1 commits
 C1: COMMIT
 C2: DELETE FROM t WHERE id = 1
 C2: COMMIT


 Can you give an actual realistic example -- ie, not doing a select for
 update and then never updating the row or with an explanation of what
 the programmer is attempting to accomplish with such an unusual
 sequence? The rest of the post talks about FKs but I don't see any
 here...


Doing a select for update and then never updating the row is a realistic 
example.


I am currently designing a database where this is an issue. The 
simplified schema to illustrate the problem:


create table object (
   id integer primary key
);

insert into object values(1);

create table attribute (
   object_id integer not null references object,
   attr_type integer not null, -- references attr_types
   value text not null,
   valid_from timestamp not null,
   valid_until timestamp
);

Now, I want to make sure there are no pairs of (object_id, attr_type) 
where the valid_from, valid_until times overlap.


A problematic sequence for this schema, both transactions in isolation 
level serializable:



C1: begin;
C1: select * from object where id = 1 for update;
-- check for conflicting attr_type, realistically where condition should 
have overlapping check, but left out for simplicity...

C1: select * from attribute where object_id = 1 and attr_type = 1;
-- Ok, nothing overlapping, I am able to insert.
C1: insert into attribute values (1, 1, 'Anssi', now(), null);
C2: begin;
-- This blocks.
C2: select * from object where id = 1 for update;
C1: commit;
-- Check for conflicts. This select won't see the insert C1 did.
C2: select * from attribute where object_id = 1 and attr_type = 1;
-- C2 doesn't see anything conflicting
C2: insert into attribute values (1, 1, 'Matti', now(), null);
C2: commit;
-- Inconsistency.

Now, that same sequence does work for read committed isolation level (C2 
sees the insert of C1), and that is my solution for now: require 
applications to use read committed isolation level. This could also be 
solved by issuing update object set id = id where id = 1 instead of 
using select for update. This would result in serialization error.


I know that for this particular example the upcoming exclusion 
constraints would solve the problem. But if I would want to ensure that 
if attr_value for attr_type 1 is 'Anssi' then attr_value for attr_type 2 
is 'Kääriäinen', then exclusion constraints could not be used.


--
Anssi Kääriäinen

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers